Purpose: With large amounts of multidimensional molecular data on cancers generated and deposited into public repositories such as The Cancer Genome Atlas and International Cancer Genome Consortium, a cancer type agnostic and integrative platform will help to identify signatures with clinical relevance. We devised such a platform and showcase it by identifying a molecular signature for patients with metastatic and recurrent (MR) head and neck squamous cell carcinoma (HNSCC).
Methods: We devised a statistical framework accompanied by a graphical user interface-driven application, Clinical Association of Functionally Established MOlecular CHAnges ( CAFE MOCHA; https://github.com/binaypanda/CAFEMOCHA), to discover molecular signatures linked to a specific clinical attribute in a cancer type. The platform integrates mutations and indels, gene expression, DNA methylation, and copy number variations to discover a classifier first and then to predict an incoming tumor for the same by pulling defined class variables into a single framework that incorporates a coordinate geometry-based algorithm called complete specificity margin-based clustering, which ensures maximum specificity. CAFE MOCHA classifies an incoming tumor sample using either its matched normal or a built-in database of normal tissues. The application is packed and deployed using the install4j multiplatform installer. We tested CAFE MOCHA in HNSCC tumors (n = 513) followed by validation in tumors from an independent cohort (n = 18) for discovering a signature linked to distant MR.
Results: CAFE MOCHA identified an integrated signature, MR44, associated with distant MR HNSCC, with 80% sensitivity and 100% specificity in the discovery stage and 100% sensitivity and 100% specificity in the validation stage.
Conclusion: CAFE MOCHA is a cancer type and clinical attribute agnostic statistical framework to discover integrated molecular signatures.