Computational systems biology analysis of biomarkers in lung cancer; unravelling genomic regions which frequently encode biomarkers, enriched pathways, and new candidates

Gene. 2018 Jun 15:659:29-36. doi: 10.1016/j.gene.2018.03.038. Epub 2018 Mar 17.

Abstract

Exponentially growing scientific knowledge in scientific publications has resulted in the emergence of a new interdisciplinary science of literature mining. In text mining, the machine reads the published literature and transfers the discovered knowledge to mathematical-like formulas. In an integrative approach in this study, we used text mining in combination with network discovery, pathway analysis, and enrichment analysis of genomic regions for better understanding of biomarkers in lung cancer. Particular attention was paid to non-coding biomarkers. In total, 60 MicroRNA biomarkers were reported for lung cancer, including some prognostic biomarkers. MIR21, MIR155, MALAT1, and MIR31 were the top non-coding RNA biomarkers of lung cancer. Text mining identified 447 proteins which have been studied as biomarkers in lung cancer. EGFR (receptor), TP53 (transcription factor), KRAS, CDKN2A, ENO2, KRT19, RASSF1, GRP (ligand), SHOX2 (transcription factor), and ERBB2 (receptor) were the most studied proteins. Within small molecules, thymosin-a1, oestrogen, and 8-OHdG have received more attention. We found some chromosomal bands, such as 7q32.2, 18q12.1, 6p12, 11p15.5, and 3p21.3 that are highly involved in deriving lung cancer biomarkers.

Keywords: Biomarker; Lung Cancer; MicroRNA; Systems biology.

MeSH terms

  • Biomarkers, Tumor / genetics*
  • Cell Line, Tumor
  • Data Mining
  • Gene Regulatory Networks
  • Humans
  • Lung Neoplasms / genetics*
  • RNA, Untranslated / genetics
  • Systems Biology / methods*

Substances

  • Biomarkers, Tumor
  • RNA, Untranslated