Computational identification of biomarker genes for lung cancer considering treatment and non-treatment studies

BMC Bioinformatics. 2020 Dec 3;21(Suppl 9):218. doi: 10.1186/s12859-020-3524-8.

Abstract

Background: Lung cancer is the number one cancer killer in the world with more than 142,670 deaths estimated in the United States alone in the year 2019. Consequently, there is an overreaching need to identify the key biomarkers for lung cancer. The aim of this study is to computationally identify biomarker genes for lung cancer that can aid in its diagnosis and treatment. The gene expression profiles of two different types of studies, namely non-treatment and treatment, are considered for discovering biomarker genes. In non-treatment studies healthy samples are control and cancer samples are cases. Whereas, in treatment studies, controls are cancer cell lines without treatment and cases are cancer cell lines with treatment.

Results: The Differentially Expressed Genes (DEGs) for lung cancer were isolated from Gene Expression Omnibus (GEO) database using R software tool GEO2R. A total of 407 DEGs (254 upregulated and 153 downregulated) from non-treatment studies and 547 DEGs (133 upregulated and 414 downregulated) from treatment studies were isolated. Two Cytoscape apps, namely, CytoHubba and MCODE, were used for identifying biomarker genes from functional networks developed using DEG genes. This study discovered two distinct sets of biomarker genes - one from non-treatment studies and the other from treatment studies, each set containing 16 genes. Survival analysis results show that most non-treatment biomarker genes have prognostic capability by indicating low-expression groups have higher chance of survival compare to high-expression groups. Whereas, most treatment biomarkers have prognostic capability by indicating high-expression groups have higher chance of survival compare to low-expression groups.

Conclusion: A computational framework is developed to identify biomarker genes for lung cancer using gene expression profiles. Two different types of studies - non-treatment and treatment - are considered for experiment. Most of the biomarker genes from non-treatment studies are part of mitosis and play vital role in DNA repair and cell-cycle regulation. Whereas, most of the biomarker genes from treatment studies are associated to ubiquitination and cellular response to stress. This study discovered a list of biomarkers, which would help experimental scientists to design a lab experiment for further exploration of detail dynamics of lung cancer development.

Keywords: Bioinformatics; Computational identification of biomarker; Lung cancer biomarkers; Non-treatment studies; Treatment studies.

MeSH terms

  • Biomarkers, Tumor / genetics*
  • Biomarkers, Tumor / metabolism
  • Computational Biology / methods*
  • Gene Expression Profiling
  • Gene Expression Regulation, Neoplastic
  • Gene Ontology
  • Gene Regulatory Networks
  • Humans
  • Lung Neoplasms / genetics*
  • Prognosis
  • Protein Interaction Maps / genetics
  • Signal Transduction / genetics
  • Survival Analysis

Substances

  • Biomarkers, Tumor