Bioinformatics analysis of differentially expressed genes in tumor and paracancerous tissues of patients with lung adenocarcinoma

J Thorac Dis. 2020 Dec;12(12):7355-7364. doi: 10.21037/jtd-20-3453.

Abstract

Background: Lung adenocarcinoma is the main pathological type of non-small cell lung cancer (NSCLC). In this study, we analyzed the gene expression profile of lung adenocarcinoma tumor and paracancerous tissues by bioinformatics to assess the genes and signal pathways related to lung adenocarcinoma.

Methods: The expression data of GSE7670, GSE27262, and GSE32863 were downloaded from the Gene Expression Omnibus (GEO) database. The three microarray data sets were integrated to obtain common differential expression genes of lung adenocarcinoma tumor and adjacent tissues. The STRING database was used to construct the protein-protein interaction (PPI) network of lung adenocarcinoma and mine the gene modules and core genes in the network, and the online tools, GEPIA and Kaplan-Meier plotter were used to further verify and analyze the core genes.

Results: There were 109 pairs of lung adenocarcinoma tissues and matched paracancerous normal lung tissues in the three data sets. Eighty-three differentially expressed genes were identified, including 16 up-regulated and 67 down-regulated genes, and 60 differentially expressed genes were successfully incorporated into the PPI network complex. Eleven core genes were identified in the PPI network complex, including three up-regulated (COMP, SPP1, COL1A1) and eight down-regulated genes (CDH5, CAV1, CLDN5, LYVE1, IL6, VWF, TEK, PECAM1). These core genes were verified by the GEPIA tumor database. Survival analysis showed that expression of the core genes was significantly related to the prognosis of lung adenocarcinoma. KEGG pathway analysis of core genes showed six genes (COMP, SPP1, COL1A1, IL6, VWF, TEK) were significantly enriched in the PI3K-Akt signaling-pathway (P=1.62E-06).

Conclusions: By analyzing the differential expression genes of lung adenocarcinoma and paracancerous normal tissues with bioinformatics, 11 genes with significant differential expression and significant influence on prognosis were identified. The findings may provide new concepts for developing diagnosis and treatment targets and prognosis markers for lung adenocarcinoma.

Keywords: Gene Expression Omnibus (GEO) database; Lung adenocarcinoma; PI3K-Akt signaling-pathway; bioinformatics analysis; differential expression genes.