TLSEA: a tool for lncRNA set enrichment analysis based on multi-source heterogeneous information fusion

Front Genet. 2023 May 2:14:1181391. doi: 10.3389/fgene.2023.1181391. eCollection 2023.

Abstract

Long non-coding RNAs (lncRNAs) play an important regulatory role in gene transcription and post-transcriptional modification, and lncRNA regulatory dysfunction leads to a variety of complex human diseases. Hence, it might be beneficial to detect the underlying biological pathways and functional categories of genes that encode lncRNA. This can be carried out by using gene set enrichment analysis, which is a pervasive bioinformatic technique that has been widely used. However, accurately performing gene set enrichment analysis of lncRNAs remains a challenge. Most conventional enrichment analysis methods have not exhaustively included the rich association information among genes, which usually affects the regulatory functions of genes. Here, we developed a novel tool for lncRNA set enrichment analysis (TLSEA) to improve the accuracy of the gene functional enrichment analysis, which extracted the low-dimensional vectors of lncRNAs in two functional annotation networks with the graph representation learning method. A novel lncRNA-lncRNA association network was constructed by merging lncRNA-related heterogeneous information obtained from multiple sources with the different lncRNA-related similarity networks. In addition, the random walk with restart method was adopted to effectively expand the lncRNAs submitted by users according to the lncRNA-lncRNA association network of TLSEA. In addition, a case study of breast cancer was performed, which demonstrated that TLSEA could detect breast cancer more accurately than conventional tools. The TLSEA can be accessed freely at http://www.lirmed.com:5003/tlsea.

Keywords: functional enrichment analysis; heterogeneous network representation learning; lncRNA; lncRNA–lncRNA association network; random walk with restart; web server.

Grants and funding

This work was supported by the National Natural Science Foundation of China under Grant Nos. 62072154 and 62202330.