PaCMAP-embedded convolutional neural network for multi-omics data integration

Heliyon. 2023 Dec 5;10(1):e23195. doi: 10.1016/j.heliyon.2023.e23195. eCollection 2024 Jan 15.

Abstract

Aims: The multi-omics data integration has emerged as a prominent avenue within the healthcare industry, presenting substantial potential for enhancing predictive models. The main motivation behind this study stems from the imperative need to advance prognostic methodologies in cancer diagnosis, an area where precision is pivotal for effective clinical decision-making. In this context, the present study introduces an innovative methodology that integrates copy number alteration (CNA), DNA methylation, and gene expression data.

Methods: The three omics data were successfully merged into a two-dimensional (2D) map using the PaCMAP dimensionality reduction technique. Utilizing the RGB coloring scheme, a visual representation of the integration was produced utilizing the values of the three omics of each sample. Then, the colored 2D maps were fed into a convolutional neural network (CNN) to forecast the Gleason score.

Results: Our proposed model outperforms the cutting-edge i-SOM-GSN model by integrating multi-omics data and the CNN architecture with an accuracy of 98.89, and AUC of 0.9996.

Conclusion: This study demonstrates the effectiveness of multi-omics data integration in predicting health outcomes. The proposed methodology, combining PaCMAP for dimensionality reduction, RGB coloring for visualization, and CNN for prediction, offers a comprehensive framework for integrating heterogeneous omics data and improving predictive accuracy. These findings contribute to the advancement of personalized medicine and have the potential to aid in clinical decision-making for prostate cancer patients.

Keywords: Convolutional neural network; Embedding techniques; Multi-omics data integration; PaCMAP.