Deep Learning for Genomics: From Early Neural Nets to Modern Large Language Models

Int J Mol Sci. 2023 Nov 1;24(21):15858. doi: 10.3390/ijms242115858.

Abstract

The data explosion driven by advancements in genomic research, such as high-throughput sequencing techniques, is constantly challenging conventional methods used in genomics. In parallel with the urgent demand for robust algorithms, deep learning has succeeded in various fields such as vision, speech, and text processing. Yet genomics entails unique challenges to deep learning, since we expect a superhuman intelligence that explores beyond our knowledge to interpret the genome from deep learning. A powerful deep learning model should rely on the insightful utilization of task-specific knowledge. In this paper, we briefly discuss the strengths of different deep learning models from a genomic perspective so as to fit each particular task with proper deep learning-based architecture, and we remark on practical considerations of developing deep learning architectures for genomics. We also provide a concise review of deep learning applications in various aspects of genomic research and point out current challenges and potential research directions for future genomics applications. We believe the collaborative use of ever-growing diverse data and the fast iteration of deep learning models will continue to contribute to the future of genomics.

Keywords: computer vision; deep learning; genomics; large language model; multi-modal machine learning.

Publication types

  • Review

MeSH terms

  • Algorithms
  • Deep Learning*
  • Genomics / methods
  • Intelligence
  • Language

Grants and funding

This research received no external funding.