Opportunities and Challenges of Data-Driven Virus Discovery

Biomolecules. 2022 Aug 4;12(8):1073. doi: 10.3390/biom12081073.

Abstract

Virus discovery has been fueled by new technologies ever since the first viruses were discovered at the end of the 19th century. Starting with mechanical devices that provided evidence for virus presence in sick hosts, virus discovery gradually transitioned into a sequence-based scientific discipline, which, nowadays, can characterize virus identity and explore viral diversity at an unprecedented resolution and depth. Sequencing technologies are now being used routinely and at ever-increasing scales, producing an avalanche of novel viral sequences found in a multitude of organisms and environments. In this perspective article, we argue that virus discovery has started to undergo another transformation prompted by the emergence of new approaches that are sequence data-centered and primarily computational, setting them apart from previous technology-driven innovations. The data-driven virus discovery approach is largely uncoupled from the collection and processing of biological samples, and exploits the availability of massive amounts of publicly and freely accessible data from sequencing archives. We discuss open challenges to be solved in order to unlock the full potential of data-driven virus discovery, and we highlight the benefits it can bring to classical (mostly molecular) virology and molecular biology in general.

Keywords: computational virology; data mining; sequencing archives; virosphere in health and disease; virus discovery.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Molecular Biology
  • Sequence Analysis
  • Viruses* / genetics

Grants and funding

C.L. is supported by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy—EXC 2155—project number 390874280. C.L. and S.S. received support from the project “Virological and immunological determinants of COVID-19 pathogenesis—lessons to get prepared for future pandemics (KA1-Co-02 “CoViPa”)”, a grant from the Helmholtz Association’s Initiative and Network Fund.