An application of slow feature analysis to the genetic sequences of coronaviruses and influenza viruses

Hum Genomics. 2021 May 7;15(1):26. doi: 10.1186/s40246-021-00327-2.

Abstract

Background: Mathematical approaches have been for decades used to probe the structure of DNA sequences. This has led to the development of Bioinformatics. In this exploratory work, a novel mathematical method is applied to probe the DNA structure of two related viral families: those of coronaviruses and those of influenza viruses. The coronaviruses are SARS-CoV-2, SARS-CoV-1, and MERS. The influenza viruses include H1N1-1918, H1N1-2009, H2N2-1957, and H3N2-1968.

Methods: The mathematical method used is the slow feature analysis (SFA), a rather new but promising method to delineate complex structure in DNA sequences.

Results: The analysis indicates that the DNA sequences exhibit an elaborate and convoluted structure akin to complex networks. We define a measure of complexity and show that each DNA sequence exhibits a certain degree of complexity within itself, while at the same time there exists complex inter-relationships between the sequences within a family and between the two families. From these relationships, we find evidence, especially for the coronavirus family, that increasing complexity in a sequence is associated with higher transmission rate but with lower mortality.

Conclusions: The complexity measure defined here may hold a promise and could become a useful tool in the prediction of transmission and mortality rates in future new viral strains.

Keywords: Coronaviruses; DNA complexity; Influenza viruses; Slow feature analysis.

MeSH terms

  • Betacoronavirus / classification*
  • Betacoronavirus / genetics*
  • Betacoronavirus / physiology
  • Coronavirus Infections / mortality
  • Coronavirus Infections / transmission
  • Coronavirus Infections / virology
  • Evolution, Molecular
  • Humans
  • Influenza A virus / classification*
  • Influenza A virus / genetics*
  • Influenza A virus / physiology
  • Influenza, Human / mortality
  • Influenza, Human / transmission
  • Influenza, Human / virology
  • Models, Genetic*
  • Sequence Analysis, DNA
  • Species Specificity
  • Time Factors