Applications of 2D and 3D-Dynamic Representations of DNA/RNA Sequences for a Description of Genome Sequences of Viruses

Comb Chem High Throughput Screen. 2022;25(3):429-438. doi: 10.2174/1386207324666210804120454.

Abstract

The aim of the studies is to show that graphical bioinformatics methods are good tools for the description of genome sequences of viruses. A new approach to the identification of unknown virus strains, is proposed.

Methods: Biological sequences have been represented graphically through 2D and 3D-Dynamic Representations of DNA/RNA Sequences - theoretical methods for the graphical representation of the sequences developed by us previously. In these approaches, some ideas of the classical dynamics have been introduced to bioinformatics. The sequences are represented by sets of material points in 2D or 3D spaces. The distribution of the points in space is characteristic of the sequence. The numerical parameters (descriptors) characterizing the sequences correspond to the quantities typical of classical dynamics.

Results: Some applications of the theoretical methods have been briefly reviewed. 2D-dynamic graphs representing the complete genome sequences of SARS-CoV-2 are shown.

Conclusion: It is proved that the 3D-Dynamic Representation of DNA/RNA Sequences, coupled with the random forest algorithm, classifies successfully the subtypes of influenza A virus strains.

Keywords: 2D and 3D-Dynamic Representations of DNA/RNA Sequences; Boruta algorithm.; Graphical bioinformatics; machine learning; random forest; supervised learning.

MeSH terms

  • Base Sequence
  • COVID-19*
  • DNA
  • Humans
  • RNA
  • SARS-CoV-2
  • Viruses*

Substances

  • RNA
  • DNA