Mapping Biomolecular Sequences: Graphical Representations - Their Origins, Applications and Future Prospects

Comb Chem High Throughput Screen. 2022;25(3):354-364. doi: 10.2174/1386207324666210510164743.

Abstract

The exponential growth in the depositories of biological sequence data has generated an urgent need to store, retrieve and analyse the data efficiently and effectively for which the standard practice of using alignment procedures are not adequate due to high demand on computing resources and time. Graphical representation of sequences has become one of the most popular alignment-free strategies to analyse the biological sequences where each basic unit of the sequences - the bases adenine, cytosine, guanine and thymine for DNA/RNA, and the 20 amino acids for proteins - are plotted on a multi-dimensional grid. The resulting curve in 2D and 3D space and the implied graph in higher dimensions provide a perception of the underlying information of the sequences through visual inspection; numerical analyses, in geometrical or matrix terms, of the plots provide a measure of comparison between sequences and thus enable study of sequence hierarchies. The new approach has also enabled studies of comparisons of DNA sequences over many thousands of bases and provided new insights into the structure of the base compositions of DNA sequences. In this article we review in brief the origins and applications of graphical representations and highlight the future perspectives in this field.

Keywords: DNA mapping; GRANCH techniques; Graphical representation; base distribution; peptide vaccines; sequence comparisons; sequence descriptors; sequence visualization.

Publication types

  • Review

MeSH terms

  • DNA* / genetics
  • RNA*
  • Sequence Analysis, DNA / methods

Substances

  • RNA
  • DNA