How ambiguity codes specify molecular descriptors and information flow in Code Biology

Nikola Štambuk; Paško Konjevoda; Albert Štambuk

doi:10.1016/j.biosystems.2023.105034

How ambiguity codes specify molecular descriptors and information flow in Code Biology

Biosystems. 2023 Nov:233:105034. doi: 10.1016/j.biosystems.2023.105034. Epub 2023 Sep 21.

Authors

Nikola Štambuk¹, Paško Konjevoda², Albert Štambuk³

Affiliations

¹ Centre for Nuclear Magnetic Resonance, Ruđer Bošković Institute, Bijenička cesta 54, HR-10000, Zagreb, Croatia. Electronic address: stambuk@irb.hr.
² Laboratory for Epigenomics, Division of Molecular Medicine, Ruđer Bošković Institute, Bijenička cesta 54, HR-10000, Zagreb, Croatia. Electronic address: pkonjev@irb.hr.
³ Faculty of Kinesiology, University of Zagreb, Horvaćanski zavoj 15, HR-10000 Zagreb, Croatia.

PMID: 37739308
DOI: 10.1016/j.biosystems.2023.105034

Abstract

The article presents IUPAC ambiguity codes for incomplete nucleic acid specification, and their use in Code Biology. It is shown how to use this nomenclature in order to extract accurate information on different properties of the biological systems. We investigated the use of ambiguity codes, as mathematical and logical operators and truth table elements, for the encoding of amino acids by means of the Standard Genetic Code. It is explained how to use ambiguity codes and truth functions in order to obtain accurate information on different properties of the biological systems. Nucleotide ambiguity codes could be applied to: 1. encoding descriptive information of nucleotides, amino acids and proteins (e.g., of polarity, relative solvent accessibility, atom depth, etc.), and 2. system modelling ranging from standard bioinformatics tools to classic evolutionary models (i.e. from Miyazawa-Jernigan statistical potential to Kimura three-substitution-type model, respectively). It is shown that the algorithms based on IUPAC ambiguity codes, Boolean functions and truth table, Probabilistic Square of Opposition/Semiotic Square and Klein 4-groups-could be used for the bioinformatics analyses and Relational data modelling in natural science. Underlying mathematical, logical and semiotic concepts of interest are presented and addressed.

Keywords: Genetic code; IUPAC ambiguity codes; Klein 4-group; Relational model; Semiotic square; Square of opposition; Truth function.