Quantitative elucidation of associations between nucleotide identity and physicochemical properties of amino acids and the functional insight

Comput Struct Biotechnol J. 2021 Jul 17:19:4042-4048. doi: 10.1016/j.csbj.2021.07.012. eCollection 2021.

Abstract

Studies on codon property would deepen our understanding of the origin of primitive life and enlighten biotechnical application. Here, we proposed a quantitative measurement of codon-amino acid association and found that seven out of 13 physicochemical properties have stronger associations with the nucleotide identity at the second codon position, indicating that protein structure and function may associate more closely with it than the other two sites. When extending the effect of codon-amino acid association to protein level, it was found that the correlation between the second codon position (measured by the relative frequencies of nucleobase T and A at this codon site) and hydrophobicity (by the form of GRAVY value) became stronger with 96% genomes having R > 0.90 and p < 1e-60. Furthermore, we revealed that informational genes encoding proteins have lower GRAVY values than operational proteins (p < 3e-37) in both prokaryotic and eukaryotic genomes. The above results reveal a complete link from codon identity (A2 versus T2) to amino acid property (hydrophilic versus hydrophobic) and then to protein functions (informational versus operational). Hence, our work may help to understand how the nucleotide sequence determines protein function.

Keywords: A2 versus T2 frequency; Amino acid physicochemical property; Codon-amino acid association; Informational versus operational functions; Nucleotide combination at specific codon position; The hydropathy and GRAVY value.