Bioinformatical studies suggest that additional information provided by nucleic acids is necessary to construct protein three-dimensional structures. We find underlying correlations between the contents of bases. All correlations occur at the third codon position of a gene sequence. Four inverse relationships are observed between u(3) and c(3), between a(3) and g(3), between u(3) and g(3), and between c(3) and a(3); and two positive relationships are apparent between u(3) and a(3), and between c(3) and g(3). Their correlation coefficients reach -0.92, -0.89, -0.83, -0.85, 0.83, and 0.66, respectively, for large proteins with multistate folding kinetics. The interconnection of bases can be ascribed to choice of synonymous codons associated with protein folding in vivo. In this study, the refolding rate constants of large proteins correlate with the contents of the third base, suggesting that there is underlying biochemical rationale of guiding protein folding in choosing synonymous codons.
Copyright © 2012 Wiley Periodicals, Inc.