Determination of the Amino Acid Recruitment Order in Early Life by Genome-Wide Analysis of Amino Acid Usage Bias

Biomolecules. 2022 Jan 21;12(2):171. doi: 10.3390/biom12020171.

Abstract

The mechanisms shaping the amino acids recruitment pattern into the proteins in the early life history presently remains a huge mystery. In this study, we conducted genome-wide analyses of amino acids usage and genetic codons structure in 7270 species across three domains of life. The carried-out analyses evidenced ubiquitous usage bias of amino acids that were likely independent from codon usage bias. Taking advantage of codon usage bias, we performed pseudotime analysis to re-determine the chronological order of the species emergence, which inspired a new species relationship by tracing the imprint of codon usage evolution. Furthermore, the multidimensional data integration showed that the amino acids A, D, E, G, L, P, R, S, T and V might be the first recruited into the last universal common ancestry (LUCA) proteins. The data analysis also indicated that the remaining amino acids most probably were gradually incorporated into proteogenesis process in the course of two long-timescale parallel evolutionary routes: I→F→Y→C→M→W and K→N→Q→H. This study provides new insight into the origin of life, particularly in terms of the basic protein composition of early life. Our work provides crucial information that will help in a further understanding of protein structure and function in relation to their evolutionary history.

Keywords: LUCA; genetic codon; proteogenesis; pseudotime analysis; recruitment.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acids* / genetics
  • Amino Acids* / metabolism
  • Base Composition
  • Codon / genetics
  • Codon Usage
  • Evolution, Molecular
  • Genome-Wide Association Study*

Substances

  • Amino Acids
  • Codon