Tracing molecular properties throughout evolution: A chemoinformatic approach

J Theor Biol. 2021 Apr 21:515:110601. doi: 10.1016/j.jtbi.2021.110601. Epub 2021 Jan 26.

Abstract

Evolution of metabolism is a longstanding yet unresolved question, and several hypotheses were proposed to address this complex process from a Darwinian point of view. Modern statistical bioinformatic approaches targeted to the comparative analysis of genomes are being used to detect signatures of natural selection at the gene and population level, as an attempt to understand the origin of primordial metabolism and its expansion. These studies, however, are still mainly centered on genes and the proteins they encode, somehow neglecting the small organic chemicals that support life processes. In this work, we selected steroids as an ancient family of metabolites widely distributed in all eukaryotes and applied unsupervised machine learning techniques to reveal the traits that natural selection has imprinted on molecular properties throughout the evolutionary process. Our results clearly show that sterols, the primal steroids that first appeared, have more conserved properties and that, from then on, more complex compounds with increasingly diverse properties have emerged, suggesting that chemical diversification parallels the expansion of biological complexity. In a wider context, these findings highlight the worth of chemoinformatic approaches to a better understanding the evolution of metabolism.

Keywords: Chemoinformatics; Machine Learning; Metabolic evolution; Molecular properties; Steroids.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cheminformatics*
  • Eukaryota
  • Evolution, Molecular
  • Genome
  • Phylogeny
  • Selection, Genetic*