One Among Millions: The Chemical Space of Nucleic Acid-Like Molecules

J Chem Inf Model. 2019 Oct 28;59(10):4266-4277. doi: 10.1021/acs.jcim.9b00632. Epub 2019 Sep 23.

Abstract

Biology encodes hereditary information in DNA and RNA, which are finely tuned to their biological functions and modes of biological production. The central role of nucleic acids in biological information flow makes them key targets of pharmaceutical research. Indeed, other nucleic acid-like polymers can play similar roles to natural nucleic acids both in vivo and in vitro; yet despite remarkable advances over the last few decades, much remains unknown regarding which structures are compatible with molecular information storage. Chemical space describes the structures and properties of molecules that could exist within a given molecular formula or other classification system. Using structure generation methods, we explore nucleic acid analogues within the formula ranges BC3-7H5-15O2-4 and BC3-6H5-15N1-2O0-4, where B is a recognition element (e.g., a nucleobase). Other restrictions included two obligatory points of attachment for inclusion into a linear polymer and substructures predicting chemical stability. These sets contain 86,007 (CHO) and 75,309 (CHNO) compositionally isomeric structures, representing 706,568 CHO and 454,422 CHNO stereoisomers, that diversely and densely occupy this space. These libraries point toward there being large spaces of unexplored chemistry relevant to pharmacology and biochemistry and efforts to understand the origins of life.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cheminformatics
  • Databases, Nucleic Acid*
  • Drug Discovery
  • Nucleic Acid Conformation
  • Nucleic Acids / chemistry*
  • Small Molecule Libraries*

Substances

  • Nucleic Acids
  • Small Molecule Libraries