A genome-phenome association study in native microbiomes identifies a mechanism for cytosine modification in DNA and RNA

Elife. 2021 Nov 8:10:e70021. doi: 10.7554/eLife.70021.

Abstract

Shotgun metagenomic sequencing is a powerful approach to study microbiomes in an unbiased manner and of increasing relevance for identifying novel enzymatic functions. However, the potential of metagenomics to relate from microbiome composition to function has thus far been underutilized. Here, we introduce the Metagenomics Genome-Phenome Association (MetaGPA) study framework, which allows linking genetic information in metagenomes with a dedicated functional phenotype. We applied MetaGPA to identify enzymes associated with cytosine modifications in environmental samples. From the 2365 genes that met our significance criteria, we confirm known pathways for cytosine modifications and proposed novel cytosine-modifying mechanisms. Specifically, we characterized and identified a novel nucleic acid-modifying enzyme, 5-hydroxymethylcytosine carbamoyltransferase, that catalyzes the formation of a previously unknown cytosine modification, 5-carbamoyloxymethylcytosine, in DNA and RNA. Our work introduces MetaGPA as a novel and versatile tool for advancing functional metagenomics.

Keywords: 5-hydroxymethylcytosine carbamoyltransferase; DNA modification; GWAS; MetaGPA; RNA modification; functional metagenomics; genetics; genomics; infectious disease; microbiology; microbiome.

Plain language summary

Many industrial processes, such as starch processing and oil refinement, use chemicals that cause harm to the environment. These can often be switched to more sustainable biological processes that are powered by proteins called enzymes. Enzymes are micro-factories that speed up biochemical reactions in most living things. Communities of microorganisms (also known as microbiomes) are an amazing but often untapped resource for discovering enzymes that can be harnessed for industrial purposes. To gain a better picture of the microbes present within a population, researchers often extract and sequence the genetic material of all microorganisms in an environmental sample, also known as the metagenome. While current methods for analyzing the metagenome are good at identifying new species, they often provide limited information about the microorganism’s functional role within the community. This makes it difficult to find new enzymes that may be useful for industry. Here, Yang, Lin et al. have developed a new technique called Metagenomics Genome-Phenome Association, or MetaGPA for short. The method works in a similar way to genome-wide association studies (GWAS) which are used to identify genes involved in human disease. However, instead of disease associated genes in humans, MetaGPA finds microbial genes that are associated with a biological process useful for biotechnology. Like GWAS, the new approach created by Yang, Lin et al. compares two groups: the first contains microorganisms that carry out a specific process, and the second contains all organisms in the microbiome. The metagenome of each group is extracted and a computational pipeline is then applied to identify genes, including those coding for enzymes, that are found more often in the group performing the desired task. To test the technique, Yang, Lin et al. used MetGPA to find new enzymes involved in DNA modification. Microbiome samples were collected from coastal water and sewage, and the computational pipeline was applied to discover genes that are associated with this process. Further analysis revealed that one of the identified genes codes for an enzyme that introduces a previously unknown change to DNA. MetaGPA could be applied to other processes and microbiomes, and, if successful, may help researchers to identify more diverse enzymes than is currently available. This could scale up the discovery of new enzymes that can be used to power industrial reactions.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cytosine / metabolism*
  • DNA, Bacterial / metabolism*
  • Escherichia coli K12 / genetics*
  • Genome, Bacterial*
  • Microbiota / genetics*
  • RNA, Bacterial / metabolism*

Substances

  • DNA, Bacterial
  • RNA, Bacterial
  • Cytosine

Grants and funding

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.