Definitive demonstration by synthesis of genome annotation completeness

Proc Natl Acad Sci U S A. 2019 Nov 26;116(48):24206-24213. doi: 10.1073/pnas.1905990116. Epub 2019 Nov 12.

Abstract

We develop a method for completing the genetics of natural living systems by which the absence of expected future discoveries can be established. We demonstrate the method using bacteriophage øX174, the first DNA genome to be sequenced. Like many well-studied natural organisms, closely related genome sequences are available-23 Bullavirinae genomes related to øX174. Using bioinformatic tools, we first identified 315 potential open reading frames (ORFs) within the genome, including the 11 established essential genes and 82 highly conserved ORFs that have no known gene products or assigned functions. Using genome-scale design and synthesis, we made a mutant genome in which all 11 essential genes are simultaneously disrupted, leaving intact only the 82 conserved but cryptic ORFs. The resulting genome is not viable. Cell-free gene expression followed by mass spectrometry revealed only a single peptide expressed from both the cryptic ORF and wild-type genomes, suggesting a potential new gene. A second synthetic genome in which 71 conserved cryptic ORFs were simultaneously disrupted is viable but with ∼50% reduced fitness relative to the wild type. However, rather than finding any new genes, repeated evolutionary adaptation revealed a single point mutation that modulates expression of gene H, a known essential gene, and fully suppresses the fitness defect. Taken together, we conclude that the annotation of currently functional ORFs for the øX174 genome is formally complete. More broadly, we show that sequencing and bioinformatics followed by synthesis-enabled reverse genomics, proteomics, and evolutionary adaptation can definitely establish the sufficiency and completeness of natural genome annotations.

Keywords: cleanomics; gene discovery; reverse genomics; synthetic biology; synthetic genomics.

MeSH terms

  • Base Sequence
  • Codon
  • Coliphages / genetics*
  • Conserved Sequence
  • Directed Molecular Evolution
  • Gene Expression Regulation, Viral
  • Genes, Essential
  • Genome, Viral*
  • Genomics / methods
  • Microorganisms, Genetically-Modified
  • Molecular Sequence Annotation / methods*
  • Mutation
  • Open Reading Frames*
  • Viral Proteins / genetics

Substances

  • Codon
  • Viral Proteins