A beginners guide to estimating the non-synonymous to synonymous rate ratio of all protein-coding genes in a genome

Methods Mol Biol. 2015:1201:65-90. doi: 10.1007/978-1-4939-1438-8_4.

Abstract

The ratio of non-synonymous to synonymous substitutions (dN/dS) is a useful measure of the strength and mode of natural selection acting on protein-coding genes. It is widely used to study patterns of selection on protein genes on a genomic scale-from the small genomes of viruses, bacteria, and parasitic eukaryotes to the largest eukaryotic genomes. In this chapter we describe all the steps necessary to calculate the dN/dS of all the genes using at least two genomes. We include a brief discussion on assigning orthologs, and of codon-aware alignment of orthologs. We then describe how to use the CODEML program of the PAML package for phylogenetic analysis to calculate the dN/dS and how to perform some statistical tests for positive selection. We then outline some methods for interpreting output and describe how one may use this data to make discoveries about the biology of your species. Finally, as a worked example we show all the steps we used to calculate dN/dS for 3,261 orthologs from six Plasmodium species, including tests for adaptive evolution (see worked_example.pdf).

MeSH terms

  • Codon
  • Databases, Genetic
  • Evolution, Molecular
  • Genomics / methods*
  • Models, Genetic
  • Phylogeny
  • Plasmodium / genetics
  • Proteins / genetics*
  • Selection, Genetic
  • Software*
  • User-Computer Interface

Substances

  • Codon
  • Proteins