RatesTools: a Nextflow pipeline for detecting de novo germline mutations in pedigree sequence data

Bioinformatics. 2023 Jan 1;39(1):btac784. doi: 10.1093/bioinformatics/btac784.

Abstract

Summary: Here, we introduce RatesTools, an automated pipeline to infer de novo mutation rates from parent-offspring trio data of diploid organisms. By providing a reference genome and high-coverage, whole-genome resequencing data of a minimum of three individuals (sire, dam and offspring), RatesTools provides a list of candidate de novo mutations and calculates a putative mutation rate. RatesTools uses several quality filtering steps, such as discarding sites with low mappability and highly repetitive regions, as well as sites with low genotype and mapping qualities to find potential de novo mutations. In addition, RatesTools implements several optional filters based on post hoc assumptions of the heterozygosity and mutation rate of the organism. Filters are highly customizable to user specifications in order to maximize utility across a wide range of applications.

Availability and implementation: RatesTools is freely available at https://github.com/campanam/RatesTools under a Creative Commons Zero (CC0) license. The pipeline is implemented in Nextflow (Di Tommaso et al., 2017), Ruby (http://www.ruby-lang.org), Bash (https://www.gnu.org/software/bash/) and R (R Core Team, 2020) with reliance upon several other freely available tools. RatesTools is compatible with macOS and Linux operating systems.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Genome
  • Germ-Line Mutation*
  • Humans
  • Pedigree
  • Sequence Analysis, DNA
  • Software*