FMFilter: A fast model based variant filtering tool

J Biomed Inform. 2016 Apr:60:319-27. doi: 10.1016/j.jbi.2016.02.013. Epub 2016 Feb 27.

Abstract

The availability of whole exome and genome sequencing has completely changed the structure of genetic disease studies. It is now possible to solve the disease causing mechanisms within shorter time and budgets. For this reason, mining out the valuable information from the huge amount of data produced by next generation techniques becomes a challenging task. Current tools analyze sequencing data in various methods. However, there is still need for fast, easy to use and efficacious tools. Considering genetic disease studies, there is a lack of publicly available tools which support compound heterozygous and de novo models. Also, existing tools either require advanced IT expertise or are inefficient for handling large variant files. In this work, we provide FMFilter, an efficient sieving tool for next generation sequencing data produced by genetic disease studies. We develop a software which allows to choose the inheritance model (recessive, dominant, compound heterozygous and de novo), the affected and control individuals. The program provides a user friendly Graphical User Interface which eliminates the requirement of advanced computer techniques. It has various filtering options which enable to eliminate the majority of the false alarms. FMFilter requires negligible memory, therefore it can easily handle very large variant files like multiple whole genomes with ordinary computers. We demonstrate the variant reduction capability and effectiveness of the proposed tool with public and in-house data for different inheritance models. We also compare FMFilter with the existing filtering software. We conclude that FMFilter provides an effective and easy to use environment for analyzing next generation sequencing data from Mendelian diseases.

Keywords: Next generation sequencing; Rare diseases; Variant filtering.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Alleles
  • Computational Biology / methods*
  • Computer Graphics
  • Databases, Genetic
  • Exome
  • Genome, Human
  • Heterozygote
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Medical Informatics / methods*
  • Programming Languages
  • Software*
  • Statistics as Topic
  • User-Computer Interface