DDIG-in: discriminating between disease-associated and neutral non-frameshifting micro-indels

Genome Biol. 2013 Mar 13;14(3):R23. doi: 10.1186/gb-2013-14-3-r23.

Abstract

Micro-indels (insertions or deletions shorter than 21 bps) constitute the second most frequent class of human gene mutation after single nucleotide variants. Despite the relative abundance of non-frameshifting indels, their damaging effect on protein structure and function has gone largely unstudied. We have developed a support vector machine-based method named DDIG-in (Detecting disease-causing genetic variations due to indels) to prioritize non-frameshifting indels by comparing disease-associated mutations with putatively neutral mutations from the 1,000 Genomes Project. The final model gives good discrimination for indels and is robust against annotation errors. A webserver implementing DDIG-in is available at http://sparks-lab.org/ddig.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms*
  • Conserved Sequence / genetics
  • DNA / genetics
  • Databases, Genetic
  • Disease / genetics*
  • Frameshift Mutation / genetics*
  • Gene Frequency / genetics
  • Humans
  • INDEL Mutation / genetics*
  • ROC Curve
  • Sequence Homology, Nucleic Acid
  • Support Vector Machine

Substances

  • DNA