Limited utility of residue masking for positive-selection inference

Mol Biol Evol. 2014 Sep;31(9):2496-500. doi: 10.1093/molbev/msu183. Epub 2014 Jun 3.

Abstract

Errors in multiple sequence alignments (MSAs) can reduce accuracy in positive-selection inference. Therefore, it has been suggested to filter MSAs before conducting further analyses. One widely used filter, Guidance, allows users to remove MSA positions aligned with low confidence. However, Guidance's utility in positive-selection inference has been disputed in the literature. We have conducted an extensive simulation-based study to characterize fully how Guidance impacts positive-selection inference, specifically for protein-coding sequences of realistic divergence levels. We also investigated whether novel scoring algorithms, which phylogenetically corrected confidence scores, and a new gap-penalization score-normalization scheme improved Guidance's performance. We found that no filter, including original Guidance, consistently benefitted positive-selection inferences. Moreover, all improvements detected were exceedingly minimal, and in certain circumstances, Guidance-based filters worsened inferences.

Keywords: alignment filters; multiple sequence alignment; positive-selection inference; sequence simulation.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Computational Biology / methods*
  • Computer Simulation
  • Proteins / genetics
  • Selection, Genetic
  • Sequence Alignment / methods*
  • Software

Substances

  • Proteins