Depth dependent amino acid substitution matrices and their use in predicting deleterious mutations

Prog Biophys Mol Biol. 2017 Sep:128:14-23. doi: 10.1016/j.pbiomolbio.2017.02.004. Epub 2017 Feb 15.

Abstract

The 20 naturally occurring amino acids have different environmental preferences of where they are likely to occur in protein structures. Environments in a protein can be classified by their proximity to solvent by the residue depth measure. Since the frequencies of amino acids are different at various depth levels, the substitution frequencies should vary according to depth. To quantify these substitution frequencies, we built depth dependent substitution matrices. The dataset used for creation of the matrices consisted of 3696 high quality, non redundant pairwise protein structural alignments. One of the applications of these matrices is to predict the tolerance of mutations in different protein environments. Using these substitution scores the prediction of deleterious mutations was done on 3500 mutations in T4 lysozyme and CcdB. The accuracy of the technique in terms of the Matthews Correlation Coefficient (MCC) is 0.48 on the CcdB testing set, while the best of the other tested methods has an MCC of 0.40. Further developments in these substitution matrices could help in improving structure-sequence alignment for protein 3D structure modeling.

Keywords: Alignment; Deleterious mutation; Depth; Substitution matrix.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Substitution*
  • Bacterial Proteins / chemistry
  • Bacterial Proteins / genetics
  • Bacterial Proteins / metabolism
  • Bacteriophage T4 / enzymology
  • Computational Biology*
  • Models, Molecular
  • Muramidase / chemistry
  • Muramidase / genetics
  • Muramidase / metabolism
  • Point Mutation*
  • Protein Conformation

Substances

  • Bacterial Proteins
  • CcdB protein, Plasmid F
  • Muramidase