CONRAD: a method for identification of variable and conserved regions within proteins by scale-space filtering

Comput Appl Biosci. 1996 Jun;12(3):197-203. doi: 10.1093/bioinformatics/12.3.197.

Abstract

Advanced sequencing techniques allow rapid deduction of individual amino acid sequences of highly related proteins. Due to their quasi-species nature, viral genomes (e.g. HIV-1) represent one of the most common sources of related proteins. Another example of related proteins are immunoglobulins. Local differences in amino acid conservation are useful indicators of potential domain structures and immunological or functional epitopes prior to structural analysis of proteins. Although variability indices can be calculated by several methods, delineation of boundaries between sequence stretches with similar variability indices is left to the user. We use algorithmic scale-space filtering for delineation of conserved and variable sequence stretches within a protein which is performed on an algorithmic basis avoiding arbitrary assignments. Out method correctly identified variable regions for the human immunoglobulin lambda-chain V-regions (subgroup I). Prediction of the variable regions of the HIV-1 gp120 env protein was in agreement with empirical derived definitions. These examples indicate that our method is useful for the regional assignment of protein variability solely on the basis of amino acid sequences.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Conserved Sequence
  • Genetic Variation
  • HIV Envelope Protein gp120 / genetics
  • Humans
  • Immunoglobulin Variable Region / genetics
  • Immunoglobulin lambda-Chains / genetics
  • Molecular Sequence Data
  • Proteins / genetics*
  • Sequence Alignment / methods*
  • Sequence Homology, Amino Acid
  • Software*

Substances

  • HIV Envelope Protein gp120
  • Immunoglobulin Variable Region
  • Immunoglobulin lambda-Chains
  • Proteins