Analysis of mutational spectra: locating hotspots and clusters of mutations using recursive segmentation

Stat Med. 2002 Jul 15;21(13):1867-85. doi: 10.1002/sim.1145.

Abstract

Mutations within different regions of disease-causing genes can vary in their impact on disease initiation and progression. Determining how individual mutations within such genes affect disease risk and progression can improve the accuracy of prognoses and help guide treatment selection. Estimates of mutation-specific risks can be poor, however, when genes have a large number of distinct mutations, and data for any given mutation is sparse. To address this problem, we present here a method of analysing the spectrum of mutations observed across a gene that pools together mutations that appear to have similar effects on disease. One of the assumptions underlying the analysis of mutational spectra created in this manner is that the frequency of the mutation in the sample reflects the degree of its effect on disease development. Additionally, mutations that disrupt the same functionally important region of the gene are expected to have a similar impact on disease development. These mutations tend to form a cluster within the spectrum. Therefore, we developed an algorithm that segments a spectrum into regions containing sites with similar mutational frequencies, and have derived by simulation equations that allow one to evaluate whether segmentation is needed. We used this approach to investigate the spectrum of mutations observed in the p53 tumour suppressor gene in colorectal cancer tumours. Here, recursive segmentation identified the boundaries of apparent clusters better than did other methods, and this approach could identify clusters of mutations which corresponded to biologically important regions of the p53 protein.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Algorithms*
  • Colorectal Neoplasms / genetics
  • Computer Simulation
  • DNA Mutational Analysis / methods*
  • Genes, p53 / genetics
  • Humans
  • Models, Statistical*
  • Point Mutation / genetics