Analysis of alcoholism data using support vector machines

BMC Genet. 2005 Dec 30;6 Suppl 1(Suppl 1):S136. doi: 10.1186/1471-2156-6-S1-S136.

Abstract

A supervised learning method, support vector machine, was used to analyze the microsatellite marker dataset of the Collaborative Study on the Genetics of Alcoholism Problem 1 for the Genetic Analysis Workshop 14. Twelve binary-valued phenotype variables were chosen for analyses using the markers from all autosomal chromosomes. Using various polynomial kernel functions of the support vector machine and randomly divided genome regions, we were able to observe the association of some marker sets with the chosen phenotypes and thus reduce the size of the dataset. The successful classifications established with the chosen support vector machine kernel function had high levels of correctness for each prediction, e.g., 96% in the fourfold cross-validations. However, owing to the limited sample data, we were not able to test the predictions of the classifiers in the new sample data.

MeSH terms

  • Alcoholism / genetics*
  • Algorithms*
  • Databases, Genetic*
  • Genome, Human / genetics
  • Humans
  • Microsatellite Repeats / genetics
  • Phenotype