Insulin resistance: regression and clustering

PLoS One. 2014 Jun 2;9(6):e94129. doi: 10.1371/journal.pone.0094129. eCollection 2014.

Abstract

In this paper we try to define insulin resistance (IR) precisely for a group of Chinese women. Our definition deliberately does not depend upon body mass index (BMI) or age, although in other studies, with particular random effects models quite different from models used here, BMI accounts for a large part of the variability in IR. We accomplish our goal through application of Gauss mixture vector quantization (GMVQ), a technique for clustering that was developed for application to lossy data compression. Defining data come from measurements that play major roles in medical practice. A precise statement of what the data are is in Section 1. Their family structures are described in detail. They concern levels of lipids and the results of an oral glucose tolerance test (OGTT). We apply GMVQ to residuals obtained from regressions of outcomes of an OGTT and lipids on functions of age and BMI that are inferred from the data. A bootstrap procedure developed for our family data supplemented by insights from other approaches leads us to believe that two clusters are appropriate for defining IR precisely. One cluster consists of women who are IR, and the other of women who seem not to be. Genes and other features are used to predict cluster membership. We argue that prediction with "main effects" is not satisfactory, but prediction that includes interactions may be.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Blood Glucose / metabolism
  • Cluster Analysis
  • Female
  • Glucose Tolerance Test
  • Humans
  • Insulin Resistance* / genetics
  • Male
  • Polymorphism, Single Nucleotide / genetics
  • Principal Component Analysis
  • Regression Analysis
  • Reproducibility of Results
  • Support Vector Machine

Substances

  • Blood Glucose