Effect of the scale of quantitative trait data on the representativeness of a cotton germplasm sub-core collection

J Zhejiang Univ Sci B. 2013 Feb;14(2):162-70. doi: 10.1631/jzus.B1200075.

Abstract

A cotton germplasm collection with data for 20 quantitative traits was used to investigate the effect of the scale of quantitative trait data on the representativeness of plant sub-core collections. The relationship between the representativeness of a sub-core collection and two influencing factors, the number of traits and the sampling percentage, was studied. A mixed linear model approach was used to eliminate environmental errors and predict genotypic values of accessions. Sub-core collections were constructed using a least distance stepwise sampling (LDSS) method combining standardized Euclidean distance and an unweighted pair-group method with arithmetic means (UPGMA) cluster method. The mean difference percentage (MD), variance difference percentage (VD), coincidence rate of range (CR), and variable rate of coefficient of variation (VR) served as evaluation parameters. Monte Carlo simulation was conducted to study the relationship among the number of traits, the sampling percentage, and the four evaluation parameters. The results showed that the representativeness of a sub-core collection was affected greatly by the number of traits and the sampling percentage, and that these two influencing factors were closely connected. Increasing the number of traits improved the representativeness of a sub-core collection when the data of genotypic values were used. The change in the genetic diversity of sub-core collections with different sampling percentages showed a linear tendency when the number of traits was small, and a logarithmic tendency when the number of traits was large. However, the change in the genetic diversity of sub-core collections with different numbers of traits always showed a strong logarithmic tendency when the sampling percentage was changing. A CR threshold method based on Monte Carlo simulation is proposed to determine the rational number of traits for a relevant sampling percentage of a sub-core collection.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Biological Specimen Banks*
  • Data Interpretation, Statistical*
  • Genetic Variation / genetics*
  • Gossypium / genetics*
  • Quantitative Trait Loci / genetics*
  • Seeds / genetics*