Using molecular sizes of simple sequence repeats vs. discrete binned data in assessing probability of ancestry: application to maize hybrids

Genetics. 2005 May;170(1):365-74. doi: 10.1534/genetics.103.022061. Epub 2004 Sep 30.

Abstract

Most inferential methods for profiling genotypes based upon the use of DNA fragments use molecular-size data transcribed into discrete bins, which are intervals of DNA fragment sizes. Categorizing into bins is labor intensive with inevitable arbitrariness that may vary between laboratories. We describe and evaluate an algorithm for determining probabilities of parentage based on raw molecular-size data without establishing bins. We determine the standard deviation of DNA fragment size and assess the association of standard deviation with fragment size. We consider a pool of potential ancestors for an index line that is a hybrid with unknown pedigree. We evaluate the identification of inbred parents of maize hybrids with simple sequence repeat data in the form of actual molecular sizes received from two laboratories. We find the standard deviation to be essentially constant over the molecular weight. We compare these results with those of parallel analyses based on these same data that had been transcribed into discrete bins by the respective laboratories. The conclusions were quite similar in the two cases, with excellent performance using either binned or molecular-size data. We demonstrate the algorithm's utility and robustness through simulations of levels of missing and misscored molecular-size data.

Publication types

  • Comparative Study

MeSH terms

  • Algorithms
  • Analysis of Variance
  • Data Interpretation, Statistical
  • Hybridization, Genetic*
  • Minisatellite Repeats*
  • Zea mays / genetics*