Soybean genomic survey: BAC-end sequences near RFLP and SSR markers

Genome. 2001 Aug;44(4):572-81.

Abstract

We are building a framework physical infrastructure across the soybean genome by using SSR (simple sequence repeat) and RFLP (restriction fragment length polymorphism) markers to identify BACs (bacterial artificial chromosomes) from two soybean BAC libraries. The libraries were prepared from two genotypes, each digested with a different restriction enzyme. The BACs identified by each marker were grouped into contigs. We have obtained BAC- end sequence from BACs within each contig. The sequences were analyzed by the University of Minnesota Center for Computational Genomics and Bioinformatics using BLAST algorithms to search nucleotide and protein databases. The SSR-identified BACs had a higher percentage of significant BLAST hits than did the RFLP-identified BACs. This difference was due to a higher percentage of hits to repetitive-type sequences for the SSR-identified BACs that was offset in part, however, by a somewhat larger proportion of RFLP-identified significant hits with similarity to experimentally defined genes and soybean ESTs (expressed sequence tags). These genes represented a wide range of metabolic functions. In these analyses, only repetitive sequences from SSR-identified contigs appeared to be clustered. The BAC-end sequences also allowed us to identify microsynteny between soybean and the model plants Arabidopsis thaliana and Medicago truncatula. This map-based approach to genome sampling provides a means of assaying soybean genome structure and organization.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Arabidopsis / genetics
  • Chromosomes, Artificial, Bacterial*
  • Contig Mapping
  • Databases as Topic
  • Expressed Sequence Tags
  • Gene Library
  • Genetic Markers*
  • Genotype
  • Glycine max / genetics*
  • Medicago / genetics
  • Models, Genetic
  • Molecular Sequence Data
  • Polymorphism, Genetic*
  • Polymorphism, Restriction Fragment Length*
  • Sequence Analysis, DNA
  • Software

Substances

  • Genetic Markers

Associated data

  • GENBANK/AF180335