R classes and methods for SNP array data

Robert B Scharpf; Ingo Ruczinski

doi:10.1007/978-1-60327-194-3_4

R classes and methods for SNP array data

Methods Mol Biol. 2010:593:67-79. doi: 10.1007/978-1-60327-194-3_4.

Authors

Robert B Scharpf¹, Ingo Ruczinski

Affiliation

¹ Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA.

Abstract

The Bioconductor project is an "open source and open development software project for the analysis and comprehension of genomic data" (1), primarily based on the R programming language. Infrastructure packages, such as Biobase, are maintained by Bioconductor core developers and serve several key roles to the broader community of Bioconductor software developers and users. In particular, Biobase introduces an S4 class, the eSet, for high-dimensional assay data. Encapsulating the assay data as well as meta-data on the samples, features, and experiment in the eSet class definition ensures propagation of the relevant sample and feature meta-data throughout an analysis. Extending the eSet class promotes code reuse through inheritance as well as interoperability with other R packages and is less error-prone. Recently proposed class definitions for high-throughput SNP arrays extend the eSet class. This chapter highlights the advantages of adopting and extending Biobase class definitions through a working example of one implementation of classes for the analysis of high-throughput SNP arrays.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Chromosome Aberrations
Chromosomes, Human, Pair 1 / genetics
Computer Simulation
Humans
Oligonucleotide Array Sequence Analysis / methods*
Polymorphism, Single Nucleotide / genetics*
Programming Languages*

Abstract

Publication types

MeSH terms

Grants and funding