Towards the integration, annotation and association of historical microarray experiments with RNA-seq

BMC Bioinformatics. 2013;14 Suppl 14(Suppl 14):S4. doi: 10.1186/1471-2105-14-S14-S4. Epub 2013 Oct 9.

Abstract

Background: Transcriptome analysis by microarrays has produced important advances in biomedicine. For instance in multiple myeloma (MM), microarray approaches led to the development of an effective disease subtyping via cluster assignment, and a 70 gene risk score. Both enabled an improved molecular understanding of MM, and have provided prognostic information for the purposes of clinical management. Many researchers are now transitioning to Next Generation Sequencing (NGS) approaches and RNA-seq in particular, due to its discovery-based nature, improved sensitivity, and dynamic range. Additionally, RNA-seq allows for the analysis of gene isoforms, splice variants, and novel gene fusions. Given the voluminous amounts of historical microarray data, there is now a need to associate and integrate microarray and RNA-seq data via advanced bioinformatic approaches.

Methods: Custom software was developed following a model-view-controller (MVC) approach to integrate Affymetrix probe set-IDs, and gene annotation information from a variety of sources. The tool/approach employs an assortment of strategies to integrate, cross reference, and associate microarray and RNA-seq datasets.

Results: Output from a variety of transcriptome reconstruction and quantitation tools (e.g., Cufflinks) can be directly integrated, and/or associated with Affymetrix probe set data, as well as necessary gene identifiers and/or symbols from a diversity of sources. Strategies are employed to maximize the annotation and cross referencing process. Custom gene sets (e.g., MM 70 risk score (GEP-70)) can be specified, and the tool can be directly assimilated into an RNA-seq pipeline.

Conclusion: A novel bioinformatic approach to aid in the facilitation of both annotation and association of historic microarray data, in conjunction with richer RNA-seq data, is now assisting with the study of MM cancer biology.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Cell Line
  • Gene Expression Profiling / methods*
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Molecular Sequence Annotation
  • RNA / chemistry*
  • RNA / genetics
  • Sequence Analysis, RNA
  • Software Design
  • Transcriptome

Substances

  • RNA