rePROBE: Workflow for Revised Probe Assignment and Updated Probe-set Annotation in Microarrays

Genomics Proteomics Bioinformatics. 2021 Dec;19(6):1043-1049. doi: 10.1016/j.gpb.2020.06.007. Epub 2021 Feb 11.

Abstract

Commercial and customized microarrays are valuable tools for the analysis of holistic expression patterns, but require the integration of the latest genomic information. This study provides a comprehensive workflow implemented in an R package (rePROBE) to assign the entire probes and to annotate the probe sets based on up-to-date genomic and transcriptomic information. The rePROBE package can be applied to available gene expression microarray platforms and addresses both public and custom databases. The revised probe assignment and updated probe-set annotation are applied to commercial microarrays available for different livestock species, i.e., chicken (Gallus gallus; ChiGene-1_0-st: 443,579 probes and 18,530 probe sets), pig (Sus scrofa; PorGene-1_1-st: 592,005 probes and 25,779 probe sets), and cattle (Bos Taurus; BovGene-1_0-st: 530,717 probes and 24,759 probe sets), as well as available for human (Homo sapiens; HuGene-1_0-st) and mouse (Mus musculus; HT_MG-430_PM). Using current species-specific transcriptomic information (RefSeq, Ensembl, and partially non-redundant nucleotide sequences) and genomic information, the applied workflow reveals 297,574 probes (15,689 probe sets) for chicken, 384,715 probes (21,673 probe sets) for pig, 363,077 probes (21,238 probe sets) for cattle, 481,168 probes (23,495 probe sets) for human, and 324,942 probes (32,494 probe sets) for mouse. These are representative of 12,641, 15,758, 18,046, 20,167, and 16,335 unique genes that are both annotated and positioned for chicken, pig, cattle, human, and mouse, respectively. Additionally, the workflow collects information on the number of single nucleotide polymorphisms (SNPs) within respective targeted genomic regions and thus provides a detailed basis for comprehensive analyses such as expression quantitative trait locus (eQTL) studies to identify quantitative and functional traits. The rePROBE R package is freely available at https://github.com/friederhadlich/rePROBE.

Keywords: Genomic database; Mapping; Microarray; Probe assignment; Probe-set annotation.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Cattle / genetics
  • Databases, Genetic*
  • Gene Expression Profiling / veterinary
  • Genomics*
  • Humans
  • Mice
  • Oligonucleotide Array Sequence Analysis
  • Quantitative Trait Loci
  • Workflow