NCBI GEO: archive for gene expression and epigenomics data sets: 23-year update

Nucleic Acids Res. 2024 Jan 5;52(D1):D138-D144. doi: 10.1093/nar/gkad965.

Abstract

The Gene Expression Omnibus (GEO) is an international public repository that archives gene expression and epigenomics data sets generated by next-generation sequencing and microarray technologies. Data are typically submitted to GEO by researchers in compliance with widespread journal and funder mandates to make generated data publicly accessible. The resource handles raw data files, processed data files and descriptive metadata for over 200 000 studies and 6.5 million samples, all of which are indexed, searchable and downloadable. Additionally, GEO offers web-based tools that facilitate analysis and visualization of differential gene expression. This article presents the current status and recent advancements in GEO, including the generation of consistently computed gene expression count matrices for thousands of RNA-seq studies, and new interactive graphical plots in GEO2R that help users identify differentially expressed genes and assess data set quality. The GEO repository is built and maintained by the National Center for Biotechnology Information (NCBI), a division of the National Library of Medicine (NLM), and is publicly accessible at https://www.ncbi.nlm.nih.gov/geo/.

MeSH terms

  • Databases, Genetic
  • Epigenomics*
  • Gene Expression Profiling*
  • Gene Expression*
  • Oligonucleotide Array Sequence Analysis