The European Nucleotide Archive in 2023

Nucleic Acids Res. 2024 Jan 5;52(D1):D92-D97. doi: 10.1093/nar/gkad1067.

Abstract

The European Nucleotide Archive (ENA; https://www.ebi.ac.uk/ena) is maintained by the European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI). The ENA is one of the three members of the International Nucleotide Sequence Database Collaboration (INSDC). It serves the bioinformatics community worldwide via the submission, processing, archiving and dissemination of sequence data. The ENA supports data types ranging from raw reads, through alignments and assemblies to functional annotation. The data is enriched with contextual information relating to samples and experimental configurations. In this article, we describe recent progress and improvements to ENA services. In particular, we focus upon three areas of work in 2023: FAIRness of ENA data, pandemic preparedness and foundational technology. For FAIRness, we have introduced minimal requirements for spatiotemporal annotation, created a metadata-based classification system, incorporated third party metadata curations with archived records, and developed a new rapid visualisation platform, the ENA Notebooks. For foundational enhancements, we have improved the INSDC data exchange and synchronisation pipelines, and invested in site reliability engineering for ENA infrastructure. In order to support genomic surveillance efforts, we have continued to provide ENA services in support of SARS-CoV-2 data mobilisation and have adapted these for broader pathogen surveillance efforts.

MeSH terms

  • Computational Biology
  • Databases, Nucleic Acid
  • Europe
  • Genomics*
  • Internet
  • Nucleotides*
  • Reproducibility of Results

Substances

  • Nucleotides