Guidelines for reproducible analysis of adaptive immune receptor repertoire sequencing data

Brief Bioinform. 2024 Mar 27;25(3):bbae221. doi: 10.1093/bib/bbae221.

Abstract

Enhancing the reproducibility and comprehension of adaptive immune receptor repertoire sequencing (AIRR-seq) data analysis is critical for scientific progress. This study presents guidelines for reproducible AIRR-seq data analysis, and a collection of ready-to-use pipelines with comprehensive documentation. To this end, ten common pipelines were implemented using ViaFoundry, a user-friendly interface for pipeline management and automation. This is accompanied by versioned containers, documentation and archiving capabilities. The automation of pre-processing analysis steps and the ability to modify pipeline parameters according to specific research needs are emphasized. AIRR-seq data analysis is highly sensitive to varying parameters and setups; using the guidelines presented here, the ability to reproduce previously published results is demonstrated. This work promotes transparency, reproducibility, and collaboration in AIRR-seq data analysis, serving as a model for handling and documenting bioinformatics pipelines in other research domains.

Keywords: AIRR-seq; FAIR; annotation; pipelines; preprocessing; reproducibility.

MeSH terms

  • Adaptive Immunity / genetics
  • Computational Biology* / methods
  • Guidelines as Topic
  • High-Throughput Nucleotide Sequencing / methods
  • Humans
  • Receptors, Immunologic / genetics
  • Reproducibility of Results
  • Software*