Measuring, visualizing, and diagnosing reference bias with biastools

Genome Biol. 2024 Apr 19;25(1):101. doi: 10.1186/s13059-024-03240-8.

Abstract

Many bioinformatics methods seek to reduce reference bias, but no methods exist to comprehensively measure it. Biastools analyzes and categorizes instances of reference bias. It works in various scenarios: when the donor's variants are known and reads are simulated; when donor variants are known and reads are real; and when variants are unknown and reads are real. Using biastools, we observe that more inclusive graph genomes result in fewer biased sites. We find that end-to-end alignment reduces bias at indels relative to local aligners. Finally, we use biastools to characterize how T2T references improve large-scale bias.

Keywords: Pangenomics; Reference bias; Sequence alignment.

MeSH terms

  • Bias
  • Computational Biology
  • Genome*
  • Genomics* / methods
  • High-Throughput Nucleotide Sequencing / methods
  • INDEL Mutation
  • Sequence Analysis, DNA / methods
  • Software