Yclon: Ultrafast clustering of B cell clones from high-throughput immunoglobulin repertoire sequencing data

J Immunol Methods. 2023 Dec:523:113576. doi: 10.1016/j.jim.2023.113576. Epub 2023 Oct 30.

Abstract

Motivation: The next-generation sequencing technologies have transformed our understanding of immunoglobulin (Ig) profiles in various immune states. Clonotyping, which groups Ig sequences into B cell clones, is crucial in investigating the diversity of repertoires and changes in antigen exposure. Despite its importance, there is no widely accepted method for clonotyping, and existing methods are computationally intensive for large sequencing datasets.

Results: To address this challenge, we introduce YClon, a fast and efficient approach for clonotyping Ig repertoire data. YClon uses a hierarchical clustering approach, similar to other methods, to group Ig sequences into B cell clones in a highly sensitive and specific manner. Notably, our approach outperforms other methods by being more than 30 to 5000 times faster in processing the repertoires analyzed. Astonishingly, YClon can effortlessly handle up to 2 million Ig sequences on a standard laptop computer. This enables in-depth analysis of large and numerous antibody repertoires.

Availability and implementation: YClon was implemented in Python3 and is freely available on GitHub.

Keywords: Agglomerative clustering; Antibody repertoire sequencing; B cell clonotyping method.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, N.I.H., Extramural

MeSH terms

  • B-Lymphocytes*
  • Clone Cells
  • Cluster Analysis
  • High-Throughput Nucleotide Sequencing / methods
  • Immunoglobulins* / genetics

Substances

  • Immunoglobulins