Fast multiclonal clusterization of V(D)J recombinations from high-throughput sequencing

BMC Genomics. 2014 May 28;15(1):409. doi: 10.1186/1471-2164-15-409.

Abstract

Background: V(D)J recombinations in lymphocytes are essential for immunological diversity. They are also useful markers of pathologies. In leukemia, they are used to quantify the minimal residual disease during patient follow-up. However, the full breadth of lymphocyte diversity is not fully understood.

Results: We propose new algorithms that process high-throughput sequencing (HTS) data to extract unnamed V(D)J junctions and gather them into clones for quantification. This analysis is based on a seed heuristic and is fast and scalable because in the first phase, no alignment is performed with germline database sequences. The algorithms were applied to TR γ HTS data from a patient with acute lymphoblastic leukemia, and also on data simulating hypermutations. Our methods identified the main clone, as well as additional clones that were not identified with standard protocols.

Conclusions: The proposed algorithms provide new insight into the analysis of high-throughput sequencing data for leukemia, and also to the quantitative assessment of any immunological profile. The methods described here are implemented in a C++ open-source program called Vidjil.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Neoplasm, Residual / diagnosis
  • Precursor Cell Lymphoblastic Leukemia-Lymphoma / diagnosis*
  • Precursor Cell Lymphoblastic Leukemia-Lymphoma / genetics
  • Sequence Analysis, DNA / methods*
  • Software
  • V(D)J Recombination*