Reconstruction of full antibody sequences in NGS datasets and accurate VL:VH coupling by cluster coordinate matching of non-overlapping reads

Comput Struct Biotechnol J. 2022 May 31:20:2723-2727. doi: 10.1016/j.csbj.2022.05.054. eCollection 2022.

Abstract

Next-generation sequencing (NGS) is an indispensable tool in antibody discovery projects. However, the limits on NGS read length make it difficult to reconstruct full antibody sequences from the sequencing runs, especially if the six CDRs are randomized. To overcome that, we took advantage of Illumina's cluster mapping capabilities to pair non-overlapping reads and reconstruct full Fab sequences with accurate VL:VH pairings. The method relies on in silico cluster coordinate information, and not on extensive in vitro manipulation, making the protocol easily deployable and less prone to PCR-derived errors. This work maintains the throughput necessary for antibody discovery campaigns, and a high degree of fidelity, which potentiates not only phage-display and synthetic library-based discovery methods, but also the NGS-driven analysis of naïve and immune libraries.

Keywords: CDR; Diversity; Fab; Next-generation Sequencing; Phage-display; Randomization; Synthetic Libraries.