A comparison of clustering models for inference of T cell receptor antigen specificity

Dan Hudson; Alex Lubbock; Mark Basham; Hashem Koohy

doi:10.1016/j.immuno.2024.100033

A comparison of clustering models for inference of T cell receptor antigen specificity

Immunoinformatics (Amst). 2024 Mar:13:None. doi: 10.1016/j.immuno.2024.100033.

Authors

Dan Hudson^{1

2}, Alex Lubbock², Mark Basham², Hashem Koohy^{1

3

4}

Affiliations

¹ MRC Human Immunology Unit, MRC Weatherall Institute of Molecular Medicine, University of Oxford, Oxford, UK.
² The Rosalind Franklin Institute, Didcot, UK.
³ Centre for Computational Biology, MRC Weatherall Institute of Molecular Medicine, University of Oxford, Oxford, UK.
⁴ Alan Turning Fellow in Health and Medicine, UK.

Abstract

The vast potential sequence diversity of TCRs and their ligands has presented an historic barrier to computational prediction of TCR epitope specificity, a holy grail of quantitative immunology. One common approach is to cluster sequences together, on the assumption that similar receptors bind similar epitopes. Here, we provide the first independent evaluation of widely used clustering algorithms for TCR specificity inference, observing some variability in predictive performance between models, and marked differences in scalability. Despite these differences, we find that different algorithms produce clusters with high degrees of similarity for receptors recognising the same epitope. Our analysis strengthens the case for use of clustering models to identify signals of common specificity from large repertoires, whilst highlighting scope for improvement of complex models over simple comparators.

Keywords: Clustering models; Deorphanizing TCRs; T cell antigen specificity; T cell receptor repertoire analysis.

Grants and funding

MC_UU_12010/3/MRC_/Medical Research Council/United Kingdom