Machine learning analysis of the T cell receptor repertoire identifies sequence features of self-reactivity

Cell Syst. 2023 Dec 20;14(12):1059-1073.e5. doi: 10.1016/j.cels.2023.11.004. Epub 2023 Dec 6.

Abstract

The T cell receptor (TCR) determines specificity and affinity for both foreign and self-peptides presented by the major histocompatibility complex (MHC). Although the strength of TCR interactions with self-pMHC impacts T cell function, it has been challenging to identify TCR sequence features that predict T cell fate. To discern patterns distinguishing TCRs from naive CD4+ T cells with low versus high self-reactivity, we used data from 42 mice to train a machine learning (ML) algorithm that identifies population-level differences between TCRβ sequence sets. This approach revealed that weakly self-reactive T cell populations were enriched for longer CDR3β regions and acidic amino acids. We tested our ML predictions of self-reactivity using retrogenic mice with fixed TCRβ sequences. Extrapolating our analyses to independent datasets, we predicted high self-reactivity for regulatory T cells and slightly reduced self-reactivity for T cells responding to chronic infections. Our analyses suggest a potential trade-off between TCR repertoire diversity and self-reactivity. A record of this paper's transparent peer review process is included in the supplemental information.

Keywords: CD4 T cells; CD5; CDR3 beta chain; T cell receptor; chronic infection; heterogeneity; machine learning; retrogenic mice; self-reactivity; thymic development.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Cell Membrane
  • Major Histocompatibility Complex
  • Mice
  • Peptides / chemistry
  • Receptors, Antigen, T-Cell* / genetics
  • T-Lymphocytes, Regulatory*

Substances

  • Receptors, Antigen, T-Cell
  • Peptides