Generative models of T-cell receptor sequences

Phys Rev E. 2020 Jun;101(6-1):062414. doi: 10.1103/PhysRevE.101.062414.

Abstract

T-cell receptors (TCR) are key proteins of the adaptive immune system, generated randomly in each individual, whose diversity underlies our ability to recognize infections and malignancies. Modeling the distribution of TCR sequences is of key importance for immunology and medical applications. Here, we compare two inference methods trained on high-throughput sequencing data: a knowledge-guided approach, which accounts for the details of sequence generation, supplemented by a physics-inspired model of selection; and a knowledge-free variational autoencoder based on deep artificial neural networks. We show that the knowledge-guided model outperforms the deep network approach at predicting TCR probabilities, while being more interpretable, at a lower computational cost.

MeSH terms

  • Amino Acid Sequence
  • Deep Learning
  • Ligands
  • Models, Biological*
  • Receptors, Antigen, T-Cell / chemistry*
  • Receptors, Antigen, T-Cell / metabolism*

Substances

  • Ligands
  • Receptors, Antigen, T-Cell