Protocol for Classification Single-Cell PBMC Types from Pathological Samples Using Supervised Machine Learning

Methods Mol Biol. 2023:2673:53-67. doi: 10.1007/978-1-0716-3239-0_4.

Abstract

Peripheral blood mononuclear cells (PBMC) are mixed subpopulations of blood cells composed of five cell types. PBMC are widely used in the study of the immune system, infectious diseases, cancer, and vaccine development. Single-cell transcriptomics (SCT) allows the labeling of cell types by gene expression patterns from biological samples. Classifying cells into cell types and states is essential for single-cell analyses, especially in the classification of diseases and the assessment of therapeutic interventions, and for many secondary analyses. Most of the classification of cell types from SCT data use unsupervised clustering or a combination of unsupervised and supervised methods including manual correction. In this chapter, we describe a protocol that uses supervised machine learning (ML) methods with SCT data for the classification of PBMC cell types in samples representing pathological states. This protocol has three parts: (1) data preprocessing, (2) labeling of reference PBMC SCT datasets and training supervised ML models, and (3) labeling new PBMC datasets from disease samples. This protocol enables building classification models that are of high accuracy and efficiency. Our example focuses on 10× Genomics technology but applies to datasets from other SCT platforms.

Keywords: Cell type classification; Disease; Peripheral blood mononuclear cells; Protocol; Single-cell transcriptomics; Supervised machine learning.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, N.I.H., Extramural

MeSH terms

  • Gene Expression Profiling / methods
  • Genomics
  • Humans
  • Leukocytes, Mononuclear*
  • Neoplasms*
  • Supervised Machine Learning