Impact of sequencing depth and read length on single cell RNA sequencing data of T cells

Sci Rep. 2017 Oct 6;7(1):12781. doi: 10.1038/s41598-017-12989-x.

Abstract

Single cell RNA sequencing (scRNA-seq) provides great potential in measuring the gene expression profiles of heterogeneous cell populations. In immunology, scRNA-seq allowed the characterisation of transcript sequence diversity of functionally relevant T cell subsets, and the identification of the full length T cell receptor (TCRαβ), which defines the specificity against cognate antigens. Several factors, e.g. RNA library capture, cell quality, and sequencing output affect the quality of scRNA-seq data. We studied the effects of read length and sequencing depth on the quality of gene expression profiles, cell type identification, and TCRαβ reconstruction, utilising 1,305 single cells from 8 publically available scRNA-seq datasets, and simulation-based analyses. Gene expression was characterised by an increased number of unique genes identified with short read lengths (<50 bp), but these featured higher technical variability compared to profiles from longer reads. Successful TCRαβ reconstruction was achieved for 6 datasets (81% - 100%) with at least 0.25 millions (PE) reads of length >50 bp, while it failed for datasets with <30 bp reads. Sufficient read length and sequencing depth can control technical noise to enable accurate identification of TCRαβ and gene expression profiles from scRNA-seq data of T cells.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • CD8-Positive T-Lymphocytes / metabolism
  • Cluster Analysis
  • Databases as Topic
  • Gene Expression Profiling
  • Hepacivirus / immunology
  • Humans
  • Receptors, Antigen, T-Cell, alpha-beta / metabolism
  • Sequence Analysis, RNA / methods*
  • Single-Cell Analysis*
  • T-Lymphocytes / metabolism*

Substances

  • Receptors, Antigen, T-Cell, alpha-beta