The need to reassess single-cell RNA sequencing datasets: the importance of biological sample processing

F1000Res. 2021 Aug 6:10:767. doi: 10.12688/f1000research.54864.2. eCollection 2021.

Abstract

Background: The advent of single-cell RNA sequencing (scRNAseq) and additional single-cell omics technologies have provided scientists with unprecedented tools to explore biology at cellular resolution. However, reaching an appropriate number of good quality reads per cell and reasonable numbers of cells within each of the populations of interest are key to infer relevant conclusions about the underlying biology of the dataset. For these reasons, scRNAseq studies are constantly increasing the number of cells analysed and the granularity of the resultant transcriptomics analyses. Methods: We aimed to identify previously described fibroblast subpopulations in healthy adult human skin by using the largest dataset published to date (528,253 sequenced cells) and an unsupervised population-matching algorithm. Results: Our reanalysis of this landmark resource demonstrates that a substantial proportion of cell transcriptomic signatures may be biased by cellular stress and response to hypoxic conditions. Conclusions: We postulate that careful design of experimental conditions is needed to avoid long processing times of biological samples. Additionally, computation of large datasets might undermine the extent of the analysis, possibly due to long processing times.

Keywords: Python; computational analysis; fibroblasts; reproducibility; single-cell RNA-seq; skin.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Gene Expression Profiling*
  • Humans
  • Sequence Analysis, RNA
  • Single-Cell Analysis*
  • Specimen Handling
  • Transcriptome