lncDIFF: a novel quasi-likelihood method for differential expression analysis of non-coding RNA

BMC Genomics. 2019 Jul 2;20(1):539. doi: 10.1186/s12864-019-5926-4.

Abstract

Background: Long non-coding RNA (lncRNA) expression data have been increasingly used in finding diagnostic and prognostic biomarkers in cancer studies. Existing differential analysis tools for RNA sequencing do not effectively accommodate low abundant genes, as commonly observed in lncRNAs.

Results: We investigated the statistical distribution of normalized counts for low expression genes in lncRNAs and mRNAs, and proposed a new tool lncDIFF based on the underlying distribution pattern to detect differentially expressed (DE) lncRNAs. lncDIFF adopts the generalized linear model with zero-inflated Exponential quasi-likelihood to estimate group effect on normalized counts, and employs the likelihood ratio test to detect differential expressed genes. The proposed method and tool are applicable to data processed with standard RNA-Seq preprocessing and normalization pipelines. Simulation results showed that lncDIFF was able to detect DE genes with more power and lower false discovery rate regardless of the data pattern, compared to DESeq2, edgeR, limma, zinbwave, DEsingle, and ShrinkBayes. In the analysis of a head and neck squamous cell carcinomas data, lncDIFF also appeared to have higher sensitivity in identifying novel lncRNA genes with relatively large fold change and prognostic value.

Conclusions: lncDIFF is a powerful differential analysis tool for low abundance non-coding RNA expression data. This method is compatible with various existing RNA-Seq quantification and normalization tools. lncDIFF is implemented in an R package available at https://github.com/qianli10000/lncDIFF .

Keywords: Differential analysis; Head and neck squamous cell carcinomas; Quasi-likelihood; lncRNA.

MeSH terms

  • Area Under Curve
  • Computational Biology / methods
  • Computational Biology / statistics & numerical data*
  • Gene Expression Regulation, Neoplastic
  • Head and Neck Neoplasms / genetics
  • Humans
  • Likelihood Functions
  • Linear Models
  • Models, Genetic
  • RNA, Long Noncoding / genetics*
  • Software*
  • Squamous Cell Carcinoma of Head and Neck / genetics

Substances

  • RNA, Long Noncoding