Next-Generation Sequencing Data Analysis on Pool-Seq and Low-Coverage Retinoblastoma Data

Interdiscip Sci. 2020 Sep;12(3):302-310. doi: 10.1007/s12539-020-00374-8. Epub 2020 Jun 9.

Abstract

Next-generation sequencing (NGS) is related to massively parallel or deep deoxyribonucleic acid (DNA) sequencing technology which has revolutionized genomic researches in recent years. Although the cost of generating NGS data was decreased compared to the one at the time of emerging this technology, its cost might still be somewhat a problem. Hence, new strategies as pool-seq and low-coverage NGS data have been developed to overcome the cost problem. Despite decreasing cost, it is important to elucidate whether they are efficient in NGS studies. We applied a bioinformatics pipeline on pool-seq and low-coverage retinoblastoma data retrieved from only tumor data. Retinoblastoma is an eye malignancy in childhood that is initiated by RB1 mutation or MYCN amplification and can lead to the loss of vision of eye(s), and even sometimes life. We applied our pipeline on both retinoblastoma disease data and two other particular data to testify the validity and also for comparison purposes in the aspect of performance. High-confidence variant calls from Genome in a Bottle Consortium were used for fulfilling these purposes. We observed that our pipeline successfully called higher number of variants than a standard pipeline for all these three different data. Besides, the recall and F-score values were quite better in our pipeline as being noteworthy. We further presented our results on disease data in the aspects of the variants, variant types and disease-related genes. This study provides a guideline for performing NGS data analysis pipeline on pool-seq and low-coverage sequencing data in conjunction. To get more conclusive outcomes of these two strategies, we recommend using cancer data having higher mutation rates and larger pools.

Keywords: Low-coverage sequencing; NGS data analysis; Pool-seq; Retinoblastoma.

MeSH terms

  • Computational Biology
  • Data Analysis
  • Genomics
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Mutation / genetics
  • Retinoblastoma / genetics*
  • Sequence Analysis, DNA