Correspondence on NanoVar's performance outlined by Jiang T. et al. in "Long-read sequencing settings for efficient structural variation detection based on comprehensive evaluation"

Cheng Yong Tham; Touati Benoukraf

doi:10.1186/s12859-023-05484-w

Correspondence on NanoVar's performance outlined by Jiang T. et al. in "Long-read sequencing settings for efficient structural variation detection based on comprehensive evaluation"

BMC Bioinformatics. 2023 Sep 20;24(1):350. doi: 10.1186/s12859-023-05484-w.

Authors

Cheng Yong Tham¹, Touati Benoukraf^{2

3}

Affiliations

¹ Cancer Science Institute of Singapore, National University of Singapore, Singapore, 117599, Singapore.
² Cancer Science Institute of Singapore, National University of Singapore, Singapore, 117599, Singapore. tbenoukraf@mun.ca.
³ Division of BioMedical Sciences, Faculty of Medicine, Memorial University of Newfoundland, St. John's, NL, A1B 3V6, Canada. tbenoukraf@mun.ca.

Abstract

A recent paper by Jiang et al. in BMC Bioinformatics presented guidelines on long-read sequencing settings for structural variation (SV) calling, and benchmarked the performance of various SV calling tools, including NanoVar. In their simulation-based benchmarking, NanoVar was shown to perform poorly compared to other tools, mostly due to low SV recall rates. To investigate the causes for NanoVar's poor performance, we regenerated the simulation datasets (3× to 20×) as specified by Jiang et al. and performed benchmarking for NanoVar and Sniffles. Our results did not reflect the findings described by Jiang et al. In our analysis, NanoVar displayed more than three times the F1 scores and recall rates as reported in Jiang et al. across all sequencing coverages, indicating a previous underestimation of its performance. We also observed that NanoVar outperformed Sniffles in calling SVs with genotype concordance by more than 0.13 in F1 scores, which is contrary to the trend reported by Jiang et al. Besides, we identified multiple detrimental errors encountered during the analysis which were not addressed by Jiang et al. We hope that this commentary clarifies NanoVar's validity as a long-read SV caller and provides assurance to its users and the scientific community.

Keywords: Benchmark; Long-read sequencing; NanoVar; SV calling; Structural variation.

MeSH terms

Benchmarking*
Computer Simulation
Genotype

Grants and funding

4334/Canada Research Chairs