Fusion Gene Detection Using Whole-Exome Sequencing Data in Cancer Patients

Front Genet. 2022 Feb 16:13:820493. doi: 10.3389/fgene.2022.820493. eCollection 2022.

Abstract

Several fusion genes are directly involved in the initiation and progression of cancers. Numerous bioinformatics tools have been developed to detect fusion events, but they are mainly based on RNA-seq data. The whole-exome sequencing (WES) represents a powerful technology that is widely used for disease-related DNA variant detection. In this study, we build a novel analysis pipeline called Fuseq-WES to detect fusion genes at DNA level based on the WES data. The same method applies also for targeted panel sequencing data. We assess the method to real datasets of acute myeloid leukemia (AML) and prostate cancer patients. The result shows that two of the main AML fusion genes discovered in RNA-seq data, PML-RARA and CBFB-MYH11, are detected in the WES data in 36 and 63% of the available samples, respectively. For the targeted deep-sequencing of prostate cancer patients, detection of the TMPRSS2-ERG fusion, which is the most frequent chimeric alteration in prostate cancer, is 91% concordant with a manually curated procedure based on four other methods. In summary, the overall results indicate that it is challenging to detect fusion genes in WES data with a standard coverage of ∼ 15-30x, where fusion candidates discovered in the RNA-seq data are often not detected in the WES data and vice versa. A subsampling study of the prostate data suggests that a coverage of at least 75x is necessary to achieve high accuracy.

Keywords: acute myeloid leukemia; discordant read; fusion gene; prostate cancer; split read; whole exome sequencing.