Evaluation of Long-Read Sequencing Simulators to Assess Real-World Applications for Food Safety

Katrina L Counihan; Siddhartha Kanrar; Shannon Tilman; Andrew Gehring

doi:10.3390/foods13010016

Evaluation of Long-Read Sequencing Simulators to Assess Real-World Applications for Food Safety

Foods. 2023 Dec 19;13(1):16. doi: 10.3390/foods13010016.

Authors

Katrina L Counihan¹, Siddhartha Kanrar¹, Shannon Tilman¹, Andrew Gehring¹

Affiliation

¹ Eastern Regional Research Center, United States Department of Agriculture, Agricultural Research Service, Wyndmoor, PA 19038, USA.

Abstract

Shiga toxin-producing Escherichia coli (STEC) and Listeria monocytogenes are routinely responsible for severe foodborne illnesses in the United States. Current identification methods utilized by the U.S. Food Safety Inspection Service require at least four days to identify STEC and six days for L. monocytogenes. Adoption of long-read, whole genome sequencing for food safety testing could significantly reduce the time needed for identification, but method development costs are high. Therefore, the goal of this project was to use NanoSim-H software to simulate Oxford Nanopore sequencing reads to assess the feasibility of sequencing-based foodborne pathogen detection and guide experimental design. Sequencing reads were simulated for STEC, L. monocytogenes, and a 1:1 combination of STEC and Bos taurus genomes using NanoSim-H. At least 2500 simulated reads were needed to identify the seven genes of interest targeted in STEC, and at least 500 reads were needed to detect the gene targeted in L. monocytogenes. Genome coverage of 30x was estimated at 21,521, and 11,802 reads for STEC and L. monocytogenes, respectively. Approximately 5-6% of reads simulated from both bacteria did not align with their respective reference genomes due to the introduction of errors. For the STEC and B. taurus 1:1 genome mixture, all genes of interest were detected with 1,000,000 reads, but less than 1x coverage was obtained. The results suggested sample enrichment would be necessary to detect foodborne pathogens with long-read sequencing, but this would still decrease the time needed from current methods. Additionally, simulation data will be useful for reducing the time and expense associated with laboratory experimentation.

Keywords: Bos taurus; Listeria monocytogenes; Shiga toxin-producing Escherichia coli O157:H7; foodborne pathogens; virulence genes.

Abstract

Grants and funding