Microbial source tracking in impaired watersheds using PhyloChip and machine-learning classification

Water Res. 2016 Nov 15:105:56-64. doi: 10.1016/j.watres.2016.08.035. Epub 2016 Aug 23.

Abstract

Sources of fecal indicator bacteria are difficult to identify in watersheds that are impacted by a variety of non-point sources. We developed a molecular source tracking test using the PhyloChip microarray that detects and distinguishes fecal bacteria from humans, birds, ruminants, horses, pigs and dogs with a single test. The multiplexed assay targets 9001 different 25-mer fragments of 16S rRNA genes that are common to the bacterial community of each source type. Both random forests and SourceTracker were tested as discrimination tools, with SourceTracker classification producing superior specificity and sensitivity for all source types. Validation with 12 different mammalian sources in mixtures found 100% correct identification of the dominant source and 84-100% specificity. The test was applied to identify sources of fecal indicator bacteria in the Russian River watershed in California. We found widespread contamination by human sources during the wet season proximal to settlements with antiquated septic infrastructure and during the dry season at beaches during intense recreational activity. The test was more sensitive than common fecal indicator tests that failed to identify potential risks at these sites. Conversely, upstream beaches and numerous creeks with less reliance on onsite wastewater treatment contained no fecal signal from humans or other animals; however these waters did contain high counts of fecal indicator bacteria after rain. Microbial community analysis revealed that increased E. coli and enterococci at these locations did not co-occur with common fecal bacteria, but rather co-varied with copiotrophic bacteria that are common in freshwaters with high nutrient and carbon loading, suggesting runoff likely promoted the growth of environmental strains of E. coli and enterococci. These results indicate that machine-learning classification of PhyloChip microarray data can outperform conventional single marker tests that are used to assess health risks, and is an effective tool for distinguishing numerous fecal and environmental sources of pathogen indicators.

Keywords: Fecal indicator bacteria; Machine learning; Microbial community analysis; Microbial source tracking; Pathogen TMDL; PhyloChip microarray.

MeSH terms

  • Animals
  • Dogs
  • Enterococcus / genetics
  • Environmental Monitoring
  • Escherichia coli / genetics*
  • Feces / microbiology
  • Horses
  • Humans
  • RNA, Ribosomal, 16S / genetics*
  • Rivers / microbiology
  • Swine
  • Water Microbiology

Substances

  • RNA, Ribosomal, 16S