Replacing the internal standard to estimate micropollutants using deep and machine learning

Water Res. 2021 Jan 1:188:116535. doi: 10.1016/j.watres.2020.116535. Epub 2020 Oct 19.

Abstract

Similar to the worldwide proliferation of urbanization, micropollutants have been involved in aquatic and ecological environmental systems. These pollutants have the propensity to wreak havoc on human health and the ecological system; hence, it is important to persistently monitor micropollutants in the environment. Micropollutants are commonly quantified via target analysis using high resolution mass spectrometry and the stable isotope labeled (SIL) standard. However, the cost-intensiveness of this standard presents a major obstacle in measuring micropollutants. This study resolved this problem by developing data-driven models, including deep learning (DL) and machine learning (ML), to estimate the concentration of micropollutants without resorting to the SIL standard. Our study hypothesized that natural organic matter (NOM) could replace internal standards if there was a specific mass spectrum (MS) subset, including NOM information, which correlated with an SIL standard peak. Therefore, we analyzed the MS to find the specific MS subsets for replacing the SIL standard peak. Thirty-five alternative MS subsets were determined for applying DL and ML as input data. Thereafter, we trained four different DL models, namely, ResNet101, GoogLeNet, VGG16, and Inception v3, as well as three different ML models, i.e., random forest (RF), support vector machine (SVM), and artificial neural network (ANN). A total of 680 MS data were used for the model training to estimate five different micropollutants, namely Sulpiride, Metformin, and Benzotriazole. Among the DL models, ResNet 101 exhibited the highest model performance, showing that the average validation R2 and MSE were 0.84 and 0.26 ng/L, respectively, while RF was the best in the ML models, manifesting R2 and MSE values of 0.69 and 0.58 ng/L. The trained models showed accurate training and validation results for the estimation of the five micropollutant concentrations. Therefore, this study demonstrates that the suggested analysis has a potential for alternative micropollutant measurement that has rapid and economic vantages.

Keywords: Deep learning; High Resolution Mass Spectrometry; Machine learning; Micropollutant.

MeSH terms

  • Humans
  • Isotopes
  • Machine Learning*
  • Neural Networks, Computer*
  • Reference Standards
  • Support Vector Machine

Substances

  • Isotopes