An integrated approach based on virtual data augmentation and deep neural networks modeling for VFA production prediction in anaerobic fermentation process

Water Res. 2020 Oct 1:184:116103. doi: 10.1016/j.watres.2020.116103. Epub 2020 Jun 30.

Abstract

Data-driven models are suitable for simulating biological wastewater treatment processes with complex intrinsic mechanisms. However, raw data collected in the early stage of biological experiments are normally not enough to train data-driven models. In this study, an integrated modeling approach incorporating the random standard deviation sampling (RSDS) method and deep neural networks (DNNs) models, was established to predict volatile fatty acid (VFA) production in the anaerobic fermentation process. The RSDS method based on the mean values (x¯) and standard deviations (α) calculated from multiple experimental determination was initially developed for virtual data augmentation. The DNNs models were then established to learn features from virtual data and predict VFA production. The results showed that when 20000 virtual samples including five input variables of the anaerobic fermentation process were used to train the DNNs model with 16 hidden layers and 100 hidden neurons in each layer, the best correlation coefficient of 0.998 and the minimal mean absolute percentage error of 3.28% were achieved. This integrated approach can learn nonlinear information from virtual data generated by the RSDS method, and consequently enlarge the application range of DNNs models in simulating biological wastewater treatment processes with small datasets.

Keywords: Anaerobic fermentation; Datasets; Deep neural networks (DNNs); Random standard deviation sampling method (RSDS); Volatile fatty acid (VFA).

MeSH terms

  • Anaerobiosis
  • Fatty Acids, Volatile*
  • Fermentation
  • Neural Networks, Computer*
  • Wastewater

Substances

  • Fatty Acids, Volatile
  • Waste Water