Machine learning approaches for estimation of prediction interval for the model output

Neural Netw. 2006 Mar;19(2):225-35. doi: 10.1016/j.neunet.2006.01.012. Epub 2006 Mar 10.

Abstract

A novel method for estimating prediction uncertainty using machine learning techniques is presented. Uncertainty is expressed in the form of the two quantiles (constituting the prediction interval) of the underlying distribution of prediction errors. The idea is to partition the input space into different zones or clusters having similar model errors using fuzzy c-means clustering. The prediction interval is constructed for each cluster on the basis of empirical distributions of the errors associated with all instances belonging to the cluster under consideration and propagated from each cluster to the examples according to their membership grades in each cluster. Then a regression model is built for in-sample data using computed prediction limits as targets, and finally, this model is applied to estimate the prediction intervals (limits) for out-of-sample data. The method was tested on artificial and real hydrologic data sets using various machine learning techniques. Preliminary results show that the method is superior to other methods estimating the prediction interval. A new method for evaluating performance for estimating prediction interval is proposed as well.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Artificial Intelligence*
  • Cluster Analysis
  • Computer Simulation*
  • Ecosystem
  • Evaluation Studies as Topic
  • Fuzzy Logic
  • Neural Networks, Computer*
  • Nonlinear Dynamics
  • Predictive Value of Tests*
  • Reproducibility of Results
  • Time