Epistemic uncertainty quantification in deep learning classification by the Delta method

Geir K Nilsen; Antonella Z Munthe-Kaas; Hans J Skaug; Morten Brun

doi:10.1016/j.neunet.2021.10.014

Epistemic uncertainty quantification in deep learning classification by the Delta method

Neural Netw. 2022 Jan:145:164-176. doi: 10.1016/j.neunet.2021.10.014. Epub 2021 Oct 23.

Authors

Geir K Nilsen¹, Antonella Z Munthe-Kaas², Hans J Skaug³, Morten Brun⁴

Affiliations

¹ Department of Mathematics, University of Bergen, Norway. Electronic address: geir.kjetil.nilsen@gmail.com.
² Department of Mathematics, University of Bergen, Norway. Electronic address: antonella.zanna@uib.no.
³ Department of Mathematics, University of Bergen, Norway. Electronic address: hans.skaug@uib.no.
⁴ Department of Mathematics, University of Bergen, Norway. Electronic address: morten.brun@uib.no.

PMID: 34749029
DOI: 10.1016/j.neunet.2021.10.014

Abstract

The Delta method is a classical procedure for quantifying epistemic uncertainty in statistical models, but its direct application to deep neural networks is prevented by the large number of parameters P. We propose a low cost approximation of the Delta method applicable to L₂-regularized deep neural networks based on the top K eigenpairs of the Fisher information matrix. We address efficient computation of full-rank approximate eigendecompositions in terms of the exact inverse Hessian, the inverse outer-products of gradients approximation and the so-called Sandwich estimator. Moreover, we provide bounds on the approximation error for the uncertainty of the predictive class probabilities. We show that when the smallest computed eigenvalue of the Fisher information matrix is near the L₂-regularization rate, the approximation error will be close to zero even when K≪P. A demonstration of the methodology is presented using a TensorFlow implementation, and we show that meaningful rankings of images based on predictive uncertainty can be obtained for two LeNet and ResNet-based neural networks using the MNIST and CIFAR-10 datasets. Further, we observe that false positives have on average a higher predictive epistemic uncertainty than true positives. This suggests that there is supplementing information in the uncertainty measure not captured by the classification alone.

Keywords: Deep learning; Fisher information; Hessian; Neural networks; Predictive epistemic uncertainty; Uncertainty quantification.

MeSH terms

Deep Learning*
Models, Statistical
Neural Networks, Computer
Probability
Uncertainty