Neural Network with Optimal Neuron Activation Functions Based on Additive Gaussian Process Regression

Sergei Manzhos; Manabu Ihara

doi:10.1021/acs.jpca.3c02949

Neural Network with Optimal Neuron Activation Functions Based on Additive Gaussian Process Regression

J Phys Chem A. 2023 Sep 21;127(37):7823-7835. doi: 10.1021/acs.jpca.3c02949. Epub 2023 Sep 12.

Authors

Sergei Manzhos¹, Manabu Ihara¹

Affiliation

¹ School of Materials and Chemical Technology, Tokyo Institute of Technology, Ookayama 2-12-1, Meguro-ku, Tokyo 152-8552, Japan.

PMID: 37698519
DOI: 10.1021/acs.jpca.3c02949

Abstract

Feed-forward neural networks (NNs) are a staple machine learning method widely used in many areas of science and technology, including physical chemistry, computational chemistry, and materials informatics. While even a single-hidden-layer NN is a universal approximator, its expressive power is limited by the use of simple neuron activation functions (such as sigmoid functions) that are typically the same for all neurons. More flexible neuron activation functions would allow the use of fewer neurons and layers and thereby save computational cost and improve expressive power. We show that additive Gaussian process regression (GPR) can be used to construct optimal neuron activation functions that are individual to each neuron. An approach is also introduced that avoids nonlinear fitting of neural network parameters by defining them with rules. The resulting method combines the advantage of robustness of a linear regression with the higher expressive power of an NN. We demonstrate the approach by fitting the potential energy surfaces of the water molecule and formaldehyde. Without requiring any nonlinear optimization, the additive-GPR-based approach outperforms a conventional NN in the high-accuracy regime, where a conventional NN suffers more from overfitting.