Parallel Multistage Wide Neural Network

IEEE Trans Neural Netw Learn Syst. 2023 Aug;34(8):4019-4032. doi: 10.1109/TNNLS.2021.3120331. Epub 2023 Aug 4.

Abstract

Deep learning networks have achieved great success in many areas, such as in large-scale image processing. They usually need large computing resources and time and process easy and hard samples inefficiently in the same way. Another undesirable problem is that the network generally needs to be retrained to learn new incoming data. Efforts have been made to reduce the computing resources and realize incremental learning by adjusting architectures, such as scalable effort classifiers, multi-grained cascade forest (gcForest), conditional deep learning (CDL), tree CNN, decision tree structure with knowledge transfer (ERDK), forest of decision trees with radial basis function (RBF) networks, and knowledge transfer (FDRK). In this article, a parallel multistage wide neural network (PMWNN) is presented. It is composed of multiple stages to classify different parts of data. First, a wide radial basis function (WRBF) network is designed to learn features efficiently in the wide direction. It can work on both vector and image instances and can be trained in one epoch using subsampling and least squares (LS). Second, successive stages of WRBF networks are combined to make up the PMWNN. Each stage focuses on the misclassified samples of the previous stage. It can stop growing at an early stage, and a stage can be added incrementally when new training data are acquired. Finally, the stages of the PMWNN can be tested in parallel, thus speeding up the testing process. To sum up, the proposed PMWNN network has the advantages of: 1) optimized computing resources; 2) incremental learning; and 3) parallel testing with stages. The experimental results with the MNIST data, a number of large hyperspectral remote sensing data, and different types of data in different application areas, including many image and nonimage datasets, show that the WRBF and PMWNN can work well on both image and nonimage data and have very competitive accuracy compared to learning models, such as stacked autoencoders, deep belief nets, support vector machine (SVM), multilayer perceptron (MLP), LeNet-5, RBF network, recently proposed CDL, broad learning, gcForest, ERDK, and FDRK.