Machine learning performance validation and training using a 'perfect' expert system

MethodsX. 2021 Aug 2:8:101477. doi: 10.1016/j.mex.2021.101477. eCollection 2021.

Abstract

A method is proposed for generating application domain agnostic data for training and evaluating machine learning systems. The proposed method randomly generates an expert system network based upon user specified parameters. This expert system serves as a generic model of an unspecified phenomena. The expert system is run to determine the ideal output from a set of random inputs. These inputs and ideal output are used for training and testing a machine learning system. This allows a machine learning technology to be developed and tested without requiring compatible test data to be collected or before data collection as a proof-of-concept validation of system operations. It also allows systems to be tested without data error noise or with known levels of noise and with other perturbations, to facilitate analysis. It may also facilitate testing system security, adversarial attacks and conducting other types of research into machine learning systems. •Provides an application domain agnostic way to test machine learning technologies and facilitates the generalization of results.•Allows technologies to be tested with data with different characteristics without having to locate datasets that have these characteristics.•Utilizes randomly generated network to represent non-specific phenomena which can be used for training and testing machine learning techniques.

Keywords: Knowledge engineering; Learning model; Machine learning; System performance evaluation.