Improving screening systems of autism using data sampling

Technol Health Care. 2021;29(5):897-909. doi: 10.3233/THC-202538.

Abstract

Objectives: Autism Spectrum Disorder (ASD) is a complex range of neurodegenerative conditions that impact individuals' social behaviour and communication skills. However, ASD data often contains far more controls than cases. This poses a serious challenge when creating classification models due to deriving models that favour controls during the classification of individuals. This problem is known as class imbalance, and it may reduce the performance in classification models derived by machine learning (ML) techniques due to individuals may remain undetected.

Methods: ML appears to help in the distressing disorder by improving outcome quality besides speeding up the access to early diagnosis and consequential treatment. A screening dataset that consists of over 1100 instances was used to perform extensive quantitative analysis using different data resampling techniques and according to specific evaluation metrics. We measure the effect of class imbalance on autism screening performance using different data resampling techniques with a ML classifier and with respect to sensitivity, specificity, and F1-measure. We would like to know which resampling methods work well in balancing autism screening data.

Results: The results reveal that data resampling, and especially oversampling, improve results derived by the considered ML classifier. More importantly, there was superiority in terms of sensitivity and specificity for models derived by Naive Bayes classifier when oversampling methods have been used for data pre-processing on the autism data considered.

Conclusion: The results reported encourages further improvement of the design and implementation of ASD screening systems using intelligent technology.

Keywords: Artificial intelligence; autism screening; class imbalance; classification; data resampling; machine learning.

MeSH terms

  • Autism Spectrum Disorder* / diagnosis
  • Autistic Disorder* / diagnosis
  • Bayes Theorem
  • Humans
  • Machine Learning
  • Mass Screening