On the use of machine learning algorithms in forensic anthropology

Leg Med (Tokyo). 2020 Nov:47:101771. doi: 10.1016/j.legalmed.2020.101771. Epub 2020 Aug 6.

Abstract

The classification performance of the statistical methods binary logistic regression (BLR), multinomial and penalized multinomial logistic regression (MLR, pMLR), linear discriminant analysis (LDA), and the machine learning algorithms naïve Bayes classification (NBC), decision trees (DT), random forest (RF), artificial neural networks (ANN), support vector machines (linear, polynomial or radial) (SVM), multivariate adaptive regression splines (MARS), and extreme gradient boosting (XGB) is examined in skeletal sex/ancestry estimation. The datasets used to test the performance of these methods were obtained from a documented human skeletal collection, Athens Collection, and the Howells Craniometric data set. For their implementation, an R package has been written to search for the optimum tuning parameters under cross-validation and perform sex/ancestry classification. It was found that the classification performance may vary significantly depending on the problem. From the methods tested, LDA and the machine learning technique of linear SVM exhibit the best performance, with high prediction accuracy and relatively low bias in most of the tests. ANN and pMLR can generally be considered to give satisfactory predictions, whereas NBC when using metric traits and DT are the worst of the classification methods examined. The possibility of making the models developed via the machine learning algorithms applicable to other assemblages without the use of a training sample is also discussed.

Keywords: Ancestry estimation; Cranium; Forensic anthropology; Pelvis; Sex estimation; Statistical methods.

MeSH terms

  • Algorithms*
  • Body Remains
  • Discriminant Analysis
  • Female
  • Forensic Anthropology / methods
  • Humans
  • Machine Learning*
  • Male
  • Pelvis
  • Racial Groups / classification*
  • Sex Determination by Skeleton / methods
  • Skull
  • Support Vector Machine