Sex estimation: a comparison of techniques based on binary logistic, probit and cumulative probit regression, linear and quadratic discriminant analysis, neural networks, and naïve Bayes classification using ordinal variables

Int J Legal Med. 2020 May;134(3):1213-1225. doi: 10.1007/s00414-019-02148-4. Epub 2019 Aug 23.

Abstract

The performance of seven classification methods, binary logistic (BLR), probit (PR) and cumulative probit (CPR) regression, linear (LDA) and quadratic (QDA) discriminant analysis, artificial neural networks (ANN), and naïve Bayes classification (NBC), is examined in skeletal sex estimation. These methods were tested using cranial and pelvic sexually dimorphic traits recorded on a modern documented collection, the Athens Collection. For their implementation, an R package has been written to perform cross-validated (CV) sex classification and give the discriminant function of each of the methods studied. A simple algorithm that combines two discriminant functions is also proposed. It was found that the differences in the classification performance between BLR, PR, CPR, LDA, QDA, ANN, and NBC are overall small. However, LDA is simpler and more flexible than CPR, QDA, and ANN and has a small but clear advantage over BLR, NBC, and PR. Consequently, LDA may be preferred in skeletal sex estimation. Finally, it is striking that the combination of pelvic and cranial traits via their discriminant functions, determined either by BLR or LDA, removes practically any population-specificity and yields much better predictions than the individual functions; in fact, the prediction accuracy increases above 97%.

Keywords: Cranium; Forensic anthropology; Pelvis; Sex estimation; Statistical methods.

Publication types

  • Comparative Study

MeSH terms

  • Algorithms
  • Bayes Theorem
  • Discriminant Analysis
  • Female
  • Humans
  • Logistic Models
  • Machine Learning
  • Male
  • Neural Networks, Computer*
  • Pelvis / anatomy & histology
  • Sex Determination by Skeleton / methods*
  • Skull / anatomy & histology
  • Statistics as Topic / methods*