Explainable artificial intelligence in forensics: Realistic explanations for number of contributor predictions of DNA profiles

Forensic Sci Int Genet. 2022 Jan:56:102632. doi: 10.1016/j.fsigen.2021.102632. Epub 2021 Nov 21.

Abstract

Machine learning obtains good accuracy in determining the number of contributors (NOC) in short tandem repeat (STR) mixture DNA profiles. However, the models used so far are not understandable to users as they only output a prediction without any reasoning for that conclusion. Therefore, we leverage techniques from the field of explainable artificial intelligence (XAI) to help users understand why specific predictions are made. Where previous attempts at explainability for NOC estimation have relied upon using simpler, more understandable models that achieve lower accuracy, we use techniques that can be applied to any machine learning model. Our explanations incorporate SHAP values and counterfactual examples for each prediction into a single visualization. Existing methods for generating counterfactuals focus on uncorrelated features. This makes them inappropriate for the highly correlated features derived from STR data for NOC estimation, as these techniques simulate combinations of features that could not have resulted from an STR profile. For this reason, we have constructed a new counterfactual method, Realistic Counterfactuals (ReCo), which generates realistic counterfactual explanations for correlated data. We show that ReCo outperforms state-of-the-art methods on traditional metrics, as well as on a novel realism score. A user evaluation of the visualization shows positive opinions of end-users, which is ultimately the most appropriate metric in assessing explanations for real-world settings.

Keywords: Counterfactual explanations; DNA mixtures; Explainable artificial intelligence; Machine learning; Number of contributors.

MeSH terms

  • Artificial Intelligence*
  • DNA / genetics
  • Forensic Medicine
  • Humans
  • Machine Learning*

Substances

  • DNA