A guide to measuring expert performance in forensic pattern matching

Samuel G Robson; Rachel A Searston; Matthew B Thompson; Jason M Tangen

doi:10.3758/s13428-024-02354-y

A guide to measuring expert performance in forensic pattern matching

Behav Res Methods. 2024 Mar 14. doi: 10.3758/s13428-024-02354-y. Online ahead of print.

Authors

Samuel G Robson^{1

2}, Rachel A Searston³, Matthew B Thompson^{4

5}, Jason M Tangen⁶

Affiliations

¹ School of Psychology, The University of Queensland, St Lucia, QLD, Australia. sam.robson@unsw.edu.au.
² School of Psychology, The University of New South Wales, Kensington, NSW, Australia. sam.robson@unsw.edu.au.
³ School of Psychology, The University of Adelaide, Adelaide, SA, Australia.
⁴ School of Psychology, Murdoch University, Murdoch, WA, Australia.
⁵ Centre for Biosecurity and One Health, Harry Butler Institute, Murdoch University, Murdoch, WA, Australia.
⁶ School of Psychology, The University of Queensland, St Lucia, QLD, Australia.

PMID: 38485882
DOI: 10.3758/s13428-024-02354-y

Abstract

Decisions in forensic science are often binary. A firearms expert must decide whether a bullet was fired from a particular gun or not. A face comparison expert must decide whether a photograph matches a suspect or not. A fingerprint examiner must decide whether a crime scene fingerprint belongs to a suspect or not. Researchers who study these decisions have therefore quantified expert performance using measurement models derived largely from signal detection theory. Here we demonstrate that the design and measurement choices researchers make can have a dramatic effect on the conclusions drawn about the performance of forensic examiners. We introduce several performance models - proportion correct, diagnosticity ratio, and parametric and non-parametric signal detection measures - and apply them to forensic decisions. We use data from expert and novice fingerprint comparison decisions along with a resampling method to demonstrate how experimental results can change as a function of the task, case materials, and measurement model chosen. We also graphically show how response bias, prevalence, inconclusive responses, floor and ceiling effects, case sampling, and number of trials might affect one's interpretation of expert performance in forensics. Finally, we discuss several considerations for experimental and diagnostic accuracy studies: (1) include an equal number of same-source and different-source trials; (2) record inconclusive responses separately from forced choices; (3) include a control comparison group; (4) counterbalance or randomly sample trials for each participant; and (5) present as many trials to participants as is practical.

Keywords: Decision-making; Expertise; Fingerprints; Forensic pattern matching; Forensic science; Proficiency tests; Signal detection.