Reliability in performance assessment creates a potential application of artificial intelligence in veterinary education: evaluation of suturing skills at a single institution

Am J Vet Res. 2023 Jun 27;84(8):ajvr.23.03.0058. doi: 10.2460/ajvr.23.03.0058. Print 2023 Aug 1.

Abstract

Objectives: To evaluate suturing skills of veterinary students using 3 common performance assessments (PAs) and to compare findings to data obtained by an electromyographic armband.

Sample: 16 second-year veterinary students.

Procedures: Students performed 4 suturing tasks on synthetic tissue models 1, 3, and 5 weeks after a surgical skills course. Digital videos were scored by 4 expert surgeons using 3 PAs (an Objective Structured Clinical Examination [OSCE]- style surgical binary checklist, an Objective Structured Assessment of Technical Skill [OSATS] checklist, and a surgical Global Rating Scale [GRS]). Surface electromyography (sEMG) data collected from the dominant forearm were input to machine learning algorithms. Performance assessment scores were compared between experts and correlated to task completion times and sEMG data. Inter-rater reliability was calculated using the intraclass correlation coefficient (ICC). Inter-rater agreement was calculated using percent agreement with varying levels of tolerance.

Results: Reliability was moderate for the OSCE and OSATS checklists and poor for the GRS. Agreement was achieved for the checklists when moderate tolerance was applied but remained poor for the GRS. sEMG signals did not correlate well with checklist scores or task times, but features extracted from signals permitted task differentiation by routine statistical comparison and correct task classification using machine learning algorithms.

Clinical relevance: Reliability and agreement of an OSCE-style checklist, OSATS checklist, and surgical GRS assessment were insufficient to characterize suturing skills of veterinary students. To avoid subjectivity associated with PA by raters, further study of kinematics and EMG data is warranted in the surgical skills evaluation of veterinary students.

MeSH terms

  • Animals
  • Artificial Intelligence*
  • Education, Veterinary*
  • Reproducibility of Results