Automatic MeSH Indexing: Revisiting the Subheading Attachment Problem

AMIA Annu Symp Proc. 2021 Jan 25:2020:1031-1040. eCollection 2020.

Abstract

This year less than 200 National Library of Medicine indexers expect to index 1 million articles, and this would not be possible without the assistance of the Medical Text Indexer (MTI) system. MTI is an automated indexing system that provides MeSH main heading/subheading pair recommendations to assist indexers with their heavy workload. Over the years, a lot of research effort has focused on improving main heading prediction performance, but automated fine-grained indexing with main heading/subheading pairs has received much less attention. This work revisits the subheading attachment problem, and demonstrates very significant performance improvements using modern Convolutional Neural Network classifiers. The best performing method is shown to outperform the current MTI implementation with a 3.7% absolute improvement in precision, and a 27.6% absolute improvement in recall. We also conducted a manual review of false positive predictions, and 70% were found to be acceptable indexing.

Publication types

  • Research Support, N.I.H., Intramural

MeSH terms

  • Abstracting and Indexing / methods*
  • Humans
  • MEDLINE
  • Machine Learning
  • Medical Subject Headings*
  • National Library of Medicine (U.S.)
  • Natural Language Processing
  • Neural Networks, Computer*
  • Unified Medical Language System
  • United States