Deep neural models for ICD-10 coding of death certificates and autopsy reports in free-text

J Biomed Inform. 2018 Apr:80:64-77. doi: 10.1016/j.jbi.2018.02.011. Epub 2018 Feb 26.

Abstract

We address the assignment of ICD-10 codes for causes of death by analyzing free-text descriptions in death certificates, together with the associated autopsy reports and clinical bulletins, from the Portuguese Ministry of Health. We leverage a deep neural network that combines word embeddings, recurrent units, and neural attention, for the generation of intermediate representations of the textual contents. The neural network also explores the hierarchical nature of the input data, by building representations from the sequences of words within individual fields, which are then combined according to the sequences of fields that compose the inputs. Moreover, we explore innovative mechanisms for initializing the weights of the final nodes of the network, leveraging co-occurrences between classes together with the hierarchical structure of ICD-10. Experimental results attest to the contribution of the different neural network components. Our best model achieves accuracy scores over 89%, 81%, and 76%, respectively for ICD-10 chapters, blocks, and full-codes. Through examples, we also show that our method can produce interpretable results, useful for public health surveillance.

Keywords: Artificial intelligence in medicine; Automated ICD coding; Clinical text mining; Deep learning; Natural language processing.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Autopsy / methods
  • Clinical Coding / methods*
  • Data Mining / methods*
  • Death Certificates*
  • Electronic Health Records
  • Humans
  • International Classification of Diseases*
  • Natural Language Processing
  • Neural Networks, Computer*