An intent classification method for questions in "Treatise on Febrile diseases" based on TinyBERT-CNN fusion model

Comput Biol Med. 2023 Aug:162:107075. doi: 10.1016/j.compbiomed.2023.107075. Epub 2023 May 29.

Abstract

"Treatise on Febrile Diseases" is an important classic book in the academic history of Chinese material medica. Based on the knowledge map of traditional Chinese medicine established by the study of "Treatise on Febrile Diseases", a question-answering system of traditional Chinese medicine was established to help people better understand and use traditional Chinese medicine. Intention classification is the basis of the question-answering system of traditional Chinese medicine, but as far as we know, there is no research on question intention classification based on "Treatise on Febrile Diseases". In this paper, the intent classification research is carried out based on the Chinese material medica-related content materials in "Treatise on Febrile Diseases" as data. Most of the existing models perform well on long text classification tasks, with high costs and a lot of memory requirements. However, the intent classification data of this paper has the characteristics of short text, a small amount of data, and unbalanced categories. In response to these problems, this paper proposes a knowledge distillation-based bidirectional Transformer encoder combined with a convolutional neural network model (TinyBERT-CNN), which is used for the task of question intent classification in "Treatise on Febrile Diseases". The model used TinyBERT as an embedding and encoding layer to obtain the global vector information of the text and then completed the intent classification by feeding the encoded feature information into the CNN. The experimental results indicated that the model outperformed other models in terms of accuracy, recall, and F1 values of 96.4%, 95.9%, and 96.2%, respectively. The experimental results prove that the model proposed in this paper can effectively classify the intent of the question sentences in "Treatise on Febrile Diseases", and provide technical support for the question-answering system of "Treatise on Febrile Diseases" later.

Keywords: Data mining; Deep learning models; Intent classification; Short text; Typhoid theory.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Humans
  • Intention*
  • Language
  • Neural Networks, Computer*