Using Semantic Text Similarity calculation for question matching in a rheumatoid arthritis question-answering system

Quant Imaging Med Surg. 2023 Apr 1;13(4):2183-2196. doi: 10.21037/qims-22-749. Epub 2023 Mar 15.

Abstract

Background: When users inquire about knowledge in a certain field using the internet, the intelligent question-answering system based on frequently asked questions (FAQs) provides numerous concise and accurate answers that have been manually verified. However, there are few specific question-answering systems for chronic diseases, such as rheumatoid arthritis, and the related technology to construct a question-answering system about chronic diseases is not sufficiently mature.

Methods: Our research embedded the classification information of the question into the sentence vector based on the bidirectional encoder representations from transformers (BERT) language model. First of all, we calculated the similarity using edit distance to recall the candidate set of similar questions. Then, we took advantage of the BERT pretraining model to map the sentence information to the corresponding embedding representation. Finally, each dimensional feature of the sentence was obtained by passing a sentence vector through the multihead attention layer and the fully connected feedforward layer. The features that were stitched and fused were used for the semantic similarity calculation.

Results: Our improved model achieved a Top-1 precision of 0.551, Top-3 precision of 0.767, and Top-5 precision of 0.813 on 176 testing question sentences. In the analysis of the actual application effect of the model, we found that our model performed well in understanding the actual intention of users.

Conclusions: Our deep learning model takes into account the background and classifications of questions and combines the efficiency of deep learning technology and the comprehensibility of semantics. The model enables the deep meaning of the user's question to be better understood by the intelligent question answering system, and answers that are more relevant to the original query are provided.

Keywords: Intelligent question-answering system; Semantic Text Similarity (STS); classification; deep learning; rheumatoid arthritis.