Automatic symptoms identification from a massive volume of unstructured medical consultations using deep neural and BERT models

Hossam Faris; Mohammad Faris; Maria Habib; Alaa Alomari

doi:10.1016/j.heliyon.2022.e09683

Automatic symptoms identification from a massive volume of unstructured medical consultations using deep neural and BERT models

Heliyon. 2022 Jun 10;8(6):e09683. doi: 10.1016/j.heliyon.2022.e09683. eCollection 2022 Jun.

Authors

Hossam Faris^{1

2

3}, Mohammad Faris³, Maria Habib³, Alaa Alomari^{3

4}

Affiliations

¹ King Abdullah II School for Information Technology, The University of Jordan, 11942, Jordan.
² Research Centre for Information and Communications Technologies of the University of Granada (CITIC-UGR), University of Granada, Granada, Spain.
³ Altibbi1https://altibbi.com., Amman, Jordan.
⁴ School of Informatics and Telecommunications Engineering, University of Granada, Granada, Spain.

Abstract

Automatic symptom identification plays a crucial role in assisting doctors during the diagnosis process in Telemedicine. In general, physicians spend considerable time on clinical documentation and symptom identification, which is unfeasible due to their full schedule. With text-based consultation services in telemedicine, the identification of symptoms from a user's consultation is a sophisticated process and time-consuming. Moreover, at Altibbi, which is an Arabic telemedicine platform and the context of this work, users consult doctors and describe their conditions in different Arabic dialects which makes the problem more complex and challenging. Therefore, in this work, an advanced deep learning approach is developed consultations with multi-dialects. The approach is formulated as a multi-label multi-class classification using features extracted based on AraBERT and fine-tuned on the bidirectional long short-term memory (BiLSTM) network. The Fine-tuning of BiLSTM relies on features engineered based on different variants of the bidirectional encoder representations from transformers (BERT). Evaluating the models based on precision, recall, and a customized hit rate showed a successful identification of symptoms from Arabic texts with promising accuracy. Hence, this paves the way toward deploying an automated symptom identification model in production at Altibbi which can help general practitioners in telemedicine in providing more efficient and accurate consultations.

Keywords: Deep learning; Machine learning; Multi-classification; Multi-label; Telemedicine.