Entity recognition of railway signal equipment fault information based on RoBERTa-wwm and deep learning integration

Math Biosci Eng. 2024 Jan;21(1):1228-1248. doi: 10.3934/mbe.2024052. Epub 2022 Dec 26.

Abstract

The operation and maintenance of railway signal systems create a significant and complex quantity of text data about faults. Aiming at the problems of fuzzy entity boundaries and low accuracy of entity recognition in the field of railway signal equipment faults, this paper provides a method for entity recognition of railway signal equipment fault information based on RoBERTa-wwm and deep learning integration. First, the model utilizes the RoBERTa-wwm pretrained language model to get the word vector of text sequences. Second, a parallel network consisting of a BiLSTM and a CNN is constructed to obtain the context feature information and the local attention information, respectively. Third, the feature vectors output from BiLSTM and CNN are combined and fed into MHA, focusing on extracting key feature information and mining the connection between different features. Finally, the label sequences with constraint relationships are outputted in CRF to complete the entity recognition task. The experimental analysis is carried out with fault text of railway signal equipment in the past ten years, and the experimental results show that the model has a higher evaluation index compared with the traditional model on this dataset, in which the precision, recall and F1 value are 93.25%, 92.45%, and 92.85%, respectively.

Keywords: RoBERTa-wwm; deep learning; fault text; knowledge graph; name entity recognition; railway signal equipment.