Categorization of tweets for damages: infrastructure and human damage assessment using fine-tuned BERT model

PeerJ Comput Sci. 2024 Feb 16:10:e1859. doi: 10.7717/peerj-cs.1859. eCollection 2024.

Abstract

Identification of infrastructure and human damage assessment tweets is beneficial to disaster management organizations as well as victims during a disaster. Most of the prior works focused on the detection of informative/situational tweets, and infrastructure damage, only one focused on human damage. This study presents a novel approach for detecting damage assessment tweets involving infrastructure and human damages. We investigated the potential of the Bidirectional Encoder Representations from Transformer (BERT) model to learn universal contextualized representations targeting to demonstrate its effectiveness for binary and multi-class classification of disaster damage assessment tweets. The objective is to exploit a pre-trained BERT as a transfer learning mechanism after fine-tuning important hyper-parameters on the CrisisMMD dataset containing seven disasters. The effectiveness of fine-tuned BERT is compared with five benchmarks and nine comparable models by conducting exhaustive experiments. The findings show that the fine-tuned BERT outperformed all benchmarks and comparable models and achieved state-of-the-art performance by demonstrating up to 95.12% macro-f1-score, and 88% macro-f1-score for binary and multi-class classification. Specifically, the improvement in the classification of human damage is promising.

Keywords: Damage assessment; Disaster; Fine-tuning BERT; Human damage; Language context; Twitter.

Grants and funding

This study was supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2024R104), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.