SCREENER: Streamlined collaborative learning of NER and RE model for discovering gene-disease relations

PLoS One. 2023 Nov 27;18(11):e0294713. doi: 10.1371/journal.pone.0294713. eCollection 2023.

Abstract

Finding relations between genes and diseases is essential in developing a clinical diagnosis, treatment, and drug design for diseases. One successful approach for mining the literature is the document-based relation extraction method. Despite recent advances in document-level extraction of entity-entity, there remains a difficulty in understanding the relations between distant words in a document. To overcome the above limitations, we propose an AI-based text-mining model that learns the document-level relations between genes and diseases using an attention mechanism. Furthermore, we show that including a direct edge (DE) and indirect edges between genetic targets and diseases when training improves the model's performance. Such relation edges can be visualized as graphs, enhancing the interpretability of the model. For the performance, we achieved an F1-score of 0.875, outperforming state-of-the-art document-level extraction models. In summary, the SCREENER identifies biological connections between target genes and diseases with superior performance by leveraging direct and indirect target-disease relations. Furthermore, we developed a web service platform named SCREENER (Streamlined CollaboRativE lEarning of NEr and Re), which extracts the gene-disease relations from the biomedical literature in real-time. We believe this interactive platform will be useful for users to uncover unknown gene-disease relations in the world of fast-paced literature publications, with sufficient interpretation supported by graph visualizations. The interactive website is available at: https://ican.standigm.com.

MeSH terms

  • Data Mining / methods
  • Interdisciplinary Placement*

Grants and funding

The author(s) received no specific funding for this work.