ANDDigest: a new web-based module of ANDSystem for the search of knowledge in the scientific literature

BMC Bioinformatics. 2020 Sep 14;21(Suppl 11):228. doi: 10.1186/s12859-020-03557-8.

Abstract

Background: The rapid growth of scientific literature has rendered the task of finding relevant information one of the critical problems in almost any research. Search engines, like Google Scholar, Web of Knowledge, PubMed, Scopus, and others, are highly effective in document search; however, they do not allow knowledge extraction. In contrast to the search engines, text-mining systems provide extraction of knowledge with representations in the form of semantic networks. Of particular interest are tools performing a full cycle of knowledge management and engineering, including automated retrieval, integration, and representation of knowledge in the form of semantic networks, their visualization, and analysis. STRING, Pathway Studio, MetaCore, and others are well-known examples of such products. Previously, we developed the Associative Network Discovery System (ANDSystem), which also implements such a cycle. However, the drawback of these systems is dependence on the employed ontologies describing the subject area, which limits their functionality in searching information based on user-specified queries.

Results: The ANDDigest system is a new web-based module of the ANDSystem tool, permitting searching within PubMed by using dictionaries from the ANDSystem tool and sets of user-defined keywords. ANDDigest allows performing the search based on complex queries simultaneously, taking into account many types of objects from the ANDSystem's ontology. The system has a user-friendly interface, providing sorting, visualization, and filtering of the found information, including mapping of mentioned objects in text, linking to external databases, sorting of data by publication date, citations number, journal H-indices, etc. The system provides data on trends for identified entities based on dynamics of interest according to the frequency of their mentions in PubMed by years.

Conclusions: The main feature of ANDDigest is its functionality, serving as a specialized search for information about multiple associative relationships of objects from the ANDSystem's ontology vocabularies, taking into account user-specified keywords. The tool can be applied to the interpretation of experimental genetics data, the search for associations between molecular genetics objects, and the preparation of scientific and analytical reviews. It is presently available at https://anddigest.sysbio.ru/ .

Keywords: Associative gene network; Dynamics of interest; Information search; Knowledge retrieval; Text-mining; Trend analysis; Web-based tool.

MeSH terms

  • Data Mining / methods*
  • Databases, Factual
  • Internet*
  • PubMed
  • Software*