Improved biomedical term selection in pseudo relevance feedback

Database (Oxford). 2018 Jan 1:2018:bay056. doi: 10.1093/database/bay056.

Abstract

Biomedical information retrieval systems are becoming popular and complex due to massive amount of ever-growing biomedical literature. Users are unable to construct a precise and accurate query that represents the intended information in a clear manner. Therefore, query is expanded with the terms or features that retrieve more relevant information. Selection of appropriate expansion terms plays key role to improve the performance of retrieval task. We propose document frequency chi-square, a newer version of chi-square in pseudo relevance feedback for term selection. The effects of pre-processing on the performance of information retrieval specifically in biomedical domain are also depicted. On average, the proposed algorithm outperformed state-of-the-art term selection algorithms by 88% at pre-defined test points. Our experiments also conclude that, stemming cause a decrease in overall performance of the pseudo relevance feedback based information retrieval system particularly in biomedical domain.Database URL: http://biodb.sdau.edu.cn/gan/.

MeSH terms

  • Algorithms
  • Biomedical Research*
  • Chi-Square Distribution
  • Databases as Topic
  • Feedback*
  • Probability
  • Search Engine