'Sciencenet'--towards a global search and share engine for all scientific knowledge

Bioinformatics. 2011 Jun 15;27(12):1734-5. doi: 10.1093/bioinformatics/btr181. Epub 2011 Apr 14.

Abstract

Summary: Modern biological experiments create vast amounts of data which are geographically distributed. These datasets consist of petabytes of raw data and billions of documents. Yet to the best of our knowledge, a search engine technology that searches and cross-links all different data types in life sciences does not exist. We have developed a prototype distributed scientific search engine technology, 'Sciencenet', which facilitates rapid searching over this large data space. By 'bringing the search engine to the data', we do not require server farms. This platform also allows users to contribute to the search index and publish their large-scale data to support e-Science. Furthermore, a community-driven method guarantees that only scientific content is crawled and presented. Our peer-to-peer approach is sufficiently scalable for the science web without performance or capacity tradeoff.

Availability and implementation: The free to use search portal web page and the downloadable client are accessible at: http://sciencenet.kit.edu. The web portal for index administration is implemented in ASP.NET, the 'AskMe' experiment publisher is written in Python 2.7, and the backend 'YaCy' search engine is based on Java 1.6.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biological Science Disciplines
  • Internet
  • Search Engine*
  • Software