B-HIT - A Tool for Harvesting and Indexing Biodiversity Data

PLoS One. 2015 Nov 6;10(11):e0142240. doi: 10.1371/journal.pone.0142240. eCollection 2015.

Abstract

With the rapidly growing number of data publishers, the process of harvesting and indexing information to offer advanced search and discovery becomes a critical bottleneck in globally distributed primary biodiversity data infrastructures. The Global Biodiversity Information Facility (GBIF) implemented a Harvesting and Indexing Toolkit (HIT), which largely automates data harvesting activities for hundreds of collection and observational data providers. The team of the Botanic Garden and Botanical Museum Berlin-Dahlem has extended this well-established system with a range of additional functions, including improved processing of multiple taxon identifications, the ability to represent associations between specimen and observation units, new data quality control and new reporting capabilities. The open source software B-HIT can be freely installed and used for setting up thematic networks serving the demands of particular user groups.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Abstracting and Indexing
  • Biodiversity*
  • Classification
  • Data Mining
  • Databases, Factual
  • Internet
  • Software*

Grants and funding

The design and implementation of the B-HIT was funded by the German Research Foundation (DFG, http://www.dfg.de) project BiNHum (BE 2283/8-1). The analysis of data for GGBN was funded by the DFG project GGBN (GU 1109/5-1) as well as the National Science Foundation (http://www.nsf.gov, NSF DEB 0956426). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.