Use it or lose it: citations predict the continued online availability of published bioinformatics resources

Nucleic Acids Res. 2017 Apr 20;45(7):3627-3633. doi: 10.1093/nar/gkx182.

Abstract

Scientific Data Analysis Resources (SDARs) such as bioinformatics programs, web servers and databases are integral to modern science, but previous studies have shown that the Uniform Resource Locators (URLs) linking to them decay in a time-dependent manner, with ∼27% decayed to date. Because SDARs are overrepresented among science's most cited papers over the past 20 years, loss of widely used SDARs could be particularly disruptive to scientific research. We identified URLs in MEDLINE abstracts and used crowdsourcing to identify which reported the creation of SDARs. We used the Internet Archive's Wayback Machine to approximate 'death dates' and calculate citations/year over each SDAR's lifespan. At first glance, decayed SDARs did not significantly differ from available SDARs in their average citations per year over their lifespan or journal impact factor (JIF). But the most cited SDARs were 94% likely to be relocated to another URL versus only 34% of uncited ones. Taking relocation into account, we find that citations are the strongest predictors of current online availability after time since publication, and JIF modestly predictive. This suggests that URL decay is a general, persistent phenomenon affecting all URLs, but the most useful/recognized SDARs are more likely to persist.

Publication types

  • Review
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Computational Biology*
  • Internet*
  • Journal Impact Factor
  • MEDLINE
  • Periodicals as Topic*