A workflow for expanding DNA barcode reference libraries through 'museum harvesting' of natural history collections

Biodivers Data J. 2023 May 10:11:e100677. doi: 10.3897/BDJ.11.e100677. eCollection 2023.

Abstract

Natural history collections are the physical repositories of our knowledge on species, the entities of biodiversity. Making this knowledge accessible to society - through, for example, digitisation or the construction of a validated, global DNA barcode library - is of crucial importance. To this end, we developed and streamlined a workflow for 'museum harvesting' of authoritatively identified Diptera specimens from the Smithsonian Institution's National Museum of Natural History. Our detailed workflow includes both on-site and off-site processing through specimen selection, labelling, imaging, tissue sampling, databasing and DNA barcoding. This approach was tested by harvesting and DNA barcoding 941 voucher specimens, representing 32 families, 819 genera and 695 identified species collected from 100 countries. We recovered 867 sequences (> 0 base pairs) with a sequencing success of 88.8% (727 of 819 sequenced genera gained a barcode > 300 base pairs). While Sanger-based methods were more effective for recently-collected specimens, the methods employing next-generation sequencing recovered barcodes for specimens over a century old. The utility of the newly-generated reference barcodes is demonstrated by the subsequent taxonomic assignment of nearly 5000 specimen records in the Barcode of Life Data Systems.

Keywords: COI; Centre for Biodiversity Genomics; DNA barcoding; Diptera; National Museum of Natural History; USNM; arthropods; digitisation; museum harvesting.