Technical considerations in Hi-C scaffolding and evaluation of chromosome-scale genome assemblies

Mol Ecol. 2021 Dec;30(23):5923-5934. doi: 10.1111/mec.16146. Epub 2021 Sep 12.

Abstract

The recent development of ecological studies has been fueled by the introduction of massive information based on chromosome-scale genome sequences, even for species for which genetic linkage is not accessible. This was enabled mainly by the application of Hi-C, a method for genome-wide chromosome conformation capture that was originally developed for investigating the long-range interaction of chromatins. Performing genomic scaffolding using Hi-C data is highly resource-demanding and employs elaborate laboratory steps for sample preparation. It starts with building a primary genome sequence assembly as an input, which is followed by computation for genome scaffolding using Hi-C data, requiring careful validation. This article presents technical considerations for obtaining optimal Hi-C scaffolding results and provides a test case of its application to a reptile species, the Madagascar ground gecko (Paroedura picta). Among the metrics that are frequently used for evaluating scaffolding results, we investigate the validity of the completeness assessment of chromosome-scale genome assemblies using single-copy reference orthologues.

Keywords: BUSCO; Hi-C scaffolding; chromosome-scale genome assembly; completeness assessment; gene space; iconHi-C.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Chromatin
  • Chromosomes* / genetics
  • Genome* / genetics
  • Genomics
  • Madagascar

Substances

  • Chromatin