Structural Refinement by Direct Mapping Reveals Assembly Inconsistencies near Hi-C Junctions

Plants (Basel). 2023 Jan 10;12(2):320. doi: 10.3390/plants12020320.

Abstract

High-throughput chromosome conformation capture (Hi-C) is widely used for scaffolding in de novo assembly because it produces highly contiguous genomes, but its indirect statistical approach can introduce connection errors. We employed optical mapping (Bionano Genomics) as an orthogonal scaffolding technology to assess the structural solidity of Hi-C reconstructed scaffolds. Optical maps were used to assess the correctness of five de novo genome assemblies based on long-read sequencing for contig generation and Hi-C for scaffolding. Hundreds of inconsistencies were found between the reconstructions generated using the Hi-C and optical mapping approaches. Manual inspection, exploiting raw long-read sequencing data and optical maps, confirmed that several of these conflicts were derived from Hi-C joining errors. Such misjoins were widespread, involved the connection of both small and large contigs, and even overlapped annotated genes. We conclude that the integration of optical mapping data after, not before, Hi-C-based scaffolding, improves the quality of the assembly and limits reconstruction errors by highlighting misjoins that can then be subjected to further investigation.

Keywords: Hi-C; assembly refinement; de novo genome assembly; optical mapping.