A Near-Chromosome Level Genome Assembly of Anopheles stephensi

Front Genet. 2020 Nov 16:11:565626. doi: 10.3389/fgene.2020.565626. eCollection 2020.

Abstract

Malaria remains a major healthcare risk to growing economies like India, and a chromosome-level reference genome of Anopheles stephensi is critical for successful vector management and understanding of vector evolution using comparative genomics. We report chromosome-level assemblies of an Indian strain, STE2, and a Pakistani strain SDA-500 by combining draft genomes of the two strains using a homology-based iterative approach. The resulting assembly IndV3/PakV3 with L50 of 9/12 and N50 6.3/6.9 Mb had scaffolds long enough for building 90% of the euchromatic regions of the three chromosomes, IndV3s/PakV3s, using low-resolution physical markers and enabled the generation of the next version of genome assemblies, IndV4/PakV4, using HiC data. We have validated these assemblies using contact maps against publicly available HiC raw data from two strains including STE2 and another lab strain of An. stephensi from UCI and compare the quality of the assemblies with other assemblies made available as preprints since the submission of the manuscript. We show that the IndV3s and IndV4 assemblies are sensitive in identifying a homozygous 2Rb inversion in the UCI strain and a 2Rb polymorphism in the STE2 strain. Multiple tandem copies of CYP6a14, 4c1, and 4c21 genes, implicated in insecticide resistance, lie within this inversion locus. Comparison of assembled genomes suggests a variation of 1 in 81 positions between the UCI and STE2 lab strains, 1 in 82 between SDA-500 and UCI strain, and 1 in 113 between SDA-500 and STE2 strains of An. stephensi, which are closer than 1 in 68 variations among individuals from two other lab strains sequenced and reported here. Based on the developmental transcriptome and orthology of all the 54 olfactory receptors (ORs) to those of other Anopheles species, we identify an OR with the potential for host recognition in the genus Anopheles. A comparative analysis of An. stephensi genomes with the completed genomes of a few other Anopheles species suggests limited inter-chromosomal gene flow and loss of synteny within chromosomal arms even among the closely related species.

Keywords: comparative genomics; cytochromeP450; developmental transcriptome; gene expression profile; genome browser; homology-based assembly; olfactory receptors; simulated mate-pair.