Genomic microsatellite characteristics analysis of Dysommaanguillare (Anguilliformes, Dysommidae), based on high-throughput sequencing technology

Biodivers Data J. 2023 Apr 7:11:e100068. doi: 10.3897/BDJ.11.e100068. eCollection 2023.

Abstract

Microsatellite loci were screened from the genomic data of Dysommaanguillare and their composition and distribution were analysed by bioinformatics for the first time. The results showed that 4,060,742 scaffolds with a total length of 1,562 Mb were obtained by high-throughput sequencing and 1,160,104 microsatellite loci were obtained by MISA screening, which were distributed on 770,294 scaffolds. The occurrence frequency and relative abundance were 28.57% and 743/Mb, respectively. Amongst the six complete microsatellite types, dinucleotide repeats accounted for the largest proportion (592,234, 51.05%), the highest occurrence frequency (14.58%) and the largest relative abundance (379.27/Mb). A total of 1488 microsatellite repeats were detected in the genome of D.anguillare, amongst which the hexanucleotide repeat motifs were the most abundant (608), followed by pentanucleotide repeat motifs (574), tetranucleotide repeat motifs (232), trinucleotide repeat motifs (59), dinucleotide repeat motifs (11) and mononucleotide repeat motifs (4). The abundance of microsatellites of the same repeat type decreased with the increase of copy numbers. Amongst the six types of nucleotide repeats, the preponderance of repeated motifs are A (191,390, 43.77%), CA (150,240, 25.37%), AAT (13,168, 14.05%), CACG (2,649, 8.14%), TAATG (119, 19.16%) and CCCTAA (190, 19.16%, 7.65%), respectively. The data of the number, distribution and abundance of different types of microsatellites in the genome of D.anguillare were obtained in this study, which would lay a foundation for the development of high-quality microsatellite markers of D.anguillare in the future.

Keywords: Dysommaanguillare; genome; high-throughput sequencing; microstatellite.