Context: 46,XY, disorders of sexual development (46,XY, DSD) is a congenital genetic disease whose pathogenesis is complex and clinical manifestations are diverse. The existing molecular research has often focused on single-centre sequencing data, instead of prediction based on big data.
Aims: This work aimed to fully understand the pathogenesis of 46,XY, DSD, and summarise the key pathogenic genes.
Methods: Firstly, the potential pathogenic genes were identified from public data. Secondly, bioinformatics was used to predict pathogenic genes, including hub gene analysis, protein-protein interaction (PPI) and function enrichment analysis. Lastly, the genomic DNA from two unrelated families were recruited, next-generation sequencing and Sanger sequencing were performed to verify the hub genes.
Key results: A total of 161 potential pathogenic genes were selected from MGI and PubMed gene sets. The PPI network was built which included 144 nodes and 194 edges. MCODE 4 was selected from PPI which scored the most significant P -value. The top 15 hub genes were ranked and identified by Cytoscape. Furthermore, three variants were found on SRD5A2 gene by genome sequencing, which belonged to the prediction hub genes.
Conclusions: Our results indicate that occurrence of 46,XY, DSD is attributed to a variety of genes. Bioinformatics analysis can help us predict the hub genes and find the most core network MCODE model.
Implications: Bioinformatic predictions may provide a novel perspective on better understanding the pathogenesis of 46,XY, DSD.