Evolutionary Landscape of SOX Genes to Inform Genotype-to-Phenotype Relationships

Genes (Basel). 2023 Jan 14;14(1):222. doi: 10.3390/genes14010222.

Abstract

The SOX transcription factor family is pivotal in controlling aspects of development. To identify genotype-phenotype relationships of SOX proteins, we performed a non-biased study of SOX using 1890 open-reading frame and 6667 amino acid sequences in combination with structural dynamics to interpret 3999 gnomAD, 485 ClinVar, 1174 Geno2MP, and 4313 COSMIC human variants. We identified, within the HMG (High Mobility Group)- box, twenty-seven amino acids with changes in multiple SOX proteins annotated to clinical pathologies. These sites were screened through Geno2MP medical phenotypes, revealing novel SOX15 R104G associated with musculature abnormality and SOX8 R159G with intellectual disability. Within gnomAD, SOX18 E137K (rs201931544), found within the HMG box of ~0.8% of Latinx individuals, is associated with seizures and neurological complications, potentially through blood-brain barrier alterations. A total of 56 highly conserved variants were found at sites outside the HMG-box, including several within the SOX2 HMG-box-flanking region with neurological associations, several in the SOX9 dimerization region associated with Campomelic Dysplasia, SOX14 K88R (rs199932938) flanking the HMG box associated with cardiovascular complications within European populations, and SOX7 A379V (rs143587868) within an SOXF conserved far C-terminal domain heterozygous in 0.716% of African individuals with associated eye phenotypes. This SOX data compilation builds a robust genotype-to-phenotype association for a gene family through more robust ortholog data integration.

Keywords: SOX genes; developmental biology; paralog mapping; transcription factor; variant data integration.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Dimerization
  • Genotype
  • High Mobility Group Proteins* / chemistry
  • High Mobility Group Proteins* / genetics
  • High Mobility Group Proteins* / metabolism
  • Humans
  • SOX Transcription Factors* / genetics
  • SOXB2 Transcription Factors / genetics
  • SOXB2 Transcription Factors / metabolism
  • SOXE Transcription Factors / genetics
  • SOXF Transcription Factors / genetics
  • SOXF Transcription Factors / metabolism

Substances

  • High Mobility Group Proteins
  • SOX Transcription Factors
  • SOX7 protein, human
  • SOXF Transcription Factors
  • SOX18 protein, human
  • SOX14 protein, human
  • SOXB2 Transcription Factors
  • SOX8 protein, human
  • SOXE Transcription Factors

Associated data

  • figshare/10.6084/m9.figshare.14544339
  • figshare/10.6084/m9.figshare.14544063
  • figshare/10.6084/m9.figshare.14544219