Experimental determination and data-driven prediction of homotypic transmembrane domain interfaces

Comput Struct Biotechnol J. 2020 Oct 7:18:3230-3242. doi: 10.1016/j.csbj.2020.09.035. eCollection 2020.

Abstract

Interactions between their transmembrane domains (TMDs) frequently support the assembly of single-pass membrane proteins to non-covalent complexes. Yet, the TMD-TMD interactome remains largely uncharted. With a view to predicting homotypic TMD-TMD interfaces from primary structure, we performed a systematic analysis of their physical and evolutionary properties. To this end, we generated a dataset of 50 self-interacting TMDs. This dataset contains interfaces of nine TMDs from bitopic human proteins (Ire1, Armcx6, Tie1, ATP1B1, PTPRO, PTPRU, PTPRG, DDR1, and Siglec7) that were experimentally identified here and combined with literature data. We show that interfacial residues of these homotypic TMD-TMD interfaces tend to be more conserved, coevolved and polar than non-interfacial residues. Further, we suggest for the first time that interface positions are deficient in β-branched residues, and likely to be located deep in the hydrophobic core of the membrane. Overrepresentation of the GxxxG motif at interfaces is strong, but that of (small)xxx(small) motifs is weak. The multiplicity of these features and the individual character of TMD-TMD interfaces, as uncovered here, prompted us to train a machine learning algorithm. The resulting prediction method, THOIPA (www.thoipa.org), excels in the prediction of key interface residues from evolutionary sequence data.

Keywords: Co-evolution; GxxxG; Machine learning; Protein-protein interaction; TMD interactions; Transmembrane.