PLM-GAN: A Large-Scale Protein Loop Modeling Using pix2pix GAN

ACS Omega. 2023 Dec 15;9(1):437-446. doi: 10.1021/acsomega.3c05863. eCollection 2024 Jan 9.

Abstract

Revealing the tertiary structure of proteins holds huge significance as it unveils their vital properties and functions. These intricate three-dimensional configurations comprise diverse interactions including ionic, hydrophobic, and disulfide forces. In certain instances, these structures exhibit missing regions, necessitating the reconstruction of specific segments, thereby resulting in challenges in protein design, which encompasses loop modeling, circular permutation, and interface prediction. To address this problem, we present two pioneering models: pix2pix generative adversarial network (GAN) and PLM-GAN. The pix2pix GAN model is adept at generating and inpainting distance matrices of protein structures, whereas the PLM-GAN model incorporates residual blocks into the U-Net network of the GAN, building upon the foundation of the pix2pix GAN model. To bolster the models' performance, we introduce a novel loss function named the "missing to real regions loss" (LMTR) within the GAN framework. Additionally, we introduce a distinctive approach of pairing two different distance matrices: one representing the native protein structure and the other representing the same structure with a missing region that undergoes changes in each successive epoch. Moreover, we extend the reconstruction of missing regions, encompassing up to 30 amino acids and increase the protein length by 128 amino acids. The evaluation of our pix2pix GAN and PLM-GAN models on a random selection of natural proteins (4ZCB, 3FJB, and 2REZ) demonstrated promising experimental results. Our models constitute significant contributions to addressing intricate challenges in protein structure design. These contributions hold immense potential to propel advancements in protein-protein interactions, drug design, and further innovations in protein engineering. Data, code, trained models, examples, and measurements are available on https://github.com/mena01/PLM-GAN-A-Large-Scale-Protein-Loop-Modeling-Using-pix2pix-GAN_.