The evolution of structural genomics

Biophys Rev. 2022 Dec 15;14(6):1247-1253. doi: 10.1007/s12551-022-01031-8. eCollection 2022 Dec.

Abstract

Structural genomics began as a global effort in the 1990s to determine the tertiary structures of all protein families as a response to large-scale genome sequencing projects. The immediate outcome was an influx of tens of thousands of protein structures, many of which had unknown functions. At the time, the value of structural genomics was controversial. However, the structures themselves were only the most obvious output. In addition, these newly solved structures motivated the emergence of huge data science and infrastructure efforts, which, together with advances in Deep Learning, have brought about a revolution in computational molecular biology. Here, we review some of the computational research carried out at the Protein Data Bank Japan (PDBj) during the Protein 3000 project under the leadership of Haruki Nakamura, much of which continues to flourish today.

Keywords: Deep learning; Fold space; Protein Data Bank; Protein structure prediction; Structural alignment; Structural genomics.