Sorting protein decoys by machine-learning-to-rank

Xiaoyang Jing; Kai Wang; Ruqian Lu; Qiwen Dong

doi:10.1038/srep31571

Sorting protein decoys by machine-learning-to-rank

Sci Rep. 2016 Aug 17:6:31571. doi: 10.1038/srep31571.

Authors

Xiaoyang Jing¹, Kai Wang², Ruqian Lu¹, Qiwen Dong³

Affiliations

¹ School of Computer Science, Fudan University, Shanghai 200433, People's Republic of China.
² College of Animal Science and Technology, Jilin Agricultural University, Changchun 130118, People's Republic of China.
³ Institute for Data Science and Engineering, East China Normal University, Shanghai 200062, People's Republic of China.

Abstract

Much progress has been made in Protein structure prediction during the last few decades. As the predicted models can span a broad range of accuracy spectrum, the accuracy of quality estimation becomes one of the key elements of successful protein structure prediction. Over the past years, a number of methods have been developed to address this issue, and these methods could be roughly divided into three categories: the single-model methods, clustering-based methods and quasi single-model methods. In this study, we develop a single-model method MQAPRank based on the learning-to-rank algorithm firstly, and then implement a quasi single-model method Quasi-MQAPRank. The proposed methods are benchmarked on the 3DRobot and CASP11 dataset. The five-fold cross-validation on the 3DRobot dataset shows the proposed single model method outperforms other methods whose outputs are taken as features of the proposed method, and the quasi single-model method can further enhance the performance. On the CASP11 dataset, the proposed methods also perform well compared with other leading methods in corresponding categories. In particular, the Quasi-MQAPRank method achieves a considerable performance on the CASP11 Best150 dataset.

MeSH terms

Algorithms
Machine Learning*
Models, Molecular
Protein Transport*
Proteins / chemistry*

Substances

Proteins