Loan default prediction of Chinese P2P market: a machine learning methodology

Junhui Xu; Zekai Lu; Ying Xie

doi:10.1038/s41598-021-98361-6

Loan default prediction of Chinese P2P market: a machine learning methodology

Sci Rep. 2021 Sep 21;11(1):18759. doi: 10.1038/s41598-021-98361-6.

Authors

Junhui Xu¹, Zekai Lu¹, Ying Xie²

Affiliations

¹ Department of Sociology, School of Public Administration, Guangzhou University, Guangzhou, 510006, China.
² Department of Sociology, School of Public Administration, Guangzhou University, Guangzhou, 510006, China. xysoc@gzhu.edu.cn.

Abstract

Repayment failures of borrowers have greatly affected the sustainable development of the peer-to-peer (P2P) lending industry. The latest literature reveals that existing risk evaluation systems may ignore important signals and risk factors affecting P2P repayment. In our study, we applied four machine learning methods (random forest (RF), extreme gradient boosting tree (XGBT), gradient boosting model (GBM), and neural network (NN)) to predict important factors affecting repayment by utilizing data from Renrendai.com in China from Thursday, January 1, 2015, to Tuesday, June 30, 2015. The results showed that borrowers who have passed video, mobile phone, job, residence or education level verification are more likely to default on loan repayment, whereas those who have passed identity and asset certification are less likely to default on loans. The accuracy and kappa value of the four methods all exceed 90%, and RF is superior to the other classification models. Our findings demonstrate important techniques for borrower screening by P2P companies and risk regulation by regulatory agencies. Our methodology and findings will help regulators, banks and creditors combat current financial disasters caused by the coronavirus disease 2019 (COVID-19) pandemic by addressing various financial risks and translating credit scoring improvements.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

COVID-19 / epidemiology
COVID-19 / virology
China / epidemiology
Financial Management
Financing, Personal / economics*
Financing, Personal / standards
Humans
Internet
Machine Learning*
Pandemics
Risk Factors
SARS-CoV-2 / isolation & purification

Grants and funding

GD20CGL40/Social Sciences Federation of Guangdong in China