Machine Learning Assisted Prediction of Power Conversion Efficiency of All-Small Molecule Organic Solar Cells: A Data Visualization and Statistical Analysis

Molecules. 2022 Sep 11;27(18):5905. doi: 10.3390/molecules27185905.

Abstract

Organic solar cells are famous for their cheap solution processing. Their industrialization needs fast designing of efficient materials. For this purpose, testing of large number of materials is necessary. Machine learning is a better option due to cheaper prediction of power conversion efficiencies. In the present work, machine learning was used to predict power conversion efficiencies. Experimental data were collected from the literature to feed the machine learning models. A detailed data visualization analysis was performed to study the trends of the dataset. The relationship between descriptors and power conversion efficiency was quantitatively determined by Pearson correlations. The importance of features was also determined using feature importance analysis. More than 10 machine learning models were tried to find better models. Only the two best models (random forest regressor and bagging regressor) were selected for further analysis. The prediction ability of these models was high. The coefficient of determination (R2) values for the random forest regressor and bagging regressor models were 0.892 and 0.887, respectively. The Shapley additive explanation (SHAP) method was used to identify the impact of descriptors on the output of models.

Keywords: Pearson correlation; machine learning; random forest regressor; small molecule donors.

MeSH terms

  • Data Visualization*
  • Machine Learning*
  • Research Design