Solvent selection for polymers enabled by generalized chemical fingerprinting and machine learning

Phys Chem Chem Phys. 2022 Nov 9;24(43):26547-26555. doi: 10.1039/d2cp03735a.

Abstract

We present machine learning models trained on experimental data to predict room-temperature solubility for any polymer-solvent pair. The new models are a significant advancement over past data-driven work, in terms of protocol, validity, and versatility. A generalizable fingerprinting method is used for the polymers and solvents, making it possible, in principle, to handle any polymer-solvent combination. Our data-driven approach achieves high accuracy when either both the polymer and solvent or just the polymer has been seen during the training phase. Model performance is modest though when a solvent (in a newly queried polymer-solvent pair) is not part of the training set. This is likely because the number of unique solvents in our data set is small (much smaller than the number of polymers). Nevertheless, as the data set increases in size, especially as the solvent set becomes more diverse, the overall predictive performance is expected to improve.