Big data collection in pharmaceutical manufacturing and its use forproduct quality predictions

Sci Data. 2022 Mar 23;9(1):99. doi: 10.1038/s41597-022-01203-x.

Abstract

Advances in data science and digitalization are transforming the world, and the pharmaceutical industry is no exception. Multiple sensor-equipped manufacturing processes and laboratory analysis are the main sources of primary data, which have been utilized for the presented dataset of 1005 actual production batches of selected medicine. This dataset includes incoming raw material quality results, compression process time series and final product quality results for the selected product. The data is highly valuable for it provides an insight into every 10 seconds of the process trajectory for 1005 actual production batches along with product quality collected over several years. It therefore offers an opportunity to develop advanced analysis models and procedures which would lead to the omission of current conventional and time consuming laboratory testing. Benefits for both the industry and patient are obvious: reducing product lead times and costs of manufacture.