Approaches to text mining for analyzing treatment plan of quit smoking with free-text medical records: A PRISMA-compliant meta-analysis

Medicine (Baltimore). 2020 Jul 17;99(29):e20999. doi: 10.1097/MD.0000000000020999.

Abstract

Background: Smoking is a complex behavior associated with multiple factors such as personality, environment, genetics, and emotions. Text data are a rich source of information. However, pure text data requires substantial human resources and time to extract and apply the knowledge, resulting in many details not being discovered and used. This study proposes a novel approach that explores a text mining flow to capture the behavior of smokers quitting tobacco from their free-text medical records. More importantly, the paper examines the impact of these changes on smokers. The goal is to help smokers quit smoking. The study population included adult patients that were >20 years old of age who consulted the medical center's smoking cessation outpatient clinic from January to December 2016. A total of 246 patients visited the clinic in the study period. After excluding incomplete medical records or lost follow up, there were 141 patients included in the final analysis. There are 141 valid data points for patients who only treated once and patients with empty medical records. Two independent review authors will make the study selection based on the study eligibility criteria. Our participants are from all the patients that were involved in this study and the staff of Division of Family Medicine, National Taiwan University Hospital. Interventions and study appraisal are not required.

Methods: The paper develops an algorithm for analyzing smoking cessation treatment plans documented in free-text medical records. The approach involves the development of an information extraction flow that uses a combination of data mining techniques, including text mining. It can use not only to help others quit smoking but also for other medical records with similar data elements. The Apriori associations of our algorithm from the text mining revealed several important clinical implications for physicians during smoking cessation. For example, an apparent association between nicotine replacement therapy (NRT) and other medications such as Inderal, Rivotril, Dogmatyl, and Solaxin. Inderal and Rivotril use in patients with anxiety disorders as anxiolytics frequently.

Results: Finally, we find that the rules associating with NRT combination with blood tests may imply that the use of NRT combination therapy in smokers with chronic illness may result in lower abstinence. Further large-scale surveys comparing varenicline or bupropion with NRT combination in smokers with a chronic disease are warranted. The Apriori algorithm suffers from some weaknesses despite being transparent and straightforward. The main limitation is the costly wasting of time to hold a vast number of candidates sets with frequent itemsets, low minimum support, or large itemsets.

Conclusion: In the paper, the most visible areas for the therapeutic application of text mining are the integration and transfer of advances made in basic sciences, as well as a better understanding of the processes involved in smoking cessation. Text mining may also be useful for supporting decision-making processes associated with smoking cessation. Systematic review registration number is not registered.

Publication types

  • Meta-Analysis

MeSH terms

  • Algorithms
  • Data Mining / methods*
  • Electronic Health Records*
  • Humans
  • Retrospective Studies
  • Smoking Cessation*
  • Taiwan
  • Tobacco Use Cessation Devices