Semiautomated text analytics for qualitative data synthesis

Res Synth Methods. 2019 Sep;10(3):452-464. doi: 10.1002/jrsm.1361. Epub 2019 Jul 9.

Abstract

Approaches to synthesizing qualitative data have, to date, largely focused on integrating the findings from published reports. However, developments in text mining software offer the potential for efficient analysis of large pooled primary qualitative datasets. This case study aimed to (a) provide a step-by-step guide to using one software application, Leximancer, and (b) interrogate opportunities and limitations of the software for qualitative data synthesis. We applied Leximancer v4.5 to a pool of five qualitative, UK-based studies on transportation such as walking, cycling, and driving, and displayed the findings of the automated content analysis as intertopic distance maps. Leximancer enabled us to "zoom out" to familiarize ourselves with, and gain a broad perspective of, the pooled data. It indicated which studies clustered around dominant topics such as "people." The software also enabled us to "zoom in" to narrow the perspective to specific subgroups and lines of enquiry. For example, "people" featured in men's and women's narratives but were talked about differently, with men mentioning "kids" and "old," whereas women mentioned "things" and "stuff." The approach provided us with a fresh lens for the initial inductive step in the analysis process and could guide further exploration. The limitations of using Leximancer were the substantial data preparation time involved and the contextual knowledge required from the researcher to turn lines of inquiry into meaningful insights. In summary, Leximancer is a useful tool for contributing to qualitative data synthesis, facilitating comprehensive and transparent data coding but can only inform, not replace, researcher-led interpretive work.

Keywords: data pooling; machine learning; qualitative data synthesis; secondary analysis; social practice; text analytics; text mining.

MeSH terms

  • Algorithms
  • Data Accuracy
  • Data Mining / methods*
  • Data Science / methods*
  • Databases, Factual
  • Female
  • Humans
  • Machine Learning
  • Male
  • Normal Distribution
  • Pattern Recognition, Automated*
  • Qualitative Research*
  • Software
  • United Kingdom