A multi-technique approach to bridge electronic case report form design and data standard adoption

J Biomed Inform. 2015 Feb:53:49-57. doi: 10.1016/j.jbi.2014.08.013. Epub 2014 Sep 6.

Abstract

Background and objective: The importance of data standards when integrating clinical research data has been recognized. The common data element (CDE) is a consensus-based data element for data harmonization and sharing between clinical researchers, it can support data standards adoption and mapping. However, the lack of a suitable methodology has become a barrier to data standard adoption. Our aim was to demonstrate an approach that allowed clinical researchers to design electronic case report forms (eCRFs) that complied with the data standard.

Methods: We used a multi-technique approach, including information retrieval, natural language processing and an ontology-based knowledgebase to facilitate data standard adoption using the eCRF design. The approach took research questions as query texts with the aim of retrieving and associating relevant CDEs with the research questions.

Results: The approach was implemented using a CDE-based eCRF builder, which was evaluated using CDE- related questions from CRFs used in the Parkinson Disease Biomarker Program, as well as CDE-unrelated questions from a technique support website. Our approach had a precision of 0.84, a recall of 0.80, a F-measure of 0.82 and an error of 0.31. Using the 303 testing CDE-related questions, our approach responded and provided suggested CDEs for 88.8% (269/303) of the study questions with a 90.3% accuracy (243/269). The reason for any missed and failed responses was also analyzed.

Conclusion: This study demonstrates an approach that helps to cross the barrier that inhibits data standard adoption in eCRF building and our evaluation reveals the approach has satisfactory performance. Our CDE-based form builder provides an alternative perspective regarding data standard compliant eCRF design.

Keywords: Case report form; Common data elements; Data standard; Natural language processing; Ontology-based knowledgebase.

MeSH terms

  • Algorithms
  • Biomarkers / metabolism
  • Biomedical Research / standards*
  • Computational Biology / standards*
  • Computer Systems
  • Humans
  • Information Storage and Retrieval / methods*
  • Models, Statistical
  • Natural Language Processing*
  • Parkinson Disease / metabolism
  • Reproducibility of Results
  • Research Design
  • Software

Substances

  • Biomarkers