ShExML: improving the usability of heterogeneous data mapping languages for first-time users

PeerJ Comput Sci. 2020 Nov 23:6:e318. doi: 10.7717/peerj-cs.318. eCollection 2020.

Abstract

Integration of heterogeneous data sources in a single representation is an active field with many different tools and techniques. In the case of text-based approaches-those that base the definition of the mappings and the integration on a DSL-there is a lack of usability studies. In this work we have conducted a usability experiment (n = 17) on three different languages: ShExML (our own language), YARRRML and SPARQL-Generate. Results show that ShExML users tend to perform better than those of YARRRML and SPARQL-Generate. This study sheds light on usability aspects of these languages design and remarks some aspects of improvement.

Keywords: Data integration; Data mapping; SPARQL-Generate; ShExML; Usability; YARRRML.

Grants and funding

This work has been funded by the Principality of Asturias through the Severo Ochoa call (grant BP17-29), by the Ministry of Economy, Industry and Competitiveness under the call of “Programa Estatal de I+D+i Orientada a los Retos de la Sociedad” (project TIN2017-88877-R), the CPER Nord-Pas de Calais/FEDER DATA Advanced data science and technologies 2015–2020, and the ANR project DataCert ANR-15-CE39-0009. There was no additional external funding received for this study. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.