A domain-specific language for managing ETL processes

PeerJ Comput Sci. 2024 Jan 26:10:e1835. doi: 10.7717/peerj-cs.1835. eCollection 2024.

Abstract

Maintenance of Data Warehouse (DW) systems is a critical task because any downtime or data loss can have significant consequences on business applications. Existing DW maintenance solutions mostly rely on concrete technologies and tools that are dependent on: the platform on which the DW system was created; the specific data extraction, transformation, and loading (ETL) tool; and the database language the DW uses. Different languages for different versions of DW systems make organizing DW processes difficult, as minimal changes in the structure require major changes in the application code for managing ETL processes. This article proposes a domain-specific language (DSL) for ETL process management that mitigates these problems by centralizing all program logic, making it independent from a particular platform. This approach would simplify DW system maintenance. The platform-independent language proposed in this article also provides an easier way to create a unified environment to control DW processes, regardless of the language, environment, or ETL tool the DW uses.

Keywords: Data warehouse; Domain-specific language; Extraction transformation and loading; Model-driven development; Platform-independent models.

Grants and funding

The authors received no funding for this work.