Compositional data for global monitoring: The case of drinking water and sanitation

Sci Total Environ. 2017 Jul 15:590-591:554-565. doi: 10.1016/j.scitotenv.2017.02.220. Epub 2017 Mar 8.

Abstract

Introduction: At a global level, access to safe drinking water and sanitation has been monitored by the Joint Monitoring Programme (JMP) of WHO and UNICEF. The methods employed are based on analysis of data from household surveys and linear regression modelling of these results over time. However, there is evidence of non-linearity in the JMP data. In addition, the compositional nature of these data is not taken into consideration. This article seeks to address these two previous shortcomings in order to produce more accurate estimates.

Methods: We employed an isometric log-ratio transformation designed for compositional data. We applied linear and non-linear time regressions to both the original and the transformed data. Specifically, different modelling alternatives for non-linear trajectories were analysed, all of which are based on a generalized additive model (GAM).

Results and discussion: Non-linear methods, such as GAM, may be used for modelling non-linear trajectories in the JMP data. This projection method is particularly suited for data-rich countries. Moreover, the ilr transformation of compositional data is conceptually sound and fairly simple to implement. It helps improve the performance of both linear and non-linear regression models, specifically in the occurrence of extreme data points, i.e. when coverage rates are near either 0% or 100%.

Keywords: Compositional data; Joint Monitoring Programme (JMP) of WHO and UNICEF; Log transformation; Sanitation and hygiene; Service ladder; Water.