Improved predictive mapping of indoor radon concentrations using ensemble regression trees based on automatic clustering of geological units

J Environ Radioact. 2015 Sep:147:51-62. doi: 10.1016/j.jenvrad.2015.05.006. Epub 2015 May 28.

Abstract

Purpose: According to estimations around 230 people die as a result of radon exposure in Switzerland. This public health concern makes reliable indoor radon prediction and mapping methods necessary in order to improve risk communication to the public. The aim of this study was to develop an automated method to classify lithological units according to their radon characteristics and to develop mapping and predictive tools in order to improve local radon prediction.

Method: About 240 000 indoor radon concentration (IRC) measurements in about 150 000 buildings were available for our analysis. The automated classification of lithological units was based on k-medoids clustering via pair-wise Kolmogorov distances between IRC distributions of lithological units. For IRC mapping and prediction we used random forests and Bayesian additive regression trees (BART).

Results: The automated classification groups lithological units well in terms of their IRC characteristics. Especially the IRC differences in metamorphic rocks like gneiss are well revealed by this method. The maps produced by random forests soundly represent the regional difference of IRCs in Switzerland and improve the spatial detail compared to existing approaches. We could explain 33% of the variations in IRC data with random forests. Additionally, the influence of a variable evaluated by random forests shows that building characteristics are less important predictors for IRCs than spatial/geological influences. BART could explain 29% of IRC variability and produced maps that indicate the prediction uncertainty.

Conclusion: Ensemble regression trees are a powerful tool to model and understand the multidimensional influences on IRCs. Automatic clustering of lithological units complements this method by facilitating the interpretation of radon properties of rock types. This study provides an important element for radon risk communication. Future approaches should consider taking into account further variables like soil gas radon measurements as well as more detailed geological information.

Keywords: Bayesian additive regression trees; Indoor radon; Mapping; Predictive modeling; Random forest; k-medoids clustering.

MeSH terms

  • Air Pollutants, Radioactive / analysis*
  • Air Pollution, Indoor / analysis*
  • Air Pollution, Radioactive / analysis*
  • Cluster Analysis
  • Geology
  • Housing
  • Models, Theoretical
  • Radiation Monitoring / methods*
  • Radon / analysis*
  • Regression Analysis
  • Switzerland

Substances

  • Air Pollutants, Radioactive
  • Radon