Topological data analysis (TDA) applied to reveal pedogenetic principles of European topsoil system

Sci Total Environ. 2017 May 15:586:1091-1100. doi: 10.1016/j.scitotenv.2017.02.095. Epub 2017 Feb 16.

Abstract

Recent developments in applied mathematics are bringing new tools that are capable to synthesize knowledge in various disciplines, and help in finding hidden relationships between variables. One such technique is topological data analysis (TDA), a fusion of classical exploration techniques such as principal component analysis (PCA), and a topological point of view applied to clustering of results. Various phenomena have already received new interpretations thanks to TDA, from the proper choice of sport teams to cancer treatments. For the first time, this technique has been applied in soil science, to show the interaction between physical and chemical soil attributes and main soil-forming factors, such as climate and land use. The topsoil data set of the Land Use/Land Cover Area Frame survey (LUCAS) was used as a comprehensive database that consists of approximately 20,000 samples, each described by 12 physical and chemical parameters. After the application of TDA, results obtained were cross-checked against known grouping parameters including five types of land cover, nine types of climate and the organic carbon content of soil. Some of the grouping characteristics observed using standard approaches were confirmed by TDA (e.g., organic carbon content) but novel subtle relationships (e.g., magnitude of anthropogenic effect in soil formation), were discovered as well. The importance of this finding is that TDA is a unique mathematical technique capable of extracting complex relations hidden in soil science data sets, giving the opportunity to see the influence of physicochemical, biotic and abiotic factors on topsoil formation through fresh eyes.

Keywords: Ecology; Land Use/Land Cover Area Frame Survey (LUCAS); Pedology; Soil geography; Soil typology; Topological data analysis (TDA).