Accounting for two-billion tons of stabilized soil carbon

C Wade Ross; Sabine Grunwald; Jason G Vogel; Daniel Markewitz; Eric J Jokela; Timothy A Martin; Rosvel Bracho; Allan R Bacon; Colby W Brungard; Xiong Xiong

doi:10.1016/j.scitotenv.2019.134615

Accounting for two-billion tons of stabilized soil carbon

Sci Total Environ. 2020 Feb 10:703:134615. doi: 10.1016/j.scitotenv.2019.134615. Epub 2019 Nov 2.

Authors

Affiliations

¹ University of Florida, Soil and Water Sciences Department, 2181 McCarty Hall A, PO Box 110290, Gainesville, FL 32611, USA; New Mexico State University, Department of Plant and Environmental Sciences, MSC 3Q, PO Box 30003, Las Cruces, NM 88003, USA. Electronic address: cwross@nmsu.edu.
² University of Florida, Soil and Water Sciences Department, 2181 McCarty Hall A, PO Box 110290, Gainesville, FL 32611, USA.
³ University of Florida, School of Forest Resources and Conservation, 136 Newins-Ziegler Hall, Gainesville, FL 32611, USA.
⁴ Warnell School of Forestry and Natural Resources, University of Georgia, Athens, GA 30602, USA.
⁵ New Mexico State University, Department of Plant and Environmental Sciences, MSC 3Q, PO Box 30003, Las Cruces, NM 88003, USA.

PMID: 31767338
DOI: 10.1016/j.scitotenv.2019.134615

Abstract

The pedosphere is the largest terrestrial reservoir of organic carbon, yet soil-carbon variability and its representation in Earth system models is a large source of uncertainty for carbon-cycle science and climate projections. Much of this uncertainty is attributed to local and regional-scale variability, and predicting this variation can be challenging if variable selection is based solely on a priori assumptions due to the scale-dependent nature of environmental determinants. Data mining can optimize predictive modeling by allowing machine-learning algorithms to learn from and discover complex patterns in large datasets that may have otherwise gone unnoticed, thus increasing the potential for knowledge discovery. In this analysis, we identify important, regional-scale determinants for top- and subsoil-carbon stabilization in production forestland across the southeastern US. Specifically, we apply recursive feature elimination to a large suite of socio-environmental data to strategically select a parsimonious, yet highly predictive covariate set. This is achieved by recursively considering smaller and smaller covariate sets-or features-by first training the estimator on the full set to obtain feature importance. The least important features are pruned, and the procedure is recursively repeated until a desired number of covariates is identified. We show that although carbon ranges from 0.3 to 8.2 kg m^-2 in the topsoil (0 to 20 cm), and from 0.4 to 17.6 kg m^-2 in the subsoil (20 to 100 cm), this variability is predictably distributed with precipitation, soil moisture, nitrogen and sand content, gamma ray emissions, mean annual minimum temperature, and elevation. From our spatial predictions, we estimate that 2.6 Pg of soil carbon is currently stabilized in the upper 100 cm of production forestland, which covers 34.7 million ha in the southeastern US.

Keywords: Carbon cycle; Data mining; Feature selection; Forest soils; Machine learning; Soil carbon.