Improved population mapping for China using remotely sensed and points-of-interest data within a random forests model

Sci Total Environ. 2019 Mar 25:658:936-946. doi: 10.1016/j.scitotenv.2018.12.276. Epub 2018 Dec 19.

Abstract

Remote sensing image products (e.g. brightness of nighttime lights and land cover/land use types) have been widely used to disaggregate census data to produce gridded population maps for large geographic areas. The advent of the geospatial big data revolution has created additional opportunities to map population distributions at fine resolutions with high accuracy. A considerable proportion of the geospatial data contains semantic information that indicates different categories of human activities occurring at exact geographic locations. Such information is often lacking in remote sensing data. In addition, the remarkable progress in machine learning provides toolkits for demographers to model complex nonlinear correlations between population and heterogeneous geographic covariates. In this study, a typical type of geospatial big data, points-of-interest (POIs), was combined with multi-source remote sensing data in a random forests model to disaggregate the 2010 county-level census population data to 100 × 100 m grids. Compared with the WorldPop population dataset, our population map showed higher accuracy. The root mean square error for population estimates in Beijing, Shanghai, Guangzhou, and Chongqing for this method and WorldPop were 27,829 and 34,193, respectively. The large under-allocation of the population in urban areas and over-allocation in rural areas in the WorldPop dataset was greatly reduced in this new population map. Apart from revealing the effectiveness of POIs in improving population mapping, this study promises the potential of geospatial big data for mapping other socioeconomic parameters in the future.

Keywords: China; Nighttime light; Points of interest; Population; Random forests.