Determination of CERES TOA fluxes using Machine learning algorithms. Part I: Classification and retrieval of CERES cloudy and clear scenes

J Atmos Ocean Technol. 2017 Oct 1;34(10):2329-2345. doi: 10.1175/JTECH-D-16-0183.1.

Abstract

Continuous monitoring of the Earth radiation budget (ERB) is critical to our understanding of the Earth's climate and its variability with time. The Clouds and the Earth's Radiant Energy System (CERES) instrument is able to provide a long record of ERB for such scientific studies. This manuscript, which is first of a two-part paper, describes the new CERES algorithm for improving the clear/cloudy scene classification without the use of coincident cloud imager data. This new CERES algorithm is based on a subset of modern artificial intelligence (AI) paradigm called Machine Learning (ML) algorithms. This paper describes development and application of the ML algorithm known as Random Forests (RF) which is used to classify CERES broadband footprint measurements into clear and cloudy scenes. Results from the RF analysis carried using the CERES Single Scanner Footprint (SSF) data for the months of January and July are presented in the manuscript. The daytime RF misclassification rate (MCR) shows relatively large values (>30%) for snow, sea ice and bright desert surface types while lower values of (<10%) for forest surface type. MCR values observed for the nighttime data in general show relatively larger values for most of the surface types compared to the daytime MCR values. The modified MCR values show lower values (< 4%) for most surface types after thin cloud data is excluded from the analysis. Sensitivity analysis shows that the number of input variables and decision trees used in the RF analysis has substantial influence in determining the classification error.