Random forests for homogeneous and non-homogeneous Poisson processes with excess zeros

Stat Methods Med Res. 2020 Aug;29(8):2217-2237. doi: 10.1177/0962280219888741. Epub 2019 Nov 24.

Abstract

We propose a general hurdle methodology to model a response from a homogeneous or a non-homogeneous Poisson process with excess zeros, based on two forests. The first forest in the two parts model is used to estimate the probability of having a zero. The second forest is used to estimate the Poisson parameter(s), using only the observations with at least one event. To build the trees in the second forest, we propose specialized splitting criteria derived from the zero truncated homogeneous and non-homogeneous Poisson likelihood. The particular case of a homogeneous process is investigated in details to stress out the advantages of the proposed method over the existing ones. Simulation studies show that the proposed methods perform well in hurdle (zero-altered) and zero-inflated settings, for both homogeneous and non-homogeneous processes. We illustrate the use of the new method with real data on the demand for medical care by the elderly.

Keywords: Hurdle model; Poisson process; non-homogeneous Poisson process; random forests; tree-based method; zero-altered Poisson (ZAP); zero-inflated Poisson (ZIP).

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Aged
  • Computer Simulation
  • Humans
  • Models, Statistical*
  • Poisson Distribution
  • Research Design*