Selecting the number of categories of the lymph node ratio in cancer research: A bootstrap-based hypothesis test

Stat Methods Med Res. 2021 Mar;30(3):926-940. doi: 10.1177/0962280220965631. Epub 2020 Nov 9.

Abstract

The high impact of the lymph node ratio as a prognostic factor is widely established in colorectal cancer, and is being used as a categorized predictor variable in several studies. However, the cut-off points as well as the number of categories considered differ considerably in the literature. Motivated by the need to obtain the best categorization of the lymph node ratio as a predictor of mortality in colorectal cancer patients, we propose a method to select the best number of categories for a continuous variable in a logistic regression framework. Thus, to this end, we propose a bootstrap-based hypothesis test, together with a new estimation algorithm for the optimal location of the cut-off points called BackAddFor, which is an updated version of the previously proposed AddFor algorithm. The performance of the hypothesis test was evaluated by means of a simulation study, under different scenarios, yielding type I errors close to the nominal errors and good power values whenever a meaningful difference in terms of prediction ability existed. Finally, the methodology proposed was applied to the CCR-CARESS study where the lymph node ratio was included as a predictor of five-year mortality, resulting in the selection of three categories.

Keywords: Categorization; bootstrap; cut-off point; prediction models.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Humans
  • Logistic Models
  • Lymph Node Ratio*
  • Lymphatic Metastasis
  • Neoplasm Staging
  • Prognosis