FHBF: Federated hybrid boosted forests with dropout rates for supervised learning tasks across highly imbalanced clinical datasets

Patterns (N Y). 2024 Jan 12;5(1):100893. doi: 10.1016/j.patter.2023.100893.

Abstract

Although several studies have deployed gradient boosting trees (GBT) as a robust classifier for federated learning tasks (federated GBT [FGBT]), even with dropout rates (federated gradient boosting trees with dropout rate [FDART]), none of them have investigated the overfitting effects of FGBT across heterogeneous and highly imbalanced datasets within federated environments nor the effect of dropouts in the loss function. In this work, we present the federated hybrid boosted forests (FHBF) algorithm, which incorporates a hybrid weight update approach to overcome ill-posed problems that arise from overfitting effects during the training across highly imbalanced datasets in the cloud. Eight case studies were conducted to stress the performance of FHBF against existing algorithms toward the development of robust AI models for lymphoma development across 18 European federated databases. Our results highlight the robustness of FHBF, yielding an average loss of 0.527 compared with FGBT (0.611) and FDART (0.584) with increased classification performance (0.938 sensitivity, 0.732 specificity).

Keywords: class imbalance; dropouts; federated learning; gradient boosting; hybrid loss function; machine learning.