Calibrating variant-scoring methods for clinical decision making

Bioinformatics. 2021 Apr 5;36(24):5709-5711. doi: 10.1093/bioinformatics/btaa943.

Abstract

Summary: Identifying pathogenic variants and annotating them is a major challenge in human genetics, especially for the non-coding ones. Several tools have been developed and used to predict the functional effect of genetic variants. However, the calibration assessment of the predictions has received little attention. Calibration refers to the idea that if a model predicts a group of variants to be pathogenic with a probability P, it is expected that the same fraction P of true positive is found in the observed set. For instance, a well-calibrated classifier should label the variants such that among the ones to which it gave a probability value close to 0.7, approximately 70% actually belong to the pathogenic class. Poorly calibrated algorithms can be misleading and potentially harmful for clinical decision making.

Avaliability and implementation: The dataset used for testing the methods is available through the DOI:10.5281/zenodo.4448197.

Supplementary information: Supplementary data are available at Bioinformatics online.