A Probabilistic Active Learning Algorithm Based on Fisher Information Ratio

Jamshid Sourati; Murat Akcakaya; Deniz Erdogmus; Todd K Leen; Jennifer G Dy

doi:10.1109/TPAMI.2017.2743707

A Probabilistic Active Learning Algorithm Based on Fisher Information Ratio

IEEE Trans Pattern Anal Mach Intell. 2018 Aug;40(8):2023-2029. doi: 10.1109/TPAMI.2017.2743707. Epub 2017 Aug 24.

Authors

Jamshid Sourati, Murat Akcakaya, Deniz Erdogmus, Todd K Leen, Jennifer G Dy

PMID: 28858784
DOI: 10.1109/TPAMI.2017.2743707

Abstract

The task of labeling samples is demanding and expensive. Active learning aims to generate the smallest possible training data set that results in a classifier with high performance in the test phase. It usually consists of two steps of selecting a set of queries and requesting their labels. Among the suggested objectives to score the query sets, information theoretic measures have become very popular. Yet among them, those based on Fisher information (FI) have the advantage of considering the diversity among the queries and tractable computations. In this work, we provide a practical algorithm based on Fisher information ratio to obtain query distribution for a general framework where, in contrast to the previous FI-based querying methods, we make no assumptions over the test distribution. The empirical results on synthetic and real-world data sets indicate that this algorithm gives competitive results.

Publication types

Research Support, N.I.H., Extramural
Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Algorithms*
Computer Simulation
Databases, Factual / statistics & numerical data
Humans
Models, Statistical
Monte Carlo Method

Grants and funding

R01 HL089856/HL/NHLBI NIH HHS/United States