Training Classifiers with Natural Language Explanations

Braden Hancock; Martin Bringmann; Paroma Varma; Percy Liang; Stephanie Wang; Christopher Ré

Training Classifiers with Natural Language Explanations

Proc Conf Assoc Comput Linguist Meet. 2018 Jul:2018:1884-1895.

Authors

Braden Hancock¹, Martin Bringmann², Paroma Varma³, Percy Liang¹, Stephanie Wang¹, Christopher Ré¹

Affiliations

¹ Computer Science Dept., Stanford University.
² OccamzRazor, San Francisco, CA.
³ Electrical Engineering Dept., Stanford University.

PMID: 31130772
PMCID: PMC6534135

Abstract

Training accurate classifiers requires many labels, but each label provides only limited information (one bit for binary classification). In this work, we propose BabbleLabble, a framework for training classifiers in which an annotator provides a natural language explanation for each labeling decision. A semantic parser converts these explanations into programmatic labeling functions that generate noisy labels for an arbitrary amount of unlabeled data, which is used to train a classifier. On three relation extraction tasks, we find that users are able to train classifiers with comparable F1 scores from 5-100× faster by providing explanations instead of just labels. Furthermore, given the inherent imperfection of labeling functions, we find that a simple rule-based semantic parser suffices.

Grants and funding

U54 EB020405/EB/NIBIB NIH HHS/United States