PABI: A Unified PAC-Bayesian Informativeness Measure for Incidental Supervision Signals

1 Jan 2021  ·  Hangfeng He, Mingyuan Zhang, Qiang Ning, Dan Roth ·

Real-world applications often require making use of {\em a range of incidental supervision signals}. However, we currently lack a principled way to measure the benefit an incidental training dataset can bring, and the common practice of using indirect, weaker signals is through exhaustive experiments with various models and hyper-parameters. This paper studies whether we can, {\em in a single framework, quantify the benefit of various types of incidental signals for one's target task without going through combinatorial experiments}. We propose PABI, a unified informativeness measure backed by PAC-Bayesian theory, characterizing the reduction in uncertainty that indirect, weak signals provide. We demonstrate PABI's use in quantifying various types of incidental signals including partial labels, noisy labels, constraints, cross-domain signals, and combinations of these. Experiments with various setups on two natural language processing (NLP) tasks, named entity recognition (NER) and question answering (QA), show that PABI correlates well with learning performance, providing a promising way to determine, ahead of learning, which supervision signals would be beneficial.

PDF Abstract

Datasets


Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here