Bandits for Learning to Explain from Explanations
We introduce Explearn, an online algorithm that learns to jointly output predictions and explanations for those predictions. Explearn leverages Gaussian Processes (GP)-based contextual bandits. This brings two key benefits. First, GPs naturally capture different kinds of explanations and enable the system designer to control how explanations generalize across the space by virtue of choosing a suitable kernel. Second, Explearn builds on recent results in contextual bandits which guarantee convergence with high probability. Our initial experiments hint at the promise of the approach.
PDF Abstract