Search Results for author: Jacob Pfau

Found 5 papers, 3 papers with code

Self-Consistency of Large Language Models under Ambiguity

1 code implementation20 Oct 2023 Henning Bartsch, Ole Jorgensen, Domenic Rosati, Jason Hoelscher-Obermaier, Jacob Pfau

Using this test, we find that despite increases in self-consistency, models usually place significant weight on alternative, inconsistent answers.

Question Answering

Goal Misgeneralization in Deep Reinforcement Learning

4 code implementations28 May 2021 Lauro Langosco, Jack Koch, Lee Sharkey, Jacob Pfau, Laurent Orseau, David Krueger

We study goal misgeneralization, a type of out-of-distribution generalization failure in reinforcement learning (RL).

Navigate Out-of-Distribution Generalization +2

Robust Semantic Interpretability: Revisiting Concept Activation Vectors

1 code implementation6 Apr 2021 Jacob Pfau, Albert T. Young, Jerome Wei, Maria L. Wei, Michael J. Keiser

Our proposed Robust Concept Activation Vectors (RCAV) quantifies the effects of semantic concepts on individual model predictions and on model behavior as a whole.

Benchmarking counterfactual +1

Global Saliency: Aggregating Saliency Maps to Assess Dataset Artefact Bias

no code implementations16 Oct 2019 Jacob Pfau, Albert T. Young, Maria L. Wei, Michael J. Keiser

In high-stakes applications of machine learning models, interpretability methods provide guarantees that models are right for the right reasons.

Semantic Segmentation

Cannot find the paper you are looking for? You can Submit a new open access paper.