no code implementations • 13 Feb 2024 • Daniel D. Johnson, Daniel Tarlow, David Duvenaud, Chris J. Maddison
Identifying how much a model ${\widehat{p}}_{\theta}(Y|X)$ knows about the stochastic real-world process $p(Y|X)$ it was trained on is important to ensure it avoids producing incorrect or "hallucinated" answers or taking unsafe actions.
1 code implementation • 23 Nov 2023 • Vincent Dumoulin, Daniel D. Johnson, Pablo Samuel Castro, Hugo Larochelle, Yann Dauphin
Learning from human feedback (LHF) -- and in particular learning from pairwise preferences -- has recently become a crucial ingredient in training large language models (LLMs), and has been the subject of much research.
1 code implementation • 1 Mar 2023 • Daniel D. Johnson, Daniel Tarlow, Christian Walder
Large language models show impressive results at predicting structured text such as code, but also commonly introduce errors and hallucinations in their output.
no code implementations • 4 Oct 2022 • Daniel D. Johnson, Ayoub El Hanchi, Chris J. Maddison
We give generalization bounds for downstream linear prediction using our Kernel PCA representation, and show empirically on a set of synthetic tasks that applying Kernel PCA to contrastive learning models can indeed approximately recover the Markov chain eigenfunctions, although the accuracy depends on the kernel parameterization as well as on the augmentation strength.
1 code implementation • NeurIPS 2021 • Guy Lorberbom, Daniel D. Johnson, Chris J. Maddison, Daniel Tarlow, Tamir Hazan
To perform counterfactual reasoning in Structural Causal Models (SCMs), one needs to know the causal mechanisms, which provide factorizations of conditional distributions into noise sources and deterministic functions mapping realizations of noise to samples.
no code implementations • ICML Workshop INNF 2021 • Daniel D. Johnson, Jacob Austin, Rianne van den Berg, Daniel Tarlow
Denoising diffusion probabilistic models (DDPMs) have shown impressive results on sequence generation by iteratively corrupting each example and then learning to map corrupted versions back to the original.
3 code implementations • NeurIPS 2021 • Jacob Austin, Daniel D. Johnson, Jonathan Ho, Daniel Tarlow, Rianne van den Berg
Here, we introduce Discrete Denoising Diffusion Probabilistic Models (D3PMs), diffusion-like generative models for discrete data that generalize the multinomial diffusion model of Hoogeboom et al. 2021, by going beyond corruption processes with uniform transition probabilities.
1 code implementation • NeurIPS 2020 • Daniel D. Johnson, Hugo Larochelle, Daniel Tarlow
In practice, edges are used both to represent intrinsic structure (e. g., abstract syntax trees of programs) and more abstract relations that aid reasoning for a downstream task (e. g., results of relevant program analyses).