no code implementations • 17 Jun 2022 • Joel Lehman, Jonathan Gordon, Shawn Jain, Kamal Ndousse, Cathy Yeh, Kenneth O. Stanley
This paper pursues the insight that large language models (LLMs) trained to generate code can vastly improve the effectiveness of mutation operators applied to programs in genetic programming (GP).
1 code implementation • pproximateinference AABI Symposium 2021 • Wessel P. Bruinsma, James Requeima, Andrew Y. K. Foong, Jonathan Gordon, Richard E. Turner
Neural Processes (NPs; Garnelo et al., 2018a, b) are a rich class of models for meta-learning that map data sets directly to predictive stochastic processes.
2 code implementations • NeurIPS 2020 • Andrew Y. K. Foong, Wessel P. Bruinsma, Jonathan Gordon, Yann Dubois, James Requeima, Richard E. Turner
Stationary stochastic processes (SPs) are a key component of many probabilistic models, such as those for off-the-grid spatio-temporal data.
no code implementations • 18 Jun 2020 • Eric Nalisnick, Jonathan Gordon, José Miguel Hernández-Lobato
For this reason, we propose predictive complexity priors: a functional prior that is defined by comparing the model's predictions to those of a reference model.
1 code implementation • ICLR 2020 • Jonathan Gordon, David Lopez-Paz, Marco Baroni, Diane Bouchacourt
Humans understand novel sentences by composing meanings and roles of core language components.
2 code implementations • ICML 2020 • John Bronskill, Jonathan Gordon, James Requeima, Sebastian Nowozin, Richard E. Turner
Modern meta-learning approaches for image classification rely on increasingly deep networks to achieve state-of-the-art performance, making batch normalization an essential component of meta-learning pipelines.
no code implementations • IJCNLP 2019 • James Mullenbach, Jonathan Gordon, Nanyun Peng, Jonathan May
This provides evidence that the amount of commonsense knowledge encoded in these language models does not extend far beyond that already baked into the word embeddings.
2 code implementations • ICLR 2020 • Jonathan Gordon, Wessel P. Bruinsma, Andrew Y. K. Foong, James Requeima, Yann Dubois, Richard E. Turner
We introduce the Convolutional Conditional Neural Process (ConvCNP), a new member of the Neural Process family that models translation equivariance in the data.
no code implementations • 25 Sep 2019 • Marton Havasi, Jasper Snoek, Dustin Tran, Jonathan Gordon, José Miguel Hernández-Lobato
Variational inference (VI) is a popular approach for approximate Bayesian inference that is particularly promising for highly parameterized models such as deep neural networks.
1 code implementation • NeurIPS 2019 • Robert Pinsler, Jonathan Gordon, Eric Nalisnick, José Miguel Hernández-Lobato
Leveraging the wealth of unlabeled data produced in recent years provides great potential for improving supervised models.
1 code implementation • NeurIPS 2019 • James Requeima, Jonathan Gordon, John Bronskill, Sebastian Nowozin, Richard E. Turner
We introduce a conditional neural process based approach to the multi-task classification setting for this purpose, and establish connections to the meta-learning and few-shot learning literature.
Ranked #6 on
Few-Shot Image Classification
on Meta-Dataset Rank
no code implementations • 13 Feb 2019 • Francesco Paolo Casale, Jonathan Gordon, Nicolo Fusi
We showcase the advantages of our approach in applications to CIFAR-10 and ImageNet, where our approach outperforms methods with double its computational cost and matches the performance of methods with costs that are three orders of magnitude larger.
1 code implementation • ICLR 2019 • Jonathan Gordon, John Bronskill, Matthias Bauer, Sebastian Nowozin, Richard E. Turner
2) We introduce VERSA, an instance of the framework employing a flexible and versatile amortization network that takes few-shot learning datasets as inputs, with arbitrary numbers of shots, and outputs a distribution over task-specific parameters in a single forward pass.
no code implementations • WS 2017 • Jonathan Gordon, Stephen Aguilar, Emily Sheng, Gully Burns
Learners need to find suitable documents to read and prioritize them in an appropriate order.
no code implementations • WS 2017 • Emily Sheng, Prem Natarajan, Jonathan Gordon, Gully Burns
We refer to this learning utility as the "pedagogical value" of the document to the learner.
no code implementations • 29 Jun 2017 • Jonathan Gordon, José Miguel Hernández-Lobato
However, these techniques a) cannot account for model uncertainty in the estimation of the model's discriminative component and b) lack flexibility to capture complex stochastic patterns in the label generation process.