1 code implementation • ICML 2020 • Xiao Shi Huang, Felipe Perez, Jimmy Ba, Maksims Volkovs
As Transformer models are becoming larger and more expensive to train, recent research has focused on understanding and improving optimization in these models.
1 code implementation • ICML 2020 • Xiao Shi Huang, Felipe Perez, Jimmy Ba, Maksims Volkovs
As Transformer models are becoming larger and more expensive to train, recent research has focused on understanding and improving optimization in these models.
no code implementations • 1 Sep 2024 • Blair Yang, Fuyang Cui, Keiran Paster, Jimmy Ba, Pashootan Vaezipoor, Silviu Pitis, Michael R. Zhang
The rapid development and dynamic nature of large language models (LLMs) make it difficult for conventional quantitative benchmarks to accurately assess their capabilities.
1 code implementation • 30 Jul 2024 • Brandon Jaipersaud, Paul Zhang, Jimmy Ba, Andrew Petersen, Lisa Zhang, Michael R. Zhang
We propose and evaluate a question-answering system that uses decomposed prompting to classify and answer student questions on a course discussion board.
1 code implementation • 5 Mar 2024 • Nathaniel Li, Alexander Pan, Anjali Gopal, Summer Yue, Daniel Berrios, Alice Gatti, Justin D. Li, Ann-Kathrin Dombrowski, Shashwat Goel, Long Phan, Gabriel Mukobi, Nathan Helm-Burger, Rassin Lababidi, Lennart Justen, Andrew B. Liu, Michael Chen, Isabelle Barrass, Oliver Zhang, Xiaoyuan Zhu, Rishub Tamirisa, Bhrugu Bharathi, Adam Khoja, Zhenqi Zhao, Ariel Herbert-Voss, Cort B. Breuer, Samuel Marks, Oam Patel, Andy Zou, Mantas Mazeika, Zifan Wang, Palash Oswal, Weiran Lin, Adam A. Hunt, Justin Tienken-Harder, Kevin Y. Shih, Kemper Talley, John Guan, Russell Kaplan, Ian Steneker, David Campbell, Brad Jokubaitis, Alex Levinson, Jean Wang, William Qian, Kallol Krishna Karmakar, Steven Basart, Stephen Fitz, Mindy Levine, Ponnurangam Kumaraguru, Uday Tupakula, Vijay Varadharajan, Ruoyu Wang, Yan Shoshitaishvili, Jimmy Ba, Kevin M. Esvelt, Alexandr Wang, Dan Hendrycks
To measure these risks of malicious use, government institutions and major AI labs are developing evaluations for hazardous capabilities in LLMs.
no code implementations • 7 Dec 2023 • Michael R. Zhang, Nishkrit Desai, Juhan Bae, Jonathan Lorraine, Jimmy Ba
This paper studies using foundational large language models (LLMs) to make decisions during hyperparameter optimization (HPO).
2 code implementations • 10 Oct 2023 • Keiran Paster, Marco Dos Santos, Zhangir Azerbayev, Jimmy Ba
We hope that our dataset, openly released on the Hugging Face Hub, will help spur advances in the reasoning abilities of large language models.
1 code implementation • 25 Sep 2023 • Yangjun Ruan, Honghua Dong, Andrew Wang, Silviu Pitis, Yongchao Zhou, Jimmy Ba, Yann Dubois, Chris J. Maddison, Tatsunori Hashimoto
Alongside the emulator, we develop an LM-based automatic safety evaluator that examines agent failures and quantifies associated risks.
2 code implementations • NeurIPS 2023 • Shalev Lifshitz, Keiran Paster, Harris Chan, Jimmy Ba, Sheila Mcilraith
Constructing AI models that respond to text instructions is challenging, especially for sequential decision-making tasks.
1 code implementation • 24 May 2023 • Yongchao Zhou, Hshmat Sahak, Jimmy Ba
In this paper, we present Diffusion Inversion, a simple yet effective method that leverages the pre-trained generative model, Stable Diffusion, to generate diverse, high-quality training data for image classification.
2 code implementations • NeurIPS 2023 • Yann Dubois, Xuechen Li, Rohan Taori, Tianyi Zhang, Ishaan Gulrajani, Jimmy Ba, Carlos Guestrin, Percy Liang, Tatsunori B. Hashimoto
As a demonstration of the research possible in AlpacaFarm, we find that methods that use a reward model can substantially improve over supervised fine-tuning and that our reference PPO implementation leads to a +10% improvement in win-rate against Davinci003.
2 code implementations • 19 May 2023 • Augustin Toma, Patrick R. Lawler, Jimmy Ba, Rahul G. Krishnan, Barry B. Rubin, Bo wang
We present Clinical Camel, an open large language model (LLM) explicitly tailored for clinical research.
1 code implementation • 6 May 2023 • Anastasia Razdaibiedina, Yuning Mao, Rui Hou, Madian Khabsa, Mike Lewis, Jimmy Ba, Amjad Almahairi
In this work, we introduce Residual Prompt Tuning - a simple and efficient method that significantly improves the performance and stability of prompt tuning.
2 code implementations • 26 Apr 2023 • Zhaoyan Liu, Noel Vouitsis, Satya Krishna Gorti, Jimmy Ba, Gabriel Loaiza-Ganem
We propose TR0N, a highly general framework to turn pre-trained unconditional generative models, such as GANs and VAEs, into conditional models.
Ranked #30 on Text-to-Image Generation on MS COCO
1 code implementation • 12 Apr 2023 • Silviu Pitis, Michael R. Zhang, Andrew Wang, Jimmy Ba
Methods such as chain-of-thought prompting and self-consistency have pushed the frontier of language model reasoning performance with no additional training.
7 code implementations • 10 Jan 2023 • Danijar Hafner, Jurgis Pasukonis, Jimmy Ba, Timothy Lillicrap
Developing a general algorithm that learns to solve tasks across a wide range of applications has been a fundamental challenge in artificial intelligence.
Ranked #6 on Atari Games 100k on Atari 100k
no code implementations • 7 Dec 2022 • Juhan Bae, Michael R. Zhang, Michael Ruan, Eric Wang, So Hasegawa, Jimmy Ba, Roger Grosse
Variational autoencoders (VAEs) are powerful tools for learning latent representations of data used in a wide range of applications.
4 code implementations • 3 Nov 2022 • Yongchao Zhou, Andrei Ioan Muresanu, Ziwen Han, Keiran Paster, Silviu Pitis, Harris Chan, Jimmy Ba
By conditioning on natural language instructions, large language models (LLMs) have displayed impressive capabilities as general-purpose computers.
no code implementations • 27 Sep 2022 • Siddhartha Rao Kamalakara, Acyr Locatelli, Bharat Venkitesh, Jimmy Ba, Yarin Gal, Aidan N. Gomez
Training deep neural networks in low rank, i. e. with factorised layers, is of particular interest to the community: it offers efficiency over unfactorised training in terms of both memory consumption and training time.
2 code implementations • 1 Jun 2022 • Yongchao Zhou, Ehsan Nezhadarya, Jimmy Ba
Dataset distillation can be formulated as a bi-level meta-learning problem where the outer loop optimizes the meta-dataset and the inner loop trains a model on the distilled data.
Ranked #4 on Dataset Distillation - 1IPC on TinyImageNet
no code implementations • 31 May 2022 • Keiran Paster, Sheila Mcilraith, Jimmy Ba
In all tested domains, ESPER achieves significantly better alignment between the target return and achieved return than simply conditioning on returns.
no code implementations • 3 May 2022 • Jimmy Ba, Murat A. Erdogdu, Taiji Suzuki, Zhichao Wang, Denny Wu, Greg Yang
We study the first gradient descent step on the first-layer parameters $\boldsymbol{W}$ in a two-layer neural network: $f(\boldsymbol{x}) = \frac{1}{\sqrt{N}}\boldsymbol{a}^\top\sigma(\boldsymbol{W}^\top\boldsymbol{x})$, where $\boldsymbol{W}\in\mathbb{R}^{d\times N}, \boldsymbol{a}\in\mathbb{R}^{N}$ are randomly initialized, and the training objective is the empirical MSE loss: $\frac{1}{n}\sum_{i=1}^n (f(\boldsymbol{x}_i)-y_i)^2$.
1 code implementation • NeurIPS 2021 • Beining Han, Chongyi Zheng, Harris Chan, Keiran Paster, Michael R. Zhang, Jimmy Ba
These changes are often spurious and unrelated to the underlying problem, such as background shifts for visual input agents.
no code implementations • ICLR 2022 • Jimmy Ba, Murat A Erdogdu, Marzyeh Ghassemi, Shengyang Sun, Taiji Suzuki, Denny Wu, Tianzong Zhang
Stein variational gradient descent (SVGD) is a deterministic inference algorithm that evolves a set of particles to fit a target distribution.
2 code implementations • NeurIPS 2021 • Vaibhav Saxena, Jimmy Ba, Danijar Hafner
We introduce the Clockwork VAE (CW-VAE), a video prediction model that leverages a hierarchy of latent sequences, where higher levels tick at slower intervals.
1 code implementation • 15 Jan 2021 • Yuhuai Wu, Markus Rabe, Wenda Li, Jimmy Ba, Roger Grosse, Christian Szegedy
While designing inductive bias in neural architectures has been widely studied, we hypothesize that transformer networks are flexible enough to learn inductive bias from suitable generic tasks.
no code implementations • 1 Jan 2021 • Vaibhav Saxena, Jimmy Ba, Danijar Hafner
Deep learning has shown promise for accurately predicting high-dimensional video sequences.
no code implementations • NeurIPS 2021 • Jingling Li, Mozhi Zhang, Keyulu Xu, John P. Dickerson, Jimmy Ba
Our framework measures a network's robustness via the predictive power in its representations -- the test performance of a linear model trained on the learned representations using a small set of clean labels.
1 code implementation • 21 Dec 2020 • Brendon Matusch, Jimmy Ba, Danijar Hafner
Moreover, input entropy and information gain correlate more strongly with human similarity than task reward does, suggesting the use of intrinsic objectives for designing agents that behave similarly to human players.
no code implementations • ICLR 2021 • Keiran Paster, Sheila A. McIlraith, Jimmy Ba
Learning task-agnostic dynamics models in high-dimensional observation spaces can be challenging for model-based RL agents.
9 code implementations • ICLR 2021 • Danijar Hafner, Timothy Lillicrap, Mohammad Norouzi, Jimmy Ba
The world model uses discrete representations and is trained separately from the policy.
Ranked #3 on Atari Games on Atari 2600 Skiing (using extra training data)
1 code implementation • 3 Sep 2020 • Danijar Hafner, Pedro A. Ortega, Jimmy Ba, Thomas Parr, Karl Friston, Nicolas Heess
While the narrow objectives correspond to domain-specific rewards as typical in reinforcement learning, the general objectives maximize information with the environment through latent variable models of input sequences.
1 code implementation • 9 Jul 2020 • Fartash Faghri, David Duvenaud, David J. Fleet, Jimmy Ba
We introduce a method, Gradient Clustering, to minimize the variance of average mini-batch gradient with stratified sampling.
3 code implementations • 8 Jul 2020 • Yuhuai Wu, Honghua Dong, Roger Grosse, Jimmy Ba
In this work, we focus on an analogical reasoning task that contains rich compositional structures, Raven's Progressive Matrices (RPM).
1 code implementation • ICLR 2021 • Yuhuai Wu, Albert Qiaochu Jiang, Jimmy Ba, Roger Grosse
In learning-assisted theorem proving, one of the most critical challenges is to generalize to theorems unlike those seen at training time.
2 code implementations • ICML 2020 • Silviu Pitis, Harris Chan, Stephen Zhao, Bradly Stadie, Jimmy Ba
What goals should a multi-goal reinforcement learning agent pursue during training in long-horizon tasks?
no code implementations • ICLR 2021 • Shun-ichi Amari, Jimmy Ba, Roger Grosse, Xuechen Li, Atsushi Nitanda, Taiji Suzuki, Denny Wu, Ji Xu
While second order optimizers such as natural gradient descent (NGD) often speed up optimization, their effect on generalization has been called into question.
no code implementations • ICLR 2020 • Jimmy Ba, Murat Erdogdu, Taiji Suzuki, Denny Wu, Tianzong Zhang
This paper investigates the generalization properties of two-layer neural networks in high-dimensions, i. e. when the number of samples $n$, features $d$, and neurons $h$ tend to infinity at the same rate.
5 code implementations • ICLR 2020 • Yeming Wen, Dustin Tran, Jimmy Ba
We also apply BatchEnsemble to lifelong learning, where on Split-CIFAR-100, BatchEnsemble yields comparable performance to progressive neural networks while having a much lower computational and memory costs.
2 code implementations • ICLR 2020 • Silviu Pitis, Harris Chan, Kiarash Jamali, Jimmy Ba
When defining distances, the triangle inequality has proven to be a useful constraint, both theoretically--to prove convergence and optimality guarantees--and empirically--as an inductive bias.
21 code implementations • ICLR 2020 • Danijar Hafner, Timothy Lillicrap, Jimmy Ba, Mohammad Norouzi
Learned world models summarize an agent's experience to facilitate learning complex behaviors.
no code implementations • pproximateinference AABI Symposium 2019 • Jimmy Ba, Murat A. Erdogdu, Marzyeh Ghassemi, Taiji Suzuki, Shengyang Sun, Denny Wu, Tianzong Zhang
Particle-based inference algorithm is a promising method to efficiently generate samples for an intractable target distribution by iteratively updating a set of particles.
no code implementations • ICLR 2020 • Yuanhao Wang, Guodong Zhang, Jimmy Ba
Many tasks in modern machine learning can be formulated as finding equilibria in \emph{sequential} games.
no code implementations • 25 Sep 2019 • Qingru Zhang, Yuhuai Wu, Fartash Faghri, Tianzong Zhang, Jimmy Ba
In this paper, we present a non-asymptotic analysis of SVRG under a noisy least squares regression problem.
19 code implementations • NeurIPS 2019 • Michael R. Zhang, James Lucas, Geoffrey Hinton, Jimmy Ba
The vast majority of successful deep neural networks are trained using variants of stochastic gradient descent (SGD) algorithms.
2 code implementations • 3 Jul 2019 • Tingwu Wang, Xuchan Bao, Ignasi Clavera, Jerrick Hoang, Yeming Wen, Eric Langlois, Shunshi Zhang, Guodong Zhang, Pieter Abbeel, Jimmy Ba
Model-based reinforcement learning (MBRL) is widely seen as having the potential to be significantly more sample efficient than model-free RL.
1 code implementation • ICLR 2020 • Tingwu Wang, Jimmy Ba
Model-based reinforcement learning (MBRL) with model-predictive control or online planning has shown great potential for locomotion control tasks in terms of both sample efficiency and asymptotic performance.
1 code implementation • 12 Jun 2019 • Tingwu Wang, Yuhao Zhou, Sanja Fidler, Jimmy Ba
To address the two challenges, we formulate automatic robot design as a graph search problem and perform evolution search in graph space.
1 code implementation • NeurIPS 2019 • Jenny Liu, Aviral Kumar, Jimmy Ba, Jamie Kiros, Kevin Swersky
We introduce graph normalizing flows: a new, reversible graph neural network model for prediction and generation.
no code implementations • ICLR 2019 • Tingwu Wang, Yuhao Zhou, Sanja Fidler, Jimmy Ba
To address the two challenges, we formulate automatic robot design as a graph search problem and perform evolution search in graph space.
no code implementations • ICLR 2019 • Yuhuai Wu, Harris Chan, Jamie Kiros, Sanja Fidler, Jimmy Ba
Sparse reward is one of the most challenging problems in reinforcement learning (RL).
no code implementations • 21 Feb 2019 • Yeming Wen, Kevin Luk, Maxime Gazeau, Guodong Zhang, Harris Chan, Jimmy Ba
We demonstrate that the learning performance of our method is more accurately captured by the structure of the covariance matrix of the noise rather than by the variance of gradients.
1 code implementation • ICLR 2019 • Sheng Jia, Jamie Kiros, Jimmy Ba
Building agents to interact with the web would allow for significant improvements in knowledge understanding and representation learning.
no code implementations • 12 Feb 2019 • Harris Chan, Yuhuai Wu, Jamie Kiros, Sanja Fidler, Jimmy Ba
We first analyze the differences among goal representation, and show that ACTRCE can efficiently solve difficult reinforcement learning problems in challenging 3D navigation tasks, whereas HER with non-language goal representation failed to learn.
1 code implementation • NeurIPS 2018 • Matthew MacKay, Paul Vicol, Jimmy Ba, Roger Grosse
Reversible RNNs---RNNs for which the hidden-to-hidden transition can be reversed---offer a path to reduce the memory requirements of training, as hidden states need not be stored and instead can be recomputed during backpropagation.
no code implementations • 27 Sep 2018 • Yeming Wen, Kevin Luk, Maxime Gazeau, Guodong Zhang, Harris Chan, Jimmy Ba
Unfortunately, a major drawback is the so-called generalization gap: large-batch training typically leads to a degradation in generalization performance of the model as compared to small-batch training.
3 code implementations • ICLR 2018 • Yeming Wen, Paul Vicol, Jimmy Ba, Dustin Tran, Roger Grosse
Stochastic neural net weights are used in a variety of contexts, including regularization, Bayesian neural nets, exploration in reinforcement learning, and evolution strategies.
no code implementations • NeurIPS 2018 • Maziar Sanjabi, Jimmy Ba, Meisam Razaviyayn, Jason D. Lee
A popular GAN formulation is based on the use of Wasserstein distance as a metric between probability distributions.
1 code implementation • ICLR 2018 • Tingwu Wang, Renjie Liao, Jimmy Ba, Sanja Fidler
We address the problem of learning structured policies for continuous control.
no code implementations • ICLR 2018 • James Martens, Jimmy Ba, Matt Johnson
Kronecker-factor Approximate Curvature (Martens & Grosse, 2015) (K-FAC) is a 2nd-order optimization method which has been shown to give state-of-the-art performance on large-scale neural network optimization tasks (Ba et al., 2017).
8 code implementations • NeurIPS 2017 • Yuhuai Wu, Elman Mansimov, Shun Liao, Roger Grosse, Jimmy Ba
In this work, we propose to apply trust region optimization to deep reinforcement learning using a recently proposed Kronecker-factored approximation to the curvature.
3 code implementations • NeurIPS 2016 • Jimmy Ba, Geoffrey Hinton, Volodymyr Mnih, Joel Z. Leibo, Catalin Ionescu
Until recently, research on artificial neural networks was largely restricted to systems with only two types of variable: Neural activities that represent the current or recent input and weights that learn to capture regularities among inputs, outputs and payoffs.
no code implementations • NeurIPS 2015 • Jimmy Ba, Roger Grosse, Ruslan Salakhutdinov, Brendan Frey
Despite their success, convolutional neural networks are computationally expensive because they must examine all image locations.
no code implementations • ICCV 2015 • Jimmy Ba, Kevin Swersky, Sanja Fidler, Ruslan Salakhutdinov
One of the main challenges in Zero-Shot Learning of visual categories is gathering semantic attributes to accompany images.
90 code implementations • 10 Feb 2015 • Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, Richard Zemel, Yoshua Bengio
Inspired by recent work in machine translation and object detection, we introduce an attention based model that automatically learns to describe the content of images.
5 code implementations • 24 Dec 2014 • Jimmy Ba, Volodymyr Mnih, Koray Kavukcuoglu
We present an attention-based model for recognizing multiple objects in images.
85 code implementations • 22 Dec 2014 • Diederik P. Kingma, Jimmy Ba
We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments.
no code implementations • NeurIPS 2013 • Jimmy Ba, Brendan Frey
For example, our model achieves 5. 8% error on the NORB test set, which is better than state-of-the-art results obtained using convolutional architectures. "