no code implementations • ICML 2020 • Alexander Vezhnevets, Yuhuai Wu, Maria Eckstein, Rémi Leblond, Joel Z. Leibo
This paper investigates generalisation in multi-agent games, where the generality of the agent can be evaluated by playing against opponents it hasn't seen during training.
no code implementations • 25 May 2022 • Yuhuai Wu, Albert Q. Jiang, Wenda Li, Markus N. Rabe, Charles Staats, Mateja Jamnik, Christian Szegedy
Autoformalization is the process of automatically translating from natural language mathematics to formal specifications and proofs.
Ranked #1 on
Automated Theorem Proving
on miniF2F-test
(using extra training data)
no code implementations • 22 May 2022 • Albert Q. Jiang, Wenda Li, Szymon Tworkowski, Konrad Czechowski, Tomasz Odrzygóźdź, Piotr Miłoś, Yuhuai Wu, Mateja Jamnik
Thor increases a language model's success rate on the PISA dataset from $39\%$ to $57\%$, while solving $8. 2\%$ of problems neither language models nor automated theorem provers are able to solve on their own.
Ranked #2 on
Automated Theorem Proving
on miniF2F-test
no code implementations • 28 Mar 2022 • Eric Zelikman, Yuhuai Wu, Jesse Mu, Noah D. Goodman
We show that STaR significantly improves performance on multiple datasets compared to a model fine-tuned to directly predict final answers, and performs comparably to fine-tuning a 30$\times$ larger state-of-the-art language model on CommensenseQA.
Ranked #9 on
Common Sense Reasoning
on CommonsenseQA
1 code implementation • ICLR 2022 • Yuhuai Wu, Markus N. Rabe, DeLesley Hutchins, Christian Szegedy
Language models typically need to be trained or finetuned in order to acquire new knowledge, which involves updating their weights.
no code implementations • 11 Mar 2022 • DeLesley Hutchins, Imanol Schlag, Yuhuai Wu, Ethan Dyer, Behnam Neyshabur
It is merely a transformer layer: it uses self-attention and cross-attention to efficiently compute a recurrent function over a large set of state vectors and tokens.
3 code implementations • 26 Oct 2021 • Piotr Nawrot, Szymon Tworkowski, Michał Tyrolski, Łukasz Kaiser, Yuhuai Wu, Christian Szegedy, Henryk Michalewski
Transformer models yield impressive results on many NLP and sequence modeling tasks.
Ranked #4 on
Image Generation
on ImageNet 32x32
(bpd metric)
no code implementations • ICLR 2022 • Chaochao Lu, Yuhuai Wu, José Miguel Hernández-Lobato, Bernhard Schölkopf
Extensive experiments on both synthetic and real-world datasets show that our approach outperforms a variety of baseline methods.
no code implementations • 29 Sep 2021 • Jin Peng Zhou, Yuhuai Wu, Qiyang Li, Roger Baker Grosse
With newly extracted theorems, we show that the existing proofs in the MetaMath database can be refactored.
no code implementations • 27 Aug 2021 • Cem Anil, Guodong Zhang, Yuhuai Wu, Roger Grosse
We develop instantiations of the PVG for two algorithmic tasks, and show that in practice, the verifier learns a robust decision rule that is able to receive useful and reliable information from an untrusted prover.
1 code implementation • NeurIPS 2021 • Konrad Czechowski, Tomasz Odrzygóźdź, Marek Zbysiński, Michał Zawalski, Krzysztof Olejnik, Yuhuai Wu, Łukasz Kuciński, Piotr Miłoś
In this paper, we implement kSubS using a transformer-based subgoal module coupled with the classical best-first search framework.
no code implementations • 16 Aug 2021 • Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri Chatterji, Annie Chen, Kathleen Creel, Jared Quincy Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh, Li Fei-Fei, Chelsea Finn, Trevor Gale, Lauren Gillespie, Karan Goel, Noah Goodman, Shelby Grossman, Neel Guha, Tatsunori Hashimoto, Peter Henderson, John Hewitt, Daniel E. Ho, Jenny Hong, Kyle Hsu, Jing Huang, Thomas Icard, Saahil Jain, Dan Jurafsky, Pratyusha Kalluri, Siddharth Karamcheti, Geoff Keeling, Fereshte Khani, Omar Khattab, Pang Wei Kohd, Mark Krass, Ranjay Krishna, Rohith Kuditipudi, Ananya Kumar, Faisal Ladhak, Mina Lee, Tony Lee, Jure Leskovec, Isabelle Levent, Xiang Lisa Li, Xuechen Li, Tengyu Ma, Ali Malik, Christopher D. Manning, Suvir Mirchandani, Eric Mitchell, Zanele Munyikwa, Suraj Nair, Avanika Narayan, Deepak Narayanan, Ben Newman, Allen Nie, Juan Carlos Niebles, Hamed Nilforoshan, Julian Nyarko, Giray Ogut, Laurel Orr, Isabel Papadimitriou, Joon Sung Park, Chris Piech, Eva Portelance, Christopher Potts, aditi raghunathan, Rob Reich, Hongyu Ren, Frieda Rong, Yusuf Roohani, Camilo Ruiz, Jack Ryan, Christopher Ré, Dorsa Sadigh, Shiori Sagawa, Keshav Santhanam, Andy Shih, Krishnan Srinivasan, Alex Tamkin, Rohan Taori, Armin W. Thomas, Florian Tramèr, Rose E. Wang, William Wang, Bohan Wu, Jiajun Wu, Yuhuai Wu, Sang Michael Xie, Michihiro Yasunaga, Jiaxuan You, Matei Zaharia, Michael Zhang, Tianyi Zhang, Xikun Zhang, Yuhui Zhang, Lucia Zheng, Kaitlyn Zhou, Percy Liang
AI is undergoing a paradigm shift with the rise of models (e. g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks.
no code implementations • 24 Feb 2021 • Chaochao Lu, Yuhuai Wu, Jośe Miguel Hernández-Lobato, Bernhard Schölkopf
Finally, in the discussion, we further explore the aforementioned assumption and propose a more general hypothesis, called the Agnostic Hypothesis: there exist a set of hidden causal factors affecting both inputs and outcomes.
3 code implementations • ICLR 2022 • Jesse Michael Han, Jason Rute, Yuhuai Wu, Edward W. Ayers, Stanislas Polu
Labeled data for imitation learning of theorem proving in large libraries of formalized mathematics is scarce as such libraries require years of concentrated effort by human specialists to be built.
1 code implementation • 15 Jan 2021 • Yuhuai Wu, Markus Rabe, Wenda Li, Jimmy Ba, Roger Grosse, Christian Szegedy
While designing inductive bias in neural architectures has been widely studied, we hypothesize that transformer networks are flexible enough to learn inductive bias from suitable generic tasks.
no code implementations • 1 Jan 2021 • Chaochao Lu, Yuhuai Wu, José Miguel Hernández-Lobato, Bernhard Schölkopf
As an alternative, we propose Invariant Causal Representation Learning (ICRL), a learning paradigm that enables out-of-distribution generalization in the nonlinear setting (i. e., nonlinear representations and nonlinear classifiers).
3 code implementations • 8 Jul 2020 • Yuhuai Wu, Honghua Dong, Roger Grosse, Jimmy Ba
In this work, we focus on an analogical reasoning task that contains rich compositional structures, Raven's Progressive Matrices (RPM).
no code implementations • 7 Jul 2020 • Pashootan Vaezipoor, Gil Lederman, Yuhuai Wu, Chris J. Maddison, Roger Grosse, Edward Lee, Sanjit A. Seshia, Fahiem Bacchus
Propositional model counting or #SAT is the problem of computing the number of satisfying assignments of a Boolean formula and many discrete probabilistic inference problems can be translated into a model counting problem to be solved by #SAT solvers.
1 code implementation • ICLR 2021 • Yuhuai Wu, Albert Qiaochu Jiang, Jimmy Ba, Roger Grosse
In learning-assisted theorem proving, one of the most critical challenges is to generalize to theorems unlike those seen at training time.
1 code implementation • ICLR 2021 • Wenda Li, Lei Yu, Yuhuai Wu, Lawrence C. Paulson
In this paper, we present a benchmark for high-level mathematical reasoning and study the reasoning capabilities of neural sequence-to-sequence models.
no code implementations • 25 Sep 2019 • Qingru Zhang, Yuhuai Wu, Fartash Faghri, Tianzong Zhang, Jimmy Ba
In this paper, we present a non-asymptotic analysis of SVRG under a noisy least squares regression problem.
1 code implementation • 4 Jun 2019 • Alexander Sasha Vezhnevets, Yuhuai Wu, Remi Leblond, Joel Z. Leibo
This paper investigates generalisation in multi-agent games, where the generality of the agent can be evaluated by playing against opponents it hasn't seen during training.
no code implementations • ICLR 2019 • Yuhuai Wu, Harris Chan, Jamie Kiros, Sanja Fidler, Jimmy Ba
Sparse reward is one of the most challenging problems in reinforcement learning (RL).
1 code implementation • 7 Mar 2019 • Emilio Parisotto, Soham Ghosh, Sai Bhargav Yalamanchi, Varsha Chinnaobireddy, Yuhuai Wu, Ruslan Salakhutdinov
In this multi-agent setting, a set of parallel agents are executed in the same environment and each of these "rollout" agents are given the means to communicate with each other.
Ranked #1 on
Meta Reinforcement Learning
on 10-Monty-Hall
no code implementations • 12 Feb 2019 • Harris Chan, Yuhuai Wu, Jamie Kiros, Sanja Fidler, Jimmy Ba
We first analyze the differences among goal representation, and show that ACTRCE can efficiently solve difficult reinforcement learning problems in challenging 3D navigation tasks, whereas HER with non-language goal representation failed to learn.
no code implementations • NeurIPS 2018 • Bradly Stadie, Ge Yang, Rein Houthooft, Peter Chen, Yan Duan, Yuhuai Wu, Pieter Abbeel, Ilya Sutskever
Results are presented on a new environment we call `Krazy World': a difficult high-dimensional gridworld which is designed to highlight the importance of correctly differentiating through sampling distributions in meta-reinforcement learning.
1 code implementation • ICLR 2018 • Yuhuai Wu, Mengye Ren, Renjie Liao, Roger Grosse
Careful tuning of the learning rate, or even schedules thereof, can be crucial to effective neural net training.
7 code implementations • ICLR 2018 • Bradly C. Stadie, Ge Yang, Rein Houthooft, Xi Chen, Yan Duan, Yuhuai Wu, Pieter Abbeel, Ilya Sutskever
We consider the problem of exploration in meta reinforcement learning.
no code implementations • 17 Jan 2018 • Jiaming Song, Yuhuai Wu
In this technical report, we consider an approach that combines the PPO objective and K-FAC natural gradient optimization, for which we call PPOKFAC.
7 code implementations • ICLR 2018 • Will Grathwohl, Dami Choi, Yuhuai Wu, Geoffrey Roeder, David Duvenaud
Gradient-based optimization is the foundation of deep learning and reinforcement learning.
8 code implementations • NeurIPS 2017 • Yuhuai Wu, Elman Mansimov, Shun Liao, Roger Grosse, Jimmy Ba
In this work, we propose to apply trust region optimization to deep reinforcement learning using a recently proposed Kronecker-factored approximation to the curvature.
1 code implementation • NeurIPS 2017 • Geoffrey Roeder, Yuhuai Wu, David Duvenaud
We propose a simple and general variant of the standard reparameterized gradient estimator for the variational evidence lower bound.
2 code implementations • 14 Nov 2016 • Yuhuai Wu, Yuri Burda, Ruslan Salakhutdinov, Roger Grosse
The past several years have seen remarkable progress in generative models which produce convincing samples of images and other modalities.
no code implementations • NeurIPS 2016 • Yuhuai Wu, Saizheng Zhang, Ying Zhang, Yoshua Bengio, Ruslan Salakhutdinov
We introduce a general and simple structural design called Multiplicative Integration (MI) to improve recurrent neural networks (RNNs).
no code implementations • NeurIPS 2016 • Behnam Neyshabur, Yuhuai Wu, Ruslan Salakhutdinov, Nathan Srebro
We investigate the parameter-space geometry of recurrent neural networks (RNNs), and develop an adaptation of path-SGD optimization method, attuned to this geometry, that can learn plain RNNs with ReLU activations.
no code implementations • NeurIPS 2016 • Saizheng Zhang, Yuhuai Wu, Tong Che, Zhouhan Lin, Roland Memisevic, Ruslan Salakhutdinov, Yoshua Bengio
In this paper, we systematically analyze the connecting architectures of recurrent neural networks (RNNs).
Ranked #21 on
Language Modelling
on Text8
no code implementations • 19 Sep 2015 • Yoshua Bengio, Thomas Mesnard, Asja Fischer, Saizheng Zhang, Yuhuai Wu
We introduce a weight update formula that is expressed only in terms of firing rates and their derivatives and that results in changes consistent with those associated with spike-timing dependent plasticity (STDP) rules and biological observations, even though the explicit timing of spikes is not needed.