Search Results for author: Yuhuai Wu

Found 50 papers, 28 papers with code

OPtions as REsponses: Grounding behavioural hierarchies in multi-agent reinforcement learning

no code implementations ICML 2020 Alexander Vezhnevets, Yuhuai Wu, Maria Eckstein, Rémi Leblond, Joel Z. Leibo

This paper investigates generalisation in multi-agent games, where the generality of the agent can be evaluated by playing against opponents it hasn't seen during training.

Multi-agent Reinforcement Learning reinforcement-learning +1

REFACTOR: Learning to Extract Theorems from Proofs

1 code implementation26 Feb 2024 Jin Peng Zhou, Yuhuai Wu, Qiyang Li, Roger Grosse

With newly extracted theorems, we show that the existing proofs in the MetaMath database can be refactored.

Automated Theorem Proving

Length Generalization in Arithmetic Transformers

no code implementations27 Jun 2023 Samy Jelassi, Stéphane d'Ascoli, Carles Domingo-Enrich, Yuhuai Wu, Yuanzhi Li, François Charton

We find that relative position embeddings enable length generalization for simple tasks, such as addition: models trained on $5$-digit numbers can perform $15$-digit sums.

Position

PaLM 2 Technical Report

1 code implementation17 May 2023 Rohan Anil, Andrew M. Dai, Orhan Firat, Melvin Johnson, Dmitry Lepikhin, Alexandre Passos, Siamak Shakeri, Emanuel Taropa, Paige Bailey, Zhifeng Chen, Eric Chu, Jonathan H. Clark, Laurent El Shafey, Yanping Huang, Kathy Meier-Hellstern, Gaurav Mishra, Erica Moreira, Mark Omernick, Kevin Robinson, Sebastian Ruder, Yi Tay, Kefan Xiao, Yuanzhong Xu, Yujing Zhang, Gustavo Hernandez Abrego, Junwhan Ahn, Jacob Austin, Paul Barham, Jan Botha, James Bradbury, Siddhartha Brahma, Kevin Brooks, Michele Catasta, Yong Cheng, Colin Cherry, Christopher A. Choquette-Choo, Aakanksha Chowdhery, Clément Crepy, Shachi Dave, Mostafa Dehghani, Sunipa Dev, Jacob Devlin, Mark Díaz, Nan Du, Ethan Dyer, Vlad Feinberg, Fangxiaoyu Feng, Vlad Fienber, Markus Freitag, Xavier Garcia, Sebastian Gehrmann, Lucas Gonzalez, Guy Gur-Ari, Steven Hand, Hadi Hashemi, Le Hou, Joshua Howland, Andrea Hu, Jeffrey Hui, Jeremy Hurwitz, Michael Isard, Abe Ittycheriah, Matthew Jagielski, Wenhao Jia, Kathleen Kenealy, Maxim Krikun, Sneha Kudugunta, Chang Lan, Katherine Lee, Benjamin Lee, Eric Li, Music Li, Wei Li, Yaguang Li, Jian Li, Hyeontaek Lim, Hanzhao Lin, Zhongtao Liu, Frederick Liu, Marcello Maggioni, Aroma Mahendru, Joshua Maynez, Vedant Misra, Maysam Moussalem, Zachary Nado, John Nham, Eric Ni, Andrew Nystrom, Alicia Parrish, Marie Pellat, Martin Polacek, Alex Polozov, Reiner Pope, Siyuan Qiao, Emily Reif, Bryan Richter, Parker Riley, Alex Castro Ros, Aurko Roy, Brennan Saeta, Rajkumar Samuel, Renee Shelby, Ambrose Slone, Daniel Smilkov, David R. So, Daniel Sohn, Simon Tokumine, Dasha Valter, Vijay Vasudevan, Kiran Vodrahalli, Xuezhi Wang, Pidong Wang, ZiRui Wang, Tao Wang, John Wieting, Yuhuai Wu, Kelvin Xu, Yunhan Xu, Linting Xue, Pengcheng Yin, Jiahui Yu, Qiao Zhang, Steven Zheng, Ce Zheng, Weikang Zhou, Denny Zhou, Slav Petrov, Yonghui Wu

Through extensive evaluations on English and multilingual language, and reasoning tasks, we demonstrate that PaLM 2 has significantly improved quality on downstream tasks across different model sizes, while simultaneously exhibiting faster and more efficient inference compared to PaLM.

Code Generation Common Sense Reasoning +6

Path Independent Equilibrium Models Can Better Exploit Test-Time Computation

no code implementations18 Nov 2022 Cem Anil, Ashwini Pokle, Kaiqu Liang, Johannes Treutlein, Yuhuai Wu, Shaojie Bai, Zico Kolter, Roger Grosse

Designing networks capable of attaining better performance with an increased inference budget is important to facilitate generalization to harder problem instances.

Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs

3 code implementations21 Oct 2022 Albert Q. Jiang, Sean Welleck, Jin Peng Zhou, Wenda Li, Jiacheng Liu, Mateja Jamnik, Timothée Lacroix, Yuhuai Wu, Guillaume Lample

In this work, we introduce Draft, Sketch, and Prove (DSP), a method that maps informal proofs to formal proof sketches, and uses the sketches to guide an automated prover by directing its search to easier sub-problems.

Ranked #3 on Automated Theorem Proving on miniF2F-valid (Pass@100 metric)

Automated Theorem Proving Language Modelling

Exploring Length Generalization in Large Language Models

no code implementations11 Jul 2022 Cem Anil, Yuhuai Wu, Anders Andreassen, Aitor Lewkowycz, Vedant Misra, Vinay Ramasesh, Ambrose Slone, Guy Gur-Ari, Ethan Dyer, Behnam Neyshabur

The ability to extrapolate from short problem instances to longer ones is an important form of out-of-distribution generalization in reasoning tasks, and is crucial when learning from datasets where longer problem instances are rare.

Automated Theorem Proving In-Context Learning +1

Insights into Pre-training via Simpler Synthetic Tasks

1 code implementation21 Jun 2022 Yuhuai Wu, Felix Li, Percy Liang

Second, to our surprise, we find that pre-training on a simple and generic synthetic task defined by the Set function achieves $65\%$ of the benefits, almost matching LIME.

Autoformalization with Large Language Models

no code implementations25 May 2022 Yuhuai Wu, Albert Q. Jiang, Wenda Li, Markus N. Rabe, Charles Staats, Mateja Jamnik, Christian Szegedy

Autoformalization is the process of automatically translating from natural language mathematics to formal specifications and proofs.

 Ranked #1 on Automated Theorem Proving on miniF2F-test (using extra training data)

Automated Theorem Proving Program Synthesis

Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers

no code implementations22 May 2022 Albert Q. Jiang, Wenda Li, Szymon Tworkowski, Konrad Czechowski, Tomasz Odrzygóźdź, Piotr Miłoś, Yuhuai Wu, Mateja Jamnik

Thor increases a language model's success rate on the PISA dataset from $39\%$ to $57\%$, while solving $8. 2\%$ of problems neither language models nor automated theorem provers are able to solve on their own.

Automated Theorem Proving

STaR: Bootstrapping Reasoning With Reasoning

1 code implementation28 Mar 2022 Eric Zelikman, Yuhuai Wu, Jesse Mu, Noah D. Goodman

We show that STaR significantly improves performance on multiple datasets compared to a model fine-tuned to directly predict final answers, and performs comparably to fine-tuning a 30$\times$ larger state-of-the-art language model on CommensenseQA.

Common Sense Reasoning Language Modelling +1

Memorizing Transformers

3 code implementations ICLR 2022 Yuhuai Wu, Markus N. Rabe, DeLesley Hutchins, Christian Szegedy

Language models typically need to be trained or finetuned in order to acquire new knowledge, which involves updating their weights.

Language Modelling Math

Block-Recurrent Transformers

3 code implementations11 Mar 2022 DeLesley Hutchins, Imanol Schlag, Yuhuai Wu, Ethan Dyer, Behnam Neyshabur

It is merely a transformer layer: it uses self-attention and cross-attention to efficiently compute a recurrent function over a large set of state vectors and tokens.

Language Modelling

Learning to Give Checkable Answers with Prover-Verifier Games

no code implementations27 Aug 2021 Cem Anil, Guodong Zhang, Yuhuai Wu, Roger Grosse

We develop instantiations of the PVG for two algorithmic tasks, and show that in practice, the verifier learns a robust decision rule that is able to receive useful and reliable information from an untrusted prover.

Subgoal Search For Complex Reasoning Tasks

1 code implementation NeurIPS 2021 Konrad Czechowski, Tomasz Odrzygóźdź, Marek Zbysiński, Michał Zawalski, Krzysztof Olejnik, Yuhuai Wu, Łukasz Kuciński, Piotr Miłoś

In this paper, we implement kSubS using a transformer-based subgoal module coupled with the classical best-first search framework.

Rubik's Cube

On the Opportunities and Risks of Foundation Models

2 code implementations16 Aug 2021 Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri Chatterji, Annie Chen, Kathleen Creel, Jared Quincy Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh, Li Fei-Fei, Chelsea Finn, Trevor Gale, Lauren Gillespie, Karan Goel, Noah Goodman, Shelby Grossman, Neel Guha, Tatsunori Hashimoto, Peter Henderson, John Hewitt, Daniel E. Ho, Jenny Hong, Kyle Hsu, Jing Huang, Thomas Icard, Saahil Jain, Dan Jurafsky, Pratyusha Kalluri, Siddharth Karamcheti, Geoff Keeling, Fereshte Khani, Omar Khattab, Pang Wei Koh, Mark Krass, Ranjay Krishna, Rohith Kuditipudi, Ananya Kumar, Faisal Ladhak, Mina Lee, Tony Lee, Jure Leskovec, Isabelle Levent, Xiang Lisa Li, Xuechen Li, Tengyu Ma, Ali Malik, Christopher D. Manning, Suvir Mirchandani, Eric Mitchell, Zanele Munyikwa, Suraj Nair, Avanika Narayan, Deepak Narayanan, Ben Newman, Allen Nie, Juan Carlos Niebles, Hamed Nilforoshan, Julian Nyarko, Giray Ogut, Laurel Orr, Isabel Papadimitriou, Joon Sung Park, Chris Piech, Eva Portelance, Christopher Potts, aditi raghunathan, Rob Reich, Hongyu Ren, Frieda Rong, Yusuf Roohani, Camilo Ruiz, Jack Ryan, Christopher Ré, Dorsa Sadigh, Shiori Sagawa, Keshav Santhanam, Andy Shih, Krishnan Srinivasan, Alex Tamkin, Rohan Taori, Armin W. Thomas, Florian Tramèr, Rose E. Wang, William Wang, Bohan Wu, Jiajun Wu, Yuhuai Wu, Sang Michael Xie, Michihiro Yasunaga, Jiaxuan You, Matei Zaharia, Michael Zhang, Tianyi Zhang, Xikun Zhang, Yuhui Zhang, Lucia Zheng, Kaitlyn Zhou, Percy Liang

AI is undergoing a paradigm shift with the rise of models (e. g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks.

Transfer Learning

Nonlinear Invariant Risk Minimization: A Causal Approach

no code implementations24 Feb 2021 Chaochao Lu, Yuhuai Wu, Jośe Miguel Hernández-Lobato, Bernhard Schölkopf

Finally, in the discussion, we further explore the aforementioned assumption and propose a more general hypothesis, called the Agnostic Hypothesis: there exist a set of hidden causal factors affecting both inputs and outcomes.

BIG-bench Machine Learning Representation Learning

Proof Artifact Co-training for Theorem Proving with Language Models

4 code implementations ICLR 2022 Jesse Michael Han, Jason Rute, Yuhuai Wu, Edward W. Ayers, Stanislas Polu

Labeled data for imitation learning of theorem proving in large libraries of formalized mathematics is scarce as such libraries require years of concentrated effort by human specialists to be built.

Automated Theorem Proving Imitation Learning +1

LIME: Learning Inductive Bias for Primitives of Mathematical Reasoning

1 code implementation15 Jan 2021 Yuhuai Wu, Markus Rabe, Wenda Li, Jimmy Ba, Roger Grosse, Christian Szegedy

While designing inductive bias in neural architectures has been widely studied, we hypothesize that transformer networks are flexible enough to learn inductive bias from suitable generic tasks.

Inductive Bias Mathematical Reasoning

Invariant Causal Representation Learning

no code implementations1 Jan 2021 Chaochao Lu, Yuhuai Wu, José Miguel Hernández-Lobato, Bernhard Schölkopf

As an alternative, we propose Invariant Causal Representation Learning (ICRL), a learning paradigm that enables out-of-distribution generalization in the nonlinear setting (i. e., nonlinear representations and nonlinear classifiers).

Out-of-Distribution Generalization Representation Learning

The Scattering Compositional Learner: Discovering Objects, Attributes, Relationships in Analogical Reasoning

3 code implementations8 Jul 2020 Yuhuai Wu, Honghua Dong, Roger Grosse, Jimmy Ba

In this work, we focus on an analogical reasoning task that contains rich compositional structures, Raven's Progressive Matrices (RPM).

Zero-shot Generalization

Learning Branching Heuristics for Propositional Model Counting

no code implementations7 Jul 2020 Pashootan Vaezipoor, Gil Lederman, Yuhuai Wu, Chris J. Maddison, Roger Grosse, Sanjit A. Seshia, Fahiem Bacchus

In addition to step count improvements, Neuro# can also achieve orders of magnitude wall-clock speedups over the vanilla solver on larger instances in some problem families, despite the runtime overhead of querying the model.

INT: An Inequality Benchmark for Evaluating Generalization in Theorem Proving

1 code implementation ICLR 2021 Yuhuai Wu, Albert Qiaochu Jiang, Jimmy Ba, Roger Grosse

In learning-assisted theorem proving, one of the most critical challenges is to generalize to theorems unlike those seen at training time.

Automated Theorem Proving

IsarStep: a Benchmark for High-level Mathematical Reasoning

2 code implementations ICLR 2021 Wenda Li, Lei Yu, Yuhuai Wu, Lawrence C. Paulson

In this paper, we present a benchmark for high-level mathematical reasoning and study the reasoning capabilities of neural sequence-to-sequence models.

Mathematical Proofs Mathematical Reasoning +1

Options as responses: Grounding behavioural hierarchies in multi-agent RL

no code implementations4 Jun 2019 Alexander Sasha Vezhnevets, Yuhuai Wu, Remi Leblond, Joel Z. Leibo

This paper investigates generalisation in multi-agent games, where the generality of the agent can be evaluated by playing against opponents it hasn't seen during training.

Multi-agent Reinforcement Learning Reinforcement Learning (RL)

Concurrent Meta Reinforcement Learning

1 code implementation7 Mar 2019 Emilio Parisotto, Soham Ghosh, Sai Bhargav Yalamanchi, Varsha Chinnaobireddy, Yuhuai Wu, Ruslan Salakhutdinov

In this multi-agent setting, a set of parallel agents are executed in the same environment and each of these "rollout" agents are given the means to communicate with each other.

Efficient Exploration Meta-Learning +4

ACTRCE: Augmenting Experience via Teacher's Advice For Multi-Goal Reinforcement Learning

no code implementations12 Feb 2019 Harris Chan, Yuhuai Wu, Jamie Kiros, Sanja Fidler, Jimmy Ba

We first analyze the differences among goal representation, and show that ACTRCE can efficiently solve difficult reinforcement learning problems in challenging 3D navigation tasks, whereas HER with non-language goal representation failed to learn.

Multi-Goal Reinforcement Learning reinforcement-learning +1

The Importance of Sampling inMeta-Reinforcement Learning

no code implementations NeurIPS 2018 Bradly Stadie, Ge Yang, Rein Houthooft, Peter Chen, Yan Duan, Yuhuai Wu, Pieter Abbeel, Ilya Sutskever

Results are presented on a new environment we call `Krazy World': a difficult high-dimensional gridworld which is designed to highlight the importance of correctly differentiating through sampling distributions in meta-reinforcement learning.

Meta Reinforcement Learning reinforcement-learning +1

Understanding Short-Horizon Bias in Stochastic Meta-Optimization

1 code implementation ICLR 2018 Yuhuai Wu, Mengye Ren, Renjie Liao, Roger Grosse

Careful tuning of the learning rate, or even schedules thereof, can be crucial to effective neural net training.

An Empirical Analysis of Proximal Policy Optimization with Kronecker-factored Natural Gradients

no code implementations17 Jan 2018 Jiaming Song, Yuhuai Wu

In this technical report, we consider an approach that combines the PPO objective and K-FAC natural gradient optimization, for which we call PPOKFAC.

Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation

8 code implementations NeurIPS 2017 Yuhuai Wu, Elman Mansimov, Shun Liao, Roger Grosse, Jimmy Ba

In this work, we propose to apply trust region optimization to deep reinforcement learning using a recently proposed Kronecker-factored approximation to the curvature.

Atari Games Continuous Control +2

Sticking the Landing: Simple, Lower-Variance Gradient Estimators for Variational Inference

1 code implementation NeurIPS 2017 Geoffrey Roeder, Yuhuai Wu, David Duvenaud

We propose a simple and general variant of the standard reparameterized gradient estimator for the variational evidence lower bound.

Variational Inference

On the Quantitative Analysis of Decoder-Based Generative Models

2 code implementations14 Nov 2016 Yuhuai Wu, Yuri Burda, Ruslan Salakhutdinov, Roger Grosse

The past several years have seen remarkable progress in generative models which produce convincing samples of images and other modalities.

On Multiplicative Integration with Recurrent Neural Networks

no code implementations NeurIPS 2016 Yuhuai Wu, Saizheng Zhang, Ying Zhang, Yoshua Bengio, Ruslan Salakhutdinov

We introduce a general and simple structural design called Multiplicative Integration (MI) to improve recurrent neural networks (RNNs).

Language Modelling

Path-Normalized Optimization of Recurrent Neural Networks with ReLU Activations

no code implementations NeurIPS 2016 Behnam Neyshabur, Yuhuai Wu, Ruslan Salakhutdinov, Nathan Srebro

We investigate the parameter-space geometry of recurrent neural networks (RNNs), and develop an adaptation of path-SGD optimization method, attuned to this geometry, that can learn plain RNNs with ReLU activations.

STDP as presynaptic activity times rate of change of postsynaptic activity

no code implementations19 Sep 2015 Yoshua Bengio, Thomas Mesnard, Asja Fischer, Saizheng Zhang, Yuhuai Wu

We introduce a weight update formula that is expressed only in terms of firing rates and their derivatives and that results in changes consistent with those associated with spike-timing dependent plasticity (STDP) rules and biological observations, even though the explicit timing of spikes is not needed.

Cannot find the paper you are looking for? You can Submit a new open access paper.