Search Results for author: Wojciech Zaremba

Found 32 papers, 21 papers with code

B-tests: Low Variance Kernel Two-Sample Tests

1 code implementation • 8 Jul 2013 • Wojciech Zaremba, Arthur Gretton, Matthew Blaschko

A family of maximum mean discrepancy (MMD) kernel two-sample tests is introduced.

Two-sample testing Vocal Bursts Valence Prediction

Paper
Code

B-test: A Non-parametric, Low Variance Kernel Two-sample Test

no code implementations • NeurIPS 2013 • Wojciech Zaremba, Arthur Gretton, Matthew Blaschko

We propose a family of maximum mean discrepancy (MMD) kernel two-sample tests that have low sample complexity and are consistent.

Vocal Bursts Valence Prediction

Paper
Add Code

Intriguing properties of neural networks

12 code implementations • 21 Dec 2013 • Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, Rob Fergus

Deep neural networks are highly expressive models that have recently achieved state of the art performance on speech and visual recognition tasks.

655

Paper
Code

Spectral Networks and Locally Connected Networks on Graphs

4 code implementations • 21 Dec 2013 • Joan Bruna, Wojciech Zaremba, Arthur Szlam, Yann Lecun

Convolutional Neural Networks are extremely efficient architectures in image and audio recognition tasks, thanks to their ability to exploit the local translational invariance of signal classes over their domain.

Clustering Translation

1,318

Paper
Code

Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation

no code implementations • NeurIPS 2014 • Emily Denton, Wojciech Zaremba, Joan Bruna, Yann Lecun, Rob Fergus

We present techniques for speeding up the test-time evaluation of large convolutional networks, designed for object recognition tasks.

Object Recognition

Paper
Add Code

Learning to Discover Efficient Mathematical Identities

1 code implementation • NeurIPS 2014 • Wojciech Zaremba, Karol Kurach, Rob Fergus

In this paper we explore how machine learning techniques can be applied to the discovery of efficient mathematical identities.

Attribute

Paper
Code

Recurrent Neural Network Regularization

21 code implementations • 8 Sep 2014 • Wojciech Zaremba, Ilya Sutskever, Oriol Vinyals

We present a simple regularization technique for Recurrent Neural Networks (RNNs) with Long Short-Term Memory (LSTM) units.

Ranked #36 on Language Modelling on Penn Treebank (Word Level)

Caption Generation Image Captioning +4

659

Paper
Code

Learning to Execute

6 code implementations • 17 Oct 2014 • Wojciech Zaremba, Ilya Sutskever

Recurrent Neural Networks (RNNs) with Long Short-Term Memory units (LSTM) are widely used because they are expressive and are easy to train.

Learning to Execute

479

Paper
Code

Addressing the Rare Word Problem in Neural Machine Translation

5 code implementations • IJCNLP 2015 • Minh-Thang Luong, Ilya Sutskever, Quoc V. Le, Oriol Vinyals, Wojciech Zaremba

Our experiments on the WMT14 English to French translation task show that this method provides a substantial improvement of up to 2. 8 BLEU points over an equivalent NMT system that does not use this technique.

Ranked #40 on Machine Translation on WMT2014 English-French

Machine Translation NMT +3

1,220

Paper
Code

Reinforcement Learning Neural Turing Machines - Revised

1 code implementation • 4 May 2015 • Wojciech Zaremba, Ilya Sutskever

The capabilities of a model can be extended by providing it with proper Interfaces that interact with the world.

reinforcement-learning Reinforcement Learning (RL)

151

Paper
Code

Convolutional networks and learning invariant to homogeneous multiplicative scalings

no code implementations • 26 Jun 2015 • Mark Tygert, Arthur Szlam, Soumith Chintala, Marc'Aurelio Ranzato, Yuandong Tian, Wojciech Zaremba

The conventional classification schemes -- notably multinomial logistic regression -- used in conjunction with convolutional networks (convnets) are classical in statistics, designed without consideration for the usual coupling with convnets, stochastic gradient descent, and backpropagation.

Classification General Classification +1

Paper
Add Code

Sequence Level Training with Recurrent Neural Networks

5 code implementations • 20 Nov 2015 • Marc'Aurelio Ranzato, Sumit Chopra, Michael Auli, Wojciech Zaremba

Many natural language processing applications use language models to generate text.

Ranked #14 on Machine Translation on IWSLT2015 German-English

Machine Translation

387

Paper
Code

Learning Simple Algorithms from Examples

1 code implementation • 23 Nov 2015 • Wojciech Zaremba, Tomas Mikolov, Armand Joulin, Rob Fergus

We present an approach for learning simple algorithms such as copying, multi-digit addition and single digit multiplication directly from examples.

Q-Learning

180

Paper
Code

OpenAI Gym

45 code implementations • 5 Jun 2016 • Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, Wojciech Zaremba

OpenAI Gym is a toolkit for reinforcement learning research.

reinforcement-learning Reinforcement Learning (RL)

33,848

Paper
Code

Improved Techniques for Training GANs

45 code implementations • NeurIPS 2016 • Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, Xi Chen

We present a variety of new architectural features and training procedures that we apply to the generative adversarial networks (GANs) framework.

Ranked #14 on Conditional Image Generation on CIFAR-10 (Inception score metric)

Conditional Image Generation Semi-Supervised Image Classification

61,324

Paper
Code

Transfer from Simulation to Real World through Learning Deep Inverse Dynamics Model

no code implementations • 11 Oct 2016 • Paul Christiano, Zain Shah, Igor Mordatch, Jonas Schneider, Trevor Blackwell, Joshua Tobin, Pieter Abbeel, Wojciech Zaremba

Nevertheless, often the overall gist of what the policy does in simulation remains valid in the real world.

Friction

Paper
Add Code

Extensions and Limitations of the Neural GPU

1 code implementation • 2 Nov 2016 • Eric Price, Wojciech Zaremba, Ilya Sutskever

We find that these techniques increase the set of algorithmic problems that can be solved by the Neural GPU: we have been able to learn to perform all the arithmetic operations (and generalize to arbitrarily long numbers) when the arguments are given in the decimal representation (which, surprisingly, has not been possible before).

131

Paper
Code

Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World

6 code implementations • 20 Mar 2017 • Josh Tobin, Rachel Fong, Alex Ray, Jonas Schneider, Wojciech Zaremba, Pieter Abbeel

Bridging the 'reality gap' that separates simulated robotics from experiments on hardware could accelerate robotic research through improved data availability.

Object Localization

457

Paper
Code

One-Shot Imitation Learning

no code implementations • NeurIPS 2017 • Yan Duan, Marcin Andrychowicz, Bradly C. Stadie, Jonathan Ho, Jonas Schneider, Ilya Sutskever, Pieter Abbeel, Wojciech Zaremba

A neural net is trained that takes as input one demonstration and the current state (which initially is the initial state of the other demonstration of the pair), and outputs an action with the goal that the resulting sequence of states and actions matches as closely as possible with the second demonstration.

Feature Engineering Imitation Learning +1

Paper
Add Code

Hindsight Experience Replay

26 code implementations • NeurIPS 2017 • Marcin Andrychowicz, Filip Wolski, Alex Ray, Jonas Schneider, Rachel Fong, Peter Welinder, Bob McGrew, Josh Tobin, Pieter Abbeel, Wojciech Zaremba

Dealing with sparse rewards is one of the biggest challenges in Reinforcement Learning (RL).

Reinforcement Learning (RL)

7,867

Paper
Code

Overcoming Exploration in Reinforcement Learning with Demonstrations

3 code implementations • 28 Sep 2017 • Ashvin Nair, Bob McGrew, Marcin Andrychowicz, Wojciech Zaremba, Pieter Abbeel

Exploration in environments with sparse rewards has been a persistent problem in reinforcement learning (RL).

Continuous Control Reinforcement Learning (RL)

813

Paper
Code

Domain Randomization and Generative Models for Robotic Grasping

no code implementations • 17 Oct 2017 • Joshua Tobin, Lukas Biewald, Rocky Duan, Marcin Andrychowicz, Ankur Handa, Vikash Kumar, Bob McGrew, Jonas Schneider, Peter Welinder, Wojciech Zaremba, Pieter Abbeel

In this work, we explore a novel data generation pipeline for training a deep neural network to perform grasp planning that applies the idea of domain randomization to object synthesis.

Object Robotic Grasping

Paper
Add Code

Asymmetric Actor Critic for Image-Based Robot Learning

no code implementations • 18 Oct 2017 • Lerrel Pinto, Marcin Andrychowicz, Peter Welinder, Wojciech Zaremba, Pieter Abbeel

While several recent works have shown promising results in transferring policies trained in simulation to the real world, they often do not fully utilize the advantage of working with a simulator.

Decision Making Reinforcement Learning (RL)

Paper
Add Code

Sim-to-Real Transfer of Robotic Control with Dynamics Randomization

no code implementations • 18 Oct 2017 • Xue Bin Peng, Marcin Andrychowicz, Wojciech Zaremba, Pieter Abbeel

By randomizing the dynamics of the simulator during training, we are able to develop policies that are capable of adapting to very different dynamics, including ones that differ significantly from the dynamics on which the policies were trained.

Robotics Systems and Control

Paper
Add Code

Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research

30 code implementations • 26 Feb 2018 • Matthias Plappert, Marcin Andrychowicz, Alex Ray, Bob McGrew, Bowen Baker, Glenn Powell, Jonas Schneider, Josh Tobin, Maciek Chociej, Peter Welinder, Vikash Kumar, Wojciech Zaremba

The purpose of this technical report is two-fold.

Continuous Control Multi-Goal Reinforcement Learning +3

141

Paper
Code

Learning Dexterous In-Hand Manipulation

no code implementations • 1 Aug 2018 • OpenAI, Marcin Andrychowicz, Bowen Baker, Maciek Chociej, Rafal Jozefowicz, Bob McGrew, Jakub Pachocki, Arthur Petron, Matthias Plappert, Glenn Powell, Alex Ray, Jonas Schneider, Szymon Sidor, Josh Tobin, Peter Welinder, Lilian Weng, Wojciech Zaremba

We use reinforcement learning (RL) to learn dexterous in-hand manipulation policies which can perform vision-based object reorientation on a physical Shadow Dexterous Hand.

Friction reinforcement-learning +1

Paper
Add Code

Solving Rubik's Cube with a Robot Hand

2 code implementations • 16 Oct 2019 • OpenAI, Ilge Akkaya, Marcin Andrychowicz, Maciek Chociej, Mateusz Litwin, Bob McGrew, Arthur Petron, Alex Paino, Matthias Plappert, Glenn Powell, Raphael Ribas, Jonas Schneider, Nikolas Tezak, Jerry Tworek, Peter Welinder, Lilian Weng, Qiming Yuan, Wojciech Zaremba, Lei Zhang

We demonstrate that models trained only in simulation can be used to solve a manipulation problem of unprecedented complexity on a real robot.

Meta-Learning Rubik's Cube

Paper
Code

Predicting Sim-to-Real Transfer with Probabilistic Dynamics Models

no code implementations • 27 Sep 2020 • Lei M. Zhang, Matthias Plappert, Wojciech Zaremba

We further show that the transfer metric can predict the effect of training setups on policy transfer performance.

Paper
Add Code

Asymmetric self-play for automatic goal discovery in robotic manipulation

no code implementations • 13 Jan 2021 • OpenAI OpenAI, Matthias Plappert, Raul Sampedro, Tao Xu, Ilge Akkaya, Vineet Kosaraju, Peter Welinder, Ruben D'Sa, Arthur Petron, Henrique Ponde de Oliveira Pinto, Alex Paino, Hyeonwoo Noh, Lilian Weng, Qiming Yuan, Casey Chu, Wojciech Zaremba

We train a single, goal-conditioned policy that can solve many robotic manipulation tasks, including tasks with previously unseen goals and objects.

Paper
Add Code

A Generalizable Approach to Learning Optimizers

1 code implementation • 2 Jun 2021 • Diogo Almeida, Clemens Winter, Jie Tang, Wojciech Zaremba

A core issue with learning to optimize neural networks has been the lack of generalization to real world problems.

Language Modelling

Paper
Code

Evaluating Large Language Models Trained on Code

13 code implementations • 7 Jul 2021 • Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, Alex Ray, Raul Puri, Gretchen Krueger, Michael Petrov, Heidy Khlaaf, Girish Sastry, Pamela Mishkin, Brooke Chan, Scott Gray, Nick Ryder, Mikhail Pavlov, Alethea Power, Lukasz Kaiser, Mohammad Bavarian, Clemens Winter, Philippe Tillet, Felipe Petroski Such, Dave Cummings, Matthias Plappert, Fotios Chantzis, Elizabeth Barnes, Ariel Herbert-Voss, William Hebgen Guss, Alex Nichol, Alex Paino, Nikolas Tezak, Jie Tang, Igor Babuschkin, Suchir Balaji, Shantanu Jain, William Saunders, Christopher Hesse, Andrew N. Carr, Jan Leike, Josh Achiam, Vedant Misra, Evan Morikawa, Alec Radford, Matthew Knight, Miles Brundage, Mira Murati, Katie Mayer, Peter Welinder, Bob McGrew, Dario Amodei, Sam McCandlish, Ilya Sutskever, Wojciech Zaremba

We introduce Codex, a GPT language model fine-tuned on publicly available code from GitHub, and study its Python code-writing capabilities.

Ranked #1 on Multi-task Language Understanding on BBH-alg

Code Generation Language Modelling +1

7,744

Paper
Code

GPT-4 Technical Report

9 code implementations • Preprint 2023 • OpenAI, :, Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, Red Avila, Igor Babuschkin, Suchir Balaji, Valerie Balcom, Paul Baltescu, Haiming Bao, Mohammad Bavarian, Jeff Belgum, Irwan Bello, Jake Berdine, Gabriel Bernadett-Shapiro, Christopher Berner, Lenny Bogdonoff, Oleg Boiko, Madelaine Boyd, Anna-Luisa Brakman, Greg Brockman, Tim Brooks, Miles Brundage, Kevin Button, Trevor Cai, Rosie Campbell, Andrew Cann, Brittany Carey, Chelsea Carlson, Rory Carmichael, Brooke Chan, Che Chang, Fotis Chantzis, Derek Chen, Sully Chen, Ruby Chen, Jason Chen, Mark Chen, Ben Chess, Chester Cho, Casey Chu, Hyung Won Chung, Dave Cummings, Jeremiah Currier, Yunxing Dai, Cory Decareaux, Thomas Degry, Noah Deutsch, Damien Deville, Arka Dhar, David Dohan, Steve Dowling, Sheila Dunning, Adrien Ecoffet, Atty Eleti, Tyna Eloundou, David Farhi, Liam Fedus, Niko Felix, Simón Posada Fishman, Juston Forte, Isabella Fulford, Leo Gao, Elie Georges, Christian Gibson, Vik Goel, Tarun Gogineni, Gabriel Goh, Rapha Gontijo-Lopes, Jonathan Gordon, Morgan Grafstein, Scott Gray, Ryan Greene, Joshua Gross, Shixiang Shane Gu, Yufei Guo, Chris Hallacy, Jesse Han, Jeff Harris, Yuchen He, Mike Heaton, Johannes Heidecke, Chris Hesse, Alan Hickey, Wade Hickey, Peter Hoeschele, Brandon Houghton, Kenny Hsu, Shengli Hu, Xin Hu, Joost Huizinga, Shantanu Jain, Shawn Jain, Joanne Jang, Angela Jiang, Roger Jiang, Haozhun Jin, Denny Jin, Shino Jomoto, Billie Jonn, Heewoo Jun, Tomer Kaftan, Łukasz Kaiser, Ali Kamali, Ingmar Kanitscheider, Nitish Shirish Keskar, Tabarak Khan, Logan Kilpatrick, Jong Wook Kim, Christina Kim, Yongjik Kim, Jan Hendrik Kirchner, Jamie Kiros, Matt Knight, Daniel Kokotajlo, Łukasz Kondraciuk, Andrew Kondrich, Aris Konstantinidis, Kyle Kosic, Gretchen Krueger, Vishal Kuo, Michael Lampe, Ikai Lan, Teddy Lee, Jan Leike, Jade Leung, Daniel Levy, Chak Ming Li, Rachel Lim, Molly Lin, Stephanie Lin, Mateusz Litwin, Theresa Lopez, Ryan Lowe, Patricia Lue, Anna Makanju, Kim Malfacini, Sam Manning, Todor Markov, Yaniv Markovski, Bianca Martin, Katie Mayer, Andrew Mayne, Bob McGrew, Scott Mayer McKinney, Christine McLeavey, Paul McMillan, Jake McNeil, David Medina, Aalok Mehta, Jacob Menick, Luke Metz, Andrey Mishchenko, Pamela Mishkin, Vinnie Monaco, Evan Morikawa, Daniel Mossing, Tong Mu, Mira Murati, Oleg Murk, David Mély, Ashvin Nair, Reiichiro Nakano, Rajeev Nayak, Arvind Neelakantan, Richard Ngo, Hyeonwoo Noh, Long Ouyang, Cullen O'Keefe, Jakub Pachocki, Alex Paino, Joe Palermo, Ashley Pantuliano, Giambattista Parascandolo, Joel Parish, Emy Parparita, Alex Passos, Mikhail Pavlov, Andrew Peng, Adam Perelman, Filipe de Avila Belbute Peres, Michael Petrov, Henrique Ponde de Oliveira Pinto, Michael, Pokorny, Michelle Pokrass, Vitchyr H. Pong, Tolly Powell, Alethea Power, Boris Power, Elizabeth Proehl, Raul Puri, Alec Radford, Jack Rae, Aditya Ramesh, Cameron Raymond, Francis Real, Kendra Rimbach, Carl Ross, Bob Rotsted, Henri Roussez, Nick Ryder, Mario Saltarelli, Ted Sanders, Shibani Santurkar, Girish Sastry, Heather Schmidt, David Schnurr, John Schulman, Daniel Selsam, Kyla Sheppard, Toki Sherbakov, Jessica Shieh, Sarah Shoker, Pranav Shyam, Szymon Sidor, Eric Sigler, Maddie Simens, Jordan Sitkin, Katarina Slama, Ian Sohl, Benjamin Sokolowsky, Yang song, Natalie Staudacher, Felipe Petroski Such, Natalie Summers, Ilya Sutskever, Jie Tang, Nikolas Tezak, Madeleine B. Thompson, Phil Tillet, Amin Tootoonchian, Elizabeth Tseng, Preston Tuggle, Nick Turley, Jerry Tworek, Juan Felipe Cerón Uribe, Andrea Vallone, Arun Vijayvergiya, Chelsea Voss, Carroll Wainwright, Justin Jay Wang, Alvin Wang, Ben Wang, Jonathan Ward, Jason Wei, CJ Weinmann, Akila Welihinda, Peter Welinder, Jiayi Weng, Lilian Weng, Matt Wiethoff, Dave Willner, Clemens Winter, Samuel Wolrich, Hannah Wong, Lauren Workman, Sherwin Wu, Jeff Wu, Michael Wu, Kai Xiao, Tao Xu, Sarah Yoo, Kevin Yu, Qiming Yuan, Wojciech Zaremba, Rowan Zellers, Chong Zhang, Marvin Zhang, Shengjia Zhao, Tianhao Zheng, Juntang Zhuang, William Zhuk, Barret Zoph

We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs.

Ranked #1 on Long-Context Understanding on Ada-LEval (BestAnswer)

Arithmetic Reasoning Bug fixing +10

13,825

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.