Search Results for author: Alec Radford

Found 24 papers, 17 papers with code

Evaluating CLIP: Towards Characterization of Broader Capabilities and Downstream Implications

no code implementations5 Aug 2021 Sandhini Agarwal, Gretchen Krueger, Jack Clark, Alec Radford, Jong Wook Kim, Miles Brundage

Recently, there have been breakthroughs in computer vision ("CV") models that are more generalizable with the advent of models such as CLIP and ALIGN.

Image Classification

Learning to summarize with human feedback

1 code implementation NeurIPS 2020 Nisan Stiennon, Long Ouyang, Jeffrey Wu, Daniel Ziegler, Ryan Lowe, Chelsea Voss, Alec Radford, Dario Amodei, Paul F. Christiano

We collect a large, high-quality dataset of human comparisons between summaries, train a model to predict the human-preferred summary, and use that model as a reward function to fine-tune a summarization policy using reinforcement learning.

Learning to summarize from human feedback

1 code implementation2 Sep 2020 Nisan Stiennon, Long Ouyang, Jeff Wu, Daniel M. Ziegler, Ryan Lowe, Chelsea Voss, Alec Radford, Dario Amodei, Paul Christiano

We collect a large, high-quality dataset of human comparisons between summaries, train a model to predict the human-preferred summary, and use that model as a reward function to fine-tune a summarization policy using reinforcement learning.

Generative Pretraining from Pixels

4 code implementations ICML 2020 Mark Chen, Alec Radford, Rewon Child, Jeff Wu, Heewoo Jun, Prafulla Dhariwal, David Luan, Ilya Sutskever

Inspired by progress in unsupervised representation learning for natural language, we examine whether similar models can learn useful representations for images.

Ranked #12 on Image Classification on STL-10 (using extra training data)

Representation Learning Self-Supervised Image Classification

Fine-Tuning Language Models from Human Preferences

3 code implementations18 Sep 2019 Daniel M. Ziegler, Nisan Stiennon, Jeffrey Wu, Tom B. Brown, Alec Radford, Dario Amodei, Paul Christiano, Geoffrey Irving

Most work on reward learning has used simulated environments, but complex information about values is often expressed in natural language, and we believe reward learning for language is a key to making RL practical and safe for real-world tasks.

Language Modelling

Language Models are Unsupervised Multitask Learners

9 code implementations Preprint 2019 Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever

Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on taskspecific datasets.

 Ranked #1 on Language Modelling on enwik8 (using extra training data)

Common Sense Reasoning Data-to-Text Generation +7

Improving Language Understanding by Generative Pre-Training

5 code implementations Preprint 2018 Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever

We demonstrate that large gains on these tasks can be realized by generative pre-training of a language model on a diverse corpus of unlabeled text, followed by discriminative fine-tuning on each specific task.

Cloze Test Document Classification +6

Improving GANs Using Optimal Transport

2 code implementations ICLR 2018 Tim Salimans, Han Zhang, Alec Radford, Dimitris Metaxas

We present Optimal Transport GAN (OT-GAN), a variant of generative adversarial nets minimizing a new metric measuring the distance between the generator distribution and the data distribution.

Image Generation

Proximal Policy Optimization Algorithms

145 code implementations20 Jul 2017 John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, Oleg Klimov

We propose a new family of policy gradient methods for reinforcement learning, which alternate between sampling data through interaction with the environment, and optimizing a "surrogate" objective function using stochastic gradient ascent.

Dota 2 Policy Gradient Methods +2

Cannot find the paper you are looking for? You can Submit a new open access paper.