Search Results for author: Oriol Vinyals

Found 111 papers, 68 papers with code

Optimizing Memory Mapping Using Deep Reinforcement Learning

no code implementations11 May 2023 Pengming Wang, Mikita Sazanovich, Berkin Ilbeyi, Phitchaya Mangpo Phothilimthana, Manish Purohit, Han Yang Tay, Ngân Vũ, Miaosen Wang, Cosmin Paduraru, Edouard Leurent, Anton Zhernov, Julian Schrittwieser, Thomas Hubert, Robert Tung, Paula Kurylowicz, Kieran Milan, Oriol Vinyals, Daniel J. Mankowitz

We also introduce a Reinforcement Learning agent, mallocMuZero, and show that it is capable of playing this game to discover new and improved memory mapping solutions that lead to faster execution times on real ML workloads on ML accelerators.

Decision Making reinforcement-learning +2

GraphCast: Learning skillful medium-range global weather forecasting

no code implementations24 Dec 2022 Remi Lam, Alvaro Sanchez-Gonzalez, Matthew Willson, Peter Wirnsberger, Meire Fortunato, Alexander Pritzel, Suman Ravuri, Timo Ewalds, Ferran Alet, Zach Eaton-Rosen, Weihua Hu, Alexander Merose, Stephan Hoyer, George Holland, Jacklynn Stott, Oriol Vinyals, Shakir Mohamed, Peter Battaglia

We introduce a machine-learning (ML)-based weather simulator--called "GraphCast"--which outperforms the most accurate deterministic operational medium-range weather forecasting system in the world, as well as all previous ML baselines.

Weather Forecasting

Integrating Language Guidance into Vision-based Deep Metric Learning

1 code implementation CVPR 2022 Karsten Roth, Oriol Vinyals, Zeynep Akata

This causes learned embedding spaces to encode incomplete semantic context and misrepresent the semantic relation between classes, impacting the generalizability of the learned metric space.

Ranked #6 on Metric Learning on CARS196 (using extra training data)

Metric Learning

Non-isotropy Regularization for Proxy-based Deep Metric Learning

1 code implementation CVPR 2022 Karsten Roth, Oriol Vinyals, Zeynep Akata

Deep Metric Learning (DML) aims to learn representation spaces on which semantic relations can simply be expressed through predefined distance metrics.

Ranked #9 on Metric Learning on CUB-200-2011 (using extra training data)

Metric Learning

HiP: Hierarchical Perceiver

2 code implementations22 Feb 2022 Joao Carreira, Skanda Koppula, Daniel Zoran, Adria Recasens, Catalin Ionescu, Olivier Henaff, Evan Shelhamer, Relja Arandjelovic, Matt Botvinick, Oriol Vinyals, Karen Simonyan, Andrew Zisserman, Andrew Jaegle

This however hinders them from scaling up to the inputs sizes required to process raw high-resolution images or video.

MuZero with Self-competition for Rate Control in VP9 Video Compression

no code implementations14 Feb 2022 Amol Mandhane, Anton Zhernov, Maribeth Rauh, Chenjie Gu, Miaosen Wang, Flora Xue, Wendy Shang, Derek Pang, Rene Claus, Ching-Han Chiang, Cheng Chen, Jingning Han, Angie Chen, Daniel J. Mankowitz, Jackson Broshear, Julian Schrittwieser, Thomas Hubert, Oriol Vinyals, Timothy Mann

Specifically, we target the problem of learning a rate control policy to select the quantization parameters (QP) in the encoding process of libvpx, an open source VP9 video compression library widely used by popular video-on-demand (VOD) services.

Decision Making Quantization +1

Scaling Language Models: Methods, Analysis & Insights from Training Gopher

no code implementations NA 2021 Jack W. Rae, Sebastian Borgeaud, Trevor Cai, Katie Millican, Jordan Hoffmann, Francis Song, John Aslanides, Sarah Henderson, Roman Ring, Susannah Young, Eliza Rutherford, Tom Hennigan, Jacob Menick, Albin Cassirer, Richard Powell, George van den Driessche, Lisa Anne Hendricks, Maribeth Rauh, Po-Sen Huang, Amelia Glaese, Johannes Welbl, Sumanth Dathathri, Saffron Huang, Jonathan Uesato, John Mellor, Irina Higgins, Antonia Creswell, Nat McAleese, Amy Wu, Erich Elsen, Siddhant Jayakumar, Elena Buchatskaya, David Budden, Esme Sutherland, Karen Simonyan, Michela Paganini, Laurent SIfre, Lena Martens, Xiang Lorraine Li, Adhiguna Kuncoro, Aida Nematzadeh, Elena Gribovskaya, Domenic Donato, Angeliki Lazaridou, Arthur Mensch, Jean-Baptiste Lespiau, Maria Tsimpoukelli, Nikolai Grigorev, Doug Fritz, Thibault Sottiaux, Mantas Pajarskas, Toby Pohlen, Zhitao Gong, Daniel Toyama, Cyprien de Masson d'Autume, Yujia Li, Tayfun Terzi, Vladimir Mikulik, Igor Babuschkin, Aidan Clark, Diego de Las Casas, Aurelia Guy, Chris Jones, James Bradbury, Matthew Johnson, Blake Hechtman, Laura Weidinger, Iason Gabriel, William Isaac, Ed Lockhart, Simon Osindero, Laura Rimell, Chris Dyer, Oriol Vinyals, Kareem Ayoub, Jeff Stanway, Lorrayne Bennett, Demis Hassabis, Koray Kavukcuoglu, Geoffrey Irving

Language modelling provides a step towards intelligent communication systems by harnessing large repositories of written human knowledge to better predict and understand the world.

Abstract Algebra Anachronisms +133

WikiGraphs: A Wikipedia Text - Knowledge Graph Paired Dataset

1 code implementation NAACL (TextGraphs) 2021 Luyu Wang, Yujia Li, Ozlem Aslan, Oriol Vinyals

We present a new dataset of Wikipedia articles each paired with a knowledge graph, to facilitate the research in conditional text generation, graph generation and graph representation learning.

Conditional Text Generation Graph Generation +4

The Benchmark Lottery

no code implementations14 Jul 2021 Mostafa Dehghani, Yi Tay, Alexey A. Gritsenko, Zhe Zhao, Neil Houlsby, Fernando Diaz, Donald Metzler, Oriol Vinyals

The world of empirical machine learning (ML) strongly relies on benchmarks in order to determine the relative effectiveness of different algorithms and methods.

Benchmarking BIG-bench Machine Learning +3

Multimodal Few-Shot Learning with Frozen Language Models

no code implementations NeurIPS 2021 Maria Tsimpoukelli, Jacob Menick, Serkan Cabi, S. M. Ali Eslami, Oriol Vinyals, Felix Hill

When trained at sufficient scale, auto-regressive language models exhibit the notable ability to learn a new language task after being prompted with just a few examples.

Few-Shot Learning Language Modelling +2

Vector Quantized Models for Planning

no code implementations8 Jun 2021 Sherjil Ozair, Yazhe Li, Ali Razavi, Ioannis Antonoglou, Aäron van den Oord, Oriol Vinyals

Our key insight is to use discrete autoencoders to capture the multiple possible effects of an action in a stochastic environment.

Perceiver: General Perception with Iterative Attention

10 code implementations4 Mar 2021 Andrew Jaegle, Felix Gimeno, Andrew Brock, Andrew Zisserman, Oriol Vinyals, Joao Carreira

The perception models used in deep learning on the other hand are designed for individual modalities, often relying on domain-specific assumptions such as the local grid structures exploited by virtually all existing vision models.

3D Point Cloud Classification Audio Classification +1

The MineRL 2020 Competition on Sample Efficient Reinforcement Learning using Human Priors

no code implementations26 Jan 2021 William H. Guss, Mario Ynocente Castro, Sam Devlin, Brandon Houghton, Noboru Sean Kuno, Crissman Loomis, Stephanie Milani, Sharada Mohanty, Keisuke Nakata, Ruslan Salakhutdinov, John Schulman, Shinya Shiroshita, Nicholay Topin, Avinash Ummadisingu, Oriol Vinyals

Although deep reinforcement learning has led to breakthroughs in many difficult domains, these successes have required an ever-increasing number of samples, affording only a shrinking segment of the AI community access to their development.

Decision Making Efficient Exploration +2

Strong Generalization and Efficiency in Neural Programs

1 code implementation7 Jul 2020 Yujia Li, Felix Gimeno, Pushmeet Kohli, Oriol Vinyals

We study the problem of learning efficient algorithms that strongly generalize in the framework of neural program induction.

Program induction

Pointer Graph Networks

no code implementations NeurIPS 2020 Petar Veličković, Lars Buesing, Matthew C. Overlan, Razvan Pascanu, Oriol Vinyals, Charles Blundell

This static input structure is often informed purely by insight of the machine learning practitioner, and might not be optimal for the actual task the GNN is solving.

Retrospective Analysis of the 2019 MineRL Competition on Sample Efficient Reinforcement Learning

no code implementations10 Mar 2020 Stephanie Milani, Nicholay Topin, Brandon Houghton, William H. Guss, Sharada P. Mohanty, Keisuke Nakata, Oriol Vinyals, Noboru Sean Kuno

To facilitate research in the direction of sample efficient reinforcement learning, we held the MineRL Competition on Sample Efficient Reinforcement Learning Using Human Priors at the Thirty-third Conference on Neural Information Processing Systems (NeurIPS 2019).

Imitation Learning reinforcement-learning +1

Low Bit-Rate Speech Coding with VQ-VAE and a WaveNet Decoder

no code implementations14 Oct 2019 Cristina Gârbacea, Aäron van den Oord, Yazhe Li, Felicia S. C. Lim, Alejandro Luebs, Oriol Vinyals, Thomas C. Walters

In order to efficiently transmit and store speech signals, speech codecs create a minimally redundant representation of the input signal which is then decoded at the receiver with the best possible perceptual quality.

Classification Accuracy Score for Conditional Generative Models

no code implementations NeurIPS 2019 Suman Ravuri, Oriol Vinyals

Deep generative models (DGMs) of images are now sufficiently mature that they produce nearly photorealistic samples and obtain scores similar to the data distribution on heuristics such as Frechet Inception Distance (FID).

Classification General Classification

Deep reinforcement learning with relational inductive biases

no code implementations ICLR 2019 Vinicius Zambaldi, David Raposo, Adam Santoro, Victor Bapst, Yujia Li, Igor Babuschkin, Karl Tuyls, David Reichert, Timothy Lillicrap, Edward Lockhart, Murray Shanahan, Victoria Langston, Razvan Pascanu, Matthew Botvinick, Oriol Vinyals, Peter Battaglia

We introduce an approach for augmenting model-free deep reinforcement learning agents with a mechanism for relational reasoning over structured representations, which improves performance, learning efficiency, generalization, and interpretability.

reinforcement-learning Reinforcement Learning (RL) +3

Visual Imitation with a Minimal Adversary

no code implementations ICLR 2019 Scott Reed, Yusuf Aytar, Ziyu Wang, Tom Paine, Aäron van den Oord, Tobias Pfaff, Sergio Gomez, Alexander Novikov, David Budden, Oriol Vinyals

The proposed agent can solve a challenging robot manipulation task of block stacking from only video demonstrations and sparse reward, in which the non-imitating agents fail to learn completely.

Imitation Learning Robot Manipulation

Graph Matching Networks for Learning the Similarity of Graph Structured Objects

3 code implementations ICLR 2019 Yujia Li, Chenjie Gu, Thomas Dullien, Oriol Vinyals, Pushmeet Kohli

This paper addresses the challenging problem of retrieval and matching of graph structured objects, and makes two key contributions.

Graph Attention Graph Matching +1

Seeing is Not Necessarily Believing: Limitations of BigGANs for Data Augmentation

no code implementations ICLR Workshop LLD 2019 Suman Ravuri, Oriol Vinyals

In fact, for one model in particular, BigGAN, metrics such as Inception Score or Frechet Inception Distance nearly match those of the dataset, suggesting that these models are close to match-ing the distribution of the training set.

Data Augmentation

Attentive Neural Processes

6 code implementations ICLR 2019 Hyunjik Kim, andriy mnih, Jonathan Schwarz, Marta Garnelo, Ali Eslami, Dan Rosenbaum, Oriol Vinyals, Yee Whye Teh

Neural Processes (NPs) (Garnelo et al 2018a;b) approach regression by learning to map a context set of observed input-output pairs to a distribution over regression functions.

regression

Preventing Posterior Collapse with delta-VAEs

no code implementations ICLR 2019 Ali Razavi, Aäron van den Oord, Ben Poole, Oriol Vinyals

Due to the phenomenon of "posterior collapse," current latent variable generative models pose a challenging design choice that either weakens the capacity of the decoder or requires augmenting the objective so it does not only maximize the likelihood of the data.

Ranked #7 on Image Generation on ImageNet 32x32 (bpd metric)

Image Generation Representation Learning

Generating Diverse Programs with Instruction Conditioned Reinforced Adversarial Learning

no code implementations3 Dec 2018 Aishwarya Agrawal, Mateusz Malinowski, Felix Hill, Ali Eslami, Oriol Vinyals, tejas kulkarni

In this work, we study the setting in which an agent must learn to generate programs for diverse scenes conditioned on a given symbolic instruction.

Meta-Learning with Latent Embedding Optimization

5 code implementations ICLR 2019 Andrei A. Rusu, Dushyant Rao, Jakub Sygnowski, Oriol Vinyals, Razvan Pascanu, Simon Osindero, Raia Hadsell

We show that it is possible to bypass these limitations by learning a data-dependent latent generative representation of model parameters, and performing gradient-based meta-learning in this low-dimensional latent space.

Few-Shot Learning

Universal Transformers

7 code implementations ICLR 2019 Mostafa Dehghani, Stephan Gouws, Oriol Vinyals, Jakob Uszkoreit, Łukasz Kaiser

Feed-forward and convolutional architectures have recently been shown to achieve superior results on some sequence modeling tasks such as machine translation, with the added advantage that they concurrently process all inputs in the sequence, leading to easy parallelization and faster training times.

Inductive Bias LAMBADA +4

Representation Learning with Contrastive Predictive Coding

27 code implementations10 Jul 2018 Aaron van den Oord, Yazhe Li, Oriol Vinyals

The key insight of our model is to learn such representations by predicting the future in latent space by using powerful autoregressive models.

Representation Learning Self-Supervised Image Classification +1

Learning Implicit Generative Models with the Method of Learned Moments

1 code implementation ICML 2018 Suman Ravuri, Shakir Mohamed, Mihaela Rosca, Oriol Vinyals

We propose a method of moments (MoM) algorithm for training large-scale implicit generative models.

Relational Deep Reinforcement Learning

7 code implementations5 Jun 2018 Vinicius Zambaldi, David Raposo, Adam Santoro, Victor Bapst, Yujia Li, Igor Babuschkin, Karl Tuyls, David Reichert, Timothy Lillicrap, Edward Lockhart, Murray Shanahan, Victoria Langston, Razvan Pascanu, Matthew Botvinick, Oriol Vinyals, Peter Battaglia

We introduce an approach for deep reinforcement learning (RL) that improves upon the efficiency, generalization capacity, and interpretability of conventional approaches through structured perception and relational reasoning.

reinforcement-learning Reinforcement Learning (RL) +3

A Study on Overfitting in Deep Reinforcement Learning

1 code implementation18 Apr 2018 Chiyuan Zhang, Oriol Vinyals, Remi Munos, Samy Bengio

We conclude with a general discussion on overfitting in RL and a study of the generalization behaviors from the perspective of inductive bias.

Inductive Bias reinforcement-learning +1

Learning Deep Generative Models of Graphs

no code implementations ICLR 2018 Yujia Li, Oriol Vinyals, Chris Dyer, Razvan Pascanu, Peter Battaglia

Graphs are fundamental data structures which concisely capture the relational structure in many important real-world domains, such as knowledge graphs, physical and social interactions, language, and chemistry.

Graph Generation Knowledge Graphs

Learning to Search with MCTSnets

2 code implementations ICML 2018 Arthur Guez, Théophane Weber, Ioannis Antonoglou, Karen Simonyan, Oriol Vinyals, Daan Wierstra, Rémi Munos, David Silver

They are most typically solved by tree search algorithms that simulate ahead into the future, evaluate future states, and back-up those evaluations to the root of a search tree.

Revisiting Bayes by Backprop

no code implementations ICLR 2018 Meire Fortunato, Charles Blundell, Oriol Vinyals

We also empirically demonstrate how Bayesian RNNs are superior to traditional RNNs on a language modelling benchmark and an image captioning task, as well as showing how each of these methods improve our model over a variety of other schemes for training them.

Image Captioning Language Modelling

Population Based Training of Neural Networks

8 code implementations27 Nov 2017 Max Jaderberg, Valentin Dalibard, Simon Osindero, Wojciech M. Czarnecki, Jeff Donahue, Ali Razavi, Oriol Vinyals, Tim Green, Iain Dunning, Karen Simonyan, Chrisantha Fernando, Koray Kavukcuoglu

Neural networks dominate the modern machine learning landscape, but their training and success still suffer from sensitivity to empirical choices of hyperparameters such as model architecture, loss function, and optimisation algorithm.

Machine Translation Model Selection

Neural Discrete Representation Learning

43 code implementations NeurIPS 2017 Aaron van den Oord, Oriol Vinyals, Koray Kavukcuoglu

Learning useful representations without supervision remains a key challenge in machine learning.

Representation Learning

Hierarchical Representations for Efficient Architecture Search

1 code implementation ICLR 2018 Hanxiao Liu, Karen Simonyan, Oriol Vinyals, Chrisantha Fernando, Koray Kavukcuoglu

We explore efficient neural architecture search methods and show that a simple yet powerful evolutionary algorithm can discover new architectures with excellent performance.

General Classification Image Classification +1

Learning model-based planning from scratch

2 code implementations19 Jul 2017 Razvan Pascanu, Yujia Li, Oriol Vinyals, Nicolas Heess, Lars Buesing, Sebastien Racanière, David Reichert, Théophane Weber, Daan Wierstra, Peter Battaglia

Here we introduce the "Imagination-based Planner", the first model-based, sequential decision-making agent that can learn to construct, evaluate, and execute plans.

Continuous Control Decision Making

Metacontrol for Adaptive Imagination-Based Optimization

1 code implementation7 May 2017 Jessica B. Hamrick, Andrew J. Ballard, Razvan Pascanu, Oriol Vinyals, Nicolas Heess, Peter W. Battaglia

The metacontroller component is a model-free reinforcement learning agent, which decides both how many iterations of the optimization procedure to run, as well as which model to consult on each iteration.

Decision Making

Bayesian Recurrent Neural Networks

4 code implementations10 Apr 2017 Meire Fortunato, Charles Blundell, Oriol Vinyals

We also empirically demonstrate how Bayesian RNNs are superior to traditional RNNs on a language modelling benchmark and an image captioning task, as well as showing how each of these methods improve our model over a variety of other schemes for training them.

Image Captioning Language Modelling

Neural Message Passing for Quantum Chemistry

17 code implementations ICML 2017 Justin Gilmer, Samuel S. Schoenholz, Patrick F. Riley, Oriol Vinyals, George E. Dahl

Supervised learning on molecules has incredible potential to be useful in chemistry, drug discovery, and materials science.

Drug Discovery Formation Energy +3

Understanding Synthetic Gradients and Decoupled Neural Interfaces

1 code implementation ICML 2017 Wojciech Marian Czarnecki, Grzegorz Świrszcz, Max Jaderberg, Simon Osindero, Oriol Vinyals, Koray Kavukcuoglu

When training neural networks, the use of Synthetic Gradients (SG) allows layers or modules to be trained without update locking - without waiting for a true error gradient to be backpropagated - resulting in Decoupled Neural Interfaces (DNIs).

Machine learning prediction errors better than DFT accuracy

no code implementations J. Chem. Theory Comput. 2017 Felix A. Faber, Luke Hutchison, Bing Huang, Justin Gilmer, Samuel S. Schoenholz, George E. Dahl, Oriol Vinyals, Steven Kearnes, Patrick F. Riley, O. Anatole von Lilienfeld

We investigate the impact of choosing regressors and molecular representations for the construction of fast machine learning (ML) models of thirteen electronic ground-state properties of organic molecules.

BIG-bench Machine Learning Drug Discovery +2

Adversarial Evaluation of Dialogue Models

no code implementations27 Jan 2017 Anjuli Kannan, Oriol Vinyals

The recent application of RNN encoder-decoder models has resulted in substantial progress in fully data-driven dialogue systems, but evaluation remains a challenge.

An Online Sequence-to-Sequence Model Using Partial Conditioning

no code implementations NeurIPS 2016 Navdeep Jaitly, Quoc V. Le, Oriol Vinyals, Ilya Sutskever, David Sussillo, Samy Bengio

However, they are unsuitable for tasks that require incremental predictions to be made as more data arrives or tasks that have long input sequences and output sequences.

Lip Reading Sentences in the Wild

no code implementations CVPR 2017 Joon Son Chung, Andrew Senior, Oriol Vinyals, Andrew Zisserman

The goal of this work is to recognise phrases and sentences being spoken by a talking face, with or without the audio.

Ranked #4 on Lipreading on GRID corpus (mixed-speech) (using extra training data)

Lipreading Lip Reading +2

Understanding deep learning requires rethinking generalization

7 code implementations10 Nov 2016 Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, Oriol Vinyals

Despite their massive size, successful deep artificial neural networks can exhibit a remarkably small difference between training and test performance.

Image Classification

Connecting Generative Adversarial Networks and Actor-Critic Methods

no code implementations6 Oct 2016 David Pfau, Oriol Vinyals

Both generative adversarial networks (GAN) in unsupervised learning and actor-critic methods in reinforcement learning (RL) have gained a reputation for being difficult to optimize.

Reinforcement Learning (RL)

Video Pixel Networks

1 code implementation ICML 2017 Nal Kalchbrenner, Aaron van den Oord, Karen Simonyan, Ivo Danihelka, Oriol Vinyals, Alex Graves, Koray Kavukcuoglu

The VPN approaches the best possible performance on the Moving MNIST benchmark, a leap over the previous state of the art, and the generated videos show only minor deviations from the ground truth.

 Ranked #1 on Video Prediction on KTH (Cond metric)

Video Prediction

Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge

19 code implementations21 Sep 2016 Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan

Automatically describing the content of an image is a fundamental problem in artificial intelligence that connects computer vision and natural language processing.

Image Captioning Translation

Decoupled Neural Interfaces using Synthetic Gradients

5 code implementations ICML 2017 Max Jaderberg, Wojciech Marian Czarnecki, Simon Osindero, Oriol Vinyals, Alex Graves, David Silver, Koray Kavukcuoglu

Training directed neural networks typically requires forward-propagating data through a computation graph, followed by backpropagating error signal, to produce weight updates.

Strategic Attentive Writer for Learning Macro-Actions

no code implementations NeurIPS 2016 Alexander, Vezhnevets, Volodymyr Mnih, John Agapiou, Simon Osindero, Alex Graves, Oriol Vinyals, Koray Kavukcuoglu

We present a novel deep recurrent neural network architecture that learns to build implicit plans in an end-to-end manner by purely interacting with an environment in reinforcement learning setting.

Atari Games

Contextual LSTM (CLSTM) models for Large scale NLP tasks

no code implementations19 Feb 2016 Shalini Ghosh, Oriol Vinyals, Brian Strope, Scott Roy, Tom Dean, Larry Heck

We evaluate CLSTM on three specific NLP tasks: word prediction, next sentence selection, and sentence topic prediction.

Paraphrase Generation Question Answering +1

Exploring the Limits of Language Modeling

10 code implementations7 Feb 2016 Rafal Jozefowicz, Oriol Vinyals, Mike Schuster, Noam Shazeer, Yonghui Wu

In this work we explore recent advances in Recurrent Neural Networks for large scale Language Modeling, a task central to language understanding.

Language Modelling

Multilingual Language Processing From Bytes

no code implementations NAACL 2016 Dan Gillick, Cliff Brunk, Oriol Vinyals, Amarnag Subramanya

We describe an LSTM-based model which we call Byte-to-Span (BTS) that reads text as bytes and outputs span annotations of the form [start, length, label] where start positions, lengths, and labels are separate entries in our vocabulary.

named-entity-recognition Named Entity Recognition +2

Order Matters: Sequence to sequence for sets

6 code implementations19 Nov 2015 Oriol Vinyals, Samy Bengio, Manjunath Kudlur

Sequences have become first class citizens in supervised learning thanks to the resurgence of recurrent neural networks.

Language Modelling

Multi-task Sequence to Sequence Learning

no code implementations19 Nov 2015 Minh-Thang Luong, Quoc V. Le, Ilya Sutskever, Oriol Vinyals, Lukasz Kaiser

This paper examines three multi-task learning (MTL) settings for sequence to sequence models: (a) the oneto-many setting - where the encoder is shared between several tasks such as machine translation and syntactic parsing, (b) the many-to-one setting - useful when only the decoder can be shared, as in the case of translation and image caption generation, and (c) the many-to-many setting - where multiple encoders and decoders are shared, which is the case with unsupervised objectives and translation.

Machine Translation Multi-Task Learning +1

Towards Principled Unsupervised Learning

no code implementations19 Nov 2015 Ilya Sutskever, Rafal Jozefowicz, Karol Gregor, Danilo Rezende, Tim Lillicrap, Oriol Vinyals

Supervised learning is successful because it can be solved by the minimization of the training error cost function.

Domain Adaptation

Generating Sentences from a Continuous Space

17 code implementations CONLL 2016 Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz, Samy Bengio

The standard recurrent neural network language model (RNNLM) generates sentences one word at a time and does not work from an explicit global sentence representation.

Language Modelling

A Neural Transducer

no code implementations16 Nov 2015 Navdeep Jaitly, David Sussillo, Quoc V. Le, Oriol Vinyals, Ilya Sutskever, Samy Bengio

However, they are unsuitable for tasks that require incremental predictions to be made as more data arrives or tasks that have long input sequences and output sequences.

Listen, Attend and Spell

40 code implementations5 Aug 2015 William Chan, Navdeep Jaitly, Quoc V. Le, Oriol Vinyals

Unlike traditional DNN-HMM models, this model learns all the components of a speech recognizer jointly.

Language Modelling Reading Comprehension +1

A Neural Conversational Model

19 code implementations19 Jun 2015 Oriol Vinyals, Quoc Le

We find that this straightforward model can generate simple conversations given a large conversational training dataset.

Common Sense Reasoning Natural Language Understanding

Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks

9 code implementations NeurIPS 2015 Samy Bengio, Oriol Vinyals, Navdeep Jaitly, Noam Shazeer

Recurrent Neural Networks can be trained to produce sequences of tokens given some input, as exemplified by recent results in machine translation and image captioning.

Constituency Parsing Image Captioning +2

Pointer Networks

18 code implementations NeurIPS 2015 Oriol Vinyals, Meire Fortunato, Navdeep Jaitly

It differs from the previous attention attempts in that, instead of using attention to blend hidden units of an encoder to a context vector at each decoder step, it uses attention as a pointer to select a member of the input sequence as the output.

Ranked #6 on Point Cloud Completion on ShapeNet (using extra training data)

Combinatorial Optimization Point Cloud Completion

Beyond Short Snippets: Deep Networks for Video Classification

1 code implementation CVPR 2015 Joe Yue-Hei Ng, Matthew Hausknecht, Sudheendra Vijayanarasimhan, Oriol Vinyals, Rajat Monga, George Toderici

Convolutional neural networks (CNNs) have been extensively applied for image recognition problems giving state-of-the-art results on recognition, detection, segmentation and retrieval.

Action Recognition Classification +4

Distilling the Knowledge in a Neural Network

59 code implementations9 Mar 2015 Geoffrey Hinton, Oriol Vinyals, Jeff Dean

A very simple way to improve the performance of almost any machine learning algorithm is to train many different models on the same data and then to average their predictions.

Knowledge Distillation

Grammar as a Foreign Language

8 code implementations NeurIPS 2015 Oriol Vinyals, Lukasz Kaiser, Terry Koo, Slav Petrov, Ilya Sutskever, Geoffrey Hinton

Syntactic constituency parsing is a fundamental problem in natural language processing and has been the subject of intensive research and engineering for decades.

Constituency Parsing

Qualitatively characterizing neural network optimization problems

1 code implementation19 Dec 2014 Ian J. Goodfellow, Oriol Vinyals, Andrew M. Saxe

Training neural networks involves solving large-scale non-convex optimization problems.

Addressing the Rare Word Problem in Neural Machine Translation

5 code implementations IJCNLP 2015 Minh-Thang Luong, Ilya Sutskever, Quoc V. Le, Oriol Vinyals, Wojciech Zaremba

Our experiments on the WMT14 English to French translation task show that this method provides a substantial improvement of up to 2. 8 BLEU points over an equivalent NMT system that does not use this technique.

Machine Translation NMT +2

Sequence to Sequence Learning with Neural Networks

68 code implementations NeurIPS 2014 Ilya Sutskever, Oriol Vinyals, Quoc V. Le

Our method uses a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode the target sequence from the vector.

Ranked #5 on Traffic Prediction on PeMS-M (using extra training data)

Machine Translation Time Series Forecasting +1

Recurrent Neural Network Regularization

20 code implementations8 Sep 2014 Wojciech Zaremba, Ilya Sutskever, Oriol Vinyals

We present a simple regularization technique for Recurrent Neural Networks (RNNs) with Long Short-Term Memory (LSTM) units.

Image Captioning Language Modelling +3

DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition

8 code implementations6 Oct 2013 Jeff Donahue, Yangqing Jia, Oriol Vinyals, Judy Hoffman, Ning Zhang, Eric Tzeng, Trevor Darrell

We evaluate whether features extracted from the activation of a deep convolutional network trained in a fully supervised fashion on a large, fixed set of object recognition tasks can be re-purposed to novel generic tasks.

Domain Adaptation Object Recognition +2

Why Size Matters: Feature Coding as Nystrom Sampling

no code implementations15 Jan 2013 Oriol Vinyals, Yangqing Jia, Trevor Darrell

Recently, the computer vision and machine learning community has been in favor of feature extraction pipelines that rely on a coding step followed by a linear classifier, due to their overall simplicity, well understood properties of linear classifiers, and their computational efficiency.

Learning with Recursive Perceptual Representations

no code implementations NeurIPS 2012 Oriol Vinyals, Yangqing Jia, Li Deng, Trevor Darrell

The use of random projections is key to our method, as we show in the experiments section, in which we observe a consistent improvement over previous --often more complicated-- methods on several vision and speech benchmarks.

Image Classification Object Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.