ICLR 2018

Parameter Space Noise for Exploration

ICLR 2018 tensorflow/models

Combining parameter noise with traditional RL methods allows to combine the best of both worlds.

CONTINUOUS CONTROL

Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep Networks for Thompson Sampling

ICLR 2018 tensorflow/models

At the same time, advances in approximate Bayesian methods have made posterior approximation for flexible neural network models practical.

DECISION MAKING MULTI-ARMED BANDITS

Scalable Private Learning with PATE

ICLR 2018 tensorflow/models

To address those concerns, one promising approach is Private Aggregation of Teacher Ensembles, or PATE, which transfers to a "student" model the knowledge of an ensemble of "teacher" models, with intuitive privacy provided by training teachers on disjoint data and strong privacy guaranteed by noisy aggregation of teachers' answers.

Generating Wikipedia by Summarizing Long Sequences

ICLR 2018 tensorflow/tensor2tensor

We show that generating English Wikipedia articles can be approached as a multi- document summarization of source documents.

DOCUMENT SUMMARIZATION MULTI-DOCUMENT SUMMARIZATION

Discrete Autoencoders for Sequence Models

ICLR 2018 tensorflow/tensor2tensor

We propose to improve the representation in sequence models by augmenting current approaches with an autoencoder that is forced to compress the sequence through an intermediate discrete latent space.

LANGUAGE MODELLING MACHINE TRANSLATION

Depthwise Separable Convolutions for Neural Machine Translation

ICLR 2018 tensorflow/tensor2tensor

In this work, we study how depthwise separable convolutions can be applied to neural machine translation.

MACHINE TRANSLATION

Word Translation Without Parallel Data

ICLR 2018 facebookresearch/MUSE

We finally describe experiments on the English-Esperanto low-resource language pair, on which there only exists a limited amount of parallel data, to show the potential impact of our method in fully unsupervised machine translation.

UNSUPERVISED MACHINE TRANSLATION WORD EMBEDDINGS

Model compression via distillation and quantization

ICLR 2018 NervanaSystems/distiller

Deep neural networks (DNNs) continue to make significant advances, solving tasks from image classification to translation or reinforcement learning.

MODEL COMPRESSION QUANTIZATION

Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning

ICLR 2018 facebookresearch/InferSent

In this work, we present a simple, effective multi-task learning framework for sentence representations that combines the inductive biases of diverse training objectives in a single model.

MULTI-TASK LEARNING NATURAL LANGUAGE INFERENCE PARAPHRASE IDENTIFICATION SEMANTIC TEXTUAL SIMILARITY