ICLR 2019

Learning Unsupervised Learning Rules

ICLR 2019 tensorflow/models

Here, our desired task (meta-objective) is the performance of the representation on semi-supervised classification, and we meta-learn an algorithm -- an unsupervised weight update rule -- that produces representations that perform well under this meta-objective.

META-LEARNING

Pay Less Attention with Lightweight and Dynamic Convolutions

ICLR 2019 pytorch/fairseq

We predict separate convolution kernels based solely on the current time-step in order to determine the importance of context elements.

ABSTRACTIVE TEXT SUMMARIZATION LANGUAGE MODELLING MACHINE TRANSLATION

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

ICLR 2019 kimiyoung/transformer-xl

Transformer networks have a potential of learning longer-term dependency, but are limited by a fixed-length context in the setting of language modeling.

LANGUAGE MODELLING

Quaternion Recurrent Neural Networks

ICLR 2019 mravanelli/pytorch-kaldi

Recurrent neural networks (RNNs) are powerful architectures to model sequential data, due to their capability to learn short and long-term dependencies between the basic elements of a sequence.

Large-Scale Study of Curiosity-Driven Learning

ICLR 2019 openai/large-scale-curiosity

However, annotating each environment with hand-designed, dense rewards is not scalable, motivating the need for developing reward functions that are intrinsic to the agent.

ATARI GAMES SNES GAMES

Instance-aware Image-to-Image Translation

ICLR 2019 sangwoomo/instagan

Unsupervised image-to-image translation has gained considerable attention due to the recent impressive progress based on generative adversarial networks (GANs).

SEMANTIC SEGMENTATION UNSUPERVISED IMAGE-TO-IMAGE TRANSLATION

Rethinking the Value of Network Pruning

ICLR 2019 Eric-mingjie/rethinking-network-pruning

Our results have several implications: 1) training a large, over-parameterized model is not necessary to obtain an efficient final model, 2) learned "important" weights of the large model are not necessarily useful for the small pruned model, 3) the pruned architecture itself, rather than a set of inherited "important" weights, is what leads to the efficiency benefit in the final model, which suggests that some pruning algorithms could be seen as performing network architecture search.

ARCHITECTURE SEARCH NETWORK PRUNING

Convolutional CRFs for Semantic Segmentation

ICLR 2019 MarvinTeichmann/ConvCRF

For the challenging semantic image segmentation task the most efficient models have traditionally combined the structured modelling capabilities of Conditional Random Fields (CRFs) with the feature extraction power of CNNs.

SEMANTIC SEGMENTATION

Trellis Networks for Sequence Modeling

ICLR 2019 locuslab/trellisnet

On the other hand, we show that truncated recurrent networks are equivalent to trellis networks with special sparsity structure in their weight matrices.

LANGUAGE MODELLING

Looking for ELMo's Friends: Sentence-Level Pretraining Beyond Language Modeling

ICLR 2019 jsalt18-sentence-repl/jiant

Work on the problem of contextualized word representation---the development of reusable neural network components for sentence understanding---has recently seen a surge of progress centered on the unsupervised pretraining task of language modeling with methods like ELMo.

LANGUAGE MODELLING