ICML 2017

Learned Optimizers that Scale and Generalize

ICML 2017 tensorflow/models

Two of the primary barriers to its adoption are an inability to scale to larger problems and a limited ability to generalize to new tasks.

Convolutional Sequence to Sequence Learning

ICML 2017 facebookresearch/fairseq-py

The prevalent approach to sequence to sequence learning maps an input sequence to a variable length output sequence via recurrent neural networks.


ProtoNN: Compressed and Accurate kNN for Resource-scarce Devices

ICML 2017 Microsoft/EdgeML

Such applications demand prediction models with small storage and computational complexity that do not compromise significantly on accuracy.


Resource-efficient Machine Learning in 2 KB RAM for the Internet of Things

ICML 2017 Microsoft/EdgeML

This paper develops a novel tree-based algorithm, called Bonsai, for efficient prediction on IoT devices – such as those based on the Arduino Uno board having an 8 bit ATmega328P microcontroller operating at 16 MHz with no native floating point support, 2 KB RAM and 32 KB read-only flash.


Image-to-Markup Generation with Coarse-to-Fine Attention

ICML 2017 harvardnlp/im2markup

We present a neural encoder-decoder model to convert images into presentational markup based on a scalable coarse-to-fine attention mechanism.


Neural Message Passing for Quantum Chemistry

ICML 2017 Microsoft/gated-graph-neural-network-samples

Supervised learning on molecules has incredible potential to be useful in chemistry, drug discovery, and materials science.


Axiomatic Attribution for Deep Networks

ICML 2017 PAIR-code/saliency

We study the problem of attributing the prediction of a deep network to its input features, a problem previously studied by several other works.

Recurrent Highway Networks

ICML 2017 julian121266/RecurrentHighwayNetworks

We introduce a novel theoretical analysis of recurrent networks based on Gersgorin's circle theorem that illuminates several modeling and optimization issues and improves our understanding of the LSTM cell.


Efficient softmax approximation for GPUs

ICML 2017 facebookresearch/adaptive-softmax

We propose an approximate strategy to efficiently train neural network based language models over very large vocabularies.

Developing Bug-Free Machine Learning Systems With Formal Mathematics

ICML 2017 dselsam/certigrad

As a case study, we implement a new system, Certigrad, for optimizing over stochastic computation graphs, and we generate a formal (i. e. machine-checkable) proof that the gradients sampled by the system are unbiased estimates of the true mathematical gradients.