NeurIPS 2019

PyTorch: An Imperative Style, High-Performance Deep Learning Library

NeurIPS 2019 pytorch/pytorch

Deep learning frameworks have often focused on either usability or speed, but not both.

PyTorch: An Imperative Style, High-Performance Deep Learning Library

NeurIPS 2019 pytorch/pytorch

Deep learning frameworks have often focused on either usability or speed, but not both.

Cross-lingual Language Model Pretraining

NeurIPS 2019 huggingface/transformers

On unsupervised machine translation, we obtain 34. 3 BLEU on WMT'16 German-English, improving the previous state of the art by more than 9 BLEU.

LANGUAGE MODELLING UNSUPERVISED MACHINE TRANSLATION

XLNet: Generalized Autoregressive Pretraining for Language Understanding

NeurIPS 2019 huggingface/transformers

With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling.

DOCUMENT RANKING LANGUAGE MODELLING NATURAL LANGUAGE INFERENCE QUESTION ANSWERING READING COMPREHENSION SEMANTIC TEXTUAL SIMILARITY SENTIMENT ANALYSIS TEXT CLASSIFICATION

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

NeurIPS 2019 huggingface/transformers

As Transfer Learning from large-scale pre-trained models becomes more prevalent in Natural Language Processing (NLP), operating these large models in on-the-edge and/or under constrained computational training or inference budgets remains challenging.

LANGUAGE MODELLING LINGUISTIC ACCEPTABILITY NATURAL LANGUAGE INFERENCE QUESTION ANSWERING SEMANTIC TEXTUAL SIMILARITY SENTIMENT ANALYSIS TRANSFER LEARNING

Unsupervised learning of object structure and dynamics from videos

NeurIPS 2019 google-research/google-research

Extracting and predicting object structure and dynamics from videos without supervision is a major challenge in machine learning.

CONTINUOUS CONTROL OBJECT TRACKING VIDEO PREDICTION

A Benchmark for Interpretability Methods in Deep Neural Networks

NeurIPS 2019 google-research/google-research

We propose an empirical measure of the approximate accuracy of feature importance estimates in deep neural networks.

FEATURE IMPORTANCE IMAGE CLASSIFICATION

Memory Efficient Adaptive Optimization

NeurIPS 2019 google-research/google-research

Adaptive gradient-based optimizers such as Adagrad and Adam are crucial for achieving state-of-the-art performance in machine translation and language modeling.

LANGUAGE MODELLING MACHINE TRANSLATION

Stand-Alone Self-Attention in Vision Models

NeurIPS 2019 google-research/google-research

The natural question that arises is whether attention can be a stand-alone primitive for vision models instead of serving as just an augmentation on top of convolutions.

OBJECT DETECTION

Momentum-Based Variance Reduction in Non-Convex SGD

NeurIPS 2019 google-research/google-research

Variance reduction has emerged in recent years as a strong competitor to stochastic gradient descent in non-convex problems, providing the first algorithms to improve upon the converge rate of stochastic gradient descent for finding first-order critical points.