Search Results for author: Stephen Merity

Found 8 papers, 6 papers with code

Single Headed Attention RNN: Stop Thinking With Your Head

5 code implementations26 Nov 2019 Stephen Merity

The leading approaches in language modeling are all obsessed with TV shows of my youth - namely Transformers and Sesame Street.

Hyperparameter Optimization Language Modelling

An Analysis of Neural Language Modeling at Multiple Scales

12 code implementations22 Mar 2018 Stephen Merity, Nitish Shirish Keskar, Richard Socher

Many of the leading approaches in language modeling introduce novel, complex and specialized architectures.

Language Modelling

Regularizing and Optimizing LSTM Language Models

47 code implementations ICLR 2018 Stephen Merity, Nitish Shirish Keskar, Richard Socher

Recurrent neural networks (RNNs), such as long short-term memory networks (LSTMs), serve as a fundamental building block for many sequence learning tasks, including machine translation, language modeling, and question answering.

Language Modelling Translation

Revisiting Activation Regularization for Language RNNs

no code implementations3 Aug 2017 Stephen Merity, Bryan McCann, Richard Socher

Both of these techniques require minimal modification to existing RNN architectures and result in performance improvements comparable or superior to more complicated regularization techniques or custom cell architectures.

L2 Regularization Language Modelling

Quasi-Recurrent Neural Networks

8 code implementations5 Nov 2016 James Bradbury, Stephen Merity, Caiming Xiong, Richard Socher

Recurrent neural networks are a powerful tool for modeling sequential data, but the dependence of each timestep's computation on the previous timestep's output limits parallelism and makes RNNs unwieldy for very long sequences.

Language Modelling Machine Translation +4

Pointer Sentinel Mixture Models

9 code implementations26 Sep 2016 Stephen Merity, Caiming Xiong, James Bradbury, Richard Socher

Recent neural network sequence models with softmax classifiers have achieved their best language modeling performance only with very large hidden states and large vocabularies.

Language Modelling

Dynamic Memory Networks for Visual and Textual Question Answering

11 code implementations4 Mar 2016 Caiming Xiong, Stephen Merity, Richard Socher

Neural network architectures with memory and attention mechanisms exhibit certain reasoning capabilities required for question answering.

Question Answering Visual Question Answering

Cannot find the paper you are looking for? You can Submit a new open access paper.