5 code implementations • 26 Nov 2019 • Stephen Merity
The leading approaches in language modeling are all obsessed with TV shows of my youth - namely Transformers and Sesame Street.
Ranked #27 on Language Modelling on enwik8
12 code implementations • 22 Mar 2018 • Stephen Merity, Nitish Shirish Keskar, Richard Socher
Many of the leading approaches in language modeling introduce novel, complex and specialized architectures.
no code implementations • ICLR 2018 • Martin Schrimpf, Stephen Merity, James Bradbury, Richard Socher
The process of designing neural architectures requires expert knowledge and extensive trial and error.
47 code implementations • ICLR 2018 • Stephen Merity, Nitish Shirish Keskar, Richard Socher
Recurrent neural networks (RNNs), such as long short-term memory networks (LSTMs), serve as a fundamental building block for many sequence learning tasks, including machine translation, language modeling, and question answering.
Ranked #17 on Language Modelling on Penn Treebank (Word Level)
no code implementations • 3 Aug 2017 • Stephen Merity, Bryan McCann, Richard Socher
Both of these techniques require minimal modification to existing RNN architectures and result in performance improvements comparable or superior to more complicated regularization techniques or custom cell architectures.
8 code implementations • 5 Nov 2016 • James Bradbury, Stephen Merity, Caiming Xiong, Richard Socher
Recurrent neural networks are a powerful tool for modeling sequential data, but the dependence of each timestep's computation on the previous timestep's output limits parallelism and makes RNNs unwieldy for very long sequences.
Ranked #15 on Machine Translation on IWSLT2015 German-English
9 code implementations • 26 Sep 2016 • Stephen Merity, Caiming Xiong, James Bradbury, Richard Socher
Recent neural network sequence models with softmax classifiers have achieved their best language modeling performance only with very large hidden states and large vocabularies.
11 code implementations • 4 Mar 2016 • Caiming Xiong, Stephen Merity, Richard Socher
Neural network architectures with memory and attention mechanisms exhibit certain reasoning capabilities required for question answering.
Ranked #4 on Visual Question Answering (VQA) on VQA v1 test-std