Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on taskspecific datasets.
We propose an alternative generator architecture for generative adversarial networks, borrowing from style transfer literature.
The AlphaGo, AlphaGo Zero, and AlphaZero series of algorithms are a remarkable demonstration of deep reinforcement learning's capabilities, achieving superhuman performance in the complex game of Go with progressively increasing autonomy.
On the negative side, we observe in our study that well-disentangled models seemingly cannot be identified without access to ground-truth labels even if we are allowed to transfer hyperparameters across data sets.
Much of the recent progress made in image classification research can be credited to training procedure refinements, such as changes in data augmentations and optimization methods.
Transformer networks have a potential of learning longer-term dependency, but are limited by a fixed-length context in the setting of language modeling.
SOTA for Language Modelling on Hutter Prize
A very simple framework for state-of-the-art Natural Language Processing (NLP)