no code implementations • 3 Nov 2022 • Gábor Melis
Just because some purely recurrent models suffer from being hard to optimize and inefficient on today's hardware, they are not necessarily bad models of language.
no code implementations • 26 Sep 2022 • Gábor Melis
In practice, with a finite number of optimization steps and a learning rate that cannot be annealed to zero, Tail Averaging can get much closer to a local minimum point of the training loss than either the individual iterates or the Polyak average.
no code implementations • 1 Dec 2020 • Gábor Melis, András György, Phil Blunsom
A common failure mode of density models trained as variational autoencoders is to model the data without relying on their latent variables, rendering these variables useless.
no code implementations • CODI 2021 • Elman Mansimov, Gábor Melis, Lei Yu
Neural machine translation (NMT) has arguably achieved human level parity when trained and evaluated at the sentence-level.
1 code implementation • 20 Sep 2019 • Chris Dyer, Gábor Melis, Phil Blunsom
A series of recent papers has used a parsing algorithm due to Shen et al. (2018) to recover phrase-structure trees based on proxies for "syntactic depth."
3 code implementations • ICLR 2020 • Gábor Melis, Tomáš Kočiský, Phil Blunsom
Many advances in Natural Language Processing have been based upon more expressive models for how inputs interact with the context in which they occur.
1 code implementation • NAACL 2019 • Yoon Kim, Alexander M. Rush, Lei Yu, Adhiguna Kuncoro, Chris Dyer, Gábor Melis
On language modeling, unsupervised RNNGs perform as well their supervised counterparts on benchmarks in English and Chinese.
Ranked #8 on Constituency Grammar Induction on PTB Diagnostic ECG Database (Max F1 (WSJ) metric)
1 code implementation • 4 Jul 2018 • Tiago Ramalho, Tomáš Kočiský, Frederic Besse, S. M. Ali Eslami, Gábor Melis, Fabio Viola, Phil Blunsom, Karl Moritz Hermann
Natural language processing has made significant inroads into learning the semantics of words through distributional approaches, however representations learnt via these methods fail to capture certain kinds of information implicit in the real world.
1 code implementation • ICLR 2019 • Gábor Melis, Charles Blundell, Tomáš Kočiský, Karl Moritz Hermann, Chris Dyer, Phil Blunsom
We show that dropout training is best understood as performing MAP estimation concurrently for a family of conditional models whose objectives are themselves lower bounded by the original dropout objective.
Ranked #24 on Language Modelling on Penn Treebank (Word Level)
2 code implementations • TACL 2018 • Tomáš Kočiský, Jonathan Schwarz, Phil Blunsom, Chris Dyer, Karl Moritz Hermann, Gábor Melis, Edward Grefenstette
Reading comprehension (RC)---in contrast to information retrieval---requires integrating information and reasoning about events, entities, and their relations across a full document.
Ranked #9 on Question Answering on NarrativeQA (BLEU-1 metric)
1 code implementation • ICLR 2018 • Gábor Melis, Chris Dyer, Phil Blunsom
Ongoing innovations in recurrent neural network architectures have provided a steady influx of apparently state-of-the-art results on language modelling benchmarks.
Ranked #32 on Language Modelling on WikiText-2
no code implementations • EMNLP 2016 • Tomáš Kočiský, Gábor Melis, Edward Grefenstette, Chris Dyer, Wang Ling, Phil Blunsom, Karl Moritz Hermann
We present a novel semi-supervised approach for sequence transduction and apply it to semantic parsing.