no code implementations • WMT (EMNLP) 2020 • Lei Yu, Laurent Sartran, Po-Sen Huang, Wojciech Stokowiec, Domenic Donato, Srivatsan Srinivasan, Alek Andreev, Wang Ling, Sona Mokra, Agustin Dal Lago, Yotam Doron, Susannah Young, Phil Blunsom, Chris Dyer
This paper describes the DeepMind submission to the Chinese\rightarrowEnglish constrained data track of the WMT2020 Shared Task on News Translation.
no code implementations • 28 Nov 2022 • Sander Dieleman, Laurent Sartran, Arman Roshannai, Nikolay Savinov, Yaroslav Ganin, Pierre H. Richemond, Arnaud Doucet, Robin Strudel, Chris Dyer, Conor Durkan, Curtis Hawthorne, Rémi Leblond, Will Grathwohl, Jonas Adler
Diffusion models have quickly become the go-to paradigm for generative modelling of perceptual signals (such as images and sound) through iterative refinement.
no code implementations • 18 Jul 2022 • Domenic Donato, Lei Yu, Wang Ling, Chris Dyer
We introduce a new distributed policy gradient algorithm and show that it outperforms existing reward-aware training procedures such as REINFORCE, minimum risk training (MRT) and proximal policy optimization (PPO) in terms of training stability and generalization performance when optimizing machine translation models.
no code implementations • 1 Mar 2022 • Laurent Sartran, Samuel Barrett, Adhiguna Kuncoro, Miloš Stanojević, Phil Blunsom, Chris Dyer
We find that TGs outperform various strong baselines on sentence-level language modeling perplexity, as well as on multiple syntax-sensitive language modeling evaluation metrics.
no code implementations • ICLR 2022 • Wang Ling, Wojciech Stokowiec, Domenic Donato, Laurent Sartran, Lei Yu, Austin Matthews, Chris Dyer
When applied to autoregressive models, our algorithm has different biases than beam search has, which enables a new analysis of the role of decoding bias in autoregressive models.
no code implementations • NA 2021 • Jack W. Rae, Sebastian Borgeaud, Trevor Cai, Katie Millican, Jordan Hoffmann, Francis Song, John Aslanides, Sarah Henderson, Roman Ring, Susannah Young, Eliza Rutherford, Tom Hennigan, Jacob Menick, Albin Cassirer, Richard Powell, George van den Driessche, Lisa Anne Hendricks, Maribeth Rauh, Po-Sen Huang, Amelia Glaese, Johannes Welbl, Sumanth Dathathri, Saffron Huang, Jonathan Uesato, John Mellor, Irina Higgins, Antonia Creswell, Nat McAleese, Amy Wu, Erich Elsen, Siddhant Jayakumar, Elena Buchatskaya, David Budden, Esme Sutherland, Karen Simonyan, Michela Paganini, Laurent SIfre, Lena Martens, Xiang Lorraine Li, Adhiguna Kuncoro, Aida Nematzadeh, Elena Gribovskaya, Domenic Donato, Angeliki Lazaridou, Arthur Mensch, Jean-Baptiste Lespiau, Maria Tsimpoukelli, Nikolai Grigorev, Doug Fritz, Thibault Sottiaux, Mantas Pajarskas, Toby Pohlen, Zhitao Gong, Daniel Toyama, Cyprien de Masson d'Autume, Yujia Li, Tayfun Terzi, Vladimir Mikulik, Igor Babuschkin, Aidan Clark, Diego de Las Casas, Aurelia Guy, Chris Jones, James Bradbury, Matthew Johnson, Blake Hechtman, Laura Weidinger, Iason Gabriel, William Isaac, Ed Lockhart, Simon Osindero, Laura Rimell, Chris Dyer, Oriol Vinyals, Kareem Ayoub, Jeff Stanway, Lorrayne Bennett, Demis Hassabis, Koray Kavukcuoglu, Geoffrey Irving
Language modelling provides a step towards intelligent communication systems by harnessing large repositories of written human knowledge to better predict and understand the world.
Ranked #1 on
College Mathematics
on BIG-bench
(using extra training data)
2 code implementations • NeurIPS 2021 • Devendra Singh Sachan, Siva Reddy, William Hamilton, Chris Dyer, Dani Yogatama
We model retrieval decisions as latent variables over sets of relevant documents.
no code implementations • ACL 2021 • Domenic Donato, Lei Yu, Chris Dyer
We propose a new architecture for adapting a sentence-level sequence-to-sequence transformer by incorporating multiple pretrained document context signals and assess the impact on translation performance of (1) different pretraining approaches for generating these signals, (2) the quantity of parallel data for which document context is available, and (3) conditioning on source, target, or source and target contexts.
no code implementations • ICLR 2022 • Kartik Goyal, Chris Dyer, Taylor Berg-Kirkpatrick
While recent work has shown that scores from models trained by the ubiquitous masked language modeling (MLM) objective effectively discriminate probable from improbable sequences, it is still an open question if these MLMs specify a principled probability distribution over the space of possible sequences.
no code implementations • NAACL 2021 • Roma Patel, Marta Garnelo, Ian Gemp, Chris Dyer, Yoram Bachrach
We propose a vocabulary selection method that views words as members of a team trying to maximize the model{'}s performance.
no code implementations • 27 May 2020 • Adhiguna Kuncoro, Lingpeng Kong, Daniel Fried, Dani Yogatama, Laura Rimell, Chris Dyer, Phil Blunsom
Textual representation learners trained on large amounts of data have achieved notable success on downstream tasks; intriguingly, they have also performed well on challenging tests of syntactic competence.
1 code implementation • ACL 2020 • Daniel Fried, Jean-Baptiste Alayrac, Phil Blunsom, Chris Dyer, Stephen Clark, Aida Nematzadeh
We apply a generative segmental model of task structure, guided by narration, to action segmentation in video.
no code implementations • ACL 2020 • Kartik Goyal, Chris Dyer, Christopher Warren, Max G'Sell, Taylor Berg-Kirkpatrick
We show that our approach outperforms rigid interpretable clustering baselines (Ocular) and overly-flexible deep generative models (VAE) alike on the task of completely unsupervised discovery of typefaces in mixed-font documents.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Kazuya Kawakami, Luyu Wang, Chris Dyer, Phil Blunsom, Aaron van den Oord
Unsupervised speech representation learning has shown remarkable success at finding representations that correlate with phonetic structures and improve downstream speech recognition performance.
no code implementations • 22 Jan 2020 • Rahul Radhakrishnan Iyer, Miguel Ballesteros, Chris Dyer, Robert Frederking
Syntactic parsing using dependency structures has become a standard technique in natural language processing with many different parsing models, in particular data-driven models that can be trained on syntactically annotated corpora.
no code implementations • CONLL 2019 • Austin Matthews, Graham Neubig, Chris Dyer
Recurrent neural network grammars generate sentences using phrase-structure syntax and perform very well on both parsing and language modeling.
no code implementations • IJCNLP 2019 • John Hale, Adhiguna Kuncoro, Keith Hall, Chris Dyer, Jonathan Brennan
Domain-specific training typically makes NLP systems work better.
no code implementations • TACL 2020 • Lei Yu, Laurent Sartran, Wojciech Stokowiec, Wang Ling, Lingpeng Kong, Phil Blunsom, Chris Dyer
We show that Bayes' rule provides an effective mechanism for creating document translation models that can be learned from only parallel sentences and monolingual documents---a compelling benefit as parallel documents are not always available.
no code implementations • 25 Sep 2019 • Wang Ling, Chris Dyer, Lei Yu, Lingpeng Kong, Dani Yogatama, Susannah Young
In natural images, transitions between adjacent pixels tend to be smooth and gradual, a fact that has long been exploited in image compression models based on predictive coding.
no code implementations • 25 Sep 2019 • Kazuya Kawakami, Luyu Wang, Chris Dyer, Phil Blunsom, Aaron van den Oord
We present an unsupervised method for learning speech representations based on a bidirectional contrastive predictive coding that implicitly discovers phonetic structure from large-scale corpora of unlabelled raw audio signals.
no code implementations • 25 Sep 2019 • Lei Yu, Laurent Sartran, Wojciech Stokowiec, Wang Ling, Lingpeng Kong, Phil Blunsom, Chris Dyer
We show that Bayes' rule provides a compelling mechanism for controlling unconditional document language models, using the long-standing challenge of effectively leveraging document context in machine translation.
1 code implementation • 20 Sep 2019 • Chris Dyer, Gábor Melis, Phil Blunsom
A series of recent papers has used a parsing algorithm due to Shen et al. (2018) to recover phrase-structure trees based on proxies for "syntactic depth."
1 code implementation • IJCNLP 2019 • Po-Sen Huang, Robert Stanforth, Johannes Welbl, Chris Dyer, Dani Yogatama, Sven Gowal, Krishnamurthy Dvijotham, Pushmeet Kohli
Neural networks are part of many contemporary NLP systems, yet their empirical successes come at the price of vulnerability to adversarial attacks.
no code implementations • 29 Aug 2019 • Swabha Swayamdipta, Matthew Peters, Brendan Roof, Chris Dyer, Noah A. Smith
Shallow syntax provides an approximation of phrase-syntactic structure of sentences; it can be produced with high accuracy, and is computationally cheap to obtain.
2 code implementations • ACL 2019 • Yoon Kim, Chris Dyer, Alexander M. Rush
We study a formalization of the grammar induction problem that models sentences as being generated by a compound probabilistic context-free grammar.
Ranked #6 on
Constituency Grammar Induction
on PTB
no code implementations • ACL 2019 • Adhiguna Kuncoro, Chris Dyer, Laura Rimell, Stephen Clark, Phil Blunsom
Prior work has shown that, on small amounts of training data, syntactic neural language models learn structurally sensitive generalisations more successfully than sequential language models.
no code implementations • NAACL 2019 • Kartik Goyal, Chris Dyer, Taylor Berg-Kirkpatrick
Globally normalized neural sequence models are considered superior to their locally normalized equivalents because they may ameliorate the effects of label bias.
1 code implementation • NAACL 2019 • Yoon Kim, Alexander M. Rush, Lei Yu, Adhiguna Kuncoro, Chris Dyer, Gábor Melis
On language modeling, unsupervised RNNGs perform as well their supervised counterparts on benchmarks in English and Chinese.
Ranked #6 on
Constituency Grammar Induction
on PTB
(Max F1 (WSJ) metric)
no code implementations • 31 Jan 2019 • Dani Yogatama, Cyprien de Masson d'Autume, Jerome Connor, Tomas Kocisky, Mike Chrzanowski, Lingpeng Kong, Angeliki Lazaridou, Wang Ling, Lei Yu, Chris Dyer, Phil Blunsom
We define general linguistic intelligence as the ability to reuse previously acquired knowledge about a language's lexicon, syntax, semantics, and pragmatic conventions to adapt to new tasks quickly.
no code implementations • 26 Nov 2018 • Lei Yu, Cyprien de Masson d'Autume, Chris Dyer, Phil Blunsom, Lingpeng Kong, Wang Ling
The meaning of a sentence is a function of the relations that hold between its words.
no code implementations • ACL 2019 • Kazuya Kawakami, Chris Dyer, Phil Blunsom
We propose a segmental neural language model that combines the generalization power of neural networks with the ability to discover word-like units that are latent in unsegmented character sequences.
no code implementations • 27 Sep 2018 • Kazuya Kawakami, Chris Dyer, Phil Blunsom
We propose a segmental neural language model that combines the representational power of neural networks and the structure learning mechanism of Bayesian nonparametrics, and show that it learns to discover semantically meaningful units (e. g., morphemes and words) from unsegmented character sequences.
1 code implementation • EMNLP 2018 • Swabha Swayamdipta, Sam Thomson, Kenton Lee, Luke Zettlemoyer, Chris Dyer, Noah A. Smith
We introduce the syntactic scaffold, an approach to incorporating syntactic information into semantic tasks.
22 code implementations • NeurIPS 2018 • Andrew Trask, Felix Hill, Scott Reed, Jack Rae, Chris Dyer, Phil Blunsom
Neural networks can learn to represent and manipulate numerical information, but they seldom generalize well outside of the range of numerical values encountered during training.
no code implementations • ACL 2018 • Adhiguna Kuncoro, Chris Dyer, John Hale, Dani Yogatama, Stephen Clark, Phil Blunsom
Language exhibits hierarchical structure, but recent work using a subject-verb agreement diagnostic argued that state-of-the-art language models, LSTMs, fail to learn long-range syntax sensitive dependencies.
no code implementations • ACL 2018 • John Hale, Chris Dyer, Adhiguna Kuncoro, Jonathan R. Brennan
Model comparisons attribute the early peak to syntactic composition within the RNNG.
28 code implementations • 4 Jun 2018 • Peter W. Battaglia, Jessica B. Hamrick, Victor Bapst, Alvaro Sanchez-Gonzalez, Vinicius Zambaldi, Mateusz Malinowski, Andrea Tacchetti, David Raposo, Adam Santoro, Ryan Faulkner, Caglar Gulcehre, Francis Song, Andrew Ballard, Justin Gilmer, George Dahl, Ashish Vaswani, Kelsey Allen, Charles Nash, Victoria Langston, Chris Dyer, Nicolas Heess, Daan Wierstra, Pushmeet Kohli, Matt Botvinick, Oriol Vinyals, Yujia Li, Razvan Pascanu
As a companion to this paper, we have released an open-source software library for building graph networks, with demonstrations of how to use them in practice.
no code implementations • NAACL 2018 • Austin Matthews, Graham Neubig, Chris Dyer
Languages with productive morphology pose problems for language models that generate words from a fixed vocabulary.
1 code implementation • NeurIPS 2018 • Zichao Yang, Zhiting Hu, Chris Dyer, Eric P. Xing, Taylor Berg-Kirkpatrick
Binary classifiers are often employed as discriminators in GAN-based unsupervised style transfer systems to ensure that transferred sentences are similar to sentences in the target domain.
1 code implementation • ICLR 2019 • Gábor Melis, Charles Blundell, Tomáš Kočiský, Karl Moritz Hermann, Chris Dyer, Phil Blunsom
We show that dropout training is best understood as performing MAP estimation concurrently for a family of conditional models whose objectives are themselves lower bounded by the original dropout objective.
Ranked #23 on
Language Modelling
on Penn Treebank (Word Level)
no code implementations • ICML 2018 • Jack W. Rae, Chris Dyer, Peter Dayan, Timothy P. Lillicrap
Neural networks trained with backpropagation often struggle to identify classes that have been observed a small number of times.
Ranked #49 on
Language Modelling
on WikiText-103
no code implementations • ICLR 2018 • Yujia Li, Oriol Vinyals, Chris Dyer, Razvan Pascanu, Peter Battaglia
Graphs are fundamental data structures which concisely capture the relational structure in many important real-world domains, such as knowledge graphs, physical and social interactions, language, and chemistry.
no code implementations • 31 Jan 2018 • Avneesh Saluja, Chris Dyer, Jean-David Ruvini
Compositional vector space models of meaning promise new solutions to stubborn language understanding problems.
no code implementations • ICLR 2018 • Dani Yogatama, Yishu Miao, Gabor Melis, Wang Ling, Adhiguna Kuncoro, Chris Dyer, Phil Blunsom
We compare and analyze sequential, random access, and stack memory architectures for recurrent neural network language models.
1 code implementation • TACL 2018 • Tomáš Kočiský, Jonathan Schwarz, Phil Blunsom, Chris Dyer, Karl Moritz Hermann, Gábor Melis, Edward Grefenstette
Reading comprehension (RC)---in contrast to information retrieval---requires integrating information and reasoning about events, entities, and their relations across a full document.
Ranked #9 on
Question Answering
on NarrativeQA
(BLEU-1 metric)
no code implementations • CONLL 2017 • Chris Dyer
On the generation front, I introduce recurrent neural network grammars (RNNGs), a joint, generative model of phrase-structure trees and sentences.
no code implementations • 1 Aug 2017 • Hao Tang, Liang Lu, Lingpeng Kong, Kevin Gimpel, Karen Livescu, Chris Dyer, Noah A. Smith, Steve Renals
Segmental models are an alternative to frame-based models for sequence prediction, where hypothesized path weights are based on entire segment scores rather than a single frame at a time.
no code implementations • 1 Aug 2017 • Kartik Goyal, Graham Neubig, Chris Dyer, Taylor Berg-Kirkpatrick
In experiments, we show that optimizing this new training objective yields substantially better results on two sequence tasks (Named Entity Recognition and CCG Supertagging) when compared with both cross entropy trained greedy decoding and cross entropy trained beam decoding baselines.
Ranked #3 on
Motion Segmentation
on Hopkins155
1 code implementation • ICLR 2018 • Gábor Melis, Chris Dyer, Phil Blunsom
Ongoing innovations in recurrent neural network architectures have provided a steady influx of apparently state-of-the-art results on language modelling benchmarks.
Ranked #31 on
Language Modelling
on WikiText-2
no code implementations • ACL 2017 • Wang Ling, Dani Yogatama, Chris Dyer, Phil Blunsom
Solving algebraic word problems requires executing a series of arithmetic operations{---}a program{---}to obtain a final answer.
10 code implementations • 29 Jun 2017 • Swabha Swayamdipta, Sam Thomson, Chris Dyer, Noah A. Smith
We present a new, efficient frame-semantic parser that labels semantic arguments to FrameNet predicates.
no code implementations • ICLR 2018 • Dirk Weissenborn, Tomáš Kočiský, Chris Dyer
Common-sense and background knowledge is required to understand natural language, but in most neural natural language understanding (NLU) systems, this knowledge must be acquired from training corpora during learning, and then it is static at test time.
Ranked #27 on
Question Answering
on TriviaQA
no code implementations • CL 2017 • Miguel Ballesteros, Chris Dyer, Yoav Goldberg, Noah A. Smith
During training, dynamic oracles alternate between sampling parser states from the training data and from the model as it is being learned, making the model more robust to the kinds of errors that will be made at test time.
2 code implementations • NeurIPS 2017 • Graham Neubig, Yoav Goldberg, Chris Dyer
Dynamic neural network toolkits such as PyTorch, DyNet, and Chainer offer more flexibility for implementing models that cope with data of varying dimensions and structure, relative to toolkits that operate on statically declared computations (e. g., TensorFlow, CNTK, and Theano).
1 code implementation • 11 May 2017 • Wang Ling, Dani Yogatama, Chris Dyer, Phil Blunsom
Solving algebraic word problems requires executing a series of arithmetic operations---a program---to obtain a final answer.
1 code implementation • ACL 2017 • Pradeep Dasigi, Waleed Ammar, Chris Dyer, Eduard Hovy
Type-level word embeddings use the same set of parameters to represent all instances of a word regardless of its context, ignoring the inherent lexical ambiguity in language.
no code implementations • ACL 2017 • Kazuya Kawakami, Chris Dyer, Phil Blunsom
Fixed-vocabulary language models fail to account for one of the most characteristic statistical facts of natural language: the frequent creation and reuse of new word types.
no code implementations • ACL 2017 • Kartik Goyal, Chris Dyer, Taylor Berg-Kirkpatrick
We demonstrate that a continuous relaxation of the argmax operation can be used to create a differentiable approximation to greedy decoding for sequence-to-sequence (seq2seq) models.
2 code implementations • 6 Mar 2017 • Dani Yogatama, Chris Dyer, Wang Ling, Phil Blunsom
We empirically characterize the performance of discriminative and generative LSTM models for text classification.
no code implementations • 21 Feb 2017 • Liang Lu, Lingpeng Kong, Chris Dyer, Noah A. Smith
Segmental conditional random fields (SCRFs) and connectionist temporal classification (CTC) are two sequence labeling methods used for end-to-end training of speech recognition models.
4 code implementations • 15 Jan 2017 • Graham Neubig, Chris Dyer, Yoav Goldberg, Austin Matthews, Waleed Ammar, Antonios Anastasopoulos, Miguel Ballesteros, David Chiang, Daniel Clothiaux, Trevor Cohn, Kevin Duh, Manaal Faruqui, Cynthia Gan, Dan Garrette, Yangfeng Ji, Lingpeng Kong, Adhiguna Kuncoro, Gaurav Kumar, Chaitanya Malaviya, Paul Michel, Yusuke Oda, Matthew Richardson, Naomi Saphra, Swabha Swayamdipta, Pengcheng Yin
In the static declaration strategy that is used in toolkits like Theano, CNTK, and TensorFlow, the user first defines a computation graph (a symbolic representation of the computation), and then examples are fed into an engine that executes this computation and computes its derivatives.
no code implementations • COLING 2016 • Qinlan Shen, Daniel Clothiaux, Emily Tagtow, Patrick Littell, Chris Dyer
While morphological analyzers can reduce this sparsity by providing morpheme-level analyses for words, they will often introduce ambiguity by returning multiple analyses for the same surface form.
1 code implementation • COLING 2016 • David R. Mortensen, Patrick Littell, Akash Bharadwaj, Kartik Goyal, Chris Dyer, Lori Levin
This paper contributes to a growing body of evidence that{---}when coupled with appropriate machine-learning techniques{--}linguistically motivated, information-rich representations can outperform one-hot encodings of linguistic data.
no code implementations • COLING 2016 • Patrick Littell, Kartik Goyal, David R. Mortensen, Alexa Little, Chris Dyer, Lori Levin
This paper describes our construction of named-entity recognition (NER) systems in two Western Iranian languages, Sorani Kurdish and Tajik, as a part of a pilot study of {``}Linguistic Rapid Response{''} to potential emergency humanitarian relief situations.
no code implementations • 28 Nov 2016 • Dani Yogatama, Phil Blunsom, Chris Dyer, Edward Grefenstette, Wang Ling
We use reinforcement learning to learn tree-structured neural networks for computing representations of natural language sentences.
1 code implementation • EACL 2017 • Adhiguna Kuncoro, Miguel Ballesteros, Lingpeng Kong, Chris Dyer, Graham Neubig, Noah A. Smith
We investigate what information they learn, from a linguistic perspective, through various ablations to the model and the data, and by augmenting the model with an attention mechanism (GA-RNNG) to enable closer inspection.
Ranked #20 on
Constituency Parsing
on Penn Treebank
no code implementations • 8 Nov 2016 • Lei Yu, Phil Blunsom, Chris Dyer, Edward Grefenstette, Tomas Kocisky
We formulate sequence to sequence transduction as a noisy channel decoding problem and use recurrent neural networks to parameterise the source and channel models.
no code implementations • EMNLP 2017 • Zichao Yang, Phil Blunsom, Chris Dyer, Wang Ling
We propose a general class of language models that treat reference as an explicit stochastic latent variable.
Ranked #1 on
Recipe Generation
on allrecipes.com
no code implementations • EMNLP 2016 • Tomáš Kočiský, Gábor Melis, Edward Grefenstette, Chris Dyer, Wang Ling, Phil Blunsom, Karl Moritz Hermann
We present a novel semi-supervised approach for sequence transduction and apply it to semantic parsing.
1 code implementation • EMNLP 2016 • Adhiguna Kuncoro, Miguel Ballesteros, Lingpeng Kong, Chris Dyer, Noah A. Smith
We introduce two first-order graph-based dependency parsers achieving a new state of the art.
Ranked #18 on
Dependency Parsing
on Penn Treebank
no code implementations • EACL 2017 • Zichao Yang, Zhiting Hu, Yuntian Deng, Chris Dyer, Alex Smola
Knowing which words have been attended to in previous time steps while generating a translation is a rich source of information for predicting what words will be attended to in the future.
1 code implementation • CONLL 2016 • Swabha Swayamdipta, Miguel Ballesteros, Chris Dyer, Noah A. Smith
We present a transition-based parser that jointly produces syntactic and semantic dependencies.
no code implementations • WS 2016 • Yulia Tsvetkov, Manaal Faruqui, Chris Dyer
We introduce QVEC-CCA--an intrinsic evaluation metric for word vector representations based on correlations of learned vectors with features extracted from linguistic resources.
1 code implementation • EMNLP 2016 • Graham Neubig, Chris Dyer
Language models (LMs) are statistical models that calculate probabilities over sequences of words or other discrete symbols.
no code implementations • NAACL 2016 • Yulia Tsvetkov, Sunayana Sitaram, Manaal Faruqui, Guillaume Lample, Patrick Littell, David Mortensen, Alan W. black, Lori Levin, Chris Dyer
We introduce polyglot language models, recurrent neural network models trained to predict symbol sequences in many different languages using shared representations of symbols and conditioning on typological information about the language to be predicted.
no code implementations • ACL 2016 • Yulia Tsvetkov, Manaal Faruqui, Wang Ling, Brian MacWhinney, Chris Dyer
We use Bayesian optimization to learn curricula for word representation learning, optimizing performance on downstream tasks that depend on the learned representations as features.
1 code implementation • WS 2016 • Manaal Faruqui, Yulia Tsvetkov, Pushpendre Rastogi, Chris Dyer
Our study suggests that the use of word similarity tasks for evaluation of word vectors is not sustainable and calls for further research on evaluation methods.
no code implementations • LREC 2016 • Patrick Littell, David R. Mortensen, Kartik Goyal, Chris Dyer, Lori Levin
In Sorani Kurdish, one of the most useful orthographic features in named-entity recognition {--} capitalization {--} is absent, as the language{'}s Perso-Arabic script does not make a distinction between uppercase and lowercase letters.
1 code implementation • ACL 2016 • Shyam Upadhyay, Manaal Faruqui, Chris Dyer, Dan Roth
Despite interest in using cross-lingual knowledge to learn word embeddings for various tasks, a systematic comparison of the possible approaches is lacking in the literature.
no code implementations • 11 Mar 2016 • Miguel Ballesteros, Yoav Goldberg, Chris Dyer, Noah A. Smith
We adapt the greedy Stack-LSTM dependency parser of Dyer et al. (2015) to support a training-with-exploration procedure using dynamic oracles(Goldberg and Nivre, 2013) instead of cross-entropy minimization.
Ranked #2 on
Chinese Dependency Parsing
on Chinese Pennbank
43 code implementations • NAACL 2016 • Guillaume Lample, Miguel Ballesteros, Sandeep Subramanian, Kazuya Kawakami, Chris Dyer
State-of-the-art named entity recognition systems rely heavily on hand-crafted features and domain-specific knowledge in order to learn effectively from the small, supervised training corpora that are available.
Ranked #8 on
Named Entity Recognition (NER)
on CoNLL++
no code implementations • 1 Mar 2016 • Liang Lu, Lingpeng Kong, Chris Dyer, Noah A. Smith, Steve Renals
This model connects the segmental conditional random field (CRF) with a recurrent neural network (RNN) used for feature extraction.
Ranked #16 on
Speech Recognition
on TIMIT
6 code implementations • NAACL 2016 • Chris Dyer, Adhiguna Kuncoro, Miguel Ballesteros, Noah A. Smith
We introduce recurrent neural network grammars, probabilistic models of sentences with explicit phrase structure.
Ranked #25 on
Constituency Parsing
on Penn Treebank
no code implementations • 13 Feb 2016 • Abhinav Maurya, Kenton Murray, Yandong Liu, Chris Dyer, William W. Cohen, Daniel B. Neill
Many methods have been proposed for detecting emerging events in text streams using topic modeling.
1 code implementation • 5 Feb 2016 • Waleed Ammar, George Mulcaire, Yulia Tsvetkov, Guillaume Lample, Chris Dyer, Noah A. Smith
We introduce new methods for estimating and evaluating embeddings of words in more than fifty languages in a single shared embedding space.
1 code implementation • TACL 2016 • Waleed Ammar, George Mulcaire, Miguel Ballesteros, Chris Dyer, Noah A. Smith
We train one multilingual model for dependency parsing and use it to parse sentences in several languages.
no code implementations • NAACL 2016 • Trevor Cohn, Cong Duy Vu Hoang, Ekaterina Vymolova, Kaisheng Yao, Chris Dyer, Gholamreza Haffari
Neural encoder-decoder models of machine translation have achieved impressive results, rivalling traditional translation models.
1 code implementation • NAACL 2016 • Manaal Faruqui, Yulia Tsvetkov, Graham Neubig, Chris Dyer
Morphological inflection generation is the task of generating the inflected form of a given lemma corresponding to a particular linguistic transformation.
no code implementations • 30 Nov 2015 • Snigdha Chaturvedi, Shashank Srivastava, Hal Daume III, Chris Dyer
Studying characters plays a vital role in computationally representing and interpreting narratives.
2 code implementations • 18 Nov 2015 • Lingpeng Kong, Chris Dyer, Noah A. Smith
Representations of the input segments (i. e., contiguous subsequences of the input) are computed by encoding their constituent tokens using bidirectional recurrent neural nets, and these "segment embeddings" are used to define compatibility scores with output labels.
no code implementations • 14 Nov 2015 • Kazuya Kawakami, Chris Dyer
We present a neural network architecture based on bidirectional LSTMs to compute representations of words in the sentential contexts.
no code implementations • 14 Nov 2015 • Wang Ling, Isabel Trancoso, Chris Dyer, Alan W. black
We introduce a neural machine translation model that views the input and output sentences as sequences of characters rather than words.
1 code implementation • 12 Nov 2015 • Yangfeng Ji, Trevor Cohn, Lingpeng Kong, Chris Dyer, Jacob Eisenstein
Text documents are structured on multiple levels of detail: individual words are related by syntax, but larger units of text are related by discourse structure.
no code implementations • 16 Aug 2015 • Kaisheng Yao, Trevor Cohn, Katerina Vylomova, Kevin Duh, Chris Dyer
This gate is a function of the lower layer memory cell, the input to and the past memory cell of this layer.
1 code implementation • EMNLP 2015 • Wang Ling, Tiago Luís, Luís Marujo, Ramón Fernandez Astudillo, Silvio Amir, Chris Dyer, Alan W. black, Isabel Trancoso
We introduce a model for constructing vector representations of words by composing characters using bidirectional LSTMs.
Ranked #4 on
Part-Of-Speech Tagging
on Penn Treebank
1 code implementation • EMNLP 2015 • Miguel Ballesteros, Chris Dyer, Noah A. Smith
We present extensions to a continuous-state dependency parsing method that makes it applicable to morphologically rich languages.
1 code implementation • IJCNLP 2015 • Manaal Faruqui, Chris Dyer
Data-driven representation learning for words is a technique of central importance in NLP.
3 code implementations • IJCNLP 2015 • Manaal Faruqui, Yulia Tsvetkov, Dani Yogatama, Chris Dyer, Noah Smith
Current distributed representations of words show little resemblance to theories of lexical semantics.
7 code implementations • IJCNLP 2015 • Chris Dyer, Miguel Ballesteros, Wang Ling, Austin Matthews, Noah A. Smith
We propose a technique for learning representations of parser states in transition-based dependency parsers.
no code implementations • HLT 2015 • Chu-Cheng Lin, Waleed Ammar, Chris Dyer, Lori Levin
Unsupervised word embeddings have been shown to be valuable as features in supervised learning problems; however, their role in unsupervised problems has been less thoroughly explored.
2 code implementations • HLT 2015 • Manaal Faruqui, Jesse Dodge, Sujay K. Jauhar, Chris Dyer, Eduard Hovy, Noah A. Smith
Vector space word representations are learned from distributional information of words in large corpora.
1 code implementation • NeurIPS 2014 • Waleed Ammar, Chris Dyer, Noah A. Smith
We introduce a framework for unsupervised learning of structured predictors with overlapping, global features.
3 code implementations • 30 Oct 2014 • Chris Dyer
Estimating the parameters of probabilistic models of language such as maxent models and probabilistic neural models is computationally difficult since it involves evaluating partition functions by summing over an entire vocabulary, which may be millions of word types in size.
no code implementations • 8 Jun 2014 • Dani Yogatama, Manaal Faruqui, Chris Dyer, Noah A. Smith
We propose a new method for learning word representations using hierarchical regularization in sparse coding inspired by the linguistic study of word meanings.
1 code implementation • LREC 2014 • Yulia Tsvetkov, Nathan Schneider, Dirk Hovy, Archna Bhatia, Manaal Faruqui, Chris Dyer
We develop a supersense taxonomy for adjectives, based on that of GermaNet, and apply it to English adjectives in WordNet using human annotation and supervised classification.
no code implementations • LREC 2014 • Shikun Zhang, Wang Ling, Chris Dyer
In this paper, we leverage the existence of dual subtitles as a source of parallel data.
no code implementations • LREC 2014 • Archna Bhatia, M Simons, y, Lori Levin, Yulia Tsvetkov, Chris Dyer, Jordan Bender
We present a definiteness annotation scheme that captures the semantic, pragmatic, and discourse information, which we call communicative functions, associated with linguistic descriptions such as {``}a story about my speech{''}, {``}the story{''}, {``}every time I give it{''}, {``}this slideshow{''}.
no code implementations • TACL 2014 • Nathan Schneider, Emily Danchik, Chris Dyer, Noah A. Smith
We present a novel representation, evaluation measure, and supervised models for the task of identifying the multiword expressions (MWEs) in a sentence, resulting in a lexical semantic segmentation.
no code implementations • TACL 2014 • Jonathan H. Clark, Chris Dyer, Alon Lavie
Linear models, which support efficient learning and inference, are the workhorses of statistical machine translation; however, linear decision rules are less attractive from a modeling perspective.
no code implementations • EMNLP 2014 • Ankur P. Parikh, Avneesh Saluja, Chris Dyer, Eric P. Xing
We present power low rank ensembles (PLRE), a flexible framework for n-gram language modeling where ensembles of low rank matrices and tensors are used to obtain smoothed probability estimates of words in context.
no code implementations • 25 Oct 2013 • Shiladitya Sinha, Chris Dyer, Kevin Gimpel, Noah A. Smith
We study the relationship between social media output and National Football League (NFL) games, using a dataset containing messages from Twitter and NFL game statistics.
no code implementations • 13 Jul 2013 • Chris Dyer
We describe the line search used in the minimum error rate training algorithm MERT as the "inside score" of a weighted proof forest under a semiring defined in terms of well-understood operations from computational geometry.
1 code implementation • WS 2013 • Nathan Schneider, Brendan O'Connor, Naomi Saphra, David Bamman, Manaal Faruqui, Noah A. Smith, Chris Dyer, Jason Baldridge
We introduce a framework for lightweight dependency syntax annotation.