Search Results for author: Chris Dyer

Found 174 papers, 50 papers with code

Continuous diffusion for categorical data

no code implementations28 Nov 2022 Sander Dieleman, Laurent Sartran, Arman Roshannai, Nikolay Savinov, Yaroslav Ganin, Pierre H. Richemond, Arnaud Doucet, Robin Strudel, Chris Dyer, Conor Durkan, Curtis Hawthorne, Rémi Leblond, Will Grathwohl, Jonas Adler

Diffusion models have quickly become the go-to paradigm for generative modelling of perceptual signals (such as images and sound) through iterative refinement.

Language Modelling

MAD for Robust Reinforcement Learning in Machine Translation

no code implementations18 Jul 2022 Domenic Donato, Lei Yu, Wang Ling, Chris Dyer

We introduce a new distributed policy gradient algorithm and show that it outperforms existing reward-aware training procedures such as REINFORCE, minimum risk training (MRT) and proximal policy optimization (PPO) in terms of training stability and generalization performance when optimizing machine translation models.

Machine Translation reinforcement-learning +2

Transformer Grammars: Augmenting Transformer Language Models with Syntactic Inductive Biases at Scale

no code implementations1 Mar 2022 Laurent Sartran, Samuel Barrett, Adhiguna Kuncoro, Miloš Stanojević, Phil Blunsom, Chris Dyer

We find that TGs outperform various strong baselines on sentence-level language modeling perplexity, as well as on multiple syntax-sensitive language modeling evaluation metrics.

Inductive Bias Language Modelling

Enabling arbitrary translation objectives with Adaptive Tree Search

no code implementations ICLR 2022 Wang Ling, Wojciech Stokowiec, Domenic Donato, Laurent Sartran, Lei Yu, Austin Matthews, Chris Dyer

When applied to autoregressive models, our algorithm has different biases than beam search has, which enables a new analysis of the role of decoding bias in autoregressive models.


Scaling Language Models: Methods, Analysis & Insights from Training Gopher

no code implementations NA 2021 Jack W. Rae, Sebastian Borgeaud, Trevor Cai, Katie Millican, Jordan Hoffmann, Francis Song, John Aslanides, Sarah Henderson, Roman Ring, Susannah Young, Eliza Rutherford, Tom Hennigan, Jacob Menick, Albin Cassirer, Richard Powell, George van den Driessche, Lisa Anne Hendricks, Maribeth Rauh, Po-Sen Huang, Amelia Glaese, Johannes Welbl, Sumanth Dathathri, Saffron Huang, Jonathan Uesato, John Mellor, Irina Higgins, Antonia Creswell, Nat McAleese, Amy Wu, Erich Elsen, Siddhant Jayakumar, Elena Buchatskaya, David Budden, Esme Sutherland, Karen Simonyan, Michela Paganini, Laurent SIfre, Lena Martens, Xiang Lorraine Li, Adhiguna Kuncoro, Aida Nematzadeh, Elena Gribovskaya, Domenic Donato, Angeliki Lazaridou, Arthur Mensch, Jean-Baptiste Lespiau, Maria Tsimpoukelli, Nikolai Grigorev, Doug Fritz, Thibault Sottiaux, Mantas Pajarskas, Toby Pohlen, Zhitao Gong, Daniel Toyama, Cyprien de Masson d'Autume, Yujia Li, Tayfun Terzi, Vladimir Mikulik, Igor Babuschkin, Aidan Clark, Diego de Las Casas, Aurelia Guy, Chris Jones, James Bradbury, Matthew Johnson, Blake Hechtman, Laura Weidinger, Iason Gabriel, William Isaac, Ed Lockhart, Simon Osindero, Laura Rimell, Chris Dyer, Oriol Vinyals, Kareem Ayoub, Jeff Stanway, Lorrayne Bennett, Demis Hassabis, Koray Kavukcuoglu, Geoffrey Irving

Language modelling provides a step towards intelligent communication systems by harnessing large repositories of written human knowledge to better predict and understand the world.

 Ranked #1 on College Mathematics on BIG-bench (using extra training data)

Abstract Algebra Anachronisms +133

Diverse Pretrained Context Encodings Improve Document Translation

no code implementations ACL 2021 Domenic Donato, Lei Yu, Chris Dyer

We propose a new architecture for adapting a sentence-level sequence-to-sequence transformer by incorporating multiple pretrained document context signals and assess the impact on translation performance of (1) different pretraining approaches for generating these signals, (2) the quantity of parallel data for which document context is available, and (3) conditioning on source, target, or source and target contexts.

Document Translation Translation

Exposing the Implicit Energy Networks behind Masked Language Models via Metropolis--Hastings

no code implementations ICLR 2022 Kartik Goyal, Chris Dyer, Taylor Berg-Kirkpatrick

While recent work has shown that scores from models trained by the ubiquitous masked language modeling (MLM) objective effectively discriminate probable from improbable sequences, it is still an open question if these MLMs specify a principled probability distribution over the space of possible sequences.

Language Modelling Machine Translation +2

Syntactic Structure Distillation Pretraining For Bidirectional Encoders

no code implementations27 May 2020 Adhiguna Kuncoro, Lingpeng Kong, Daniel Fried, Dani Yogatama, Laura Rimell, Chris Dyer, Phil Blunsom

Textual representation learners trained on large amounts of data have achieved notable success on downstream tasks; intriguingly, they have also performed well on challenging tests of syntactic competence.

Knowledge Distillation Language Modelling +2

A Probabilistic Generative Model for Typographical Analysis of Early Modern Printing

no code implementations ACL 2020 Kartik Goyal, Chris Dyer, Christopher Warren, Max G'Sell, Taylor Berg-Kirkpatrick

We show that our approach outperforms rigid interpretable clustering baselines (Ocular) and overly-flexible deep generative models (VAE) alike on the task of completely unsupervised discovery of typefaces in mixed-font documents.

Learning Robust and Multilingual Speech Representations

no code implementations Findings of the Association for Computational Linguistics 2020 Kazuya Kawakami, Luyu Wang, Chris Dyer, Phil Blunsom, Aaron van den Oord

Unsupervised speech representation learning has shown remarkable success at finding representations that correlate with phonetic structures and improve downstream speech recognition performance.

Representation Learning speech-recognition +1

Transition-Based Dependency Parsing using Perceptron Learner

no code implementations22 Jan 2020 Rahul Radhakrishnan Iyer, Miguel Ballesteros, Chris Dyer, Robert Frederking

Syntactic parsing using dependency structures has become a standard technique in natural language processing with many different parsing models, in particular data-driven models that can be trained on syntactically annotated corpora.

Transition-Based Dependency Parsing

Comparing Top-Down and Bottom-Up Neural Generative Dependency Models

no code implementations CONLL 2019 Austin Matthews, Graham Neubig, Chris Dyer

Recurrent neural network grammars generate sentences using phrase-structure syntax and perform very well on both parsing and language modeling.

Language Modelling

Better Document-Level Machine Translation with Bayes' Rule

no code implementations TACL 2020 Lei Yu, Laurent Sartran, Wojciech Stokowiec, Wang Ling, Lingpeng Kong, Phil Blunsom, Chris Dyer

We show that Bayes' rule provides an effective mechanism for creating document translation models that can be learned from only parallel sentences and monolingual documents---a compelling benefit as parallel documents are not always available.

Document Level Machine Translation Document Translation +3

Relative Pixel Prediction For Autoregressive Image Generation

no code implementations25 Sep 2019 Wang Ling, Chris Dyer, Lei Yu, Lingpeng Kong, Dani Yogatama, Susannah Young

In natural images, transitions between adjacent pixels tend to be smooth and gradual, a fact that has long been exploited in image compression models based on predictive coding.

Colorization Image Colorization +4

Unsupervised Learning of Efficient and Robust Speech Representations

no code implementations25 Sep 2019 Kazuya Kawakami, Luyu Wang, Chris Dyer, Phil Blunsom, Aaron van den Oord

We present an unsupervised method for learning speech representations based on a bidirectional contrastive predictive coding that implicitly discovers phonetic structure from large-scale corpora of unlabelled raw audio signals.

speech-recognition Speech Recognition

Putting Machine Translation in Context with the Noisy Channel Model

no code implementations25 Sep 2019 Lei Yu, Laurent Sartran, Wojciech Stokowiec, Wang Ling, Lingpeng Kong, Phil Blunsom, Chris Dyer

We show that Bayes' rule provides a compelling mechanism for controlling unconditional document language models, using the long-standing challenge of effectively leveraging document context in machine translation.

Document Translation Language Modelling +2

A Critical Analysis of Biased Parsers in Unsupervised Parsing

1 code implementation20 Sep 2019 Chris Dyer, Gábor Melis, Phil Blunsom

A series of recent papers has used a parsing algorithm due to Shen et al. (2018) to recover phrase-structure trees based on proxies for "syntactic depth."

Language Modelling

Shallow Syntax in Deep Water

no code implementations29 Aug 2019 Swabha Swayamdipta, Matthew Peters, Brendan Roof, Chris Dyer, Noah A. Smith

Shallow syntax provides an approximation of phrase-syntactic structure of sentences; it can be produced with high accuracy, and is computationally cheap to obtain.

Compound Probabilistic Context-Free Grammars for Grammar Induction

2 code implementations ACL 2019 Yoon Kim, Chris Dyer, Alexander M. Rush

We study a formalization of the grammar induction problem that models sentences as being generated by a compound probabilistic context-free grammar.

Constituency Grammar Induction Variational Inference

Scalable Syntax-Aware Language Models Using Knowledge Distillation

no code implementations ACL 2019 Adhiguna Kuncoro, Chris Dyer, Laura Rimell, Stephen Clark, Phil Blunsom

Prior work has shown that, on small amounts of training data, syntactic neural language models learn structurally sensitive generalisations more successfully than sequential language models.

Knowledge Distillation Language Modelling

Unsupervised Recurrent Neural Network Grammars

1 code implementation NAACL 2019 Yoon Kim, Alexander M. Rush, Lei Yu, Adhiguna Kuncoro, Chris Dyer, Gábor Melis

On language modeling, unsupervised RNNGs perform as well their supervised counterparts on benchmarks in English and Chinese.

Ranked #6 on Constituency Grammar Induction on PTB (Max F1 (WSJ) metric)

Constituency Grammar Induction Language Modelling +1

Learning and Evaluating General Linguistic Intelligence

no code implementations31 Jan 2019 Dani Yogatama, Cyprien de Masson d'Autume, Jerome Connor, Tomas Kocisky, Mike Chrzanowski, Lingpeng Kong, Angeliki Lazaridou, Wang Ling, Lei Yu, Chris Dyer, Phil Blunsom

We define general linguistic intelligence as the ability to reuse previously acquired knowledge about a language's lexicon, syntax, semantics, and pragmatic conventions to adapt to new tasks quickly.

Natural Language Understanding Question Answering

Learning to Discover, Ground and Use Words with Segmental Neural Language Models

no code implementations ACL 2019 Kazuya Kawakami, Chris Dyer, Phil Blunsom

We propose a segmental neural language model that combines the generalization power of neural networks with the ability to discover word-like units that are latent in unsegmented character sequences.

Language Modelling

Unsupervised Word Discovery with Segmental Neural Language Models

no code implementations27 Sep 2018 Kazuya Kawakami, Chris Dyer, Phil Blunsom

We propose a segmental neural language model that combines the representational power of neural networks and the structure learning mechanism of Bayesian nonparametrics, and show that it learns to discover semantically meaningful units (e. g., morphemes and words) from unsegmented character sequences.

Language Modelling

Neural Arithmetic Logic Units

22 code implementations NeurIPS 2018 Andrew Trask, Felix Hill, Scott Reed, Jack Rae, Chris Dyer, Phil Blunsom

Neural networks can learn to represent and manipulate numerical information, but they seldom generalize well outside of the range of numerical values encountered during training.

LSTMs Can Learn Syntax-Sensitive Dependencies Well, But Modeling Structure Makes Them Better

no code implementations ACL 2018 Adhiguna Kuncoro, Chris Dyer, John Hale, Dani Yogatama, Stephen Clark, Phil Blunsom

Language exhibits hierarchical structure, but recent work using a subject-verb agreement diagnostic argued that state-of-the-art language models, LSTMs, fail to learn long-range syntax sensitive dependencies.

Language Modelling Machine Translation +1

Unsupervised Text Style Transfer using Language Models as Discriminators

1 code implementation NeurIPS 2018 Zichao Yang, Zhiting Hu, Chris Dyer, Eric P. Xing, Taylor Berg-Kirkpatrick

Binary classifiers are often employed as discriminators in GAN-based unsupervised style transfer systems to ensure that transferred sentences are similar to sentences in the target domain.

Decipherment Language Modelling +4

Pushing the bounds of dropout

1 code implementation ICLR 2019 Gábor Melis, Charles Blundell, Tomáš Kočiský, Karl Moritz Hermann, Chris Dyer, Phil Blunsom

We show that dropout training is best understood as performing MAP estimation concurrently for a family of conditional models whose objectives are themselves lower bounded by the original dropout objective.

Language Modelling

Learning Deep Generative Models of Graphs

no code implementations ICLR 2018 Yujia Li, Oriol Vinyals, Chris Dyer, Razvan Pascanu, Peter Battaglia

Graphs are fundamental data structures which concisely capture the relational structure in many important real-world domains, such as knowledge graphs, physical and social interactions, language, and chemistry.

Graph Generation Knowledge Graphs

Paraphrase-Supervised Models of Compositionality

no code implementations31 Jan 2018 Avneesh Saluja, Chris Dyer, Jean-David Ruvini

Compositional vector space models of meaning promise new solutions to stubborn language understanding problems.

Machine Translation Translation

Memory Architectures in Recurrent Neural Network Language Models

no code implementations ICLR 2018 Dani Yogatama, Yishu Miao, Gabor Melis, Wang Ling, Adhiguna Kuncoro, Chris Dyer, Phil Blunsom

We compare and analyze sequential, random access, and stack memory architectures for recurrent neural network language models.

The NarrativeQA Reading Comprehension Challenge

1 code implementation TACL 2018 Tomáš Kočiský, Jonathan Schwarz, Phil Blunsom, Chris Dyer, Karl Moritz Hermann, Gábor Melis, Edward Grefenstette

Reading comprehension (RC)---in contrast to information retrieval---requires integrating information and reasoning about events, entities, and their relations across a full document.

Ranked #9 on Question Answering on NarrativeQA (BLEU-1 metric)

Information Retrieval Question Answering +2

Should Neural Network Architecture Reflect Linguistic Structure?

no code implementations CONLL 2017 Chris Dyer

On the generation front, I introduce recurrent neural network grammars (RNNGs), a joint, generative model of phrase-structure trees and sentences.

End-to-End Neural Segmental Models for Speech Recognition

no code implementations1 Aug 2017 Hao Tang, Liang Lu, Lingpeng Kong, Kevin Gimpel, Karen Livescu, Chris Dyer, Noah A. Smith, Steve Renals

Segmental models are an alternative to frame-based models for sequence prediction, where hypothesized path weights are based on entire segment scores rather than a single frame at a time.

speech-recognition Speech Recognition

A Continuous Relaxation of Beam Search for End-to-end Training of Neural Sequence Models

no code implementations1 Aug 2017 Kartik Goyal, Graham Neubig, Chris Dyer, Taylor Berg-Kirkpatrick

In experiments, we show that optimizing this new training objective yields substantially better results on two sequence tasks (Named Entity Recognition and CCG Supertagging) when compared with both cross entropy trained greedy decoding and cross entropy trained beam decoding baselines.

CCG Supertagging Motion Segmentation +3

On the State of the Art of Evaluation in Neural Language Models

1 code implementation ICLR 2018 Gábor Melis, Chris Dyer, Phil Blunsom

Ongoing innovations in recurrent neural network architectures have provided a steady influx of apparently state-of-the-art results on language modelling benchmarks.

Language Modelling

Frame-Semantic Parsing with Softmax-Margin Segmental RNNs and a Syntactic Scaffold

10 code implementations29 Jun 2017 Swabha Swayamdipta, Sam Thomson, Chris Dyer, Noah A. Smith

We present a new, efficient frame-semantic parser that labels semantic arguments to FrameNet predicates.

Semantic Parsing

Dynamic Integration of Background Knowledge in Neural NLU Systems

no code implementations ICLR 2018 Dirk Weissenborn, Tomáš Kočiský, Chris Dyer

Common-sense and background knowledge is required to understand natural language, but in most neural natural language understanding (NLU) systems, this knowledge must be acquired from training corpora during learning, and then it is static at test time.

Common Sense Reasoning Natural Language Inference +3

Greedy Transition-Based Dependency Parsing with Stack LSTMs

no code implementations CL 2017 Miguel Ballesteros, Chris Dyer, Yoav Goldberg, Noah A. Smith

During training, dynamic oracles alternate between sampling parser states from the training data and from the model as it is being learned, making the model more robust to the kinds of errors that will be made at test time.

Transition-Based Dependency Parsing

On-the-fly Operation Batching in Dynamic Computation Graphs

2 code implementations NeurIPS 2017 Graham Neubig, Yoav Goldberg, Chris Dyer

Dynamic neural network toolkits such as PyTorch, DyNet, and Chainer offer more flexibility for implementing models that cope with data of varying dimensions and structure, relative to toolkits that operate on statically declared computations (e. g., TensorFlow, CNTK, and Theano).

Program Induction by Rationale Generation : Learning to Solve and Explain Algebraic Word Problems

1 code implementation11 May 2017 Wang Ling, Dani Yogatama, Chris Dyer, Phil Blunsom

Solving algebraic word problems requires executing a series of arithmetic operations---a program---to obtain a final answer.

Program induction

Ontology-Aware Token Embeddings for Prepositional Phrase Attachment

1 code implementation ACL 2017 Pradeep Dasigi, Waleed Ammar, Chris Dyer, Eduard Hovy

Type-level word embeddings use the same set of parameters to represent all instances of a word regardless of its context, ignoring the inherent lexical ambiguity in language.

Prepositional Phrase Attachment Word Embeddings

Learning to Create and Reuse Words in Open-Vocabulary Neural Language Modeling

no code implementations ACL 2017 Kazuya Kawakami, Chris Dyer, Phil Blunsom

Fixed-vocabulary language models fail to account for one of the most characteristic statistical facts of natural language: the frequent creation and reuse of new word types.

Language Modelling

Differentiable Scheduled Sampling for Credit Assignment

no code implementations ACL 2017 Kartik Goyal, Chris Dyer, Taylor Berg-Kirkpatrick

We demonstrate that a continuous relaxation of the argmax operation can be used to create a differentiable approximation to greedy decoding for sequence-to-sequence (seq2seq) models.

Machine Translation named-entity-recognition +3

Generative and Discriminative Text Classification with Recurrent Neural Networks

2 code implementations6 Mar 2017 Dani Yogatama, Chris Dyer, Wang Ling, Phil Blunsom

We empirically characterize the performance of discriminative and generative LSTM models for text classification.

Continual Learning General Classification +2

Multitask Learning with CTC and Segmental CRF for Speech Recognition

no code implementations21 Feb 2017 Liang Lu, Lingpeng Kong, Chris Dyer, Noah A. Smith

Segmental conditional random fields (SCRFs) and connectionist temporal classification (CTC) are two sequence labeling methods used for end-to-end training of speech recognition models.

speech-recognition Speech Recognition

DyNet: The Dynamic Neural Network Toolkit

4 code implementations15 Jan 2017 Graham Neubig, Chris Dyer, Yoav Goldberg, Austin Matthews, Waleed Ammar, Antonios Anastasopoulos, Miguel Ballesteros, David Chiang, Daniel Clothiaux, Trevor Cohn, Kevin Duh, Manaal Faruqui, Cynthia Gan, Dan Garrette, Yangfeng Ji, Lingpeng Kong, Adhiguna Kuncoro, Gaurav Kumar, Chaitanya Malaviya, Paul Michel, Yusuke Oda, Matthew Richardson, Naomi Saphra, Swabha Swayamdipta, Pengcheng Yin

In the static declaration strategy that is used in toolkits like Theano, CNTK, and TensorFlow, the user first defines a computation graph (a symbolic representation of the computation), and then examples are fed into an engine that executes this computation and computes its derivatives.

graph construction

The Role of Context in Neural Morphological Disambiguation

no code implementations COLING 2016 Qinlan Shen, Daniel Clothiaux, Emily Tagtow, Patrick Littell, Chris Dyer

While morphological analyzers can reduce this sparsity by providing morpheme-level analyses for words, they will often introduce ambiguity by returning multiple analyses for the same surface form.

Morphological Disambiguation

PanPhon: A Resource for Mapping IPA Segments to Articulatory Feature Vectors

1 code implementation COLING 2016 David R. Mortensen, Patrick Littell, Akash Bharadwaj, Kartik Goyal, Chris Dyer, Lori Levin

This paper contributes to a growing body of evidence that{---}when coupled with appropriate machine-learning techniques{--}linguistically motivated, information-rich representations can outperform one-hot encodings of linguistic data.


Named Entity Recognition for Linguistic Rapid Response in Low-Resource Languages: Sorani Kurdish and Tajik

no code implementations COLING 2016 Patrick Littell, Kartik Goyal, David R. Mortensen, Alexa Little, Chris Dyer, Lori Levin

This paper describes our construction of named-entity recognition (NER) systems in two Western Iranian languages, Sorani Kurdish and Tajik, as a part of a pilot study of {``}Linguistic Rapid Response{''} to potential emergency humanitarian relief situations.

Humanitarian named-entity-recognition +2

Learning to Compose Words into Sentences with Reinforcement Learning

no code implementations28 Nov 2016 Dani Yogatama, Phil Blunsom, Chris Dyer, Edward Grefenstette, Wang Ling

We use reinforcement learning to learn tree-structured neural networks for computing representations of natural language sentences.

reinforcement-learning Reinforcement Learning (RL)

What Do Recurrent Neural Network Grammars Learn About Syntax?

1 code implementation EACL 2017 Adhiguna Kuncoro, Miguel Ballesteros, Lingpeng Kong, Chris Dyer, Graham Neubig, Noah A. Smith

We investigate what information they learn, from a linguistic perspective, through various ablations to the model and the data, and by augmenting the model with an attention mechanism (GA-RNNG) to enable closer inspection.

Constituency Parsing Dependency Parsing +1

The Neural Noisy Channel

no code implementations8 Nov 2016 Lei Yu, Phil Blunsom, Chris Dyer, Edward Grefenstette, Tomas Kocisky

We formulate sequence to sequence transduction as a noisy channel decoding problem and use recurrent neural networks to parameterise the source and channel models.

Machine Translation Morphological Inflection +1

Reference-Aware Language Models

no code implementations EMNLP 2017 Zichao Yang, Phil Blunsom, Chris Dyer, Wang Ling

We propose a general class of language models that treat reference as an explicit stochastic latent variable.

Dialogue Generation Recipe Generation

Neural Machine Translation with Recurrent Attention Modeling

no code implementations EACL 2017 Zichao Yang, Zhiting Hu, Yuntian Deng, Chris Dyer, Alex Smola

Knowing which words have been attended to in previous time steps while generating a translation is a rich source of information for predicting what words will be attended to in the future.

Machine Translation Translation

Correlation-based Intrinsic Evaluation of Word Vector Representations

no code implementations WS 2016 Yulia Tsvetkov, Manaal Faruqui, Chris Dyer

We introduce QVEC-CCA--an intrinsic evaluation metric for word vector representations based on correlations of learned vectors with features extracted from linguistic resources.

Word Similarity

Generalizing and Hybridizing Count-based and Neural Language Models

1 code implementation EMNLP 2016 Graham Neubig, Chris Dyer

Language models (LMs) are statistical models that calculate probabilities over sequences of words or other discrete symbols.

Language Modelling

Polyglot Neural Language Models: A Case Study in Cross-Lingual Phonetic Representation Learning

no code implementations NAACL 2016 Yulia Tsvetkov, Sunayana Sitaram, Manaal Faruqui, Guillaume Lample, Patrick Littell, David Mortensen, Alan W. black, Lori Levin, Chris Dyer

We introduce polyglot language models, recurrent neural network models trained to predict symbol sequences in many different languages using shared representations of symbols and conditioning on typological information about the language to be predicted.

Representation Learning

Learning the Curriculum with Bayesian Optimization for Task-Specific Word Representation Learning

no code implementations ACL 2016 Yulia Tsvetkov, Manaal Faruqui, Wang Ling, Brian MacWhinney, Chris Dyer

We use Bayesian optimization to learn curricula for word representation learning, optimizing performance on downstream tasks that depend on the learned representations as features.

Bayesian Optimization Representation Learning

Problems With Evaluation of Word Embeddings Using Word Similarity Tasks

1 code implementation WS 2016 Manaal Faruqui, Yulia Tsvetkov, Pushpendre Rastogi, Chris Dyer

Our study suggests that the use of word similarity tasks for evaluation of word vectors is not sustainable and calls for further research on evaluation methods.

Semantic Similarity Semantic Textual Similarity +2

Bridge-Language Capitalization Inference in Western Iranian: Sorani, Kurmanji, Zazaki, and Tajik

no code implementations LREC 2016 Patrick Littell, David R. Mortensen, Kartik Goyal, Chris Dyer, Lori Levin

In Sorani Kurdish, one of the most useful orthographic features in named-entity recognition {--} capitalization {--} is absent, as the language{'}s Perso-Arabic script does not make a distinction between uppercase and lowercase letters.

named-entity-recognition Named Entity Recognition +1

Cross-lingual Models of Word Embeddings: An Empirical Comparison

1 code implementation ACL 2016 Shyam Upadhyay, Manaal Faruqui, Chris Dyer, Dan Roth

Despite interest in using cross-lingual knowledge to learn word embeddings for various tasks, a systematic comparison of the possible approaches is lacking in the literature.

Word Embeddings

Training with Exploration Improves a Greedy Stack-LSTM Parser

no code implementations11 Mar 2016 Miguel Ballesteros, Yoav Goldberg, Chris Dyer, Noah A. Smith

We adapt the greedy Stack-LSTM dependency parser of Dyer et al. (2015) to support a training-with-exploration procedure using dynamic oracles(Goldberg and Nivre, 2013) instead of cross-entropy minimization.

Chinese Dependency Parsing Dependency Parsing

Neural Architectures for Named Entity Recognition

43 code implementations NAACL 2016 Guillaume Lample, Miguel Ballesteros, Sandeep Subramanian, Kazuya Kawakami, Chris Dyer

State-of-the-art named entity recognition systems rely heavily on hand-crafted features and domain-specific knowledge in order to learn effectively from the small, supervised training corpora that are available.

Named Entity Recognition

Segmental Recurrent Neural Networks for End-to-end Speech Recognition

no code implementations1 Mar 2016 Liang Lu, Lingpeng Kong, Chris Dyer, Noah A. Smith, Steve Renals

This model connects the segmental conditional random field (CRF) with a recurrent neural network (RNN) used for feature extraction.

Acoustic Modelling Language Modelling +2

Massively Multilingual Word Embeddings

1 code implementation5 Feb 2016 Waleed Ammar, George Mulcaire, Yulia Tsvetkov, Guillaume Lample, Chris Dyer, Noah A. Smith

We introduce new methods for estimating and evaluating embeddings of words in more than fifty languages in a single shared embedding space.

Multilingual Word Embeddings Text Categorization

Morphological Inflection Generation Using Character Sequence to Sequence Learning

1 code implementation NAACL 2016 Manaal Faruqui, Yulia Tsvetkov, Graham Neubig, Chris Dyer

Morphological inflection generation is the task of generating the inflected form of a given lemma corresponding to a particular linguistic transformation.

LEMMA Morphological Inflection

Modeling Dynamic Relationships Between Characters in Literary Novels

no code implementations30 Nov 2015 Snigdha Chaturvedi, Shashank Srivastava, Hal Daume III, Chris Dyer

Studying characters plays a vital role in computationally representing and interpreting narratives.

Structured Prediction

Segmental Recurrent Neural Networks

2 code implementations18 Nov 2015 Lingpeng Kong, Chris Dyer, Noah A. Smith

Representations of the input segments (i. e., contiguous subsequences of the input) are computed by encoding their constituent tokens using bidirectional recurrent neural nets, and these "segment embeddings" are used to define compatibility scores with output labels.

Chinese Word Segmentation Handwriting Recognition +1

Learning to Represent Words in Context with Multilingual Supervision

no code implementations14 Nov 2015 Kazuya Kawakami, Chris Dyer

We present a neural network architecture based on bidirectional LSTMs to compute representations of words in the sentential contexts.

Machine Translation Translation

Character-based Neural Machine Translation

no code implementations14 Nov 2015 Wang Ling, Isabel Trancoso, Chris Dyer, Alan W. black

We introduce a neural machine translation model that views the input and output sentences as sequences of characters rather than words.

Machine Translation Translation

Document Context Language Models

1 code implementation12 Nov 2015 Yangfeng Ji, Trevor Cohn, Lingpeng Kong, Chris Dyer, Jacob Eisenstein

Text documents are structured on multiple levels of detail: individual words are related by syntax, but larger units of text are related by discourse structure.

Depth-Gated LSTM

no code implementations16 Aug 2015 Kaisheng Yao, Trevor Cohn, Katerina Vylomova, Kevin Duh, Chris Dyer

This gate is a function of the lower layer memory cell, the input to and the past memory cell of this layer.

Language Modelling Machine Translation +1

Improved Transition-Based Parsing by Modeling Characters instead of Words with LSTMs

1 code implementation EMNLP 2015 Miguel Ballesteros, Chris Dyer, Noah A. Smith

We present extensions to a continuous-state dependency parsing method that makes it applicable to morphologically rich languages.

Dependency Parsing

Non-distributional Word Vector Representations

1 code implementation IJCNLP 2015 Manaal Faruqui, Chris Dyer

Data-driven representation learning for words is a technique of central importance in NLP.

Representation Learning

Sparse Overcomplete Word Vector Representations

3 code implementations IJCNLP 2015 Manaal Faruqui, Yulia Tsvetkov, Dani Yogatama, Chris Dyer, Noah Smith

Current distributed representations of words show little resemblance to theories of lexical semantics.

Unsupervised POS Induction with Word Embeddings

no code implementations HLT 2015 Chu-Cheng Lin, Waleed Ammar, Chris Dyer, Lori Levin

Unsupervised word embeddings have been shown to be valuable as features in supervised learning problems; however, their role in unsupervised problems has been less thoroughly explored.

POS Word Embeddings

Retrofitting Word Vectors to Semantic Lexicons

2 code implementations HLT 2015 Manaal Faruqui, Jesse Dodge, Sujay K. Jauhar, Chris Dyer, Eduard Hovy, Noah A. Smith

Vector space word representations are learned from distributional information of words in large corpora.

Notes on Noise Contrastive Estimation and Negative Sampling

3 code implementations30 Oct 2014 Chris Dyer

Estimating the parameters of probabilistic models of language such as maxent models and probabilistic neural models is computationally difficult since it involves evaluating partition functions by summing over an entire vocabulary, which may be millions of word types in size.

Binary Classification

Learning Word Representations with Hierarchical Sparse Coding

no code implementations8 Jun 2014 Dani Yogatama, Manaal Faruqui, Chris Dyer, Noah A. Smith

We propose a new method for learning word representations using hierarchical regularization in sparse coding inspired by the linguistic study of word meanings.

Sentence Completion Sentiment Analysis +1

Augmenting English Adjective Senses with Supersenses

1 code implementation LREC 2014 Yulia Tsvetkov, Nathan Schneider, Dirk Hovy, Archna Bhatia, Manaal Faruqui, Chris Dyer

We develop a supersense taxonomy for adjectives, based on that of GermaNet, and apply it to English adjectives in WordNet using human annotation and supervised classification.

Classification General Classification

A Unified Annotation Scheme for the Semantic/Pragmatic Components of Definiteness

no code implementations LREC 2014 Archna Bhatia, M Simons, y, Lori Levin, Yulia Tsvetkov, Chris Dyer, Jordan Bender

We present a definiteness annotation scheme that captures the semantic, pragmatic, and discourse information, which we call communicative functions, associated with linguistic descriptions such as {``}a story about my speech{''}, {``}the story{''}, {``}every time I give it{''}, {``}this slideshow{''}.

Machine Translation Specificity

Discriminative Lexical Semantic Segmentation with Gaps: Running the MWE Gamut

no code implementations TACL 2014 Nathan Schneider, Emily Danchik, Chris Dyer, Noah A. Smith

We present a novel representation, evaluation measure, and supervised models for the task of identifying the multiword expressions (MWEs) in a sentence, resulting in a lexical semantic segmentation.

Chunking Semantic Segmentation

Locally Non-Linear Learning for Statistical Machine Translation via Discretization and Structured Regularization

no code implementations TACL 2014 Jonathan H. Clark, Chris Dyer, Alon Lavie

Linear models, which support efficient learning and inference, are the workhorses of statistical machine translation; however, linear decision rules are less attractive from a modeling perspective.

Feature Engineering Language Modelling +3

Language Modeling with Power Low Rank Ensembles

no code implementations EMNLP 2014 Ankur P. Parikh, Avneesh Saluja, Chris Dyer, Eric P. Xing

We present power low rank ensembles (PLRE), a flexible framework for n-gram language modeling where ensembles of low rank matrices and tensors are used to obtain smoothed probability estimates of words in context.

Language Modelling Machine Translation +1

Predicting the NFL using Twitter

no code implementations25 Oct 2013 Shiladitya Sinha, Chris Dyer, Kevin Gimpel, Noah A. Smith

We study the relationship between social media output and National Football League (NFL) games, using a dataset containing messages from Twitter and NFL game statistics.

Minimum Error Rate Training and the Convex Hull Semiring

no code implementations13 Jul 2013 Chris Dyer

We describe the line search used in the minimum error rate training algorithm MERT as the "inside score" of a weighted proof forest under a semiring defined in terms of well-understood operations from computational geometry.