Search Results for author: Chris Dyer

Found 175 papers, 51 papers with code

Neural Architectures for Named Entity Recognition

43 code implementations • NAACL 2016 • Guillaume Lample, Miguel Ballesteros, Sandeep Subramanian, Kazuya Kawakami, Chris Dyer

State-of-the-art named entity recognition systems rely heavily on hand-crafted features and domain-specific knowledge in order to learn effectively from the small, supervised training corpora that are available.

Ranked #8 on Named Entity Recognition (NER) on CoNLL++

Named Entity Recognition

13,558

Paper
Code

Relational inductive biases, deep learning, and graph networks

31 code implementations • 4 Jun 2018 • Peter W. Battaglia, Jessica B. Hamrick, Victor Bapst, Alvaro Sanchez-Gonzalez, Vinicius Zambaldi, Mateusz Malinowski, Andrea Tacchetti, David Raposo, Adam Santoro, Ryan Faulkner, Caglar Gulcehre, Francis Song, Andrew Ballard, Justin Gilmer, George Dahl, Ashish Vaswani, Kelsey Allen, Charles Nash, Victoria Langston, Chris Dyer, Nicolas Heess, Daan Wierstra, Pushmeet Kohli, Matt Botvinick, Oriol Vinyals, Yujia Li, Razvan Pascanu

As a companion to this paper, we have released an open-source software library for building graph networks, with demonstrations of how to use them in practice.

Decision Making Inductive Bias +1

5,323

Paper
Code

DyNet: The Dynamic Neural Network Toolkit

4 code implementations • 15 Jan 2017 • Graham Neubig, Chris Dyer, Yoav Goldberg, Austin Matthews, Waleed Ammar, Antonios Anastasopoulos, Miguel Ballesteros, David Chiang, Daniel Clothiaux, Trevor Cohn, Kevin Duh, Manaal Faruqui, Cynthia Gan, Dan Garrette, Yangfeng Ji, Lingpeng Kong, Adhiguna Kuncoro, Gaurav Kumar, Chaitanya Malaviya, Paul Michel, Yusuke Oda, Matthew Richardson, Naomi Saphra, Swabha Swayamdipta, Pengcheng Yin

In the static declaration strategy that is used in toolkits like Theano, CNTK, and TensorFlow, the user first defines a computation graph (a symbolic representation of the computation), and then examples are fed into an engine that executes this computation and computes its derivatives.

graph construction

3,407

Paper
Code

Unsupervised Text Style Transfer using Language Models as Discriminators

1 code implementation • NeurIPS 2018 • Zichao Yang, Zhiting Hu, Chris Dyer, Eric P. Xing, Taylor Berg-Kirkpatrick

Binary classifiers are often employed as discriminators in GAN-based unsupervised style transfer systems to ensure that transferred sentences are similar to sentences in the target domain.

Decipherment Language Modelling +4

2,381

Paper
Code

Scaling Language Models: Methods, Analysis & Insights from Training Gopher

2 code implementations • NA 2021 • Jack W. Rae, Sebastian Borgeaud, Trevor Cai, Katie Millican, Jordan Hoffmann, Francis Song, John Aslanides, Sarah Henderson, Roman Ring, Susannah Young, Eliza Rutherford, Tom Hennigan, Jacob Menick, Albin Cassirer, Richard Powell, George van den Driessche, Lisa Anne Hendricks, Maribeth Rauh, Po-Sen Huang, Amelia Glaese, Johannes Welbl, Sumanth Dathathri, Saffron Huang, Jonathan Uesato, John Mellor, Irina Higgins, Antonia Creswell, Nat McAleese, Amy Wu, Erich Elsen, Siddhant Jayakumar, Elena Buchatskaya, David Budden, Esme Sutherland, Karen Simonyan, Michela Paganini, Laurent SIfre, Lena Martens, Xiang Lorraine Li, Adhiguna Kuncoro, Aida Nematzadeh, Elena Gribovskaya, Domenic Donato, Angeliki Lazaridou, Arthur Mensch, Jean-Baptiste Lespiau, Maria Tsimpoukelli, Nikolai Grigorev, Doug Fritz, Thibault Sottiaux, Mantas Pajarskas, Toby Pohlen, Zhitao Gong, Daniel Toyama, Cyprien de Masson d'Autume, Yujia Li, Tayfun Terzi, Vladimir Mikulik, Igor Babuschkin, Aidan Clark, Diego de Las Casas, Aurelia Guy, Chris Jones, James Bradbury, Matthew Johnson, Blake Hechtman, Laura Weidinger, Iason Gabriel, William Isaac, Ed Lockhart, Simon Osindero, Laura Rimell, Chris Dyer, Oriol Vinyals, Kareem Ayoub, Jeff Stanway, Lorrayne Bennett, Demis Hassabis, Koray Kavukcuoglu, Geoffrey Irving

Language modelling provides a step towards intelligent communication systems by harnessing large repositories of written human knowledge to better predict and understand the world.

Ranked #1 on Language Modelling on StackExchange

Abstract Algebra Anachronisms +133

755

Paper
Code

A Simple, Fast, and Effective Reparameterization of IBM Model 2

1 code implementation • NAACL 2013 • Chris Dyer, Victor Chahuneau, Noah A. Smith

Machine Translation Word Alignment

725

Paper
Code

Hierarchical Attention Networks for Document Classification

1 code implementation • NAACL 2016 • Zichao Yang, Diyi Yang, Chris Dyer, Xiaodong He, Alex Smola, Eduard Hovy

Ranked #4 on Text Classification on arXiv-10

Citation Intent Classification Document Classification +2

457

Paper
Code

The NarrativeQA Reading Comprehension Challenge

2 code implementations • TACL 2018 • Tomáš Kočiský, Jonathan Schwarz, Phil Blunsom, Chris Dyer, Karl Moritz Hermann, Gábor Melis, Edward Grefenstette

Reading comprehension (RC)---in contrast to information retrieval---requires integrating information and reasoning about events, entities, and their relations across a full document.

Ranked #9 on Question Answering on NarrativeQA (BLEU-1 metric)

Information Retrieval Question Answering +2

432

Paper
Code

Notes on Noise Contrastive Estimation and Negative Sampling

3 code implementations • 30 Oct 2014 • Chris Dyer

Estimating the parameters of probabilistic models of language such as maxent models and probabilistic neural models is computationally difficult since it involves evaluating partition functions by summing over an entire vocabulary, which may be millions of word types in size.

Binary Classification

404

Paper
Code

Retrofitting Word Vectors to Semantic Lexicons

2 code implementations • HLT 2015 • Manaal Faruqui, Jesse Dodge, Sujay K. Jauhar, Chris Dyer, Eduard Hovy, Noah A. Smith

Vector space word representations are learned from distributional information of words in large corpora.

372

Paper
Code

Transition-Based Dependency Parsing with Stack Long Short-Term Memory

7 code implementations • IJCNLP 2015 • Chris Dyer, Miguel Ballesteros, Wang Ling, Austin Matthews, Noah A. Smith

We propose a technique for learning representations of parser states in transition-based dependency parsers.

Transition-Based Dependency Parsing

274

Paper
Code

Program Induction by Rationale Generation : Learning to Solve and Explain Algebraic Word Problems

1 code implementation • 11 May 2017 • Wang Ling, Dani Yogatama, Chris Dyer, Phil Blunsom

Solving algebraic word problems requires executing a series of arithmetic operations---a program---to obtain a final answer.

Program induction

266

Paper
Code

Frame-Semantic Parsing with Softmax-Margin Segmental RNNs and a Syntactic Scaffold

10 code implementations • 29 Jun 2017 • Swabha Swayamdipta, Sam Thomson, Chris Dyer, Noah A. Smith

We present a new, efficient frame-semantic parser that labels semantic arguments to FrameNet predicates.

Semantic Parsing

221

Paper
Code

Two/Too Simple Adaptations of Word2Vec for Syntax Problems

1 code implementation • HLT 2015 • Chris Dyer, Wang Ling, Isabel Trancoso, Alan W. black

Dependency Parsing Language Modelling +3

211

Paper
Code

Generative and Discriminative Text Classification with Recurrent Neural Networks

2 code implementations • 6 Mar 2017 • Dani Yogatama, Chris Dyer, Wang Ling, Phil Blunsom

We empirically characterize the performance of discriminative and generative LSTM models for text classification.

Continual Learning General Classification +2

209

Paper
Code

Improved Transition-Based Parsing by Modeling Characters instead of Words with LSTMs

1 code implementation • EMNLP 2015 • Miguel Ballesteros, Chris Dyer, Noah A. Smith

We present extensions to a continuous-state dependency parsing method that makes it applicable to morphologically rich languages.

Dependency Parsing

204

Paper
Code

PanPhon: A Resource for Mapping IPA Segments to Articulatory Feature Vectors

1 code implementation • COLING 2016 • David R. Mortensen, Patrick Littell, Akash Bharadwaj, Kartik Goyal, Chris Dyer, Lori Levin

This paper contributes to a growing body of evidence that{---}when coupled with appropriate machine-learning techniques{--}linguistically motivated, information-rich representations can outperform one-hot encodings of linguistic data.

NER

192

Paper
Code

What Do Recurrent Neural Network Grammars Learn About Syntax?

1 code implementation • EACL 2017 • Adhiguna Kuncoro, Miguel Ballesteros, Lingpeng Kong, Chris Dyer, Graham Neubig, Noah A. Smith

We investigate what information they learn, from a linguistic perspective, through various ablations to the model and the data, and by augmenting the model with an attention mechanism (GA-RNNG) to enable closer inspection.

Ranked #20 on Constituency Parsing on Penn Treebank

Constituency Parsing Dependency Parsing +1

186

Paper
Code

Recurrent Neural Network Grammars

6 code implementations • NAACL 2016 • Chris Dyer, Adhiguna Kuncoro, Miguel Ballesteros, Noah A. Smith

We introduce recurrent neural network grammars, probabilistic models of sentences with explicit phrase structure.

Ranked #25 on Constituency Parsing on Penn Treebank

Constituency Parsing Language Modelling

186

Paper
Code

Unsupervised Recurrent Neural Network Grammars

1 code implementation • NAACL 2019 • Yoon Kim, Alexander M. Rush, Lei Yu, Adhiguna Kuncoro, Chris Dyer, Gábor Melis

On language modeling, unsupervised RNNGs perform as well their supervised counterparts on benchmarks in English and Chinese.

Ranked #8 on Constituency Grammar Induction on PTB Diagnostic ECG Database (Max F1 (WSJ) metric)

Constituency Grammar Induction Language Modelling +2

175

Paper
Code

Achieving Verified Robustness to Symbol Substitutions via Interval Bound Propagation

1 code implementation • IJCNLP 2019 • Po-Sen Huang, Robert Stanforth, Johannes Welbl, Chris Dyer, Dani Yogatama, Sven Gowal, Krishnamurthy Dvijotham, Pushmeet Kohli

Neural networks are part of many contemporary NLP systems, yet their empirical successes come at the price of vulnerability to adversarial attacks.

Data Augmentation text-classification +1

148

Paper
Code

Gaussian LDA for Topic Models with Word Embeddings

1 code implementation • IJCNLP 2015 • Rajarshi Das, Manzil Zaheer, Chris Dyer

Topic Models Word Embeddings

141

Paper
Code

Pushing the bounds of dropout

1 code implementation • ICLR 2019 • Gábor Melis, Charles Blundell, Tomáš Kočiský, Karl Moritz Hermann, Chris Dyer, Phil Blunsom

We show that dropout training is best understood as performing MAP estimation concurrently for a family of conditional models whose objectives are themselves lower bounded by the original dropout objective.

Ranked #24 on Language Modelling on Penn Treebank (Word Level)

Language Modelling

137

Paper
Code

On the State of the Art of Evaluation in Neural Language Models

1 code implementation • ICLR 2018 • Gábor Melis, Chris Dyer, Phil Blunsom

Ongoing innovations in recurrent neural network architectures have provided a steady influx of apparently state-of-the-art results on language modelling benchmarks.

Ranked #32 on Language Modelling on WikiText-2

Language Modelling

137

Paper
Code

Compound Probabilistic Context-Free Grammars for Grammar Induction

2 code implementations • ACL 2019 • Yoon Kim, Chris Dyer, Alexander M. Rush

We study a formalization of the grammar induction problem that models sentences as being generated by a compound probabilistic context-free grammar.

Ranked #10 on Constituency Grammar Induction on PTB Diagnostic ECG Database

Constituency Grammar Induction Sentence +1

126

Paper
Code

On-the-fly Operation Batching in Dynamic Computation Graphs

2 code implementations • NeurIPS 2017 • Graham Neubig, Yoav Goldberg, Chris Dyer

Dynamic neural network toolkits such as PyTorch, DyNet, and Chainer offer more flexibility for implementing models that cope with data of varying dimensions and structure, relative to toolkits that operate on statically declared computations (e. g., TensorFlow, CNTK, and Theano).

123

Paper
Code

End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering

2 code implementations • NeurIPS 2021 • Devendra Singh Sachan, Siva Reddy, William Hamilton, Chris Dyer, Dani Yogatama

We model retrieval decisions as latent variables over sets of relevant documents.

Ranked #1 on Open-Domain Question Answering on Natural Questions (short)

Answer Generation Open-Domain Question Answering +1

106

Paper
Code

Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation

1 code implementation • EMNLP 2015 • Wang Ling, Tiago Luís, Luís Marujo, Ramón Fernandez Astudillo, Silvio Amir, Chris Dyer, Alan W. black, Isabel Trancoso

We introduce a model for constructing vector representations of words by composing characters using bidirectional LSTMs.

Ranked #4 on Part-Of-Speech Tagging on Penn Treebank

Language Modelling Part-Of-Speech Tagging

Paper
Code

Massively Multilingual Word Embeddings

1 code implementation • 5 Feb 2016 • Waleed Ammar, George Mulcaire, Yulia Tsvetkov, Guillaume Lample, Chris Dyer, Noah A. Smith

We introduce new methods for estimating and evaluating embeddings of words in more than fifty languages in a single shared embedding space.

Multilingual Word Embeddings Text Categorization

Paper
Code

Evaluation of Word Vector Representations by Subspace Alignment

1 code implementation • EMNLP 2015 • Yulia Tsvetkov, Manaal Faruqui, Wang Ling, Guillaume Lample, Chris Dyer

Chunking Named Entity Recognition (NER) +2

Paper
Code

Metaphor Detection with Cross-Lingual Model Transfer

1 code implementation • ACL 2014 • Yulia Tsvetkov, Leonid Boytsov, Anatole Gershman, Eric Nyberg, Chris Dyer

Decision Making Machine Translation +1

Paper
Code

Non-distributional Word Vector Representations

1 code implementation • IJCNLP 2015 • Manaal Faruqui, Chris Dyer

Data-driven representation learning for words is a technique of central importance in NLP.

Representation Learning

Paper
Code

Greedy, Joint Syntactic-Semantic Parsing with Stack LSTMs

1 code implementation • CONLL 2016 • Swabha Swayamdipta, Miguel Ballesteros, Chris Dyer, Noah A. Smith

We present a transition-based parser that jointly produces syntactic and semantic dependencies.

Semantic Parsing

Paper
Code

Sparse Overcomplete Word Vector Representations

3 code implementations • IJCNLP 2015 • Manaal Faruqui, Yulia Tsvetkov, Dani Yogatama, Chris Dyer, Noah Smith

Current distributed representations of words show little resemblance to theories of lexical semantics.

Paper
Code

Syntactic Scaffolds for Semantic Structures

1 code implementation • EMNLP 2018 • Swabha Swayamdipta, Sam Thomson, Kenton Lee, Luke Zettlemoyer, Chris Dyer, Noah A. Smith

We introduce the syntactic scaffold, an approach to incorporating syntactic information into semantic tasks.

coreference-resolution

Paper
Code

Ontology-Aware Token Embeddings for Prepositional Phrase Attachment

1 code implementation • ACL 2017 • Pradeep Dasigi, Waleed Ammar, Chris Dyer, Eduard Hovy

Type-level word embeddings use the same set of parameters to represent all instances of a word regardless of its context, ignoring the inherent lexical ambiguity in language.

Prepositional Phrase Attachment Word Embeddings

Paper
Code

Neural Arithmetic Logic Units

21 code implementations • NeurIPS 2018 • Andrew Trask, Felix Hill, Scott Reed, Jack Rae, Chris Dyer, Phil Blunsom

Neural networks can learn to represent and manipulate numerical information, but they seldom generalize well outside of the range of numerical values encountered during training.

Paper
Code

A Critical Analysis of Biased Parsers in Unsupervised Parsing

1 code implementation • 20 Sep 2019 • Chris Dyer, Gábor Melis, Phil Blunsom

A series of recent papers has used a parsing algorithm due to Shen et al. (2018) to recover phrase-structure trees based on proxies for "syntactic depth."

Language Modelling

Paper
Code

Many Languages, One Parser

1 code implementation • TACL 2016 • Waleed Ammar, George Mulcaire, Miguel Ballesteros, Chris Dyer, Noah A. Smith

We train one multilingual model for dependency parsing and use it to parse sentences in several languages.

Ranked #2 on Cross-lingual zero-shot dependency parsing on Universal Dependency Treebank

Cross-lingual zero-shot dependency parsing POS

Paper
Code

Morphological Inflection Generation Using Character Sequence to Sequence Learning

1 code implementation • NAACL 2016 • Manaal Faruqui, Yulia Tsvetkov, Graham Neubig, Chris Dyer

Morphological inflection generation is the task of generating the inflected form of a given lemma corresponding to a particular linguistic transformation.

LEMMA Morphological Inflection

Paper
Code

Generalizing and Hybridizing Count-based and Neural Language Models

1 code implementation • EMNLP 2016 • Graham Neubig, Chris Dyer

Language models (LMs) are statistical models that calculate probabilities over sequences of words or other discrete symbols.

Language Modelling

Paper
Code

Document Context Language Models

1 code implementation • 12 Nov 2015 • Yangfeng Ji, Trevor Cohn, Lingpeng Kong, Chris Dyer, Jacob Eisenstein

Text documents are structured on multiple levels of detail: individual words are related by syntax, but larger units of text are related by discourse structure.

Sentence

Paper
Code

Conditional Random Field Autoencoders for Unsupervised Structured Prediction

1 code implementation • NeurIPS 2014 • Waleed Ammar, Chris Dyer, Noah A. Smith

We introduce a framework for unsupervised learning of structured predictors with overlapping, global features.

MULTI-VIEW LEARNING Structured Prediction +1

Paper
Code

Learning to Segment Actions from Observation and Narration

1 code implementation • ACL 2020 • Daniel Fried, Jean-Baptiste Alayrac, Phil Blunsom, Chris Dyer, Stephen Clark, Aida Nematzadeh

We apply a generative segmental model of task structure, guided by narration, to action segmentation in video.

Action Segmentation Segmentation

Paper
Code

Cross-lingual Models of Word Embeddings: An Empirical Comparison

1 code implementation • ACL 2016 • Shyam Upadhyay, Manaal Faruqui, Chris Dyer, Dan Roth

Despite interest in using cross-lingual knowledge to learn word embeddings for various tasks, a systematic comparison of the possible approaches is lacking in the literature.

Word Embeddings

Paper
Code

Distilling an Ensemble of Greedy Dependency Parsers into One MST Parser

1 code implementation • EMNLP 2016 • Adhiguna Kuncoro, Miguel Ballesteros, Lingpeng Kong, Chris Dyer, Noah A. Smith

We introduce two first-order graph-based dependency parsers achieving a new state of the art.

Ranked #17 on Dependency Parsing on Penn Treebank

Dependency Parsing

Paper
Code

Problems With Evaluation of Word Embeddings Using Word Similarity Tasks

1 code implementation • WS 2016 • Manaal Faruqui, Yulia Tsvetkov, Pushpendre Rastogi, Chris Dyer

Our study suggests that the use of word similarity tasks for evaluation of word vectors is not sustainable and calls for further research on evaluation methods.

Paper
Code

Latent-Variable Synchronous CFGs for Hierarchical Translation

1 code implementation • EMNLP 2014 • Avneesh Saluja, Chris Dyer, Shay B. Cohen

Translation

Paper
Code

A framework for (under)specifying dependency syntax without overloading annotators

1 code implementation • WS 2013 • Nathan Schneider, Brendan O'Connor, Naomi Saphra, David Bamman, Manaal Faruqui, Noah A. Smith, Chris Dyer, Jason Baldridge

We introduce a framework for lightweight dependency syntax annotation.

Paper
Code

Augmenting English Adjective Senses with Supersenses

1 code implementation • LREC 2014 • Yulia Tsvetkov, Nathan Schneider, Dirk Hovy, Archna Bhatia, Manaal Faruqui, Chris Dyer

We develop a supersense taxonomy for adjectives, based on that of GermaNet, and apply it to English adjectives in WordNet using human annotation and supervised classification.

Classification General Classification

Paper
Code

Finding Syntax in Human Encephalography with Beam Search

no code implementations • ACL 2018 • John Hale, Chris Dyer, Adhiguna Kuncoro, Jonathan R. Brennan

Model comparisons attribute the early peak to syntactic composition within the RNNG.

Attribute Language Modelling

Paper
Add Code

Fast Parametric Learning with Activation Memorization

no code implementations • ICML 2018 • Jack W. Rae, Chris Dyer, Peter Dayan, Timothy P. Lillicrap

Neural networks trained with backpropagation often struggle to identify classes that have been observed a small number of times.

Ranked #68 on Language Modelling on WikiText-103

Image Classification Language Modelling +1

Paper
Add Code

Learning Deep Generative Models of Graphs

no code implementations • ICLR 2018 • Yujia Li, Oriol Vinyals, Chris Dyer, Razvan Pascanu, Peter Battaglia

Graphs are fundamental data structures which concisely capture the relational structure in many important real-world domains, such as knowledge graphs, physical and social interactions, language, and chemistry.

Graph Generation Knowledge Graphs

Paper
Add Code

Paraphrase-Supervised Models of Compositionality

no code implementations • 31 Jan 2018 • Avneesh Saluja, Chris Dyer, Jean-David Ruvini

Compositional vector space models of meaning promise new solutions to stubborn language understanding problems.

Machine Translation Translation

Paper
Add Code

Dynamic Integration of Background Knowledge in Neural NLU Systems

no code implementations • ICLR 2018 • Dirk Weissenborn, Tomáš Kočiský, Chris Dyer

Common-sense and background knowledge is required to understand natural language, but in most neural natural language understanding (NLU) systems, this knowledge must be acquired from training corpora during learning, and then it is static at test time.

Ranked #34 on Question Answering on TriviaQA

Common Sense Reasoning Natural Language Inference +3

Paper
Add Code

A Continuous Relaxation of Beam Search for End-to-end Training of Neural Sequence Models

no code implementations • 1 Aug 2017 • Kartik Goyal, Graham Neubig, Chris Dyer, Taylor Berg-Kirkpatrick

In experiments, we show that optimizing this new training objective yields substantially better results on two sequence tasks (Named Entity Recognition and CCG Supertagging) when compared with both cross entropy trained greedy decoding and cross entropy trained beam decoding baselines.

Ranked #3 on Motion Segmentation on Hopkins155

CCG Supertagging Motion Segmentation +3

Paper
Add Code

End-to-End Neural Segmental Models for Speech Recognition

no code implementations • 1 Aug 2017 • Hao Tang, Liang Lu, Lingpeng Kong, Kevin Gimpel, Karen Livescu, Chris Dyer, Noah A. Smith, Steve Renals

Segmental models are an alternative to frame-based models for sequence prediction, where hypothesized path weights are based on entire segment scores rather than a single frame at a time.

speech-recognition Speech Recognition

Paper
Add Code

Reference-Aware Language Models

no code implementations • EMNLP 2017 • Zichao Yang, Phil Blunsom, Chris Dyer, Wang Ling

We propose a general class of language models that treat reference as an explicit stochastic latent variable.

Ranked #1 on Recipe Generation on allrecipes.com

Dialogue Generation Recipe Generation

Paper
Add Code

Multitask Learning with CTC and Segmental CRF for Speech Recognition

no code implementations • 21 Feb 2017 • Liang Lu, Lingpeng Kong, Chris Dyer, Noah A. Smith

Segmental conditional random fields (SCRFs) and connectionist temporal classification (CTC) are two sequence labeling methods used for end-to-end training of speech recognition models.

speech-recognition Speech Recognition

Paper
Add Code

Learning to Create and Reuse Words in Open-Vocabulary Neural Language Modeling

no code implementations • ACL 2017 • Kazuya Kawakami, Chris Dyer, Phil Blunsom

Fixed-vocabulary language models fail to account for one of the most characteristic statistical facts of natural language: the frequent creation and reuse of new word types.

Language Modelling

Paper
Add Code

Differentiable Scheduled Sampling for Credit Assignment

no code implementations • ACL 2017 • Kartik Goyal, Chris Dyer, Taylor Berg-Kirkpatrick

We demonstrate that a continuous relaxation of the argmax operation can be used to create a differentiable approximation to greedy decoding for sequence-to-sequence (seq2seq) models.

Machine Translation named-entity-recognition +3

Paper
Add Code

The Neural Noisy Channel

no code implementations • 8 Nov 2016 • Lei Yu, Phil Blunsom, Chris Dyer, Edward Grefenstette, Tomas Kocisky

We formulate sequence to sequence transduction as a noisy channel decoding problem and use recurrent neural networks to parameterise the source and channel models.

Machine Translation Morphological Inflection +2

Paper
Add Code

Learning to Compose Words into Sentences with Reinforcement Learning

no code implementations • 28 Nov 2016 • Dani Yogatama, Phil Blunsom, Chris Dyer, Edward Grefenstette, Wang Ling

We use reinforcement learning to learn tree-structured neural networks for computing representations of natural language sentences.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Semantic Parsing with Semi-Supervised Sequential Autoencoders

no code implementations • EMNLP 2016 • Tomáš Kočiský, Gábor Melis, Edward Grefenstette, Chris Dyer, Wang Ling, Phil Blunsom, Karl Moritz Hermann

We present a novel semi-supervised approach for sequence transduction and apply it to semantic parsing.

Semantic Parsing

Paper
Add Code

Training with Exploration Improves a Greedy Stack-LSTM Parser

no code implementations • 11 Mar 2016 • Miguel Ballesteros, Yoav Goldberg, Chris Dyer, Noah A. Smith

We adapt the greedy Stack-LSTM dependency parser of Dyer et al. (2015) to support a training-with-exploration procedure using dynamic oracles(Goldberg and Nivre, 2013) instead of cross-entropy minimization.

Ranked #2 on Chinese Dependency Parsing on Chinese Pennbank

Chinese Dependency Parsing Dependency Parsing

Paper
Add Code

Neural Machine Translation with Recurrent Attention Modeling

no code implementations • EACL 2017 • Zichao Yang, Zhiting Hu, Yuntian Deng, Chris Dyer, Alex Smola

Knowing which words have been attended to in previous time steps while generating a translation is a rich source of information for predicting what words will be attended to in the future.

Machine Translation Translation

Paper
Add Code

Correlation-based Intrinsic Evaluation of Word Vector Representations

no code implementations • WS 2016 • Yulia Tsvetkov, Manaal Faruqui, Chris Dyer

We introduce QVEC-CCA--an intrinsic evaluation metric for word vector representations based on correlations of learned vectors with features extracted from linguistic resources.

Word Similarity

Paper
Add Code

Learning the Curriculum with Bayesian Optimization for Task-Specific Word Representation Learning

no code implementations • ACL 2016 • Yulia Tsvetkov, Manaal Faruqui, Wang Ling, Brian MacWhinney, Chris Dyer

We use Bayesian optimization to learn curricula for word representation learning, optimizing performance on downstream tasks that depend on the learned representations as features.

Bayesian Optimization Representation Learning

Paper
Add Code

Segmental Recurrent Neural Networks for End-to-end Speech Recognition

no code implementations • 1 Mar 2016 • Liang Lu, Lingpeng Kong, Chris Dyer, Noah A. Smith, Steve Renals

This model connects the segmental conditional random field (CRF) with a recurrent neural network (RNN) used for feature extraction.

Ranked #16 on Speech Recognition on TIMIT

Acoustic Modelling Language Modelling +2

Paper
Add Code

Polyglot Neural Language Models: A Case Study in Cross-Lingual Phonetic Representation Learning

no code implementations • NAACL 2016 • Yulia Tsvetkov, Sunayana Sitaram, Manaal Faruqui, Guillaume Lample, Patrick Littell, David Mortensen, Alan W. black, Lori Levin, Chris Dyer

We introduce polyglot language models, recurrent neural network models trained to predict symbol sequences in many different languages using shared representations of symbols and conditioning on typological information about the language to be predicted.

Representation Learning

Paper
Add Code

Segmental Recurrent Neural Networks

2 code implementations • 18 Nov 2015 • Lingpeng Kong, Chris Dyer, Noah A. Smith

Representations of the input segments (i. e., contiguous subsequences of the input) are computed by encoding their constituent tokens using bidirectional recurrent neural nets, and these "segment embeddings" are used to define compatibility scores with output labels.

Chinese Word Segmentation Handwriting Recognition +2

Paper
Code

Semantic Scan: Detecting Subtle, Spatially Localized Events in Text Streams

no code implementations • 13 Feb 2016 • Abhinav Maurya, Kenton Murray, Yandong Liu, Chris Dyer, William W. Cohen, Daniel B. Neill

Many methods have been proposed for detecting emerging events in text streams using topic modeling.

Event Detection

Paper
Add Code

Incorporating Structural Alignment Biases into an Attentional Neural Translation Model

no code implementations • NAACL 2016 • Trevor Cohn, Cong Duy Vu Hoang, Ekaterina Vymolova, Kaisheng Yao, Chris Dyer, Gholamreza Haffari

Neural encoder-decoder models of machine translation have achieved impressive results, rivalling traditional translation models.

Machine Translation Translation

Paper
Add Code

Modeling Dynamic Relationships Between Characters in Literary Novels

no code implementations • 30 Nov 2015 • Snigdha Chaturvedi, Shashank Srivastava, Hal Daume III, Chris Dyer

Studying characters plays a vital role in computationally representing and interpreting narratives.

Structured Prediction World Knowledge

Paper
Add Code

Learning to Represent Words in Context with Multilingual Supervision

no code implementations • 14 Nov 2015 • Kazuya Kawakami, Chris Dyer

We present a neural network architecture based on bidirectional LSTMs to compute representations of words in the sentential contexts.

Machine Translation Translation

Paper
Add Code

Character-based Neural Machine Translation

no code implementations • 14 Nov 2015 • Wang Ling, Isabel Trancoso, Chris Dyer, Alan W. black

We introduce a neural machine translation model that views the input and output sentences as sequences of characters rather than words.

Machine Translation Translation

Paper
Add Code

Depth-Gated LSTM

no code implementations • 16 Aug 2015 • Kaisheng Yao, Trevor Cohn, Katerina Vylomova, Kevin Duh, Chris Dyer

This gate is a function of the lower layer memory cell, the input to and the past memory cell of this layer.

Language Modelling Machine Translation +1

Paper
Add Code

Unsupervised POS Induction with Word Embeddings

no code implementations • HLT 2015 • Chu-Cheng Lin, Waleed Ammar, Chris Dyer, Lori Levin

Unsupervised word embeddings have been shown to be valuable as features in supervised learning problems; however, their role in unsupervised problems has been less thoroughly explored.

POS Word Embeddings

Paper
Add Code

Learning Word Representations with Hierarchical Sparse Coding

no code implementations • 8 Jun 2014 • Dani Yogatama, Manaal Faruqui, Chris Dyer, Noah A. Smith

We propose a new method for learning word representations using hierarchical regularization in sparse coding inspired by the linguistic study of word meanings.

Sentence Sentence Completion +2

Paper
Add Code

Language Modeling with Power Low Rank Ensembles

no code implementations • EMNLP 2014 • Ankur P. Parikh, Avneesh Saluja, Chris Dyer, Eric P. Xing

We present power low rank ensembles (PLRE), a flexible framework for n-gram language modeling where ensembles of low rank matrices and tensors are used to obtain smoothed probability estimates of words in context.

Language Modelling Machine Translation +1

Paper
Add Code

Predicting the NFL using Twitter

no code implementations • 25 Oct 2013 • Shiladitya Sinha, Chris Dyer, Kevin Gimpel, Noah A. Smith

We study the relationship between social media output and National Football League (NFL) games, using a dataset containing messages from Twitter and NFL game statistics.

Paper
Add Code

Minimum Error Rate Training and the Convex Hull Semiring

no code implementations • 13 Jul 2013 • Chris Dyer

We describe the line search used in the minimum error rate training algorithm MERT as the "inside score" of a weighted proof forest under a semiring defined in terms of well-understood operations from computational geometry.

Paper
Add Code

Learning to Discover, Ground and Use Words with Segmental Neural Language Models

no code implementations • ACL 2019 • Kazuya Kawakami, Chris Dyer, Phil Blunsom

We propose a segmental neural language model that combines the generalization power of neural networks with the ability to discover word-like units that are latent in unsegmented character sequences.

Language Modelling Segmentation

Paper
Add Code

Sentence Encoding with Tree-constrained Relation Networks

no code implementations • 26 Nov 2018 • Lei Yu, Cyprien de Masson d'Autume, Chris Dyer, Phil Blunsom, Lingpeng Kong, Wang Ling

The meaning of a sentence is a function of the relations that hold between its words.

General Classification Machine Translation +5

Paper
Add Code

Greedy Transition-Based Dependency Parsing with Stack LSTMs

no code implementations • CL 2017 • Miguel Ballesteros, Chris Dyer, Yoav Goldberg, Noah A. Smith

During training, dynamic oracles alternate between sampling parser states from the training data and from the model as it is being learned, making the model more robust to the kinds of errors that will be made at test time.

Transition-Based Dependency Parsing

Paper
Add Code

Mining Parallel Corpora from Sina Weibo and Twitter

no code implementations • CL 2016 • Wang Ling, Lu{\'\i}s Marujo, Chris Dyer, Alan W. black, Isabel Trancoso

Paper
Add Code

LSTMs Can Learn Syntax-Sensitive Dependencies Well, But Modeling Structure Makes Them Better

no code implementations • ACL 2018 • Adhiguna Kuncoro, Chris Dyer, John Hale, Dani Yogatama, Stephen Clark, Phil Blunsom

Language exhibits hierarchical structure, but recent work using a subject-verb agreement diagnostic argued that state-of-the-art language models, LSTMs, fail to learn long-range syntax sensitive dependencies.

Language Modelling Machine Translation +1

Paper
Add Code

Program Induction by Rationale Generation: Learning to Solve and Explain Algebraic Word Problems

no code implementations • ACL 2017 • Wang Ling, Dani Yogatama, Chris Dyer, Phil Blunsom

Solving algebraic word problems requires executing a series of arithmetic operations{---}a program{---}to obtain a final answer.

Decision Making Program induction

Paper
Add Code

Synthesizing Compound Words for Machine Translation

no code implementations • ACL 2016 • Austin Matthews, Eva Schlinger, Alon Lavie, Chris Dyer

Machine Translation Translation

Paper
Add Code

Using Morphological Knowledge in Open-Vocabulary Neural Language Models

no code implementations • NAACL 2018 • Austin Matthews, Graham Neubig, Chris Dyer

Languages with productive morphology pose problems for language models that generate words from a fixed vocabulary.

Language Modelling Morphological Disambiguation

Paper
Add Code

Generation from Abstract Meaning Representation using Tree Transducers

no code implementations • NAACL 2016 • Jeffrey Flanigan, Chris Dyer, Noah A. Smith, Jaime Carbonell

Language Modelling Machine Translation +1

Paper
Add Code

CMU at SemEval-2016 Task 8: Graph-based AMR Parsing with Infinite Ramp Loss

no code implementations • SEMEVAL 2016 • Jeffrey Flanigan, Chris Dyer, Noah A. Smith, Jaime Carbonell

Ranked #3 on AMR Parsing on LDC2015E86

AMR Parsing Structured Prediction

Paper
Add Code

Phonologically Aware Neural Model for Named Entity Recognition in Low Resource Transfer Settings

no code implementations • EMNLP 2016 • Akash Bharadwaj, David Mortensen, Chris Dyer, Jaime Carbonell

Cross-Lingual Transfer Feature Engineering +5

Paper
Add Code

Character Sequence Models for Colorful Words

no code implementations • EMNLP 2016 • Kazuya Kawakami, Chris Dyer, Bryan Routledge, Noah A. Smith

Language Modelling

Paper
Add Code

Training with Exploration Improves a Greedy Stack LSTM Parser

no code implementations • EMNLP 2016 • Miguel Ballesteros, Yoav Goldberg, Chris Dyer, Noah A. Smith

Paper
Add Code

Transition-Based Dependency Parsing with Heuristic Backtracking

no code implementations • EMNLP 2016 • Jacob Buckman, Miguel Ballesteros, Chris Dyer

Transition-Based Dependency Parsing

Paper
Add Code

Should Neural Network Architecture Reflect Linguistic Structure?

no code implementations • CONLL 2017 • Chris Dyer

On the generation front, I introduce recurrent neural network grammars (RNNGs), a joint, generative model of phrase-structure trees and sentences.

Paper
Add Code

Attention-based Multimodal Neural Machine Translation

no code implementations • WS 2016 • Po-Yao Huang, Frederick Liu, Sz-Rung Shiang, Jean Oh, Chris Dyer

Image Captioning Machine Translation +1

Paper
Add Code

Posterior regularization for Joint Modeling of Multiple Structured Prediction Tasks with Soft Constraints

no code implementations • WS 2016 • Kartik Goyal, Chris Dyer

Multi-Task Learning Named Entity Recognition (NER) +2

Paper
Add Code

The Role of Context in Neural Morphological Disambiguation

no code implementations • COLING 2016 • Qinlan Shen, Daniel Clothiaux, Emily Tagtow, Patrick Littell, Chris Dyer

While morphological analyzers can reduce this sparsity by providing morpheme-level analyses for words, they will often introduce ambiguity by returning multiple analyses for the same surface form.

Morphological Disambiguation

Paper
Add Code

Named Entity Recognition for Linguistic Rapid Response in Low-Resource Languages: Sorani Kurdish and Tajik

no code implementations • COLING 2016 • Patrick Littell, Kartik Goyal, David R. Mortensen, Alexa Little, Chris Dyer, Lori Levin

This paper describes our construction of named-entity recognition (NER) systems in two Western Iranian languages, Sorani Kurdish and Tajik, as a part of a pilot study of {``}Linguistic Rapid Response{''} to potential emergency humanitarian relief situations.

Humanitarian named-entity-recognition +2

Paper
Add Code

Memory Architectures in Recurrent Neural Network Language Models

no code implementations • ICLR 2018 • Dani Yogatama, Yishu Miao, Gabor Melis, Wang Ling, Adhiguna Kuncoro, Chris Dyer, Phil Blunsom

We compare and analyze sequential, random access, and stack memory architectures for recurrent neural network language models.

Paper
Add Code

Learning and Evaluating General Linguistic Intelligence

no code implementations • 31 Jan 2019 • Dani Yogatama, Cyprien de Masson d'Autume, Jerome Connor, Tomas Kocisky, Mike Chrzanowski, Lingpeng Kong, Angeliki Lazaridou, Wang Ling, Lei Yu, Chris Dyer, Phil Blunsom

We define general linguistic intelligence as the ability to reuse previously acquired knowledge about a language's lexicon, syntax, semantics, and pragmatic conventions to adapt to new tasks quickly.

Natural Language Understanding Question Answering

Paper
Add Code

Book Reviews: Linguistic Fundamentals for Natural Language Processing: 100 Essentials from Morphology and Syntax by Emily M. Bender

no code implementations • CL 2015 • Chris Dyer

Paper
Add Code

Discriminative Lexical Semantic Segmentation with Gaps: Running the MWE Gamut

no code implementations • TACL 2014 • Nathan Schneider, Emily Danchik, Chris Dyer, Noah A. Smith

We present a novel representation, evaluation measure, and supervised models for the task of identifying the multiword expressions (MWEs) in a sentence, resulting in a lexical semantic segmentation.

Chunking Segmentation +2

Paper
Add Code

Locally Non-Linear Learning for Statistical Machine Translation via Discretization and Structured Regularization

no code implementations • TACL 2014 • Jonathan H. Clark, Chris Dyer, Alon Lavie

Linear models, which support efficient learning and inference, are the workhorses of statistical machine translation; however, linear decision rules are less attractive from a modeling perspective.

Feature Engineering Language Modelling +3

Paper
Add Code

Unifying Bayesian Inference and Vector Space Models for Improved Decipherment

no code implementations • IJCNLP 2015 • Qing Dou, Ashish Vaswani, Kevin Knight, Chris Dyer

Bayesian Inference Decipherment +3

Paper
Add Code

Lexicon Stratification for Translating Out-of-Vocabulary Words

no code implementations • IJCNLP 2015 • Yulia Tsvetkov, Chris Dyer

Transliteration

Paper
Add Code

Frame-Semantic Role Labeling with Heterogeneous Annotations

no code implementations • IJCNLP 2015 • Meghana Kshirsagar, Sam Thomson, Nathan Schneider, Jaime Carbonell, Noah A. Smith, Chris Dyer

Domain Adaptation Semantic Parsing +1

Paper
Add Code

Automatic Keyword Extraction on Twitter

no code implementations • IJCNLP 2015 • Lu{\'\i}s Marujo, Wang Ling, Isabel Trancoso, Chris Dyer, Alan W. black, Anatole Gershman, David Martins de Matos, Jo{\~a}o Neto, Jaime Carbonell

Information Retrieval Keyword Extraction +6

Paper
Add Code

A Discriminative Graph-Based Parser for the Abstract Meaning Representation

no code implementations • ACL 2014 • Jeffrey Flanigan, Sam Thomson, Jaime Carbonell, Chris Dyer, Noah A. Smith

AMR Parsing Dependency Parsing +2

Paper
Add Code

Distributed Representations of Geographically Situated Language

no code implementations • ACL 2014 • David Bamman, Chris Dyer, Noah A. Smith

Representation Learning Semantic Textual Similarity

Paper
Add Code

Community Evaluation and Exchange of Word Vectors at wordvectors.org

no code implementations • ACL 2014 • Manaal Faruqui, Chris Dyer

Text Classification Word Embeddings

Paper
Add Code

Simplified Dependency Annotations with GFL-Web

no code implementations • ACL 2014 • Michael T. Mordowanec, Nathan Schneider, Chris Dyer, Noah A. Smith

Paper
Add Code

Microblogs as Parallel Corpora

no code implementations • ACL 2013 • Wang Ling, Guang Xiang, Chris Dyer, Alan Black, Isabel Trancoso

Machine Translation

Paper
Add Code

An Information Theoretic Approach to Bilingual Word Clustering

no code implementations • ACL 2013 • Manaal Faruqui, Chris Dyer

Clustering Word Alignment

Paper
Add Code

Joint Feature Selection in Distributed Stochastic Learning for Large-Scale Discriminative Training in SMT

no code implementations • ACL 2012 • Patrick Simianer, Stefan Riezler, Chris Dyer

Feature Engineering feature selection +1

Paper
Add Code

Learning from Post-Editing: Online Model Adaptation for Statistical Machine Translation

no code implementations • EACL 2014 • Michael Denkowski, Chris Dyer, Alon Lavie

Language Modelling Machine Translation +1

Paper
Add Code

Improving Vector Space Word Representations Using Multilingual Correlation

no code implementations • EACL 2014 • Manaal Faruqui, Chris Dyer

Word Embeddings

Paper
Add Code

Augmenting Translation Models with Simulated Acoustic Confusions for Improved Spoken Language Translation

no code implementations • EACL 2014 • Yulia Tsvetkov, Florian Metze, Chris Dyer

Language Modelling Machine Translation +2

Paper
Add Code

Constraint-Based Models of Lexical Borrowing

no code implementations • HLT 2015 • Yulia Tsvetkov, Chris Dyer, Waleed Ammar

Transliteration

Paper
Add Code

Ontologically Grounded Multi-sense Representation Learning for Semantic Vector Space Models

no code implementations • HLT 2015 • Eduard Hovy, Chris Dyer, Sujay Kumar Jauhar

Information Retrieval Question Answering +5

Paper
Add Code

Large-Scale Discriminative Training for Statistical Machine Translation Using Held-Out Line Search

no code implementations • NAACL 2013 • Jeffrey Flanigan, Chris Dyer, Jaime Carbonell

Language Modelling Machine Translation +1

Paper
Add Code

Improved Part-of-Speech Tagging for Online Conversational Text with Word Clusters

no code implementations • NAACL 2013 • Olutobi Owoputi, Brendan O{'}Connor, Chris Dyer, Kevin Gimpel, Nathan Schneider, Noah A. Smith

Ranked #2 on Part-Of-Speech Tagging on Social media

Named Entity Recognition (NER) Part-Of-Speech Tagging

Paper
Add Code

Supersense Tagging for Arabic: the MT-in-the-Middle Attack

no code implementations • NAACL 2013 • Nathan Schneider, Behrang Mohit, Chris Dyer, Kemal Oflazer, Noah A. Smith

Coreference Resolution Machine Translation +1

Paper
Add Code

Knowledge-Rich Morphological Priors for Bayesian Language Models

no code implementations • NAACL 2013 • Victor Chahuneau, Noah A. Smith, Chris Dyer

Language Modelling Morphological Analysis +1

Paper
Add Code

CMU: Arc-Factored, Discriminative Semantic Dependency Parsing

no code implementations • SEMEVAL 2014 • Sam Thomson, Brendan O{'}Connor, Jeffrey Flanigan, David Bamman, Jesse Dodge, Swabha Swayamdipta, Nathan Schneider, Chris Dyer, Noah A. Smith

Dependency Parsing Knowledge Base Population +2

Paper
Add Code

Learning Semantics and Selectional Preference of Adjective-Noun Pairs

no code implementations • SEMEVAL 2012 • Karl Moritz Hermann, Chris Dyer, Phil Blunsom, Stephen Pulman

Dimensionality Reduction Semantic Textual Similarity +1

Paper
Add Code

Not All Contexts Are Created Equal: Better Word Representations with Variable Attention

no code implementations • EMNLP 2015 • Wang Ling, Yulia Tsvetkov, Silvio Amir, Ferm, Ram{\'o}n ez, Chris Dyer, Alan W. black, Isabel Trancoso, Chu-Cheng Lin

Dependency Parsing Machine Translation +1

Paper
Add Code

Humor Recognition and Humor Anchor Extraction

no code implementations • EMNLP 2015 • Diyi Yang, Alon Lavie, Chris Dyer, Eduard Hovy

Paper
Add Code

A Dependency Parser for Tweets

no code implementations • EMNLP 2014 • Lingpeng Kong, Nathan Schneider, Swabha Swayamdipta, Archna Bhatia, Chris Dyer, Noah A. Smith

Dependency Parsing Domain Adaptation +1

Paper
Add Code

Paraphrasing 4 Microblog Normalization

no code implementations • EMNLP 2013 • Wang Ling, Chris Dyer, Alan W. black, Isabel Trancoso

Machine Translation

Paper
Add Code

A Systematic Exploration of Diversity in Machine Translation

no code implementations • EMNLP 2013 • Kevin Gimpel, Dhruv Batra, Chris Dyer, Gregory Shakhnarovich

Machine Translation Translation

Paper
Add Code

Translating into Morphologically Rich Languages with Synthetic Phrases

no code implementations • EMNLP 2013 • Victor Chahuneau, Eva Schlinger, Noah A. Smith, Chris Dyer

Machine Translation Morphological Analysis

Paper
Add Code

A Bayesian Model for Learning SCFGs with Discontiguous Rules

no code implementations • EMNLP 2012 • Abby Levenberg, Chris Dyer, Phil Blunsom

Machine Translation Word Alignment

Paper
Add Code

A Supertag-Context Model for Weakly-Supervised CCG Parser Learning

no code implementations • CONLL 2015 • Dan Garrette, Chris Dyer, Jason Baldridge, Noah A. Smith

Paper
Add Code

Weakly-Supervised Bayesian Learning of a CCG Supertagger

no code implementations • WS 2014 • Dan Garrette, Chris Dyer, Jason Baldridge, Noah A. Smith

Paper
Add Code

Real Time Adaptive Machine Translation for Post-Editing with cdec and TransCenter

no code implementations • WS 2014 • Michael Denkowski, Alon Lavie, Isabel Lacruz, Chris Dyer

Language Modelling Machine Translation +1

Paper
Add Code

The CMU Machine Translation Systems at WMT 2014

no code implementations • WS 2014 • Austin Matthews, Waleed Ammar, Archna Bhatia, Weston Feely, Greg Hanneman, Eva Schlinger, Swabha Swayamdipta, Yulia Tsvetkov, Alon Lavie, Chris Dyer

Lemmatization Machine Translation +1

Paper
Add Code

Crowdsourcing High-Quality Parallel Data Extraction from Twitter

no code implementations • WS 2014 • Wang Ling, Lu{\'\i}s Marujo, Chris Dyer, Alan W. black, Isabel Trancoso

Machine Translation Vocal Bursts Intensity Prediction +1

Paper
Add Code

The CMU Submission for the Shared Task on Language Identification in Code-Switched Data

no code implementations • WS 2014 • Chu-Cheng Lin, Waleed Ammar, Lori Levin, Chris Dyer

Language Identification Learning Word Embeddings

Paper
Add Code

Identifying the L1 of non-native writers: the CMU-Haifa system

no code implementations • WS 2013 • Yulia Tsvetkov, Naama Twitto, Nathan Schneider, Noam Ordan, Manaal Faruqui, Victor Chahuneau, Shuly Wintner, Chris Dyer

Language Modelling Text Classification

Paper
Add Code

The CMU Machine Translation Systems at WMT 2013: Syntax, Synthetic Translation Options, and Pseudo-References

no code implementations • WS 2013 • Waleed Ammar, Victor Chahuneau, Michael Denkowski, Greg Hanneman, Wang Ling, Austin Matthews, Kenton Murray, Nicola Segall, Alon Lavie, Chris Dyer

Language Modelling Translation +1

Paper
Add Code

Generating English Determiners in Phrase-Based Translation with Synthetic Translation Options

no code implementations • WS 2013 • Yulia Tsvetkov, Chris Dyer, Lori Levin, Archna Bhatia

Language Modelling Machine Translation +1

Paper
Add Code

Distributions on Minimalist Grammar Derivations

no code implementations • WS 2013 • Tim Hunter, Chris Dyer

Paper
Add Code

Transliteration by Sequence Labeling with Lattice Encodings and Reranking

no code implementations • WS 2012 • Waleed Ammar, Chris Dyer, Noah Smith

Transliteration

Paper
Add Code

Automatic Classification of Communicative Functions of Definiteness

no code implementations • COLING 2014 • Archna Bhatia, Chu-Cheng Lin, Nathan Schneider, Yulia Tsvetkov, Fatima Talib Al-Raisi, Laleh Roostapour, Jordan Bender, Abhimanu Kumar, Lori Levin, M Simons, y, Chris Dyer

Classification General Classification

Paper
Add Code

Bayesian Language Modelling of German Compounds

no code implementations • COLING 2012 • Jan A. Botha, Chris Dyer, Phil Blunsom

Language Modelling Machine Translation

Paper
Add Code

A Unified Annotation Scheme for the Semantic/Pragmatic Components of Definiteness

no code implementations • LREC 2014 • Archna Bhatia, M Simons, y, Lori Levin, Yulia Tsvetkov, Chris Dyer, Jordan Bender

We present a definiteness annotation scheme that captures the semantic, pragmatic, and discourse information, which we call communicative functions, associated with linguistic descriptions such as {``}a story about my speech{''}, {``}the story{''}, {``}every time I give it{''}, {``}this slideshow{''}.

Machine Translation Specificity

Paper
Add Code

Dual Subtitles as Parallel Corpora

no code implementations • LREC 2014 • Shikun Zhang, Wang Ling, Chris Dyer

In this paper, we leverage the existence of dual subtitles as a source of parallel data.

Machine Translation Sentence +2

Paper
Add Code

An Empirical Investigation of Global and Local Normalization for Recurrent Neural Sequence Models Using a Continuous Relaxation to Beam Search

no code implementations • NAACL 2019 • Kartik Goyal, Chris Dyer, Taylor Berg-Kirkpatrick

Globally normalized neural sequence models are considered superior to their locally normalized equivalents because they may ameliorate the effects of label bias.

CCG Supertagging Machine Translation

Paper
Add Code

Bridge-Language Capitalization Inference in Western Iranian: Sorani, Kurmanji, Zazaki, and Tajik

no code implementations • LREC 2016 • Patrick Littell, David R. Mortensen, Kartik Goyal, Chris Dyer, Lori Levin

In Sorani Kurdish, one of the most useful orthographic features in named-entity recognition {--} capitalization {--} is absent, as the language{'}s Perso-Arabic script does not make a distinction between uppercase and lowercase letters.

named-entity-recognition Named Entity Recognition +1

Paper
Add Code

Scalable Syntax-Aware Language Models Using Knowledge Distillation

no code implementations • ACL 2019 • Adhiguna Kuncoro, Chris Dyer, Laura Rimell, Stephen Clark, Phil Blunsom

Prior work has shown that, on small amounts of training data, syntactic neural language models learn structurally sensitive generalisations more successfully than sequential language models.

Knowledge Distillation Language Modelling +1

Paper
Add Code

Shallow Syntax in Deep Water

no code implementations • 29 Aug 2019 • Swabha Swayamdipta, Matthew Peters, Brendan Roof, Chris Dyer, Noah A. Smith

Shallow syntax provides an approximation of phrase-syntactic structure of sentences; it can be produced with high accuracy, and is computationally cheap to obtain.

Paper
Add Code

Better Document-Level Machine Translation with Bayes' Rule

no code implementations • TACL 2020 • Lei Yu, Laurent Sartran, Wojciech Stokowiec, Wang Ling, Lingpeng Kong, Phil Blunsom, Chris Dyer

We show that Bayes' rule provides an effective mechanism for creating document translation models that can be learned from only parallel sentences and monolingual documents---a compelling benefit as parallel documents are not always available.

Document Level Machine Translation Document Translation +4

Paper
Add Code

Comparing Top-Down and Bottom-Up Neural Generative Dependency Models

no code implementations • CONLL 2019 • Austin Matthews, Graham Neubig, Chris Dyer

Recurrent neural network grammars generate sentences using phrase-structure syntax and perform very well on both parsing and language modeling.

Language Modelling

Paper
Add Code

Text Genre and Training Data Size in Human-like Parsing

no code implementations • IJCNLP 2019 • John Hale, Adhiguna Kuncoro, Keith Hall, Chris Dyer, Jonathan Brennan

Domain-specific training typically makes NLP systems work better.

Paper
Add Code

Transition-Based Dependency Parsing using Perceptron Learner

no code implementations • 22 Jan 2020 • Rahul Radhakrishnan Iyer, Miguel Ballesteros, Chris Dyer, Robert Frederking

Syntactic parsing using dependency structures has become a standard technique in natural language processing with many different parsing models, in particular data-driven models that can be trained on syntactically annotated corpora.

Transition-Based Dependency Parsing

Paper
Add Code

Learning Robust and Multilingual Speech Representations

no code implementations • Findings of the Association for Computational Linguistics 2020 • Kazuya Kawakami, Luyu Wang, Chris Dyer, Phil Blunsom, Aaron van den Oord

Unsupervised speech representation learning has shown remarkable success at finding representations that correlate with phonetic structures and improve downstream speech recognition performance.

Representation Learning speech-recognition +1

Paper
Add Code

A Probabilistic Generative Model for Typographical Analysis of Early Modern Printing

no code implementations • ACL 2020 • Kartik Goyal, Chris Dyer, Christopher Warren, Max G'Sell, Taylor Berg-Kirkpatrick

We show that our approach outperforms rigid interpretable clustering baselines (Ocular) and overly-flexible deep generative models (VAE) alike on the task of completely unsupervised discovery of typefaces in mixed-font documents.

Clustering

Paper
Add Code

Syntactic Structure Distillation Pretraining For Bidirectional Encoders

no code implementations • 27 May 2020 • Adhiguna Kuncoro, Lingpeng Kong, Daniel Fried, Dani Yogatama, Laura Rimell, Chris Dyer, Phil Blunsom

Textual representation learners trained on large amounts of data have achieved notable success on downstream tasks; intriguingly, they have also performed well on challenging tests of syntactic competence.

Knowledge Distillation Language Modelling +3

Paper
Add Code

Exposing the Implicit Energy Networks behind Masked Language Models via Metropolis--Hastings

no code implementations • ICLR 2022 • Kartik Goyal, Chris Dyer, Taylor Berg-Kirkpatrick

While recent work has shown that scores from models trained by the ubiquitous masked language modeling (MLM) objective effectively discriminate probable from improbable sequences, it is still an open question if these MLMs specify a principled probability distribution over the space of possible sequences.

Language Modelling Machine Translation +3

Paper
Add Code

Diverse Pretrained Context Encodings Improve Document Translation

no code implementations • ACL 2021 • Domenic Donato, Lei Yu, Chris Dyer

We propose a new architecture for adapting a sentence-level sequence-to-sequence transformer by incorporating multiple pretrained document context signals and assess the impact on translation performance of (1) different pretraining approaches for generating these signals, (2) the quantity of parallel data for which document context is available, and (3) conditioning on source, target, or source and target contexts.

Document Translation Sentence +1

Paper
Add Code

Game-theoretic Vocabulary Selection via the Shapley Value and Banzhaf Index

no code implementations • NAACL 2021 • Roma Patel, Marta Garnelo, Ian Gemp, Chris Dyer, Yoram Bachrach

We propose a vocabulary selection method that views words as members of a team trying to maximize the model{'}s performance.

Document Classification Natural Language Inference +2

Paper
Add Code

Better Chinese Sentence Segmentation with Reinforcement Learning

no code implementations • Findings (ACL) 2021 • Srivatsan Srinivasan, Chris Dyer

reinforcement-learning Reinforcement Learning (RL) +2

Paper
Add Code

The DeepMind Chinese–English Document Translation System at WMT2020

no code implementations • WMT (EMNLP) 2020 • Lei Yu, Laurent Sartran, Po-Sen Huang, Wojciech Stokowiec, Domenic Donato, Srivatsan Srinivasan, Alek Andreev, Wang Ling, Sona Mokra, Agustin Dal Lago, Yotam Doron, Susannah Young, Phil Blunsom, Chris Dyer

This paper describes the DeepMind submission to the Chinese\rightarrowEnglish constrained data track of the WMT2020 Shared Task on News Translation.

Document Translation Sentence +2

Paper
Add Code

Unsupervised Word Discovery with Segmental Neural Language Models

no code implementations • 27 Sep 2018 • Kazuya Kawakami, Chris Dyer, Phil Blunsom

We propose a segmental neural language model that combines the representational power of neural networks and the structure learning mechanism of Bayesian nonparametrics, and show that it learns to discover semantically meaningful units (e. g., morphemes and words) from unsegmented character sequences.

Language Modelling

Paper
Add Code

Putting Machine Translation in Context with the Noisy Channel Model

no code implementations • 25 Sep 2019 • Lei Yu, Laurent Sartran, Wojciech Stokowiec, Wang Ling, Lingpeng Kong, Phil Blunsom, Chris Dyer

We show that Bayes' rule provides a compelling mechanism for controlling unconditional document language models, using the long-standing challenge of effectively leveraging document context in machine translation.

Document Translation Language Modelling +3

Paper
Add Code

Unsupervised Learning of Efficient and Robust Speech Representations

no code implementations • 25 Sep 2019 • Kazuya Kawakami, Luyu Wang, Chris Dyer, Phil Blunsom, Aaron van den Oord

We present an unsupervised method for learning speech representations based on a bidirectional contrastive predictive coding that implicitly discovers phonetic structure from large-scale corpora of unlabelled raw audio signals.

speech-recognition Speech Recognition

Paper
Add Code

Relative Pixel Prediction For Autoregressive Image Generation

no code implementations • 25 Sep 2019 • Wang Ling, Chris Dyer, Lei Yu, Lingpeng Kong, Dani Yogatama, Susannah Young

In natural images, transitions between adjacent pixels tend to be smooth and gradual, a fact that has long been exploited in image compression models based on predictive coding.

Colorization Image Colorization +4

Paper
Add Code

Enabling arbitrary translation objectives with Adaptive Tree Search

no code implementations • ICLR 2022 • Wang Ling, Wojciech Stokowiec, Domenic Donato, Laurent Sartran, Lei Yu, Austin Matthews, Chris Dyer

When applied to autoregressive models, our algorithm has different biases than beam search has, which enables a new analysis of the role of decoding bias in autoregressive models.

Translation

Paper
Add Code

Transformer Grammars: Augmenting Transformer Language Models with Syntactic Inductive Biases at Scale

no code implementations • 1 Mar 2022 • Laurent Sartran, Samuel Barrett, Adhiguna Kuncoro, Miloš Stanojević, Phil Blunsom, Chris Dyer

We find that TGs outperform various strong baselines on sentence-level language modeling perplexity, as well as on multiple syntax-sensitive language modeling evaluation metrics.

Inductive Bias Language Modelling +1

Paper
Add Code

MAD for Robust Reinforcement Learning in Machine Translation

no code implementations • 18 Jul 2022 • Domenic Donato, Lei Yu, Wang Ling, Chris Dyer

We introduce a new distributed policy gradient algorithm and show that it outperforms existing reward-aware training procedures such as REINFORCE, minimum risk training (MRT) and proximal policy optimization (PPO) in terms of training stability and generalization performance when optimizing machine translation models.

Machine Translation reinforcement-learning +3

Paper
Add Code

Continuous diffusion for categorical data

no code implementations • 28 Nov 2022 • Sander Dieleman, Laurent Sartran, Arman Roshannai, Nikolay Savinov, Yaroslav Ganin, Pierre H. Richemond, Arnaud Doucet, Robin Strudel, Chris Dyer, Conor Durkan, Curtis Hawthorne, Rémi Leblond, Will Grathwohl, Jonas Adler

Diffusion models have quickly become the go-to paradigm for generative modelling of perceptual signals (such as images and sound) through iterative refinement.

Language Modelling

Paper
Add Code

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

no code implementations • 8 Mar 2024 • Machel Reid, Nikolay Savinov, Denis Teplyashin, Dmitry Lepikhin, Timothy Lillicrap, Jean-Baptiste Alayrac, Radu Soricut, Angeliki Lazaridou, Orhan Firat, Julian Schrittwieser, Ioannis Antonoglou, Rohan Anil, Sebastian Borgeaud, Andrew Dai, Katie Millican, Ethan Dyer, Mia Glaese, Thibault Sottiaux, Benjamin Lee, Fabio Viola, Malcolm Reynolds, Yuanzhong Xu, James Molloy, Jilin Chen, Michael Isard, Paul Barham, Tom Hennigan, Ross Mcilroy, Melvin Johnson, Johan Schalkwyk, Eli Collins, Eliza Rutherford, Erica Moreira, Kareem Ayoub, Megha Goel, Clemens Meyer, Gregory Thornton, Zhen Yang, Henryk Michalewski, Zaheer Abbas, Nathan Schucher, Ankesh Anand, Richard Ives, James Keeling, Karel Lenc, Salem Haykal, Siamak Shakeri, Pranav Shyam, Aakanksha Chowdhery, Roman Ring, Stephen Spencer, Eren Sezener, Luke Vilnis, Oscar Chang, Nobuyuki Morioka, George Tucker, Ce Zheng, Oliver Woodman, Nithya Attaluri, Tomas Kocisky, Evgenii Eltyshev, Xi Chen, Timothy Chung, Vittorio Selo, Siddhartha Brahma, Petko Georgiev, Ambrose Slone, Zhenkai Zhu, James Lottes, Siyuan Qiao, Ben Caine, Sebastian Riedel, Alex Tomala, Martin Chadwick, Juliette Love, Peter Choy, Sid Mittal, Neil Houlsby, Yunhao Tang, Matthew Lamm, Libin Bai, Qiao Zhang, Luheng He, Yong Cheng, Peter Humphreys, Yujia Li, Sergey Brin, Albin Cassirer, Yingjie Miao, Lukas Zilka, Taylor Tobin, Kelvin Xu, Lev Proleev, Daniel Sohn, Alberto Magni, Lisa Anne Hendricks, Isabel Gao, Santiago Ontañón, Oskar Bunyan, Nathan Byrd, Abhanshu Sharma, Biao Zhang, Mario Pinto, Rishika Sinha, Harsh Mehta, Dawei Jia, Sergi Caelles, Albert Webson, Alex Morris, Becca Roelofs, Yifan Ding, Robin Strudel, Xuehan Xiong, Marvin Ritter, Mostafa Dehghani, Rahma Chaabouni, Abhijit Karmarkar, Guangda Lai, Fabian Mentzer, Bibo Xu, Yaguang Li, Yujing Zhang, Tom Le Paine, Alex Goldin, Behnam Neyshabur, Kate Baumli, Anselm Levskaya, Michael Laskin, Wenhao Jia, Jack W. Rae, Kefan Xiao, Antoine He, Skye Giordano, Lakshman Yagati, Jean-Baptiste Lespiau, Paul Natsev, Sanjay Ganapathy, Fangyu Liu, Danilo Martins, Nanxin Chen, Yunhan Xu, Megan Barnes, Rhys May, Arpi Vezer, Junhyuk Oh, Ken Franko, Sophie Bridgers, Ruizhe Zhao, Boxi Wu, Basil Mustafa, Sean Sechrist, Emilio Parisotto, Thanumalayan Sankaranarayana Pillai, Chris Larkin, Chenjie Gu, Christina Sorokin, Maxim Krikun, Alexey Guseynov, Jessica Landon, Romina Datta, Alexander Pritzel, Phoebe Thacker, Fan Yang, Kevin Hui, Anja Hauth, Chih-Kuan Yeh, David Barker, Justin Mao-Jones, Sophia Austin, Hannah Sheahan, Parker Schuh, James Svensson, Rohan Jain, Vinay Ramasesh, Anton Briukhov, Da-Woon Chung, Tamara von Glehn, Christina Butterfield, Priya Jhakra, Matthew Wiethoff, Justin Frye, Jordan Grimstad, Beer Changpinyo, Charline Le Lan, Anna Bortsova, Yonghui Wu, Paul Voigtlaender, Tara Sainath, Charlotte Smith, Will Hawkins, Kris Cao, James Besley, Srivatsan Srinivasan, Mark Omernick, Colin Gaffney, Gabriela Surita, Ryan Burnell, Bogdan Damoc, Junwhan Ahn, Andrew Brock, Mantas Pajarskas, Anastasia Petrushkina, Seb Noury, Lorenzo Blanco, Kevin Swersky, Arun Ahuja, Thi Avrahami, Vedant Misra, Raoul de Liedekerke, Mariko Iinuma, Alex Polozov, Sarah York, George van den Driessche, Paul Michel, Justin Chiu, Rory Blevins, Zach Gleicher, Adrià Recasens, Alban Rrustemi, Elena Gribovskaya, Aurko Roy, Wiktor Gworek, Séb Arnold, Lisa Lee, James Lee-Thorp, Marcello Maggioni, Enrique Piqueras, Kartikeya Badola, Sharad Vikram, Lucas Gonzalez, Anirudh Baddepudi, Evan Senter, Jacob Devlin, James Qin, Michael Azzam, Maja Trebacz, Martin Polacek, Kashyap Krishnakumar, Shuo-Yiin Chang, Matthew Tung, Ivo Penchev, Rishabh Joshi, Kate Olszewska, Carrie Muir, Mateo Wirth, Ale Jakse Hartman, Josh Newlan, Sheleem Kashem, Vijay Bolina, Elahe Dabir, Joost van Amersfoort, Zafarali Ahmed, James Cobon-Kerr, Aishwarya Kamath, Arnar Mar Hrafnkelsson, Le Hou, Ian Mackinnon, Alexandre Frechette, Eric Noland, Xiance Si, Emanuel Taropa, Dong Li, Phil Crone, Anmol Gulati, Sébastien Cevey, Jonas Adler, Ada Ma, David Silver, Simon Tokumine, Richard Powell, Stephan Lee, Michael Chang, Samer Hassan, Diana Mincu, Antoine Yang, Nir Levine, Jenny Brennan, Mingqiu Wang, Sarah Hodkinson, Jeffrey Zhao, Josh Lipschultz, Aedan Pope, Michael B. Chang, Cheng Li, Laurent El Shafey, Michela Paganini, Sholto Douglas, Bernd Bohnet, Fabio Pardo, Seth Odoom, Mihaela Rosca, Cicero Nogueira dos santos, Kedar Soparkar, Arthur Guez, Tom Hudson, Steven Hansen, Chulayuth Asawaroengchai, Ravi Addanki, Tianhe Yu, Wojciech Stokowiec, Mina Khan, Justin Gilmer, Jaehoon Lee, Carrie Grimes Bostock, Keran Rong, Jonathan Caton, Pedram Pejman, Filip Pavetic, Geoff Brown, Vivek Sharma, Mario Lučić, Rajkumar Samuel, Josip Djolonga, Amol Mandhane, Lars Lowe Sjösund, Elena Buchatskaya, Elspeth White, Natalie Clay, Jiepu Jiang, Hyeontaek Lim, Ross Hemsley, Jane Labanowski, Nicola De Cao, David Steiner, Sayed Hadi Hashemi, Jacob Austin, Anita Gergely, Tim Blyth, Joe Stanton, Kaushik Shivakumar, Aditya Siddhant, Anders Andreassen, Carlos Araya, Nikhil Sethi, Rakesh Shivanna, Steven Hand, Ankur Bapna, Ali Khodaei, Antoine Miech, Garrett Tanzer, Andy Swing, Shantanu Thakoor, Zhufeng Pan, Zachary Nado, Stephanie Winkler, Dian Yu, Mohammad Saleh, Loren Maggiore, Iain Barr, Minh Giang, Thais Kagohara, Ivo Danihelka, Amit Marathe, Vladimir Feinberg, Nimesh Ghelani, Dan Horgan, Helen Miller, Lexi Walker, Richard Tanburn, Mukarram Tariq, Disha Shrivastava, Fei Xia, Chung-Cheng Chiu, Khuslen Baatarsukh, Sina Samangooei, Fred Alcober, Axel Stjerngren, Paul Komarek, Katerina Tsihlas, Anudhyan Boral, Ramona Comanescu, Jeremy Chen, Ruibo Liu, Dawn Bloxwich, Charlie Chen, Yanhua Sun, Fangxiaoyu Feng, Matthew Mauger, Xerxes Dotiwalla, Vincent Hellendoorn, Michael Sharman, Ivy Zheng, Krishna Haridasan, Gabe Barth-Maron, Craig Swanson, Dominika Rogozińska, Alek Andreev, Paul Kishan Rubenstein, Ruoxin Sang, Dan Hurt, Gamaleldin Elsayed, Renshen Wang, Dave Lacey, Anastasija Ilić, Yao Zhao, Lora Aroyo, Chimezie Iwuanyanwu, Vitaly Nikolaev, Balaji Lakshminarayanan, Sadegh Jazayeri, Raphaël Lopez Kaufman, Mani Varadarajan, Chetan Tekur, Doug Fritz, Misha Khalman, David Reitter, Kingshuk Dasgupta, Shourya Sarcar, Tina Ornduff, Javier Snaider, Fantine Huot, Johnson Jia, Rupert Kemp, Nejc Trdin, Anitha Vijayakumar, Lucy Kim, Christof Angermueller, Li Lao, Tianqi Liu, Haibin Zhang, David Engel, Somer Greene, Anaïs White, Jessica Austin, Lilly Taylor, Shereen Ashraf, Dangyi Liu, Maria Georgaki, Irene Cai, Yana Kulizhskaya, Sonam Goenka, Brennan Saeta, Kiran Vodrahalli, Christian Frank, Dario de Cesare, Brona Robenek, Harry Richardson, Mahmoud Alnahlawi, Christopher Yew, Priya Ponnapalli, Marco Tagliasacchi, Alex Korchemniy, Yelin Kim, Dinghua Li, Bill Rosgen, Zoe Ashwood, Kyle Levin, Jeremy Wiesner, Praseem Banzal, Praveen Srinivasan, Hongkun Yu, Çağlar Ünlü, David Reid, Zora Tung, Daniel Finchelstein, Ravin Kumar, Andre Elisseeff, Jin Huang, Ming Zhang, Rui Zhu, Ricardo Aguilar, Mai Giménez, Jiawei Xia, Olivier Dousse, Willi Gierke, Soheil Hassas Yeganeh, Damion Yates, Komal Jalan, Lu Li, Eri Latorre-Chimoto, Duc Dung Nguyen, Ken Durden, Praveen Kallakuri, Yaxin Liu, Matthew Johnson, Tomy Tsai, Alice Talbert, Jasmine Liu, Alexander Neitz, Chen Elkind, Marco Selvi, Mimi Jasarevic, Livio Baldini Soares, Albert Cui, Pidong Wang, Alek Wenjiao Wang, Xinyu Ye, Krystal Kallarackal, Lucia Loher, Hoi Lam, Josef Broder, Dan Holtmann-Rice, Nina Martin, Bramandia Ramadhana, Daniel Toyama, Mrinal Shukla, Sujoy Basu, Abhi Mohan, Nick Fernando, Noah Fiedel, Kim Paterson, Hui Li, Ankush Garg, Jane Park, DongHyun Choi, Diane Wu, Sankalp Singh, Zhishuai Zhang, Amir Globerson, Lily Yu, John Carpenter, Félix de Chaumont Quitry, Carey Radebaugh, Chu-Cheng Lin, Alex Tudor, Prakash Shroff, Drew Garmon, Dayou Du, Neera Vats, Han Lu, Shariq Iqbal, Alex Yakubovich, Nilesh Tripuraneni, James Manyika, Haroon Qureshi, Nan Hua, Christel Ngani, Maria Abi Raad, Hannah Forbes, Anna Bulanova, Jeff Stanway, Mukund Sundararajan, Victor Ungureanu, Colton Bishop, Yunjie Li, Balaji Venkatraman, Bo Li, Chloe Thornton, Salvatore Scellato, Nishesh Gupta, Yicheng Wang, Ian Tenney, Xihui Wu, Ashish Shenoy, Gabriel Carvajal, Diana Gage Wright, Ben Bariach, Zhuyun Xiao, Peter Hawkins, Sid Dalmia, Clement Farabet, Pedro Valenzuela, Quan Yuan, Chris Welty, Ananth Agarwal, Mia Chen, Wooyeol Kim, Brice Hulse, Nandita Dukkipati, Adam Paszke, Andrew Bolt, Elnaz Davoodi, Kiam Choo, Jennifer Beattie, Jennifer Prendki, Harsha Vashisht, Rebeca Santamaria-Fernandez, Luis C. Cobo, Jarek Wilkiewicz, David Madras, Ali Elqursh, Grant Uy, Kevin Ramirez, Matt Harvey, Tyler Liechty, Heiga Zen, Jeff Seibert, Clara Huiyi Hu, Mohamed Elhawaty, Andrey Khorlin, Maigo Le, Asaf Aharoni, Megan Li, Lily Wang, Sandeep Kumar, Alejandro Lince, Norman Casagrande, Jay Hoover, Dalia El Badawy, David Soergel, Denis Vnukov, Matt Miecnikowski, Jiri Simsa, Anna Koop, Praveen Kumar, Thibault Sellam, Daniel Vlasic, Samira Daruki, Nir Shabat, John Zhang, Guolong Su, Jiageng Zhang, Jeremiah Liu, Yi Sun, Evan Palmer, Alireza Ghaffarkhah, Xi Xiong, Victor Cotruta, Michael Fink, Lucas Dixon, Ashwin Sreevatsa, Adrian Goedeckemeyer, Alek Dimitriev, Mohsen Jafari, Remi Crocker, Nicholas FitzGerald, Aviral Kumar, Sanjay Ghemawat, Ivan Philips, Frederick Liu, Yannie Liang, Rachel Sterneck, Alena Repina, Marcus Wu, Laura Knight, Marin Georgiev, Hyo Lee, Harry Askham, Abhishek Chakladar, Annie Louis, Carl Crous, Hardie Cate, Dessie Petrova, MICHAEL QUINN, Denese Owusu-Afriyie, Achintya Singhal, Nan Wei, Solomon Kim, Damien Vincent, Milad Nasr, Christopher A. Choquette-Choo, Reiko Tojo, Shawn Lu, Diego de Las Casas, Yuchung Cheng, Tolga Bolukbasi, Katherine Lee, Saaber Fatehi, Rajagopal Ananthanarayanan, Miteyan Patel, Charbel Kaed, Jing Li, Jakub Sygnowski, Shreyas Rammohan Belle, Zhe Chen, Jaclyn Konzelmann, Siim Põder, Roopal Garg, Vinod Koverkathu, Adam Brown, Chris Dyer, Rosanne Liu, Azade Nova, Jun Xu, Slav Petrov, Demis Hassabis, Koray Kavukcuoglu, Jeffrey Dean, Oriol Vinyals

In this report, we present the latest model of the Gemini family, Gemini 1. 5 Pro, a highly compute-efficient multimodal mixture-of-experts model capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio.

Ranked #19 on Code Generation on HumanEval

Code Generation Retrieval

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.