Search Results for author: Michael Collins

Found 52 papers, 15 papers with code

Investigating the Effect of Background Knowledge on Natural Questions

no code implementations NAACL (DeeLIO) 2021 Vidhisha Balachandran, Bhuwan Dhingra, Haitian Sun, Michael Collins, William Cohen

We create a subset of the NQ data, Factual Questions (FQ), where the questions have evidence in the KB in the form of paths that link question entities to answer entities but still must be answered using text, to facilitate further research into KB integration methods.

Unsupervised Cross-Lingual Part-of-Speech Tagging for Truly Low-Resource Scenarios

no code implementations EMNLP 2020 Ramy Eskander, Smaranda Muresan, Michael Collins

Our approach innovates in three ways: 1) a robust approach of selecting training instances via cross-lingual annotation projection that exploits best practices of unsupervised type and token constraints, word-alignment confidence and density of projected POS, 2) a Bi-LSTM architecture that uses contextualized word embeddings, affix embeddings and hierarchical Brown clusters, and 3) an evaluation on 12 diverse languages in terms of language family and morphological typology.

Cross-Lingual Transfer Part-Of-Speech Tagging +3

A Well-Composed Text is Half Done! Composition Sampling for Diverse Conditional Generation

1 code implementation ACL 2022 Shashi Narayan, Gonçalo Simões, Yao Zhao, Joshua Maynez, Dipanjan Das, Michael Collins, Mirella Lapata

We propose Composition Sampling, a simple but effective method to generate diverse outputs for conditional generation of higher quality compared to previous stochastic decoding strategies.

Question Generation

Measuring Attribution in Natural Language Generation Models

no code implementations23 Dec 2021 Hannah Rashkin, Vitaly Nikolaev, Matthew Lamm, Michael Collins, Dipanjan Das, Slav Petrov, Gaurav Singh Tomar, Iulia Turc, David Reitter

With recent improvements in natural language generation (NLG) models for various applications, it has become imperative to have the means to identify and evaluate whether NLG output is only sharing verifiable information about the external world.

Text Generation

Partially Supervised Named Entity Recognition via the Expected Entity Ratio Loss

1 code implementation16 Aug 2021 Thomas Effland, Michael Collins

We study learning named entity recognizers in the presence of missing entity annotations.

Named Entity Recognition

On planetary systems as ordered sequences

no code implementations20 May 2021 Emily Sandford, David Kipping, Michael Collins

A planetary system consists of a host star and one or more planets, arranged into a particular configuration.

Selection bias Unsupervised Part-Of-Speech Tagging

Decontextualization: Making Sentences Stand-Alone

no code implementations9 Feb 2021 Eunsol Choi, Jennimaria Palomaki, Matthew Lamm, Tom Kwiatkowski, Dipanjan Das, Michael Collins

Models for question answering, dialogue agents, and summarization often interpret the meaning of a sentence in a rich context and use that meaning in a new context.

Question Answering

Evaluating Explanations: How much do explanations from the teacher aid students?

1 code implementation1 Dec 2020 Danish Pruthi, Rachit Bansal, Bhuwan Dhingra, Livio Baldini Soares, Michael Collins, Zachary C. Lipton, Graham Neubig, William W. Cohen

While many methods purport to explain predictions by highlighting salient features, what aims these explanations serve and how they ought to be evaluated often go unstated.

Question Answering Text Classification

QED: A Framework and Dataset for Explanations in Question Answering

1 code implementation8 Sep 2020 Matthew Lamm, Jennimaria Palomaki, Chris Alberti, Daniel Andor, Eunsol Choi, Livio Baldini Soares, Michael Collins

A question answering system that in addition to providing an answer provides an explanation of the reasoning that leads to that answer has potential advantages in terms of debuggability, extensibility and trust.

Explanation Generation Question Answering

Sparse, Dense, and Attentional Representations for Text Retrieval

1 code implementation1 May 2020 Yi Luan, Jacob Eisenstein, Kristina Toutanova, Michael Collins

Dual encoders perform retrieval by encoding documents and queries into dense lowdimensional vectors, scoring each document by its inner product with the query.

Open-Domain Question Answering

Fusion of Detected Objects in Text for Visual Question Answering

1 code implementation IJCNLP 2019 Chris Alberti, Jeffrey Ling, Michael Collins, David Reitter

To advance models of multimodal context, we introduce a simple yet powerful neural architecture for data that combines vision and natural language.

Question Answering Visual Commonsense Reasoning +1

Synthetic QA Corpora Generation with Roundtrip Consistency

3 code implementations ACL 2019 Chris Alberti, Daniel Andor, Emily Pitler, Jacob Devlin, Michael Collins

We introduce a novel method of generating synthetic question answering corpora by combining models of question generation and answer extraction, and by filtering the results to ensure roundtrip consistency.

Question Answering Question Generation +1

BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions

1 code implementation NAACL 2019 Christopher Clark, Kenton Lee, Ming-Wei Chang, Tom Kwiatkowski, Michael Collins, Kristina Toutanova

In this paper we study yes/no questions that are naturally occurring --- meaning that they are generated in unprompted and unconstrained settings.

Reading Comprehension Transfer Learning

Low-Resource Syntactic Transfer with Unsupervised Source Reordering

no code implementations NAACL 2019 Mohammad Sadegh Rasooli, Michael Collins

We describe a cross-lingual transfer method for dependency parsing that takes into account the problem of word order differences between source and target languages.

Cross-Lingual Transfer Dependency Parsing

Improving Span-based Question Answering Systems with Coarsely Labeled Data

no code implementations5 Nov 2018 Hao Cheng, Ming-Wei Chang, Kenton Lee, Ankur Parikh, Michael Collins, Kristina Toutanova

We study approaches to improve fine-grained short answer Question Answering models by integrating coarse-grained data annotated for paragraph-level relevance and show that coarsely annotated data can bring significant performance gains.

Multi-Task Learning Question Answering

Noise Contrastive Estimation and Negative Sampling for Conditional Models: Consistency and Statistical Efficiency

no code implementations EMNLP 2018 Zhuang Ma, Michael Collins

Noise Contrastive Estimation (NCE) is a powerful parameter estimation method for log-linear models, which avoids calculation of the partition function or its derivatives at each training step, a computationally demanding step in many cases.

Classification General Classification +2

Source-Side Left-to-Right or Target-Side Left-to-Right? An Empirical Comparison of Two Phrase-Based Decoding Algorithms

no code implementations EMNLP 2017 Yin-Wen Chang, Michael Collins

The algorithm produces a translation by processing the source-language sentence in strictly left-to-right order, differing from commonly used approaches that build the target-language sentence in left-to-right order.

Machine Translation Translation

Kernel Approximation Methods for Speech Recognition

no code implementations13 Jan 2017 Avner May, Alireza Bagheri Garakani, Zhiyun Lu, Dong Guo, Kuan Liu, Aurélien Bellet, Linxi Fan, Michael Collins, Daniel Hsu, Brian Kingsbury, Michael Picheny, Fei Sha

First, in order to reduce the number of random features required by kernel models, we propose a simple but effective method for feature selection.

Frame Speech Recognition

A Polynomial-Time Dynamic Programming Algorithm for Phrase-Based Decoding with a Fixed Distortion Limit

no code implementations TACL 2017 Yin-Wen Chang, Michael Collins

Decoding of phrase-based translation models in the general case is known to be NP-complete, by a reduction from the traveling salesman problem (Knight, 1999).

Machine Translation Translation +1

Cross-Lingual Syntactic Transfer with Limited Resources

1 code implementation TACL 2017 Mohammad Sadegh Rasooli, Michael Collins

We describe a simple but effective method for cross-lingual syntactic transfer of dependency parsers, in the scenario where a large amount of translation data is not available.


Globally Normalized Transition-Based Neural Networks

1 code implementation ACL 2016 Daniel Andor, Chris Alberti, David Weiss, Aliaksei Severyn, Alessandro Presta, Kuzman Ganchev, Slav Petrov, Michael Collins

Our model is a simple feed-forward neural network that operates on a task-specific transition system, yet achieves comparable or better accuracies than recurrent models.

Dependency Parsing Part-Of-Speech Tagging +1

Unsupervised Part-Of-Speech Tagging with Anchor Hidden Markov Models

1 code implementation TACL 2016 Karl Stratos, Michael Collins, Daniel Hsu

We tackle unsupervised part-of-speech (POS) tagging by learning hidden Markov models (HMMs) that are particularly well-suited for the problem.


Transforming Dependency Structures to Logical Forms for Semantic Parsing

1 code implementation TACL 2016 Siva Reddy, Oscar T{\"a}ckstr{\"o}m, Michael Collins, Tom Kwiatkowski, Dipanjan Das, Mark Steedman, Mirella Lapata

In contrast{---}partly due to the lack of a strong type system{---}dependency structures are easy to annotate and have become a widely used form of syntactic analysis for many languages.

Question Answering Semantic Parsing +1

Learning Dictionaries for Named Entity Recognition using Minimal Supervision

no code implementations EACL 2014 Arvind Neelakantan, Michael Collins

This paper describes an approach for automatic construction of dictionaries for Named Entity Recognition (NER) using large amounts of unlabeled data and a few seed examples.

Named Entity Recognition NER +1

A Tutorial on Dual Decomposition and Lagrangian Relaxation for Inference in Natural Language Processing

no code implementations23 Jan 2014 Alexander M. Rush, Michael Collins

Dual decomposition, and more generally Lagrangian relaxation, is a classical method for combinatorial optimization; it has recently been applied to several inference problems in natural language processing (NLP).

Combinatorial Optimization

Tensor Decomposition for Fast Parsing with Latent-Variable PCFGs

no code implementations NeurIPS 2012 Michael Collins, Shay B. Cohen

We describe an approach to speed-up inference with latent variable PCFGs, which have been shown to be highly effective for natural language parsing.

Tensor Decomposition

Learning Label Embeddings for Nearest-Neighbor Multi-class Classification with an Application to Speech Recognition

no code implementations NeurIPS 2009 Natasha Singh-Miller, Michael Collins

We consider the problem of using nearest neighbor methods to provide a conditional probability estimate, P(y|a), when the number of labels y is large and the labels share some underlying structure.

General Classification Multi-class Classification +1

Cannot find the paper you are looking for? You can Submit a new open access paper.