no code implementations • ACL 2022 • Jiawei Zhou, Jason Eisner, Michael Newman, Emmanouil Antonios Platanios, Sam Thomson
Standard conversational semantic parsing maps a complete user utterance into an executable program, after which the program is executed to respond to the user.
no code implementations • 20 Jun 2024 • Yunmo Chen, Tongfei Chen, Harsh Jhamtani, Patrick Xia, Richard Shin, Jason Eisner, Benjamin Van Durme
We introduce iterative retrieval, a novel framework that empowers retrievers to make iterative decisions through policy optimization.
1 code implementation • 7 Mar 2024 • Boshi Wang, Hao Fang, Jason Eisner, Benjamin Van Durme, Yu Su
We find that existing LLMs, including GPT-4 and open-source LLMs specifically fine-tuned for tool use, only reach a correctness rate in the range of 30% to 60%, far from reliable use in practice.
no code implementations • 29 Dec 2023 • Li Du, Afra Amini, Lucas Torroba Hennigen, Xinyan Velocity Yu, Jason Eisner, Holden Lee, Ryan Cotterell
Recent papers have demonstrated the possibility of energy-based text generation by adapting gradient-based sampling algorithms, a paradigm of MCMC algorithms that promises fast convergence.
no code implementations • 28 Dec 2023 • Sky CH-Wang, Benjamin Van Durme, Jason Eisner, Chris Kedzie
We design probes trained on the internal representations of a transformer language model to predict its hallucinatory behavior on three grounded generation tasks.
no code implementations • 21 Dec 2023 • Weiting Tan, Chu-Cheng Lin, Jason Eisner
In this paper, we focus on the resulting challenge of imputing the latent alignment path that explains a given pair of input and output strings (e. g., during training).
1 code implementation • 4 Dec 2023 • Giovanni Monea, Maxime Peyrard, Martin Josifoski, Vishrav Chaudhary, Jason Eisner, Emre Kiciman, Hamid Palangi, Barun Patra, Robert West
We present a novel method to study grounding abilities using Fakepedia, a novel dataset of counterfactual texts constructed to clash with a model's internal parametric knowledge.
1 code implementation • 16 Nov 2023 • Nikita Moghe, Patrick Xia, Jacob Andreas, Jason Eisner, Benjamin Van Durme, Harsh Jhamtani
Users of natural language interfaces, generally powered by Large Language Models (LLMs), often must repeat their preferences each time they make a similar request.
1 code implementation • 20 Sep 2023 • Kumar Shridhar, Harsh Jhamtani, Hao Fang, Benjamin Van Durme, Jason Eisner, Patrick Xia
To enable exploration in this space, we present SCREWS, a modular framework for reasoning with revisions.
no code implementations • 8 Jul 2023 • Belinda Z. Li, Jason Eisner, Adam Pauls, Sam Thomson
Voice dictation is an increasingly important text input modality.
1 code implementation • 6 Jul 2023 • Andreas Opedal, Ran Zmigrod, Tim Vieira, Ryan Cotterell, Jason Eisner
This paper provides a reference description, in the form of a deduction system, of Earley's (1970) context-free parsing algorithm with various speed-ups.
1 code implementation • 31 May 2023 • Jessy Lin, Nicholas Tomlin, Jacob Andreas, Jason Eisner
We describe a class of tasks called decision-oriented dialogues, in which AI assistants such as large language models (LMs) must collaborate with one or more humans via natural language to help them make complex decisions.
no code implementations • 20 May 2023 • Li Du, Hongyuan Mei, Jason Eisner
To predict the next token, autoregressive models ordinarily examine the past.
1 code implementation • 17 Jan 2023 • Anej Svete, Benjamin Dayan, Tim Vieira, Ryan Cotterell, Jason Eisner
The pathsum in ordinary acyclic WFSAs is efficiently computed by the backward algorithm in time $O(|E|)$, where $E$ is the set of transitions.
no code implementations • 20 Dec 2022 • Li Du, Lucas Torroba Hennigen, Tiago Pimentel, Clara Meister, Jason Eisner, Ryan Cotterell
Language modeling, a central task in natural language processing, involves estimating a probability distribution over strings.
no code implementations • 20 Dec 2022 • FatemehSadat Mireshghallah, Yu Su, Tatsunori Hashimoto, Jason Eisner, Richard Shin
Task-oriented dialogue systems often assist users with personal or confidential matters.
2 code implementations • 27 Oct 2022 • Xiang Lisa Li, Ari Holtzman, Daniel Fried, Percy Liang, Jason Eisner, Tatsunori Hashimoto, Luke Zettlemoyer, Mike Lewis
We propose contrastive decoding (CD), a reliable decoding approach that optimizes a contrastive objective subject to a plausibility constraint.
1 code implementation • 16 Sep 2022 • Hao Fang, Anusha Balakrishnan, Harsh Jhamtani, John Bufe, Jean Crawford, Jayant Krishnamurthy, Adam Pauls, Jason Eisner, Jacob Andreas, Dan Klein
Satisfying these constraints simultaneously is difficult for the two predominant paradigms in language generation: neural language modeling and rule-based generation.
1 code implementation • 14 Sep 2022 • Clemente Pasti, Andreas Opedal, Tiago Pimentel, Tim Vieira, Jason Eisner, Ryan Cotterell
It shows, by a simple construction, that the intersection of a context-free language and a regular language is itself context-free.
1 code implementation • NeurIPS 2023 • Subhro Roy, Sam Thomson, Tongfei Chen, Richard Shin, Adam Pauls, Jason Eisner, Benjamin Van Durme
We introduce BenchCLAMP, a Benchmark to evaluate Constrained LAnguage Model Parsing, that includes context-free grammars for seven semantic parsing datasets and two syntactic parsing datasets with varied output representations, as well as a constrained decoding interface to generate only valid outputs covered by these grammars.
1 code implementation • 25 May 2022 • Ruiqi Zhong, Charlie Snell, Dan Klein, Jason Eisner
We introduce APEL, a framework in which non-programmers select among candidate programs generated by a seed semantic parser (e. g., Codex).
1 code implementation • 24 May 2022 • Elias Stengel-Eskin, Emmanouil Antonios Platanios, Adam Pauls, Sam Thomson, Hao Fang, Benjamin Van Durme, Jason Eisner, Yu Su
Rejecting class imbalance as the sole culprit, we reveal that the trend is closely associated with an effect we call source signal dilution, where strong lexical cues for the new symbol become diluted as the training dataset grows.
1 code implementation • ICLR 2022 • Chenghao Yang, Hongyuan Mei, Jason Eisner
The neural Hawkes process (Mei & Eisner, 2017) is a generative model of irregularly spaced sequences of discrete events.
no code implementations • Findings (EMNLP) 2021 • Tim Vieira, Ryan Cotterell, Jason Eisner
To this end, we describe a set of program transformations, a simple metric for assessing the efficiency of a transformed program, and a heuristic search procedure to improve this metric.
1 code implementation • EMNLP 2021 • Richard Shin, Christopher H. Lin, Sam Thomson, Charles Chen, Subhro Roy, Emmanouil Antonios Platanios, Adam Pauls, Dan Klein, Jason Eisner, Benjamin Van Durme
We explore the use of large pretrained language models as few-shot semantic parsers.
2 code implementations • NAACL 2021 • Guanghui Qin, Jason Eisner
We explore the idea of learning prompts by gradient descent -- either fine-tuning prompts taken from previous work, or starting from random initialization.
2 code implementations • NeurIPS 2020 • Hongyuan Mei, Tom Wan, Jason Eisner
The log-likelihood of a generative model often involves both positive and negative terms.
no code implementations • NAACL 2021 • Chu-Cheng Lin, Aaron Jaech, Xin Li, Matthew R. Gormley, Jason Eisner
Standard autoregressive language models perform only polynomial-time computation to compute the probability of the next symbol.
1 code implementation • 20 Oct 2020 • Matthew Francis-Landau, Tim Vieira, Jason Eisner
We present a scheme for translating logic programs, which may use aggregation and arithmetic, into algebraic expressions that denote bag relations over ground terms of the Herbrand universe.
Programming Languages Symbolic Computation
1 code implementation • 24 Sep 2020 • Semantic Machines, Jacob Andreas, John Bufe, David Burkett, Charles Chen, Josh Clausman, Jean Crawford, Kate Crim, Jordan DeLoach, Leah Dorner, Jason Eisner, Hao Fang, Alan Guo, David Hall, Kristin Hayes, Kellie Hill, Diana Ho, Wendy Iwaszuk, Smriti Jha, Dan Klein, Jayant Krishnamurthy, Theo Lanman, Percy Liang, Christopher H Lin, Ilya Lintsbakh, Andy McGovern, Aleksandr Nisnevich, Adam Pauls, Dmitrij Petters, Brent Read, Dan Roth, Subhro Roy, Jesse Rusak, Beth Short, Div Slomin, Ben Snyder, Stephon Striplin, Yu Su, Zachary Tellman, Sam Thomson, Andrei Vorobev, Izabela Witoszko, Jason Wolfe, Abby Wray, Yuchen Zhang, Alexander Zotov
We describe an approach to task-oriented dialogue in which dialogue state is represented as a dataflow graph.
1 code implementation • ICML 2020 • Hongyuan Mei, Guanghui Qin, Minjie Xu, Jason Eisner
Learning how to predict future events from patterns of past events is difficult when the set of possible event types is large.
no code implementations • ACL 2020 • Elizabeth Salesky, Eleanor Chodroff, Tiago Pimentel, Matthew Wiesner, Ryan Cotterell, Alan W. black, Jason Eisner
A major hurdle in data-driven research on typology is having sufficient data in many languages to draw meaningful conclusions.
no code implementations • IJCNLP 2019 • Adithya Renduchintala, Philipp Koehn, Jason Eisner
We present a machine foreign-language teacher that modifies text in a student{'}s native language (L1) by replacing some word tokens with glosses in a foreign language (L2), in such a way that the student can acquire L2 vocabulary simply by reading the resulting macaronic text.
1 code implementation • IJCNLP 2019 • Xiang Lisa Li, Jason Eisner
Pre-trained word embeddings like ELMo and BERT contain rich syntactic and semantic information, resulting in state-of-the-art performance on various tasks.
no code implementations • 25 Sep 2019 • Hongyuan Mei, Guanghui Qin, Minjie Xu, Jason Eisner
Consider a world in which events occur that involve various entities.
no code implementations • WS 2019 • Adithya Renduchintala, Philipp Koehn, Jason Eisner
We accomplish this by modifying a cloze language model to incrementally learn new vocabulary items, and use this language model as a proxy for the word guessing and learning ability of real students.
no code implementations • TACL 2019 • Xiang Lisa Li, Dingquan Wang, Jason Eisner
When the tree's yield is rendered as a written sentence, a string rewriting mechanism transduces the underlying marks into "surface" marks, which are part of the observed (surface) string but should not be regarded as part of the tree.
no code implementations • ACL 2019 • Sabrina J. Mielke, Ryan Cotterell, Kyle Gorman, Brian Roark, Jason Eisner
Trying to answer the question of what features difficult languages have in common, we try and fail to reproduce our earlier (Cotterell et al., 2018) observation about morphological complexity and instead reveal far simpler statistics of the data that seem to drive complexity in a much larger sample.
no code implementations • NAACL 2019 • Chu-Cheng Lin, Hao Zhu, Matthew R. Gormley, Jason Eisner
We introduce neural finite state transducers (NFSTs), a family of string transduction models defining joint and conditional probability distributions over pairs of strings.
2 code implementations • 14 May 2019 • Hongyuan Mei, Guanghui Qin, Jason Eisner
On held-out incomplete sequences, our method is effective at inferring the ground-truth unobserved events, with particle smoothing consistently improving upon particle filtering.
no code implementations • NAACL 2019 • Ekaterina Vylomova, Ryan Cotterell, Timothy Baldwin, Trevor Cohn, Jason Eisner
Critical to natural language generation is the production of correctly inflected text.
3 code implementations • LREC 2018 • Christo Kirov, Ryan Cotterell, John Sylak-Glassman, Géraldine Walther, Ekaterina Vylomova, Patrick Xia, Manaal Faruqui, Sabrina J. Mielke, Arya D. McCarthy, Sandra Kübler, David Yarowsky, Jason Eisner, Mans Hulden
The Universal Morphology UniMorph project is a collaborative effort to improve how NLP handles complex morphology across the world's languages.
no code implementations • CONLL 2018 • Ryan Cotterell, Christo Kirov, John Sylak-Glassman, Géraldine Walther, Ekaterina Vylomova, Arya D. McCarthy, Katharina Kann, Sabrina J. Mielke, Garrett Nicolai, Miikka Silfverberg, David Yarowsky, Jason Eisner, Mans Hulden
Apart from extending the number of languages involved in earlier supervised tasks of generating inflected forms, this year the shared task also featured a new second task which asked participants to inflect words in sentential context, similar to a cloze task.
1 code implementation • EMNLP 2018 • Dingquan Wang, Jason Eisner
To approximately parse an unfamiliar language, it helps to have a treebank of a similar language.
no code implementations • 27 Sep 2018 • Hongyuan Mei, Guanghui Qin, Jason Eisner
Particle smoothing is an extension of particle filtering in which proposed events are conditioned on the future as well as the past.
no code implementations • TACL 2019 • Ryan Cotterell, Christo Kirov, Mans Hulden, Jason Eisner
We quantify the linguistic complexity of different languages' morphological systems.
no code implementations • NAACL 2018 • Ryan Cotterell, Jason Eisner
What makes some types of languages more probable than others?
no code implementations • NAACL 2018 • Ryan Cotterell, Christo Kirov, Sabrina J. Mielke, Jason Eisner
Lexical ambiguity makes it difficult to compute various useful statistics of a corpus.
no code implementations • NAACL 2018 • Ryan Cotterell, Sabrina J. Mielke, Jason Eisner, Brian Roark
For general modeling methods applied to diverse languages, a natural question is: how well should we expect our models to work on languages with differing typological profiles?
no code implementations • NAACL 2018 • Chu-Cheng Lin, Jason Eisner
We introduce neural particle smoothing, a sequential Monte Carlo method for sampling annotations of an input string from a given probability model.
no code implementations • 23 Apr 2018 • Ryan Cotterell, Christo Kirov, Mans Hulden, Jason Eisner
Many languages' inflectional morphological systems are replete with irregulars, i. e., words that do not seem to follow standard inflectional rules.
1 code implementation • 23 Apr 2018 • Sabrina J. Mielke, Jason Eisner
By invoking the second RNN to generate spellings for novel words in context, we obtain an open-vocabulary language model.
no code implementations • TACL 2018 • Dingquan Wang, Jason Eisner
We show experimentally across multiple languages: (1) Features computed from the unparsed corpus improve parsing accuracy.
no code implementations • TACL 2017 • Dingquan Wang, Jason Eisner
We show how to predict the basic word-order facts of a novel language given only a corpus of part-of-speech (POS) sequences.
1 code implementation • TACL 2016 • Dingquan Wang, Jason Eisner
We release Galactic Dependencies 1. 0---a large set of synthetic languages not found on Earth, but annotated in Universal Dependencies format.
no code implementations • CONLL 2017 • Adithya Renduchintala, Philipp Koehn, Jason Eisner
We present a feature-rich knowledge tracing method that captures a student{'}s acquisition and retention of knowledge during a foreign language phrase learning task.
no code implementations • ACL 2017 • Nicholas Andrews, Mark Dredze, Benjamin Van Durme, Jason Eisner
Practically, this means that we may treat the lexical resources as observations under the proposed generative model.
Low Resource Named Entity Recognition named-entity-recognition +2
no code implementations • CONLL 2017 • Ryan Cotterell, Christo Kirov, John Sylak-Glassman, Géraldine Walther, Ekaterina Vylomova, Patrick Xia, Manaal Faruqui, Sandra Kübler, David Yarowsky, Jason Eisner, Mans Hulden
In sub-task 2, systems were given a lemma and some of its specific inflected forms, and asked to complete the inflectional paradigm by predicting all of the remaining inflected forms.
no code implementations • ACL 2017 • Ryan Cotterell, Jason Eisner
Linguistic typology studies the range of structures present in human language.
no code implementations • EACL 2017 • Ryan Cotterell, Adam Poliak, Benjamin Van Durme, Jason Eisner
The popular skip-gram model induces word embeddings by exploiting the signal from word-context coocurrence.
no code implementations • TACL 2017 • Tim Vieira, Jason Eisner
Pruning hypotheses during dynamic programming is commonly used to speed up inference in settings such as parsing.
8 code implementations • NeurIPS 2017 • Hongyuan Mei, Jason Eisner
Many events occur in the world.
no code implementations • TACL 2015 • Matthew R. Gormley, Mark Dredze, Jason Eisner
We show how to adjust the model parameters to compensate for the errors introduced by this approximation, by following the gradient of the actual loss on training data.
no code implementations • TACL 2015 • Ryan Cotterell, Nanyun Peng, Jason Eisner
Given some surface word types of a concatenative language along with the abstract morpheme sequences that they express, we show how to recover consistent underlying forms for these morphemes, together with the (stochastic) phonology that maps each concatenation of underlying forms to a surface form.
no code implementations • NeurIPS 2012 • Jiarong Jiang, Adam Teichert, Jason Eisner, Hal Daume
Users want natural language processing (NLP) systems to be both fast and accurate, but quality often comes at the cost of speed.
no code implementations • NeurIPS 2012 • He He, Jason Eisner, Hal Daume
However, it is important to note that these guarantees depend on how well the policy we found can imitate the oracle on the training data.