no code implementations • EMNLP (DeeLIO) 2020 • Dan Iter, Xiao Yu, Fangtao Li
Entity-attribute relations are a fundamental component for building large-scale knowledge bases, which are widely employed in modern search engines.
no code implementations • 24 May 2023 • Dan Iter, Reid Pryzant, Ruochen Xu, Shuohang Wang, Yang Liu, Yichong Xu, Chenguang Zhu
Our method is based on the observation that the effectiveness of in-context demonstrations negatively correlates with the perplexity of the test example by a language model that was finetuned on that demonstration.
no code implementations • 22 May 2023 • Ruochen Xu, Song Wang, Yang Liu, Shuohang Wang, Yichong Xu, Dan Iter, Chenguang Zhu, Michael Zeng
We hypothesize that there is a hidden query for each summary sentence in a generic summarization annotation, and we utilize a large-scale pretrained language model to recover it.
no code implementations • 22 May 2023 • Yichong Xu, Ruochen Xu, Dan Iter, Yang Liu, Shuohang Wang, Chenguang Zhu, Michael Zeng
While large models such as GPT-3 demonstrate exceptional performance in zeroshot and fewshot summarization tasks, their extensive serving and fine-tuning costs hinder their utilization in various applications.
no code implementations • 4 May 2023 • Reid Pryzant, Dan Iter, Jerry Li, Yin Tat Lee, Chenguang Zhu, Michael Zeng
Large Language Models (LLMs) have shown impressive performance as general purpose agents, but their abilities remain highly dependent on prompts which are hand written with onerous trial-and-error effort.
no code implementations • 29 Mar 2023 • Yang Liu, Dan Iter, Yichong Xu, Shuohang Wang, Ruochen Xu, Chenguang Zhu
In this work, we present G-Eval, a framework of using large language models with chain-of-thoughts (CoT) and a form-filling paradigm, to assess the quality of NLG outputs.
no code implementations • 22 Feb 2023 • Simeng Sun, Yang Liu, Dan Iter, Chenguang Zhu, Mohit Iyyer
This motivates the use of parameter-efficient adaptation methods such as prompt tuning (PT), which adds a small number of tunable embeddings to an otherwise frozen model, and in-context learning (ICL), in which demonstrations of the task are provided to the model in natural language without any additional training.
1 code implementation • 21 Sep 2022 • Wenhao Yu, Dan Iter, Shuohang Wang, Yichong Xu, Mingxuan Ju, Soumya Sanyal, Chenguang Zhu, Michael Zeng, Meng Jiang
We call our method generate-then-read (GenRead), which first prompts a large language model to generate contextutal documents based on a given question, and then reads the generated documents to produce the final answer.
1 code implementation • EMNLP 2021 • William Held, Dan Iter, Dan Jurafsky
We model the entities/events in a reader's focus as a neighborhood within a learned latent embedding space which minimizes the distance between mentions and the centroids of their gold coreference clusters.
Ranked #1 on
Event Coreference Resolution
on Gun Violence Corpus
no code implementations • ACL 2022 • David Grangier, Dan Iter
This work connects language model adaptation with concepts of machine learning theory.
no code implementations • 15 Sep 2021 • Dan Iter, David Grangier
Domain adaptation of neural networks commonly relies on three training phases: pretraining, selected data training and then fine tuning.
1 code implementation • ACL 2020 • Dan Iter, Kelvin Guu, Larry Lansing, Dan Jurafsky
Recent models for unsupervised representation learning of text have employed a number of techniques to improve contextual word representations but have put little focus on discourse-level representations.
no code implementations • WS 2018 • Dan Iter, Alon Halevy, Wang-Chiew Tan
A common need of NLP applications is to extract structured data from text corpora in order to perform analytics or trigger an appropriate action.
no code implementations • WS 2018 • Dan Iter, Jong Yoon, Dan Jurafsky
Here, we present the first benchmark comparison of previously proposed coherence models for detecting symptoms of schizophrenia and evaluate their performance on a new dataset of recorded interviews between subjects and clinicians.
no code implementations • 25 Oct 2016 • Paroma Varma, Bryan He, Dan Iter, Peng Xu, Rose Yu, Christopher De Sa, Christopher Ré
Prior work has explored learning accuracies for these sources even without ground truth labels, but they assume that a single accuracy parameter is sufficient to model the behavior of these sources over the entire training set.
1 code implementation • 14 Jun 2016 • Stefan Hadjis, Ce Zhang, Ioannis Mitliagkas, Dan Iter, Christopher Ré
Given a specification of a convolutional neural network, our goal is to minimize the time to train this model on a cluster of commodity CPUs and GPUs.