no code implementations • NAACL (ACL) 2022 • Kasturi Bhattacharjee, Rashmi Gangadharaiah, Kathleen McKeown, Dan Roth
Users often leave feedback on a myriad of aspects of a product which, if leveraged successfully, can help yield useful insights that can lead to further improvements down the line.
1 code implementation • ACL 2022 • Xingyu Fu, Ben Zhou, Ishaan Chandratreya, Carl Vondrick, Dan Roth
Images are often more significant than only the pixels to human eyes, as we can infer, associate, and reason with contextual information from other sources to establish a more complete picture.
no code implementations • NAACL 2022 • Elior Sulem, Jamaal Hay, Dan Roth
For example, given the context “She married a lawyer from New-York.”, we don’t know whether the answer to the question “Did she marry in New York?” is “Yes” or “No”.
1 code implementation • NAACL (ACL) 2022 • Xinya Du, Zixuan Zhang, Sha Li, Pengfei Yu, Hongwei Wang, Tuan Lai, Xudong Lin, Ziqi Wang, Iris Liu, Ben Zhou, Haoyang Wen, Manling Li, Darryl Hannan, Jie Lei, Hyounghun Kim, Rotem Dror, Haoyu Wang, Michael Regan, Qi Zeng, Qing Lyu, Charles Yu, Carl Edwards, Xiaomeng Jin, Yizhu Jiao, Ghazaleh Kazeminejad, Zhenhailong Wang, Chris Callison-Burch, Mohit Bansal, Carl Vondrick, Jiawei Han, Dan Roth, Shih-Fu Chang, Martha Palmer, Heng Ji
We introduce RESIN-11, a new schema-guided event extraction&prediction framework that can be applied to a large variety of newsworthy scenarios.
no code implementations • NAACL (ACL) 2022 • Muhao Chen, Lifu Huang, Manling Li, Ben Zhou, Heng Ji, Dan Roth
This tutorial targets researchers and practitioners who are interested in AI and ML technologies for structural information extraction (IE) from unstructured textual sources.
no code implementations • *SEM (NAACL) 2022 • Zheng Qi, Elior Sulem, Haoyu Wang, Xiaodong Yu, Dan Roth
We address this task as a pipeline, first predicting whether two granular events mentioned in the text belong to the same complex event, independently of their position in the text, and then using this to cluster them into complex events.
no code implementations • EMNLP 2020 • Annie Louis, Dan Roth, Filip Radlinski
We revisit a pragmatic inference problem in dialog: Understanding indirect responses to questions.
no code implementations • Findings (EMNLP) 2021 • Soham Dan, Osbert Bastani, Dan Roth
This way the concept learning problem is naturally a program synthesis problem and our algorithm learns from a few examples to synthesize a program representing the novel concept.
no code implementations • Findings (EMNLP) 2021 • Soham Dan, Xinran Han, Dan Roth
Executing natural language instructions in a physically grounded domain requires a model that understands both spatial concepts such as “left of” and “above”, and the compositional language used to identify landmarks and articulate instructions relative to them.
no code implementations • Findings (EMNLP) 2021 • Soham Dan, Dan Roth
To reduce the cost of training such large models, prior work has developed smaller, more compact models which achieves a significant speedup in training time while maintaining competitive accuracy to the original model on downstream tasks.
no code implementations • Findings (EMNLP) 2021 • Elior Sulem, Jamaal Hay, Dan Roth
Understanding when a text snippet does not provide a sought after information is an essential part of natural language utnderstanding.
no code implementations • Findings (NAACL) 2022 • Ritam Dutt, Kasturi Bhattacharjee, Rashmi Gangadharaiah, Dan Roth, Carolyn Rose
The above concerns motivate our question answer- ing setting over personalized knowledge graphs (PERKGQA) where each user has restricted access to their KG.
no code implementations • CoNLL (EMNLP) 2021 • Daniel Deutsch, Dan Roth
Reference-based metrics such as ROUGE or BERTScore evaluate the content quality of a summary by comparing the summary to a reference.
no code implementations • CoNLL (EMNLP) 2021 • Philip A. Huebner, Elior Sulem, Fisher Cynthia, Dan Roth
Transformer-based language models have taken the NLP world by storm.
no code implementations • EMNLP 2021 • Rujun Han, I-Hung Hsu, Jiao Sun, Julia Baylon, Qiang Ning, Dan Roth, Nanyun Peng
While these tasks partially evaluate machines’ ability of narrative understanding, human-like reading comprehension requires the capability to process event-based information beyond arguments and temporal reasoning.
no code implementations • NAACL (DaSH) 2021 • Tatiana Tsygankova, Francesca Marini, Stephen Mayhew, Dan Roth
In low-resource natural language processing (NLP), the key problems are a lack of target language training data, and a lack of native speakers to create it.
Low Resource Named Entity Recognition
named-entity-recognition
+2
no code implementations • BioNLP (ACL) 2022 • Kevin Xie, Brian Litt, Dan Roth, Colin A. Ellis
A wealth of important clinical information lies untouched in the Electronic Health Record, often in the form of unstructured textual documents.
no code implementations • 30 Jan 2025 • Peter Baile Chen, Yi Zhang, Michael Cafarella, Dan Roth
However, LLM's decomposition of questions is unaware of what data is available and how data is organized, often leading to a sub-optimal retrieval performance.
no code implementations • 9 Jan 2025 • Xingyu Fu, Minqian Liu, Zhengyuan Yang, John Corring, Yijuan Lu, Jianwei Yang, Dan Roth, Dinei Florencio, Cha Zhang
ReFocus largely improves performance on all tasks over GPT-4o without visual editing, yielding an average gain of 11. 0% on table tasks and 6. 8% on chart tasks.
no code implementations • 17 Dec 2024 • Karan Wanchoo, Xiaoye Zuo, Hannah Gonzalez, Soham Dan, Georgios Georgakis, Dan Roth, Kostas Daniilidis, Eleni Miltsakaki
We present NAVCON, a large-scale annotated Vision-Language Navigation (VLN) corpus built on top of two popular datasets (R2R and RxR).
no code implementations • 12 Dec 2024 • Yu Feng, Phu Mon Htut, Zheng Qi, Wei Xiao, Manuel Mager, Nikolaos Pappas, Kishaloy Halder, Yang Li, Yassine Benajiba, Dan Roth
In this paper, we propose a novel method, DiverseAgentEntropy, for evaluating a model's uncertainty using multi-agent interaction under the assumption that if a model is certain, it should consistently recall the answer to the original query across a diverse collection of questions about the same original query.
no code implementations • 11 Nov 2024 • Chaitanya Malaviya, Joseph Chee Chang, Dan Roth, Mohit Iyyer, Mark Yatskar, Kyle Lo
would depend on the user's preferences, and a good response to an open-ended query like "How do antibiotics work against bacteria?"
no code implementations • 29 Oct 2024 • Yahan Yang, Soham Dan, Dan Roth, Insup Lee
With the ubiquity of Large Language Models (LLMs), guardrails have become crucial to detect and defend against toxic content.
no code implementations • 24 Oct 2024 • Xiaodong Yu, Ben Zhou, Hao Cheng, Dan Roth
Existing math datasets evaluate the reasoning abilities of large language models (LLMs) by either using the final answer or the intermediate reasoning steps derived from static examples.
no code implementations • 16 Oct 2024 • Siyi Liu, Qiang Ning, Kishaloy Halder, Wei Xiao, Zheng Qi, Phu Mon Htut, Yi Zhang, Neha Anna John, Bonan Min, Yassine Benajiba, Dan Roth
Open domain question answering systems frequently rely on information retrieved from large collections of text (such as the Web) to answer questions.
no code implementations • 11 Oct 2024 • Jiashu He, Mingyu Derek Ma, Jinxuan Fan, Dan Roth, Wei Wang, Alejandro Ribeiro
Existing retrieval-based reasoning approaches for large language models (LLMs) heavily rely on the density and quality of the non-parametric knowledge source to provide domain knowledge and explicit reasoning chain.
1 code implementation • 3 Oct 2024 • Aparna Elangovan, Lei Xu, Jongwoo Ko, Mahsa Elyasi, Ling Liu, Sravan Bodapati, Dan Roth
Specifically, we demonstrate that when the proportion of samples with variation or uncertainty in human assigned labels is relatively high, machine labels (generated by automatic evaluation methods) may superficially appear to have similar or better correlation with the human majority label compared to the human-to-human (HH) correlation.
no code implementations • 24 Sep 2024 • Tianyue Ou, Frank F. Xu, Aman Madaan, Jiarui Liu, Robert Lo, Abishek Sridhar, Sudipta Sengupta, Dan Roth, Graham Neubig, Shuyan Zhou
LLMs can now act as autonomous agents that interact with digital environments and complete specific objectives (e. g., arranging an online meeting).
no code implementations • 16 Sep 2024 • Qingru Zhang, Xiaodong Yu, Chandan Singh, Xiaodong Liu, Liyuan Liu, Jianfeng Gao, Tuo Zhao, Dan Roth, Hao Cheng
However, they often struggle to fully comprehend and effectively utilize their input contexts, resulting in responses that are unfaithful or hallucinated.
no code implementations • 30 Aug 2024 • Srija Mukhopadhyay, Abhishek Rajgaria, Prerana Khatiwada, Vivek Gupta, Dan Roth
Vision-language models (VLMs) excel at tasks requiring joint understanding of visual and linguistic information.
no code implementations • 25 Aug 2024 • Suyash Vardhan Mathur, Jainit Sushil Bafna, Kunal Kartik, Harshita Khandelwal, Manish Shrivastava, Vivek Gupta, Mohit Bansal, Dan Roth
With the evolution of AI models capable of multimodal reasoning, it is pertinent to assess their efficacy in handling such structured data.
no code implementations • 22 Jul 2024 • Irwin Deng, Kushagra Dixit, Vivek Gupta, Dan Roth
We provide critical insights for improving LLM performance in temporal reasoning tasks with tabular data.
no code implementations • 15 Jul 2024 • Pranshu Pandya, Vatsal Gupta, Agney S Talwarr, Tushar Kataria, Dan Roth, Vivek Gupta
Cognitive textual and visual reasoning tasks, including puzzles, series, and analogies, demand the ability to quickly reason, decipher, and evaluate patterns both textually and spatially.
no code implementations • 15 Jul 2024 • Srija Mukhopadhyay, Adnan Qidwai, Aparna Garimella, Pritika Ramu, Vivek Gupta, Dan Roth
However, the robustness and consistency of current Visual Language Models (VLMs) in this field remain under-explored.
no code implementations • 13 Jul 2024 • Kaifu Wang, Efthymia Tsamoura, Dan Roth
At the same time, the supervision signal is generated by a function $\sigma$ over the (hidden) gold labels of $\mathbf{x}$.
1 code implementation • 29 Jun 2024 • Nikhil Abhyankar, Vivek Gupta, Dan Roth, Chandan K. Reddy
Tabular reasoning involves interpreting natural language queries about tabular data, which presents a unique challenge of combining language understanding with structured data analysis.
no code implementations • 27 Jun 2024 • Shubhankar Singh, Purvi Chaurasia, Yerram Varun, Pranshu Pandya, Vatsal Gupta, Vivek Gupta, Dan Roth
Existing benchmarks for visual question answering lack in visual grounding and complexity, particularly in evaluating spatial reasoning skills.
no code implementations • 17 Jun 2024 • Bangzheng Li, Ben Zhou, Xingyu Fu, Fei Wang, Dan Roth, Muhao Chen
One popular approach is using perplexity as a way to measure models' familiarity with the prompt.
1 code implementation • 16 Jun 2024 • Bowen Jiang, Yangxinyu Xie, Zhuoqun Hao, Xiaomeng Wang, Tanwi Mallick, Weijie J. Su, Camillo J. Taylor, Dan Roth
This study introduces a hypothesis-testing framework to assess whether large language models (LLMs) possess genuine reasoning abilities or primarily depend on token bias.
no code implementations • 13 Jun 2024 • Yushi Hu, Weijia Shi, Xingyu Fu, Dan Roth, Mari Ostendorf, Luke Zettlemoyer, Noah A Smith, Ranjay Krishna
In this work, we introduce Sketchpad, a framework that gives multimodal LMs a visual sketchpad and tools to draw on the sketchpad.
1 code implementation • 13 Jun 2024 • Fei Wang, Xingyu Fu, James Y. Huang, Zekun Li, Qin Liu, Xiaogeng Liu, Mingyu Derek Ma, Nan Xu, Wenxuan Zhou, Kai Zhang, Tianyi Lorena Yan, Wenjie Jacky Mo, Hsiang-Hui Liu, Pan Lu, Chunyuan Li, Chaowei Xiao, Kai-Wei Chang, Dan Roth, Sheng Zhang, Hoifung Poon, Muhao Chen
We introduce MuirBench, a comprehensive benchmark that focuses on robust multi-image understanding capabilities of multimodal LLMs.
no code implementations • 11 Jun 2024 • Xingyu Fu, Muyu He, Yujie Lu, William Yang Wang, Dan Roth
We present a novel task and benchmark for evaluating the ability of text-to-image(T2I) generation models to produce images that align with commonsense in real life, which we call Commonsense-T2I.
no code implementations • 28 May 2024 • Aparna Elangovan, Ling Liu, Lei Xu, Sravan Bodapati, Dan Roth
In this position paper, we argue that human evaluation of generative large language models (LLMs) should be a multidisciplinary undertaking that draws upon insights from disciplines such as user experience research and human behavioral psychology to ensure that the experimental design and results are reliable.
no code implementations • 25 May 2024 • Haoyu Wang, Tao Li, Zhiwei Deng, Dan Roth, Yang Li
The experimental results suggest that our introspection-driven approach not only enhances the agent's ability to navigate unanticipated challenges through a robust mechanism of plan execution, but also improves efficiency by reducing the number of trials and plan revisions by 45% needed to achieve a task.
no code implementations • 18 Apr 2024 • Xingyu Fu, Yushi Hu, Bangzheng Li, Yu Feng, Haoyu Wang, Xudong Lin, Dan Roth, Noah A. Smith, Wei-Chiu Ma, Ranjay Krishna
We introduce Blink, a new benchmark for multimodal language models (LLMs) that focuses on core visual perception abilities not found in other evaluations.
no code implementations • 18 Apr 2024 • Yu Feng, Ben Zhou, Weidong Lin, Dan Roth
Predictive models often need to work with incomplete information in real-world tasks.
no code implementations • 16 Apr 2024 • Hantian Ding, Zijian Wang, Giovanni Paolini, Varun Kumar, Anoop Deoras, Dan Roth, Stefano Soatto
In large language model training, input documents are typically concatenated together and then split into sequences of equal length to avoid padding tokens.
no code implementations • 15 Apr 2024 • Peter Baile Chen, Yi Zhang, Dan Roth
Retrieving relevant tables containing the necessary information to accurately answer a given question over tables is critical to open-domain question-answering (QA) systems.
no code implementations • 30 Mar 2024 • Ben Zhou, Hongming Zhang, Sihao Chen, Dian Yu, Hongwei Wang, Baolin Peng, Dan Roth, Dong Yu
Conceptual reasoning, the ability to reason in abstract and high-level perspectives, is key to generalization in human cognition.
1 code implementation • 21 Mar 2024 • Bowen Jiang, Zhijun Zhuang, Shreyas S. Shivakumar, Dan Roth, Camillo J. Taylor
This work explores the zero-shot capabilities of foundation models in Visual Question Answering (VQA) tasks.
no code implementations • 10 Mar 2024 • Fei Wang, Chao Shang, Sarthak Jain, Shuai Wang, Qiang Ning, Bonan Min, Vittorio Castelli, Yassine Benajiba, Dan Roth
We investigate common constraints in NLP tasks, categorize them into three classes based on the types of their arguments, and propose a unified framework, ACT (Aligning to ConsTraints), to automatically produce supervision signals for user alignment with constraints.
no code implementations • 17 Feb 2024 • Pragya Srivastava, Manuj Malik, Vivek Gupta, Tanuja Ganu, Dan Roth
Large Language Models (LLMs), excel in natural language understanding, but their capability for complex mathematical reasoning with an amalgamation of structured tables and unstructured text is uncertain.
no code implementations • 5 Feb 2024 • James Y. Huang, Sailik Sengupta, Daniele Bonadiman, Yi-An Lai, Arshit Gupta, Nikolaos Pappas, Saab Mansour, Katrin Kirchhoff, Dan Roth
Current work focuses on alignment at model training time, through techniques such as Reinforcement Learning with Human Feedback (RLHF).
no code implementations • 2 Feb 2024 • Dejiao Zhang, Wasi Ahmad, Ming Tan, Hantian Ding, Ramesh Nallapati, Dan Roth, Xiaofei Ma, Bing Xiang
Recent studies have shown that code language models at scale demonstrate significant performance gains on downstream tasks, i. e., code generation.
1 code implementation • 16 Nov 2023 • Bangzheng Li, Ben Zhou, Fei Wang, Xingyu Fu, Dan Roth, Muhao Chen
During the construction of the evidence, we purposefully replace semantic clues (entities) that may lead to the correct answer with distractor clues (evidence) that will not directly lead to the correct answer but require a chain-like reasoning process.
2 code implementations • 16 Nov 2023 • Chaitanya Malaviya, Subin Lee, Dan Roth, Mark Yatskar
A rationale outlines the approach followed by the model to answer the question.
no code implementations • 15 Nov 2023 • Yahan Yang, Soham Dan, Dan Roth, Insup Lee
We also conduct several ablation experiments to study the effect of language distances, language corpus size, and model size on calibration, and how multilingual models compare with their monolingual counterparts for diverse tasks and languages.
no code implementations • 15 Nov 2023 • Vatsal Gupta, Pranshu Pandya, Tushar Kataria, Vivek Gupta, Dan Roth
In this study, we introduce a methodology designed to examine how input perturbations affect language models across various scales, including pre-trained models and large language models (LLMs).
1 code implementation • 7 Nov 2023 • Sihao Chen, Hongming Zhang, Tong Chen, Ben Zhou, Wenhao Yu, Dian Yu, Baolin Peng, Hongwei Wang, Dan Roth, Dong Yu
We introduce sub-sentence encoder, a contrastively-learned contextual embedding model for fine-grained semantic representation of text.
no code implementations • 19 Oct 2023 • Xiaodong Yu, Hao Cheng, Xiaodong Liu, Dan Roth, Jianfeng Gao
Specifically, given the potential of data contamination (e. g., leading to memorization), good static benchmark performance does not ensure that model can reliably use the provided evidence for responding, which is essential to avoid hallucination when the required knowledge is new or private.
1 code implementation • 29 Sep 2023 • Hangfeng He, Hongming Zhang, Dan Roth
Existing reference-free reasoning evaluation metrics, while eliminating the need for human-crafted reasoning chains as references, often require fine-tuning with human-derived chains before evaluation, complicating the process and questioning their adaptability to other datasets.
4 code implementations • 14 Sep 2023 • Chaitanya Malaviya, Subin Lee, Sihao Chen, Elizabeth Sieber, Mark Yatskar, Dan Roth
In this work, we conduct human evaluation of responses from a few representative systems along various axes of attribution and factuality, by bringing domain experts in the loop.
no code implementations • 10 Aug 2023 • Alexander Hanbo Li, Mingyue Shang, Evangelia Spiliopoulou, Jie Ma, Patrick Ng, Zhiguo Wang, Bonan Min, William Wang, Kathleen McKeown, Vittorio Castelli, Dan Roth, Bing Xiang
We present a novel approach for structured data-to-text generation that addresses the limitations of existing methods that primarily focus on specific types of structured data.
no code implementations • 9 Aug 2023 • Xiaodong Yu, Ben Zhou, Dan Roth
Information retrieval (IR) or knowledge retrieval, is a critical component for many down-stream tasks such as open-domain question answering (QA).
no code implementations • 8 Jul 2023 • Kaifu Wang, Hangfeng He, Tin D. Nguyen, Piyush Kumar, Dan Roth
Prior knowledge and symbolic rules in machine learning are often expressed in the form of label constraints, especially in structured prediction problems.
no code implementations • 30 Jun 2023 • Vivek Srikumar, Dan Roth
At the end, we will see two worked examples to illustrate the use of these recipes.
no code implementations • NAACL (ACL) 2022 • Hantian Ding, Jinrui Yang, Yuqian Deng, Hongming Zhang, Dan Roth
We introduce an open-domain topic classification system that accepts user-defined taxonomy in real time.
1 code implementation • 24 Jun 2023 • Alyssa Hwang, Bryan Li, Zhaoyi Hou, Dan Roth
With their remarkably improved text generation and prompting capabilities, large language models can adapt existing written information into forms that are easier to use and understand.
no code implementations • NeurIPS 2023 • Kaifu Wang, Efthymia Tsamoura, Dan Roth
This condition non-trivially generalizes and relaxes the existing small ambiguity degree in the PLL literature, since we allow the transition to be deterministic.
no code implementations • 5 Jun 2023 • Hantian Ding, Varun Kumar, Yuchen Tian, Zijian Wang, Rob Kwiatkowski, Xiaopeng Li, Murali Krishna Ramanathan, Baishakhi Ray, Parminder Bhatia, Sudipta Sengupta, Dan Roth, Bing Xiang
Large language models trained on code have shown great potential to increase productivity of software developers.
no code implementations • 30 May 2023 • Xingyu Fu, Sheng Zhang, Gukyeong Kwon, Pramuditha Perera, Henghui Zhu, Yuhao Zhang, Alexander Hanbo Li, William Yang Wang, Zhiguo Wang, Vittorio Castelli, Patrick Ng, Dan Roth, Bing Xiang
The open-ended Visual Question Answering (VQA) task requires AI models to jointly reason over visual and natural language inputs using world knowledge.
1 code implementation • 26 May 2023 • Tyler A. Chang, Kishaloy Halder, Neha Anna John, Yogarshi Vyas, Yassine Benajiba, Miguel Ballesteros, Dan Roth
In this paper, we propose three dimensions of linguistic dataset drift: vocabulary, structural, and semantic drift.
no code implementations • 24 May 2023 • Xingyu Fu, Ben Zhou, Sihao Chen, Mark Yatskar, Dan Roth
We propose the Dynamic Clue Bottleneck Model ( (DCLUB), a method that is designed towards an inherently interpretable VQA system.
no code implementations • 22 May 2023 • Karthikeyan K, Yogarshi Vyas, Jie Ma, Giovanni Paolini, Neha Anna John, Shuai Wang, Yassine Benajiba, Vittorio Castelli, Dan Roth, Miguel Ballesteros
We experiment with 6 diverse datasets and show that PLM consistently performs better than most other approaches (0. 5 - 2. 5 F1), including in novel settings for taxonomy expansion not considered in prior work.
no code implementations • 22 May 2023 • Siyi Liu, Hongming Zhang, Hongwei Wang, Kaiqiang Song, Dan Roth, Dong Yu
However, none of the existing methods have explicitly addressed the issue of framing bias that is inherent in news articles.
no code implementations • 18 May 2023 • Sharon Levy, Neha Anna John, Ling Liu, Yogarshi Vyas, Jie Ma, Yoshinari Fujinuma, Miguel Ballesteros, Vittorio Castelli, Dan Roth
As a result, it is critical to examine biases within each language and attribute.
1 code implementation • 20 Apr 2023 • Iker García-Ferrero, Jon Ander Campos, Oscar Sainz, Ander Salaberria, Dan Roth
Named Entity Recognition (NER) is a core natural language processing task in which pre-trained language models have shown remarkable performance.
Multilingual Named Entity Recognition
named-entity-recognition
+4
no code implementations • 6 Apr 2023 • Sihao Chen, William Bruno, Dan Roth
To facilitate research in this domain, we propose and study a conceptual framework, where we compare how sources typically mention certain controversial entities, and use such as indicators for the sources' content selection preferences.
1 code implementation • 16 Feb 2023 • Hossein Rajaby Faghihi, Aliakbar Nafar, Chen Zheng, Roshanak Mirzaee, Yue Zhang, Andrzej Uszok, Alexander Wan, Tanawan Premsri, Dan Roth, Parisa Kordjamshidi
Recent research has shown that integrating domain knowledge into deep learning architectures is effective -- it helps reduce the amount of required data, improves the accuracy of the models' decisions, and improves the interpretability of models.
no code implementations • 16 Feb 2023 • Shamik Roy, Raphael Shu, Nikolaos Pappas, Elman Mansimov, Yi Zhang, Saab Mansour, Dan Roth
Conventional text style transfer approaches focus on sentence-level style transfer without considering contextual information, and the style is described with attributes (e. g., formality).
no code implementations • 13 Feb 2023 • Danilo Ribeiro, Shen Wang, Xiaofei Ma, Henry Zhu, Rui Dong, Deguang Kong, Juliette Burger, Anjelica Ramos, William Wang, Zhiheng Huang, George Karypis, Bing Xiang, Dan Roth
We introduce STREET, a unified multi-task and multi-domain natural language reasoning and explanation benchmark.
1 code implementation • 31 Dec 2022 • Hangfeng He, Hongming Zhang, Dan Roth
To address this issue, we propose a novel post-processing approach, rethinking with retrieval (RR), which retrieves relevant external knowledge based on the decomposed reasoning steps obtained from the chain-of-thought (CoT) prompting.
Ranked #2 on
Question Answering
on StrategyQA
no code implementations • 21 Dec 2022 • Sihao Chen, Senaka Buthpitiya, Alex Fabrikant, Dan Roth, Tal Schuster
As these propositions can carry different truth values in the context of a given premise, we argue for the need to recognize the textual entailment relation of each proposition in a sentence individually.
no code implementations • 20 Dec 2022 • Raphael Shu, Elman Mansimov, Tamer Alkhouli, Nikolaos Pappas, Salvatore Romeo, Arshit Gupta, Saab Mansour, Yi Zhang, Dan Roth
The conversational model interacts with the environment by generating and executing programs triggering a set of pre-defined APIs.
1 code implementation • 20 Dec 2022 • Yangruibo Ding, Zijian Wang, Wasi Uddin Ahmad, Murali Krishna Ramanathan, Ramesh Nallapati, Parminder Bhatia, Dan Roth, Bing Xiang
While pre-trained language models (LM) for code have achieved great success in code completion, they generate code conditioned only on the contents within the file, i. e., in-file context, but ignore the rich semantics in other files within the same project, i. e., cross-file context, a critical source of information that is especially useful in modern modular software development.
no code implementations • 20 Dec 2022 • Yu Feng, Ben Zhou, Haoyu Wang, Helen Jin, Dan Roth
Temporal reasoning is the task of predicting temporal relations of event pairs.
no code implementations • 20 Dec 2022 • Yahan Yang, Soham Dan, Dan Roth, Insup Lee
Recently it has been shown that state-of-the-art NLP models are vulnerable to adversarial attacks, where the predictions of a model can be drastically altered by slight modifications to the input (such as synonym substitutions).
2 code implementations • 20 Dec 2022 • Shiqi Wang, Zheng Li, Haifeng Qian, Chenghao Yang, Zijian Wang, Mingyue Shang, Varun Kumar, Samson Tan, Baishakhi Ray, Parminder Bhatia, Ramesh Nallapati, Murali Krishna Ramanathan, Dan Roth, Bing Xiang
Most existing works on robustness in text or code tasks have focused on classification, while robustness in generation tasks is an uncharted area and to date there is no comprehensive benchmark for robustness in code generation.
no code implementations • 19 Dec 2022 • Vinayshekhar Bannihatti Kumar, Rashmi Gangadharaiah, Dan Roth
In several real world industry applications that use Machine Learning to build models on user data, such mandates require significant effort both in terms of data cleansing as well as model retraining while ensuring the models do not deteriorate in prediction quality due to removal of data.
1 code implementation • 18 Dec 2022 • Hritik Bansal, Karthik Gopalakrishnan, Saket Dingliwal, Sravan Bodapati, Katrin Kirchhoff, Dan Roth
Using a 66 billion parameter language model (OPT-66B) across a diverse set of 14 downstream tasks, we find this is indeed the case: $\sim$70% of attention heads and $\sim$20% of feed forward networks can be removed with minimal decline in task performance.
1 code implementation • 6 Dec 2022 • William Bruno, Dan Roth
Premises are long and multigranular.
1 code implementation • 7 Nov 2022 • Jiayao Zhang, Hongming Zhang, Zhun Deng, Dan Roth
We distill several insights from our analysis on study the peer review process with the help of large LMs.
no code implementations • 30 Oct 2022 • Ben Zhou, Kyle Richardson, Xiaodong Yu, Dan Roth
Explicit decomposition modeling, which involves breaking down complex tasks into more straightforward and often more interpretable sub-tasks, has long been a central theme in developing robust and interpretable NLU systems.
2 code implementations • 26 Oct 2022 • Ben Athiwaratkun, Sanjay Krishna Gouda, Zijian Wang, Xiaopeng Li, Yuchen Tian, Ming Tan, Wasi Uddin Ahmad, Shiqi Wang, Qing Sun, Mingyue Shang, Sujan Kumar Gonugondla, Hantian Ding, Varun Kumar, Nathan Fulton, Arash Farahani, Siddhartha Jain, Robert Giaquinto, Haifeng Qian, Murali Krishna Ramanathan, Ramesh Nallapati, Baishakhi Ray, Parminder Bhatia, Sudipta Sengupta, Dan Roth, Bing Xiang
Using these benchmarks, we are able to assess the performance of code generation models in a multi-lingual fashion, and discovered generalization ability of language models on out-of-domain languages, advantages of multi-lingual models over mono-lingual, the ability of few-shot prompting to teach the model new languages, and zero-shot translation abilities even on mono-lingual settings.
no code implementations • 22 Oct 2022 • Daniel Deutsch, Rotem Dror, Dan Roth
There is significant interest in developing evaluation metrics which accurately estimate the quality of generated text without the aid of a human-written reference text, which can be time consuming and expensive to collect or entirely unavailable in online applications.
no code implementations • 12 Oct 2022 • Rotem Dror, Haoyu Wang, Dan Roth
The answers to these questions can be found by collecting many documents on the complex event of interest, extracting relevant information, and analyzing it.
no code implementations • 12 Oct 2022 • Hongming Zhang, Yintong Huo, Yanai Elazar, Yangqiu Song, Yoav Goldberg, Dan Roth
We first align commonsense tasks with relevant knowledge from commonsense knowledge bases and ask humans to annotate whether the knowledge is enough or not.
1 code implementation • 12 Oct 2022 • Siddharth Varia, Shuai Wang, Kishaloy Halder, Robert Vacareanu, Miguel Ballesteros, Yassine Benajiba, Neha Anna John, Rishita Anubhai, Smaranda Muresan, Dan Roth
Aspect-based Sentiment Analysis (ABSA) is a fine-grained sentiment analysis task which involves four elements from user-generated texts: aspect term, aspect category, opinion term, and sentiment polarity.
Aspect-Based Sentiment Analysis
Aspect-Based Sentiment Analysis (ABSA)
+3
1 code implementation • 11 Oct 2022 • Ben Zhou, Dian Yu, Dong Yu, Dan Roth
Speaker identification, determining which character said each utterance in literary text, benefits many downstream tasks.
no code implementations • 10 Oct 2022 • Haoyu Wang, Hongming Zhang, Yuqian Deng, Jacob R. Gardner, Dan Roth, Muhao Chen
In this paper, we seek to improve the faithfulness of TempRel extraction models from two perspectives.
Ranked #3 on
Temporal Relation Classification
on MATRES
no code implementations • 8 Oct 2022 • Haoyu Wang, Hongming Zhang, Yueguan Wang, Yuqian Deng, Muhao Chen, Dan Roth
In this paper, we address this gap by examining the extent to which current models comprehend the essentiality of step events in relation to a goal event.
no code implementations • 7 Oct 2022 • Vinayshekhar Bannihatti Kumar, Rashmi Gangadharaiah, Dan Roth
Research has shown that personality is a key driver to improve engagement and user experience in conversational systems.
no code implementations • 19 Jul 2022 • Harsha Kokel, Mayukh Das, Rakibul Islam, Julia Bonn, Jon Cai, Soham Dan, Anjali Narayan-Chen, Prashant Jayannavar, Janardhan Rao Doppa, Julia Hockenmaier, Sriraam Natarajan, Martha Palmer, Dan Roth
We consider the problem of human-machine collaborative problem solving as a planning task coupled with natural language communication.
4 code implementations • 9 Jun 2022 • Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza, Ambrose Slone, Ameet Rahane, Anantharaman S. Iyer, Anders Andreassen, Andrea Madotto, Andrea Santilli, Andreas Stuhlmüller, Andrew Dai, Andrew La, Andrew Lampinen, Andy Zou, Angela Jiang, Angelica Chen, Anh Vuong, Animesh Gupta, Anna Gottardi, Antonio Norelli, Anu Venkatesh, Arash Gholamidavoodi, Arfa Tabassum, Arul Menezes, Arun Kirubarajan, Asher Mullokandov, Ashish Sabharwal, Austin Herrick, Avia Efrat, Aykut Erdem, Ayla Karakaş, B. Ryan Roberts, Bao Sheng Loe, Barret Zoph, Bartłomiej Bojanowski, Batuhan Özyurt, Behnam Hedayatnia, Behnam Neyshabur, Benjamin Inden, Benno Stein, Berk Ekmekci, Bill Yuchen Lin, Blake Howald, Bryan Orinion, Cameron Diao, Cameron Dour, Catherine Stinson, Cedrick Argueta, César Ferri Ramírez, Chandan Singh, Charles Rathkopf, Chenlin Meng, Chitta Baral, Chiyu Wu, Chris Callison-Burch, Chris Waites, Christian Voigt, Christopher D. Manning, Christopher Potts, Cindy Ramirez, Clara E. Rivera, Clemencia Siro, Colin Raffel, Courtney Ashcraft, Cristina Garbacea, Damien Sileo, Dan Garrette, Dan Hendrycks, Dan Kilman, Dan Roth, Daniel Freeman, Daniel Khashabi, Daniel Levy, Daniel Moseguí González, Danielle Perszyk, Danny Hernandez, Danqi Chen, Daphne Ippolito, Dar Gilboa, David Dohan, David Drakard, David Jurgens, Debajyoti Datta, Deep Ganguli, Denis Emelin, Denis Kleyko, Deniz Yuret, Derek Chen, Derek Tam, Dieuwke Hupkes, Diganta Misra, Dilyar Buzan, Dimitri Coelho Mollo, Diyi Yang, Dong-Ho Lee, Dylan Schrader, Ekaterina Shutova, Ekin Dogus Cubuk, Elad Segal, Eleanor Hagerman, Elizabeth Barnes, Elizabeth Donoway, Ellie Pavlick, Emanuele Rodola, Emma Lam, Eric Chu, Eric Tang, Erkut Erdem, Ernie Chang, Ethan A. Chi, Ethan Dyer, Ethan Jerzak, Ethan Kim, Eunice Engefu Manyasi, Evgenii Zheltonozhskii, Fanyue Xia, Fatemeh Siar, Fernando Martínez-Plumed, Francesca Happé, Francois Chollet, Frieda Rong, Gaurav Mishra, Genta Indra Winata, Gerard de Melo, Germán Kruszewski, Giambattista Parascandolo, Giorgio Mariani, Gloria Wang, Gonzalo Jaimovitch-López, Gregor Betz, Guy Gur-Ari, Hana Galijasevic, Hannah Kim, Hannah Rashkin, Hannaneh Hajishirzi, Harsh Mehta, Hayden Bogar, Henry Shevlin, Hinrich Schütze, Hiromu Yakura, Hongming Zhang, Hugh Mee Wong, Ian Ng, Isaac Noble, Jaap Jumelet, Jack Geissinger, Jackson Kernion, Jacob Hilton, Jaehoon Lee, Jaime Fernández Fisac, James B. Simon, James Koppel, James Zheng, James Zou, Jan Kocoń, Jana Thompson, Janelle Wingfield, Jared Kaplan, Jarema Radom, Jascha Sohl-Dickstein, Jason Phang, Jason Wei, Jason Yosinski, Jekaterina Novikova, Jelle Bosscher, Jennifer Marsh, Jeremy Kim, Jeroen Taal, Jesse Engel, Jesujoba Alabi, Jiacheng Xu, Jiaming Song, Jillian Tang, Joan Waweru, John Burden, John Miller, John U. Balis, Jonathan Batchelder, Jonathan Berant, Jörg Frohberg, Jos Rozen, Jose Hernandez-Orallo, Joseph Boudeman, Joseph Guerr, Joseph Jones, Joshua B. Tenenbaum, Joshua S. Rule, Joyce Chua, Kamil Kanclerz, Karen Livescu, Karl Krauth, Karthik Gopalakrishnan, Katerina Ignatyeva, Katja Markert, Kaustubh D. Dhole, Kevin Gimpel, Kevin Omondi, Kory Mathewson, Kristen Chiafullo, Ksenia Shkaruta, Kumar Shridhar, Kyle McDonell, Kyle Richardson, Laria Reynolds, Leo Gao, Li Zhang, Liam Dugan, Lianhui Qin, Lidia Contreras-Ochando, Louis-Philippe Morency, Luca Moschella, Lucas Lam, Lucy Noble, Ludwig Schmidt, Luheng He, Luis Oliveros Colón, Luke Metz, Lütfi Kerem Şenel, Maarten Bosma, Maarten Sap, Maartje ter Hoeve, Maheen Farooqi, Manaal Faruqui, Mantas Mazeika, Marco Baturan, Marco Marelli, Marco Maru, Maria Jose Ramírez Quintana, Marie Tolkiehn, Mario Giulianelli, Martha Lewis, Martin Potthast, Matthew L. Leavitt, Matthias Hagen, Mátyás Schubert, Medina Orduna Baitemirova, Melody Arnaud, Melvin McElrath, Michael A. Yee, Michael Cohen, Michael Gu, Michael Ivanitskiy, Michael Starritt, Michael Strube, Michał Swędrowski, Michele Bevilacqua, Michihiro Yasunaga, Mihir Kale, Mike Cain, Mimee Xu, Mirac Suzgun, Mitch Walker, Mo Tiwari, Mohit Bansal, Moin Aminnaseri, Mor Geva, Mozhdeh Gheini, Mukund Varma T, Nanyun Peng, Nathan A. Chi, Nayeon Lee, Neta Gur-Ari Krakover, Nicholas Cameron, Nicholas Roberts, Nick Doiron, Nicole Martinez, Nikita Nangia, Niklas Deckers, Niklas Muennighoff, Nitish Shirish Keskar, Niveditha S. Iyer, Noah Constant, Noah Fiedel, Nuan Wen, Oliver Zhang, Omar Agha, Omar Elbaghdadi, Omer Levy, Owain Evans, Pablo Antonio Moreno Casares, Parth Doshi, Pascale Fung, Paul Pu Liang, Paul Vicol, Pegah Alipoormolabashi, Peiyuan Liao, Percy Liang, Peter Chang, Peter Eckersley, Phu Mon Htut, Pinyu Hwang, Piotr Miłkowski, Piyush Patil, Pouya Pezeshkpour, Priti Oli, Qiaozhu Mei, Qing Lyu, Qinlang Chen, Rabin Banjade, Rachel Etta Rudolph, Raefer Gabriel, Rahel Habacker, Ramon Risco, Raphaël Millière, Rhythm Garg, Richard Barnes, Rif A. Saurous, Riku Arakawa, Robbe Raymaekers, Robert Frank, Rohan Sikand, Roman Novak, Roman Sitelew, Ronan LeBras, Rosanne Liu, Rowan Jacobs, Rui Zhang, Ruslan Salakhutdinov, Ryan Chi, Ryan Lee, Ryan Stovall, Ryan Teehan, Rylan Yang, Sahib Singh, Saif M. Mohammad, Sajant Anand, Sam Dillavou, Sam Shleifer, Sam Wiseman, Samuel Gruetter, Samuel R. Bowman, Samuel S. Schoenholz, Sanghyun Han, Sanjeev Kwatra, Sarah A. Rous, Sarik Ghazarian, Sayan Ghosh, Sean Casey, Sebastian Bischoff, Sebastian Gehrmann, Sebastian Schuster, Sepideh Sadeghi, Shadi Hamdan, Sharon Zhou, Shashank Srivastava, Sherry Shi, Shikhar Singh, Shima Asaadi, Shixiang Shane Gu, Shubh Pachchigar, Shubham Toshniwal, Shyam Upadhyay, Shyamolima, Debnath, Siamak Shakeri, Simon Thormeyer, Simone Melzi, Siva Reddy, Sneha Priscilla Makini, Soo-Hwan Lee, Spencer Torene, Sriharsha Hatwar, Stanislas Dehaene, Stefan Divic, Stefano Ermon, Stella Biderman, Stephanie Lin, Stephen Prasad, Steven T. Piantadosi, Stuart M. Shieber, Summer Misherghi, Svetlana Kiritchenko, Swaroop Mishra, Tal Linzen, Tal Schuster, Tao Li, Tao Yu, Tariq Ali, Tatsu Hashimoto, Te-Lin Wu, Théo Desbordes, Theodore Rothschild, Thomas Phan, Tianle Wang, Tiberius Nkinyili, Timo Schick, Timofei Kornev, Titus Tunduny, Tobias Gerstenberg, Trenton Chang, Trishala Neeraj, Tushar Khot, Tyler Shultz, Uri Shaham, Vedant Misra, Vera Demberg, Victoria Nyamai, Vikas Raunak, Vinay Ramasesh, Vinay Uday Prabhu, Vishakh Padmakumar, Vivek Srikumar, William Fedus, William Saunders, William Zhang, Wout Vossen, Xiang Ren, Xiaoyu Tong, Xinran Zhao, Xinyi Wu, Xudong Shen, Yadollah Yaghoobzadeh, Yair Lakretz, Yangqiu Song, Yasaman Bahri, Yejin Choi, Yichi Yang, Yiding Hao, Yifu Chen, Yonatan Belinkov, Yu Hou, Yufang Hou, Yuntao Bai, Zachary Seid, Zhuoye Zhao, Zijian Wang, Zijie J. Wang, ZiRui Wang, Ziyi Wu
BIG-bench focuses on tasks that are believed to be beyond the capabilities of current language models.
1 code implementation • Findings (NAACL) 2022 • Danilo Ribeiro, Shen Wang, Xiaofei Ma, Rui Dong, Xiaokai Wei, Henry Zhu, Xinchi Chen, Zhiheng Huang, Peng Xu, Andrew Arnold, Dan Roth
Our model is able to explain a given hypothesis by systematically generating a step-by-step explanation from textual premises.
1 code implementation • 29 Apr 2022 • Daniel Deutsch, Dan Roth
We introduce Repro, an open-source library which aims at improving the reproducibility and usability of research code.
no code implementations • NAACL 2022 • Daniel Deutsch, Rotem Dror, Dan Roth
How reliably an automatic summarization evaluation metric replicates human judgments of summary quality is quantified by system-level correlations.
no code implementations • Findings (ACL) 2022 • Daniel Deutsch, Dan Roth
Question answering-based summarization evaluation metrics must automatically determine whether the QA model's prediction is correct or not, a task known as answer verification.
1 code implementation • ACL 2022 • Aaron Mueller, Jason Krone, Salvatore Romeo, Saab Mansour, Elman Mansimov, Yi Zhang, Dan Roth
Label semantic aware systems have leveraged this information for improved text classification performance during fine-tuning and prediction.
2 code implementations • ACL 2022 • Zheng Li, Zijian Wang, Ming Tan, Ramesh Nallapati, Parminder Bhatia, Andrew Arnold, Bing Xiang, Dan Roth
Empirical analyses show that, despite the challenging nature of generative tasks, we were able to achieve a 16. 5x model footprint compression ratio with little performance drop relative to the full-precision counterparts on multiple summarization and QA datasets.
1 code implementation • Findings (ACL) 2022 • Jie Ma, Miguel Ballesteros, Srikanth Doss, Rishita Anubhai, Sunil Mallya, Yaser Al-Onaizan, Dan Roth
We study the problem of few shot learning for named entity recognition.
1 code implementation • CVPR 2022 • Georgios Georgakis, Karl Schmeckpeper, Karan Wanchoo, Soham Dan, Eleni Miltsakaki, Dan Roth, Kostas Daniilidis
We consider the problem of Vision-and-Language Navigation (VLN).
1 code implementation • 1 Mar 2022 • Xingyu Fu, Ben Zhou, Ishaan Preetam Chandratreya, Carl Vondrick, Dan Roth
For example, in Figure 1, we can find a way to identify the news articles related to the picture through segment-wise understandings of the signs, the buildings, the crowds, and more.
no code implementations • 20 Feb 2022 • Soham Dan, Osbert Bastani, Dan Roth
Currently, deep neural networks struggle to generalize robustly to such shifts in the data distribution.
1 code implementation • 31 Jan 2022 • Jiayao Zhang, Hongming Zhang, Weijie J. Su, Dan Roth
Commonsense causality reasoning (CCR) aims at identifying plausible causes and effects in natural language descriptions that are deemed reasonable by an average person.
2 code implementations • 28 Jan 2022 • Uri Alon, Frank F. Xu, Junxian He, Sudipta Sengupta, Dan Roth, Graham Neubig
Retrieval-based language models (R-LM) model the probability of natural language text by combining a standard language model (LM) with examples retrieved from an external datastore at test time.
1 code implementation • 15 Dec 2021 • Xiaodong Yu, Wenpeng Yin, Nitish Gupta, Dan Roth
Third, we retrain and evaluate two state-of-the-art (SOTA) entity linking models, showing the challenges of event linking, and we propose an event-specific linking system EVELINK to set a competitive result for the new task.
1 code implementation • Findings (NAACL) 2022 • Sihao Chen, Siyi Liu, Xander Uyttendaele, Yi Zhang, William Bruno, Dan Roth
Naturally, identifying such responses within a document is a natural language understanding task.
no code implementations • 15 Nov 2021 • Daniel Deutsch, Dan Roth
In this work, we propose a method for incorporating question-answering (QA) signals into a summarization model.
no code implementations • 1 Nov 2021 • Bonan Min, Hayley Ross, Elior Sulem, Amir Pouran Ben Veyseh, Thien Huu Nguyen, Oscar Sainz, Eneko Agirre, Ilana Heinz, Dan Roth
Large, pre-trained transformer-based language models such as BERT have drastically changed the Natural Language Processing (NLP) field.
no code implementations • EMNLP 2021 • Haoyu Wang, Hongming Zhang, Muhao Chen, Dan Roth
The task of subevent detection aims to resolve this granularity issue, recognizing the membership of multi-granular events in event complexes.
no code implementations • ACL 2021 • Yi Zhang, Zachary Ives, Dan Roth
We experiment with a newly created evaluation dataset, Politi-Prov, based on fact-checking articles from \url{www. politifact. com}; our experimental results show that our solution leads to a significant improvement over baselines.
no code implementations • ACL 2021 • Qing Lyu, Hongming Zhang, Elior Sulem, Dan Roth
Event extraction has long been a challenging task, addressed mostly with supervised methods that require expensive annotation and are not extensible to new event ontologies.
no code implementations • ACL 2021 • Muhao Chen, Hongming Zhang, Qiang Ning, Manling Li, Heng Ji, Kathleen McKeown, Dan Roth
This tutorial targets researchers and practitioners who are interested in AI technologies that help machines understand natural language text, particularly real-world events described in the text.
1 code implementation • NAACL 2021 • Siyi Liu, Sihao Chen, Xander Uyttendaele, Dan Roth
We propose MultiOpEd, an open-domain news editorial corpus that supports various tasks pertaining to the argumentation structure in news editorials, focusing on automatic perspective discovery.
1 code implementation • NAACL 2021 • Haoyang Wen, Yanru Qu, Heng Ji, Qiang Ning, Jiawei Han, Avi Sil, Hanghang Tong, Dan Roth
Grounding events into a precise timeline is important for natural language understanding but has received limited attention in recent work.
1 code implementation • NAACL 2021 • Yi Zhang, Sujay Kumar Jauhar, Julia Kiseleva, Ryen White, Dan Roth
Both components of our graph induction solution are evaluated in experiments, demonstrating that our models outperform a state-of-the-art text generator significantly.
no code implementations • NAACL 2021 • Soham Dan, Michael Zhou, Dan Roth
Understanding and executing natural language instructions in a grounded domain is one of the hallmarks of artificial intelligence.
1 code implementation • NAACL 2021 • Haoyang Wen, Ying Lin, Tuan Lai, Xiaoman Pan, Sha Li, Xudong Lin, Ben Zhou, Manling Li, Haoyu Wang, Hongming Zhang, Xiaodong Yu, Alexander Dong, Zhenhailong Wang, Yi Fung, Piyush Mishra, Qing Lyu, D{\'\i}dac Sur{\'\i}s, Brian Chen, Susan Windisch Brown, Martha Palmer, Chris Callison-Burch, Carl Vondrick, Jiawei Han, Dan Roth, Shih-Fu Chang, Heng Ji
We present a new information extraction system that can automatically construct temporal event graphs from a collection of news documents from multiple sources, multiple languages (English and Spanish for our experiment), and multiple data modalities (speech, text, image and video).
1 code implementation • ICLR 2022 • Shuxiao Chen, Koby Crammer, Hangfeng He, Dan Roth, Weijie J. Su
In this paper, we introduce Target-Aware Weighted Training (TAWT), a weighted training algorithm for cross-task learning based on minimizing a representation-based task distance between the source and target tasks.
no code implementations • 26 Apr 2021 • Celine Lee, Justin Gottschlich, Dan Roth
With the growth of natural language processing techniques and demand for improved software engineering efficiency, there is an emerging interest in translating intention from human languages to programming languages.
no code implementations • NAACL 2021 • Sihao Chen, Fan Zhang, Kazoo Sone, Dan Roth
Despite significant progress in neural abstractive summarization, recent studies have shown that the current models are prone to generating summaries that are unfaithful to the original context.
no code implementations • EMNLP 2021 • Yanai Elazar, Hongming Zhang, Yoav Goldberg, Dan Roth
To support this claim, we first show that the current evaluation method of WS is sub-optimal and propose a modification that uses twin sentences for evaluation.
Ranked #24 on
Coreference Resolution
on Winograd Schema Challenge