no code implementations • COLING 2022 • Revanth Gangi Reddy, Vikas Yadav, Md Arafat Sultan, Martin Franz, Vittorio Castelli, Heng Ji, Avirup Sil
Research on neural IR has so far been focused primarily on standard supervised learning settings, where it outperforms traditional term matching baselines.
no code implementations • 10 Mar 2024 • Fei Wang, Chao Shang, Sarthak Jain, Shuai Wang, Qiang Ning, Bonan Min, Vittorio Castelli, Yassine Benajiba, Dan Roth
We investigate common constraints in NLP tasks, categorize them into three classes based on the types of their arguments, and propose a unified framework, ACT (Aligning to ConsTraints), to automatically produce supervision signals for user alignment with constraints.
no code implementations • 28 Feb 2024 • Alyssa Hwang, Kalpit Dixit, Miguel Ballesteros, Yassine Benajiba, Vittorio Castelli, Markus Dreyer, Mohit Bansal, Kathleen McKeown
We present NewsQs (news-cues), a dataset that provides question-answer pairs for multiple news documents.
no code implementations • 10 Aug 2023 • Alexander Hanbo Li, Mingyue Shang, Evangelia Spiliopoulou, Jie Ma, Patrick Ng, Zhiguo Wang, Bonan Min, William Wang, Kathleen McKeown, Vittorio Castelli, Dan Roth, Bing Xiang
We present a novel approach for structured data-to-text generation that addresses the limitations of existing methods that primarily focus on specific types of structured data.
no code implementations • 30 May 2023 • Xingyu Fu, Sheng Zhang, Gukyeong Kwon, Pramuditha Perera, Henghui Zhu, Yuhao Zhang, Alexander Hanbo Li, William Yang Wang, Zhiguo Wang, Vittorio Castelli, Patrick Ng, Dan Roth, Bing Xiang
The open-ended Visual Question Answering (VQA) task requires AI models to jointly reason over visual and natural language inputs using world knowledge.
no code implementations • 27 May 2023 • Sijia Wang, Alexander Hanbo Li, Henry Zhu, Sheng Zhang, Chung-Wei Hang, Pramuditha Perera, Jie Ma, William Wang, Zhiguo Wang, Vittorio Castelli, Bing Xiang, Patrick Ng
Entities can be expressed in diverse formats, such as texts, images, or column names and cell values in tables.
1 code implementation • 25 May 2023 • Wuwei Lan, Zhiguo Wang, Anuj Chauhan, Henghui Zhu, Alexander Li, Jiang Guo, Sheng Zhang, Chung-Wei Hang, Joseph Lilien, Yiqun Hu, Lin Pan, Mingwen Dong, Jun Wang, Jiarong Jiang, Stephen Ash, Vittorio Castelli, Patrick Ng, Bing Xiang
A practical text-to-SQL system should generalize well on a wide variety of natural language questions, unseen database schemas, and novel SQL query structures.
1 code implementation • 24 May 2023 • Mujeen Sung, James Gung, Elman Mansimov, Nikolaos Pappas, Raphael Shu, Salvatore Romeo, Yi Zhang, Vittorio Castelli
Intent classification (IC) plays an important role in task-oriented dialogue systems.
no code implementations • 22 May 2023 • Karthikeyan K, Yogarshi Vyas, Jie Ma, Giovanni Paolini, Neha Anna John, Shuai Wang, Yassine Benajiba, Vittorio Castelli, Dan Roth, Miguel Ballesteros
We experiment with 6 diverse datasets and show that PLM consistently performs better than most other approaches (0. 5 - 2. 5 F1), including in novel settings for taxonomy expansion not considered in prior work.
no code implementations • 18 May 2023 • Sharon Levy, Neha Anna John, Ling Liu, Yogarshi Vyas, Jie Ma, Yoshinari Fujinuma, Miguel Ballesteros, Vittorio Castelli, Dan Roth
As a result, it is critical to examine biases within each language and attribute.
2 code implementations • 21 Jan 2023 • Shuaichen Chang, Jun Wang, Mingwen Dong, Lin Pan, Henghui Zhu, Alexander Hanbo Li, Wuwei Lan, Sheng Zhang, Jiarong Jiang, Joseph Lilien, Steve Ash, William Yang Wang, Zhiguo Wang, Vittorio Castelli, Patrick Ng, Bing Xiang
Neural text-to-SQL models have achieved remarkable performance in translating natural language questions into SQL queries.
no code implementations • 17 Dec 2022 • Yiyun Zhao, Jiarong Jiang, Yiqun Hu, Wuwei Lan, Henry Zhu, Anuj Chauhan, Alexander Li, Lin Pan, Jun Wang, Chung-Wei Hang, Sheng Zhang, Marvin Dong, Joe Lilien, Patrick Ng, Zhiguo Wang, Vittorio Castelli, Bing Xiang
In this paper, we first examined the existing synthesized datasets and discovered that state-of-the-art text-to-SQL algorithms did not further improve on popular benchmarks when trained with augmented synthetic data.
no code implementations • 9 Nov 2022 • Hardy Hardy, Miguel Ballesteros, Faisal Ladhak, Muhammad Khalifa, Vittorio Castelli, Kathleen McKeown
Summarizing novel chapters is a difficult task due to the input length and the fact that sentences that appear in the desired summaries draw content from multiple places throughout the chapter.
no code implementations • 20 Apr 2022 • Revanth Gangi Reddy, Bhavani Iyer, Md Arafat Sultan, Rong Zhang, Avirup Sil, Vittorio Castelli, Radu Florian, Salim Roukos
Neural passage retrieval is a new and promising approach in open retrieval question answering.
no code implementations • 15 Apr 2021 • Revanth Gangi Reddy, Vikas Yadav, Md Arafat Sultan, Martin Franz, Vittorio Castelli, Heng Ji, Avirup Sil
Recent work has shown that commonly available machine reading comprehension (MRC) datasets can be used to train high-performance neural information retrieval (IR) systems.
no code implementations • 2 Dec 2020 • Revanth Gangi Reddy, Bhavani Iyer, Md Arafat Sultan, Rong Zhang, Avi Sil, Vittorio Castelli, Radu Florian, Salim Roukos
End-to-end question answering (QA) requires both information retrieval (IR) over a large document collection and machine reading comprehension (MRC) on the retrieved passages.
no code implementations • COLING 2020 • Yousef El-Kurdi, Hiroshi Kanayama, Efsun Sarioglu Kayi, Vittorio Castelli, Todd Ward, Radu Florian
We present scalable Universal Dependency (UD) treebank synthesis techniques that exploit advances in language representation modeling which leverage vast amounts of unlabeled general-purpose multilingual text.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Revanth Gangi Reddy, Md Arafat Sultan, Efsun Sarioglu Kayi, Rong Zhang, Vittorio Castelli, Avirup Sil
Answer validation in machine reading comprehension (MRC) consists of verifying an extracted answer against an input context and question pair.
no code implementations • 24 Oct 2020 • Yanda Chen, Md Arafat Sultan, Vittorio Castelli
Automatically generated synthetic training examples have been shown to improve performance in machine reading comprehension (MRC).
no code implementations • EMNLP 2020 • Rong Zhang, Revanth Gangi Reddy, Md Arafat Sultan, Vittorio Castelli, Anthony Ferritto, Radu Florian, Efsun Sarioglu Kayi, Salim Roukos, Avirup Sil, Todd Ward
Transfer learning techniques are particularly useful in NLP tasks where a sizable amount of high-quality annotated data is difficult to obtain.
no code implementations • ACL 2020 • Md. Arafat Sultan, Ch, Shubham el, Fern, Ram{\'o}n ez Astudillo, Vittorio Castelli
Automatic question generation (QG) has shown promise as a source of synthetic training data for question answering (QA).
2 code implementations • ACL 2020 • Vittorio Castelli, Rishav Chakravarti, Saswati Dana, Anthony Ferritto, Radu Florian, Martin Franz, Dinesh Garg, Dinesh Khandelwal, Scott McCarley, Mike McCawley, Mohamed Nasr, Lin Pan, Cezar Pendus, John Pitrelli, Saurabh Pujar, Salim Roukos, Andrzej Sakrajda, Avirup Sil, Rosario Uceda-Sosa, Todd Ward, Rong Zhang
We introduce TechQA, a domain-adaptation question answering dataset for the technical support domain.
no code implementations • WS 2019 • Elozino Egonmwan, Vittorio Castelli, Md. Arafat Sultan
We demonstrate the viability of knowledge transfer between two related tasks: machine reading comprehension (MRC) and query-based text summarization.
no code implementations • IJCNLP 2019 • Rishav Chakravarti, Cezar Pendus, Andrzej Sakrajda, Anthony Ferritto, Lin Pan, Michael Glass, Vittorio Castelli, J. William Murdock, Radu Florian, Salim Roukos, Avirup Sil
This paper introduces a novel orchestration framework, called CFO (COMPUTATION FLOW ORCHESTRATOR), for building, experimenting with, and deploying interactive NLP (Natural Language Processing) and IR (Information Retrieval) systems to production environments.
no code implementations • CONLL 2018 • Hui Wan, Tahira Naseem, Young-suk Lee, Vittorio Castelli, Miguel Ballesteros
This paper presents the IBM Research AI submission to the CoNLL 2018 Shared Task on Parsing Universal Dependencies.
no code implementations • ACL 2013 • Lu Wang, Hema Raghavan, Vittorio Castelli, Radu Florian, Claire Cardie
We consider the problem of using sentence compression techniques to facilitate query-focused multi-document summarization.
no code implementations • COLING 2014 • Lu Wang, Hema Raghavan, Claire Cardie, Vittorio Castelli
We present a submodular function-based framework for query-focused opinion summarization.
no code implementations • TACL 2016 • Md. Arafat Sultan, Vittorio Castelli, Radu Florian
Answer sentence ranking and answer extraction are two key challenges in question answering that have traditionally been treated in isolation, i. e., as independent tasks.