Search Results for author: Vittorio Castelli

Found 29 papers, 4 papers with code

Towards Robust Neural Retrieval with Source Domain Synthetic Pre-Finetuning

no code implementations • COLING 2022 • Revanth Gangi Reddy, Vikas Yadav, Md Arafat Sultan, Martin Franz, Vittorio Castelli, Heng Ji, Avirup Sil

Research on neural IR has so far been focused primarily on standard supervised learning settings, where it outperforms traditional term matching baselines.

Data Augmentation Domain Generalization +2

Paper
Add Code

From Instructions to Constraints: Language Model Alignment with Automatic Constraint Verification

no code implementations • 10 Mar 2024 • Fei Wang, Chao Shang, Sarthak Jain, Shuai Wang, Qiang Ning, Bonan Min, Vittorio Castelli, Yassine Benajiba, Dan Roth

We investigate common constraints in NLP tasks, categorize them into three classes based on the types of their arguments, and propose a unified framework, ACT (Aligning to ConsTraints), to automatically produce supervision signals for user alignment with constraints.

Abstractive Text Summarization Entity Typing +2

Paper
Add Code

NewsQs: Multi-Source Question Generation for the Inquiring Mind

no code implementations • 28 Feb 2024 • Alyssa Hwang, Kalpit Dixit, Miguel Ballesteros, Yassine Benajiba, Vittorio Castelli, Markus Dreyer, Mohit Bansal, Kathleen McKeown

We present NewsQs (news-cues), a dataset that provides question-answer pairs for multiple news documents.

Document Summarization Multi-Document Summarization +3

Paper
Add Code

Few-Shot Data-to-Text Generation via Unified Representation and Multi-Source Learning

no code implementations • 10 Aug 2023 • Alexander Hanbo Li, Mingyue Shang, Evangelia Spiliopoulou, Jie Ma, Patrick Ng, Zhiguo Wang, Bonan Min, William Wang, Kathleen McKeown, Vittorio Castelli, Dan Roth, Bing Xiang

We present a novel approach for structured data-to-text generation that addresses the limitations of existing methods that primarily focus on specific types of structured data.

Data-to-Text Generation

Paper
Add Code

Generate then Select: Open-ended Visual Question Answering Guided by World Knowledge

no code implementations • 30 May 2023 • Xingyu Fu, Sheng Zhang, Gukyeong Kwon, Pramuditha Perera, Henghui Zhu, Yuhao Zhang, Alexander Hanbo Li, William Yang Wang, Zhiguo Wang, Vittorio Castelli, Patrick Ng, Dan Roth, Bing Xiang

The open-ended Visual Question Answering (VQA) task requires AI models to jointly reason over visual and natural language inputs using world knowledge.

Answer Selection Visual Question Answering +1

Paper
Add Code

Benchmarking Diverse-Modal Entity Linking with Generative Models

no code implementations • 27 May 2023 • Sijia Wang, Alexander Hanbo Li, Henry Zhu, Sheng Zhang, Chung-Wei Hang, Pramuditha Perera, Jie Ma, William Wang, Zhiguo Wang, Vittorio Castelli, Bing Xiang, Patrick Ng

Entities can be expressed in diverse formats, such as texts, images, or column names and cell values in tables.

Benchmarking Entity Linking +1

Paper
Add Code

UNITE: A Unified Benchmark for Text-to-SQL Evaluation

1 code implementation • 25 May 2023 • Wuwei Lan, Zhiguo Wang, Anuj Chauhan, Henghui Zhu, Alexander Li, Jiang Guo, Sheng Zhang, Chung-Wei Hang, Joseph Lilien, Yiqun Hu, Lin Pan, Mingwen Dong, Jun Wang, Jiarong Jiang, Stephen Ash, Vittorio Castelli, Patrick Ng, Bing Xiang

A practical text-to-SQL system should generalize well on a wide variety of natural language questions, unseen database schemas, and novel SQL query structures.

Text-To-SQL

Paper
Code

Pre-training Intent-Aware Encoders for Zero- and Few-Shot Intent Classification

1 code implementation • 24 May 2023 • Mujeen Sung, James Gung, Elman Mansimov, Nikolaos Pappas, Raphael Shu, Salvatore Romeo, Yi Zhang, Vittorio Castelli

Intent classification (IC) plays an important role in task-oriented dialogue systems.

Contrastive Learning intent-classification +2

Paper
Code

Taxonomy Expansion for Named Entity Recognition

no code implementations • 22 May 2023 • Karthikeyan K, Yogarshi Vyas, Jie Ma, Giovanni Paolini, Neha Anna John, Shuai Wang, Yassine Benajiba, Vittorio Castelli, Dan Roth, Miguel Ballesteros

We experiment with 6 diverse datasets and show that PLM consistently performs better than most other approaches (0. 5 - 2. 5 F1), including in novel settings for taxonomy expansion not considered in prior work.

named-entity-recognition Named Entity Recognition +2

Paper
Add Code

Comparing Biases and the Impact of Multilingual Training across Multiple Languages

no code implementations • 18 May 2023 • Sharon Levy, Neha Anna John, Ling Liu, Yogarshi Vyas, Jie Ma, Yoshinari Fujinuma, Miguel Ballesteros, Vittorio Castelli, Dan Roth

As a result, it is critical to examine biases within each language and attribute.

Attribute Fairness +1

Paper
Add Code

Dr.Spider: A Diagnostic Evaluation Benchmark towards Text-to-SQL Robustness

2 code implementations • 21 Jan 2023 • Shuaichen Chang, Jun Wang, Mingwen Dong, Lin Pan, Henghui Zhu, Alexander Hanbo Li, Wuwei Lan, Sheng Zhang, Jiarong Jiang, Joseph Lilien, Steve Ash, William Yang Wang, Zhiguo Wang, Vittorio Castelli, Patrick Ng, Bing Xiang

Neural text-to-SQL models have achieved remarkable performance in translating natural language questions into SQL queries.

Natural Questions Text-To-SQL

Paper
Code

Importance of Synthesizing High-quality Data for Text-to-SQL Parsing

no code implementations • 17 Dec 2022 • Yiyun Zhao, Jiarong Jiang, Yiqun Hu, Wuwei Lan, Henry Zhu, Anuj Chauhan, Alexander Li, Lin Pan, Jun Wang, Chung-Wei Hang, Sheng Zhang, Marvin Dong, Joe Lilien, Patrick Ng, Zhiguo Wang, Vittorio Castelli, Bing Xiang

In this paper, we first examined the existing synthesized datasets and discovered that state-of-the-art text-to-SQL algorithms did not further improve on popular benchmarks when trained with augmented synthetic data.

SQL Parsing SQL-to-Text +2

Paper
Add Code

Novel Chapter Abstractive Summarization using Spinal Tree Aware Sub-Sentential Content Selection

no code implementations • 9 Nov 2022 • Hardy Hardy, Miguel Ballesteros, Faisal Ladhak, Muhammad Khalifa, Vittorio Castelli, Kathleen McKeown

Summarizing novel chapters is a difficult task due to the input length and the fact that sentences that appear in the desired summaries draw content from multiple places throughout the chapter.

Abstractive Text Summarization Extractive Summarization

Paper
Add Code

Synthetic Target Domain Supervision for Open Retrieval QA

no code implementations • 20 Apr 2022 • Revanth Gangi Reddy, Bhavani Iyer, Md Arafat Sultan, Rong Zhang, Avirup Sil, Vittorio Castelli, Radu Florian, Salim Roukos

Neural passage retrieval is a new and promising approach in open retrieval question answering.

Passage Retrieval Question Answering +1

Paper
Add Code

Towards Robust Neural Retrieval Models with Synthetic Pre-Training

no code implementations • 15 Apr 2021 • Revanth Gangi Reddy, Vikas Yadav, Md Arafat Sultan, Martin Franz, Vittorio Castelli, Heng Ji, Avirup Sil

Recent work has shown that commonly available machine reading comprehension (MRC) datasets can be used to train high-performance neural information retrieval (IR) systems.

Information Retrieval Machine Reading Comprehension +1

Paper
Add Code

End-to-End QA on COVID-19: Domain Adaptation with Synthetic Training

no code implementations • 2 Dec 2020 • Revanth Gangi Reddy, Bhavani Iyer, Md Arafat Sultan, Rong Zhang, Avi Sil, Vittorio Castelli, Radu Florian, Salim Roukos

End-to-end question answering (QA) requires both information retrieval (IR) over a large document collection and machine reading comprehension (MRC) on the retrieved passages.

Domain Adaptation Information Retrieval +3

Paper
Add Code

Scalable Cross-lingual Treebank Synthesis for Improved Production Dependency Parsers

no code implementations • COLING 2020 • Yousef El-Kurdi, Hiroshi Kanayama, Efsun Sarioglu Kayi, Vittorio Castelli, Todd Ward, Radu Florian

We present scalable Universal Dependency (UD) treebank synthesis techniques that exploit advances in language representation modeling which leverage vast amounts of unlabeled general-purpose multilingual text.

Data Augmentation

Paper
Add Code

Answer Span Correction in Machine Reading Comprehension

no code implementations • Findings of the Association for Computational Linguistics 2020 • Revanth Gangi Reddy, Md Arafat Sultan, Efsun Sarioglu Kayi, Rong Zhang, Vittorio Castelli, Avirup Sil

Answer validation in machine reading comprehension (MRC) consists of verifying an extracted answer against an input context and question pair.

Machine Reading Comprehension

Paper
Add Code

Improved Synthetic Training for Reading Comprehension

no code implementations • 24 Oct 2020 • Yanda Chen, Md Arafat Sultan, Vittorio Castelli

Automatically generated synthetic training examples have been shown to improve performance in machine reading comprehension (MRC).

Knowledge Distillation Machine Reading Comprehension

Paper
Add Code

Multi-Stage Pre-training for Low-Resource Domain Adaptation

no code implementations • EMNLP 2020 • Rong Zhang, Revanth Gangi Reddy, Md Arafat Sultan, Vittorio Castelli, Anthony Ferritto, Radu Florian, Efsun Sarioglu Kayi, Salim Roukos, Avirup Sil, Todd Ward

Transfer learning techniques are particularly useful in NLP tasks where a sizable amount of high-quality annotated data is difficult to obtain.

Document Ranking Domain Adaptation +3

Paper
Add Code

On the Importance of Diversity in Question Generation for QA

no code implementations • ACL 2020 • Md. Arafat Sultan, Ch, Shubham el, Fern, Ram{\'o}n ez Astudillo, Vittorio Castelli

Automatic question generation (QG) has shown promise as a source of synthetic training data for question answering (QA).

Question Answering Question Generation +1

Paper
Add Code

The TechQA Dataset

2 code implementations • ACL 2020 • Vittorio Castelli, Rishav Chakravarti, Saswati Dana, Anthony Ferritto, Radu Florian, Martin Franz, Dinesh Garg, Dinesh Khandelwal, Scott McCarley, Mike McCawley, Mohamed Nasr, Lin Pan, Cezar Pendus, John Pitrelli, Saurabh Pujar, Salim Roukos, Andrzej Sakrajda, Avirup Sil, Rosario Uceda-Sosa, Todd Ward, Rong Zhang

We introduce TechQA, a domain-adaptation question answering dataset for the technical support domain.

Domain Adaptation Question Answering

Paper
Code

Cross-Task Knowledge Transfer for Query-Based Text Summarization

no code implementations • WS 2019 • Elozino Egonmwan, Vittorio Castelli, Md. Arafat Sultan

We demonstrate the viability of knowledge transfer between two related tasks: machine reading comprehension (MRC) and query-based text summarization.

Machine Reading Comprehension Machine Translation +4

Paper
Add Code

CFO: A Framework for Building Production NLP Systems

no code implementations • IJCNLP 2019 • Rishav Chakravarti, Cezar Pendus, Andrzej Sakrajda, Anthony Ferritto, Lin Pan, Michael Glass, Vittorio Castelli, J. William Murdock, Radu Florian, Salim Roukos, Avirup Sil

This paper introduces a novel orchestration framework, called CFO (COMPUTATION FLOW ORCHESTRATOR), for building, experimenting with, and deploying interactive NLP (Natural Language Processing) and IR (Information Retrieval) systems to production environments.

Information Retrieval Machine Reading Comprehension +2

Paper
Add Code

IBM Research at the CoNLL 2018 Shared Task on Multilingual Parsing

no code implementations • CONLL 2018 • Hui Wan, Tahira Naseem, Young-suk Lee, Vittorio Castelli, Miguel Ballesteros

This paper presents the IBM Research AI submission to the CoNLL 2018 Shared Task on Parsing Universal Dependencies.

Dependency Parsing Morphological Tagging +3

Paper
Add Code

A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization

no code implementations • ACL 2013 • Lu Wang, Hema Raghavan, Vittorio Castelli, Radu Florian, Claire Cardie

We consider the problem of using sentence compression techniques to facilitate query-focused multi-document summarization.

Document Summarization Multi-Document Summarization +2

Paper
Add Code

Query-Focused Opinion Summarization for User-Generated Content

no code implementations • COLING 2014 • Lu Wang, Hema Raghavan, Claire Cardie, Vittorio Castelli

We present a submodular function-based framework for query-focused opinion summarization.

Opinion Summarization text similarity

Paper
Add Code

A Joint Model for Answer Sentence Ranking and Answer Extraction

no code implementations • TACL 2016 • Md. Arafat Sultan, Vittorio Castelli, Radu Florian

Answer sentence ranking and answer extraction are two key challenges in question answering that have traditionally been treated in isolation, i. e., as independent tasks.

Information Retrieval Question Answering +2

Paper
Add Code

Finding What Matters in Questions

no code implementations • NAACL 2013 • Xiaoqiang Luo, Hema Raghavan, Vittorio Castelli, Sameer Maskey, Radu Florian

Information Retrieval Question Answering

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.