Search Results for author: Dragomir Radev

Found 119 papers, 70 papers with code

Testing Cross-Database Semantic Parsers With Canonical Utterances

1 code implementation • EMNLP (Eval4NLP) 2021 • Heather Lent, Semih Yavuz, Tao Yu, Tong Niu, Yingbo Zhou, Dragomir Radev, Xi Victoria Lin

Paper
Code

You reap what you sow: On the Challenges of Bias Evaluation Under Multilingual Settings

no code implementations • BigScience (ACL) 2022 • Zeerak Talat, Aurélie Névéol, Stella Biderman, Miruna Clinciu, Manan Dey, Shayne Longpre, Sasha Luccioni, Maraim Masoud, Margaret Mitchell, Dragomir Radev, Shanya Sharma, Arjun Subramonian, Jaesung Tae, Samson Tan, Deepak Tunuguntla, Oskar van der Wal

Evaluating bias, fairness, and social impact in monolingual language models is a difficult task.

Fairness Language Modelling

Paper
Add Code

Improving Cross-lingual Text Classification with Zero-shot Instance-Weighting

no code implementations • ACL (RepL4NLP) 2021 • Irene Li, Prithviraj Sen, Huaiyu Zhu, Yunyao Li, Dragomir Radev

In this paper, we propose zero-shot instance-weighting, a general model-agnostic zero-shot learning framework for improving CLTC by leveraging source instance weighting.

text-classification Text Classification +1

Paper
Add Code

An Exploratory Study on Long Dialogue Summarization: What Works and What’s Next

1 code implementation • Findings (EMNLP) 2021 • Yusen Zhang, Ansong Ni, Tao Yu, Rui Zhang, Chenguang Zhu, Budhaditya Deb, Asli Celikyilmaz, Ahmed Hassan Awadallah, Dragomir Radev

Dialogue summarization helps readers capture salient information from long conversations in meetings, interviews, and TV series.

Retrieval

Paper
Code

Ascle: A Python Natural Language Processing Toolkit for Medical Text Generation

3 code implementations • 28 Nov 2023 • Rui Yang, Qingcheng Zeng, Keen You, Yujie Qiao, Lucas Huang, Chia-Chun Hsieh, Benjamin Rosand, Jeremy Goldwasser, Amisha D Dave, Tiarnan D. L. Keenan, Emily Y Chew, Dragomir Radev, Zhiyong Lu, Hua Xu, Qingyu Chen, Irene Li

This study introduces Ascle, a pioneering natural language processing (NLP) toolkit designed for medical text generation.

Machine Translation Question Answering +5

Paper
Code

Benchmarking Generation and Evaluation Capabilities of Large Language Models for Instruction Controllable Summarization

1 code implementation • 15 Nov 2023 • Yixin Liu, Alexander R. Fabbri, Jiawen Chen, Yilun Zhao, Simeng Han, Shafiq Joty, PengFei Liu, Dragomir Radev, Chien-Sheng Wu, Arman Cohan

Our study reveals that instruction controllable text summarization remains a challenging task for LLMs, since (1) all LLMs evaluated still make factual and other types of errors in their summaries; (2) all LLM-based evaluation methods cannot achieve a strong alignment with human annotators when judging the quality of candidate summaries; (3) different LLMs show large performance gaps in summary generation and evaluation.

Benchmarking Text Summarization

Paper
Code

Fair Abstractive Summarization of Diverse Perspectives

1 code implementation • 14 Nov 2023 • Yusen Zhang, Nan Zhang, Yixin Liu, Alexander Fabbri, Junru Liu, Ryo Kamoi, Xiaoxin Lu, Caiming Xiong, Jieyu Zhao, Dragomir Radev, Kathleen McKeown, Rui Zhang

However, current work in summarization metrics and Large Language Models (LLMs) evaluation has not explored fair abstractive summarization.

Abstractive Text Summarization Fairness

Paper
Code

L2CEval: Evaluating Language-to-Code Generation Capabilities of Large Language Models

no code implementations • 29 Sep 2023 • Ansong Ni, Pengcheng Yin, Yilun Zhao, Martin Riddell, Troy Feng, Rui Shen, Stephen Yin, Ye Liu, Semih Yavuz, Caiming Xiong, Shafiq Joty, Yingbo Zhou, Dragomir Radev, Arman Cohan

Recently, large language models (LLMs), especially those that are pretrained on code, have demonstrated strong capabilities in generating programs from natural language inputs in a few-shot or even zero-shot manner.

Code Generation Math +1

Paper
Add Code

RobuT: A Systematic Study of Table QA Robustness Against Human-Annotated Adversarial Perturbations

1 code implementation • 25 Jun 2023 • Yilun Zhao, Chen Zhao, Linyong Nan, Zhenting Qi, Wenlin Zhang, Xiangru Tang, Boyu Mi, Dragomir Radev

Despite significant progress having been made in question answering on tabular data (Table QA), it's unclear whether, and to what extent existing Table QA models are robust to task-specific perturbations, e. g., replacing key question entities or shuffling table columns.

Few-Shot Learning Question Answering

Paper
Code

bgGLUE: A Bulgarian General Language Understanding Evaluation Benchmark

2 code implementations • 4 Jun 2023 • Momchil Hardalov, Pepa Atanasova, Todor Mihaylov, Galia Angelova, Kiril Simov, Petya Osenova, Ves Stoyanov, Ivan Koychev, Preslav Nakov, Dragomir Radev

We run the first systematic evaluation of pre-trained language models for Bulgarian, comparing and contrasting results across the nine tasks in the benchmark.

Fact Checking named-entity-recognition +5

Paper
Code

On Learning to Summarize with Large Language Models as References

1 code implementation • 23 May 2023 • Yixin Liu, Kejian Shi, Katherine S He, Longtian Ye, Alexander R. Fabbri, PengFei Liu, Dragomir Radev, Arman Cohan

Meanwhile, we perform a meta-analysis on this new learning setting that reveals a discrepancy between human and LLM-based evaluation, highlighting the benefits and risks of this LLM-as-reference setting we investigated.

Contrastive Learning Text Summarization

Paper
Code

QTSumm: Query-Focused Summarization over Tabular Data

2 code implementations • 23 May 2023 • Yilun Zhao, Zhenting Qi, Linyong Nan, Boyu Mi, Yixin Liu, Weijin Zou, Simeng Han, Ruizhe Chen, Xiangru Tang, Yumo Xu, Dragomir Radev, Arman Cohan

Motivated by this, we define a new query-focused table summarization task, where text generation models have to perform human-like reasoning and analysis over the given table to generate a tailored summary.

Query-focused Summarization Table-to-Text Generation

Paper
Code

Enhancing Few-shot Text-to-SQL Capabilities of Large Language Models: A Study on Prompt Design Strategies

no code implementations • 21 May 2023 • Linyong Nan, Yilun Zhao, Weijin Zou, Narutatsu Ri, Jaesung Tae, Ellen Zhang, Arman Cohan, Dragomir Radev

In-context learning (ICL) has emerged as a new approach to various natural language processing tasks, utilizing large language models (LLMs) to make predictions based on context that has been supplemented with a few examples or task-specific instructions.

In-Context Learning Question Answering +1

Paper
Add Code

HPE:Answering Complex Questions over Text by Hybrid Question Parsing and Execution

no code implementations • 12 May 2023 • Ye Liu, Semih Yavuz, Rui Meng, Dragomir Radev, Caiming Xiong, Yingbo Zhou

It comprises two central pillars: (1) We parse the question of varying complexity into an intermediate representation, named H-expression, which is composed of simple questions as the primitives and symbolic operations representing the relationships among them; (2) To execute the resulting H-expressions, we design a hybrid executor, which integrates the deterministic rules to translate the symbolic operations with a drop-in neural reader network to answer each decomposed simple question.

Knowledge Graphs Question Answering +1

Paper
Add Code

HiPool: Modeling Long Documents Using Graph Neural Networks

1 code implementation • 5 May 2023 • Irene Li, Aosong Feng, Dragomir Radev, Rex Ying

Encoding long sequences in Natural Language Processing (NLP) is a challenging problem.

Document Classification Sentence

Paper
Code

Evaluating GPT-4 and ChatGPT on Japanese Medical Licensing Examinations

1 code implementation • 31 Mar 2023 • Jungo Kasai, Yuhei Kasai, Keisuke Sakaguchi, Yutaro Yamada, Dragomir Radev

In this work, we evaluate LLM APIs (ChatGPT, GPT-3, and GPT-4) on the Japanese national medical licensing examinations from the past five years, including the current year.

Paper
Code

Towards Interpretable and Efficient Automatic Reference-Based Summarization Evaluation

1 code implementation • 7 Mar 2023 • Yixin Liu, Alexander R. Fabbri, Yilun Zhao, PengFei Liu, Shafiq Joty, Chien-Sheng Wu, Caiming Xiong, Dragomir Radev

Interpretability and efficiency are two important considerations for the adoption of neural automatic metrics.

Paper
Code

ProofNet: Autoformalizing and Formally Proving Undergraduate-Level Mathematics

2 code implementations • 24 Feb 2023 • Zhangir Azerbayev, Bartosz Piotrowski, Hailey Schoelkopf, Edward W. Ayers, Dragomir Radev, Jeremy Avigad

We introduce ProofNet, a benchmark for autoformalization and formal proving of undergraduate-level mathematics.

Abstract Algebra Automated Theorem Proving +2

6,556

Paper
Code

LEVER: Learning to Verify Language-to-Code Generation with Execution

1 code implementation • 16 Feb 2023 • Ansong Ni, Srini Iyer, Dragomir Radev, Ves Stoyanov, Wen-tau Yih, Sida I. Wang, Xi Victoria Lin

The advent of large language models trained on code (code LLMs) has led to significant progress in language-to-code generation.

Ranked #2 on Semantic Parsing on spider

Arithmetic Reasoning Code Generation +3

Paper
Code

LoFT: Enhancing Faithfulness and Diversity for Table-to-Text Generation via Logic Form Control

1 code implementation • 6 Feb 2023 • Yilun Zhao, Zhenting Qi, Linyong Nan, Lorenzo Jaime Yu Flores, Dragomir Radev

Logical Table-to-Text (LT2T) generation is tasked with generating logically faithful sentences from tables.

Table-to-Text Generation

Paper
Code

Rethinking Explaining Graph Neural Networks via Non-parametric Subgraph Matching

1 code implementation • 7 Jan 2023 • Fang Wu, Siyuan Li, Xurui Jin, Yinghui Jiang, Dragomir Radev, Zhangming Niu, Stan Z. Li

It takes advantage of MatchExplainer to fix the most informative portion of the graph and merely operates graph augmentations on the rest less informative part.

Graph Sampling

Paper
Code

STRUDEL: Structured Dialogue Summarization for Dialogue Comprehension

no code implementations • 24 Dec 2022 • Borui Wang, Chengcheng Feng, Arjun Nair, Madelyn Mao, Jai Desai, Asli Celikyilmaz, Haoran Li, Yashar Mehdad, Dragomir Radev

Abstractive dialogue summarization has long been viewed as an important standalone task in natural language processing, but no previous work has explored the possibility of whether abstractive dialogue summarization can also be used as a means to boost an NLP system's performance on other important dialogue comprehension tasks.

Abstractive Dialogue Summarization Question Answering

Paper
Add Code

On Improving Summarization Factual Consistency from Natural Language Feedback

1 code implementation • 20 Dec 2022 • Yixin Liu, Budhaditya Deb, Milagro Teruel, Aaron Halfaker, Dragomir Radev, Ahmed H. Awadallah

We collect a high-quality dataset, DeFacto, containing human demonstrations and informational natural language feedback consisting of corrective instructions, edited summaries, and explanations with respect to the factual consistency of the summary.

Text Generation Zero-Shot Learning

Paper
Code

BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting

1 code implementation • 19 Dec 2022 • Zheng-Xin Yong, Hailey Schoelkopf, Niklas Muennighoff, Alham Fikri Aji, David Ifeoluwa Adelani, Khalid Almubarak, M Saiful Bari, Lintang Sutawika, Jungo Kasai, Ahmed Baruwa, Genta Indra Winata, Stella Biderman, Edward Raff, Dragomir Radev, Vassilina Nikoulina

We find language adaptation to be effective at improving zero-shot performance in new languages.

Language Modelling Zero-Shot Learning

Paper
Code

Revisiting the Gold Standard: Grounding Summarization Evaluation with Robust Human Evaluation

2 code implementations • 15 Dec 2022 • Yixin Liu, Alexander R. Fabbri, PengFei Liu, Yilun Zhao, Linyong Nan, Ruilin Han, Simeng Han, Shafiq Joty, Chien-Sheng Wu, Caiming Xiong, Dragomir Radev

Human evaluation is the foundation upon which the evaluation of both summarization systems and automatic metrics rests.

Paper
Code

Integration of Pre-trained Protein Language Models into Geometric Deep Learning Networks

1 code implementation • 7 Dec 2022 • Fang Wu, Lirong Wu, Dragomir Radev, Jinbo Xu, Stan Z. Li

Geometric deep learning has recently achieved great success in non-Euclidean domains, and learning on 3D structures of large biomolecules is emerging as a distinct research area.

Protein Interface Prediction Representation Learning

Paper
Code

Explicit Knowledge Transfer for Weakly-Supervised Code Generation

no code implementations • 30 Nov 2022 • Zhangir Azerbayev, Ansong Ni, Hailey Schoelkopf, Dragomir Radev

More specifically, we propose explicit knowledge transfer (EKT), which uses the few-shot capabilities of a teacher LLM to create NL-code pairs that we then filter for correctness and fine-tune the student on.

Code Generation Few-Shot Learning +4

Paper
Add Code

CREATIVESUMM: Shared Task on Automatic Summarization for Creative Writing

no code implementations • COLING (CreativeSumm) 2022 • Divyansh Agarwal, Alexander R. Fabbri, Simeng Han, Wojciech Kryściński, Faisal Ladhak, Bryan Li, Kathleen McKeown, Dragomir Radev, Tianyi Zhang, Sam Wiseman

We detail the process of curating these datasets for the task, as well as the metrics used for the evaluation of the submissions.

Text Summarization

Paper
Add Code

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

6 code implementations • 9 Nov 2022 • BigScience Workshop, :, Teven Le Scao, Angela Fan, Christopher Akiki, Ellie Pavlick, Suzana Ilić, Daniel Hesslow, Roman Castagné, Alexandra Sasha Luccioni, François Yvon, Matthias Gallé, Jonathan Tow, Alexander M. Rush, Stella Biderman, Albert Webson, Pawan Sasanka Ammanamanchi, Thomas Wang, Benoît Sagot, Niklas Muennighoff, Albert Villanova del Moral, Olatunji Ruwase, Rachel Bawden, Stas Bekman, Angelina McMillan-Major, Iz Beltagy, Huu Nguyen, Lucile Saulnier, Samson Tan, Pedro Ortiz Suarez, Victor Sanh, Hugo Laurençon, Yacine Jernite, Julien Launay, Margaret Mitchell, Colin Raffel, Aaron Gokaslan, Adi Simhi, Aitor Soroa, Alham Fikri Aji, Amit Alfassy, Anna Rogers, Ariel Kreisberg Nitzav, Canwen Xu, Chenghao Mou, Chris Emezue, Christopher Klamm, Colin Leong, Daniel van Strien, David Ifeoluwa Adelani, Dragomir Radev, Eduardo González Ponferrada, Efrat Levkovizh, Ethan Kim, Eyal Bar Natan, Francesco De Toni, Gérard Dupont, Germán Kruszewski, Giada Pistilli, Hady Elsahar, Hamza Benyamina, Hieu Tran, Ian Yu, Idris Abdulmumin, Isaac Johnson, Itziar Gonzalez-Dios, Javier de la Rosa, Jenny Chim, Jesse Dodge, Jian Zhu, Jonathan Chang, Jörg Frohberg, Joseph Tobing, Joydeep Bhattacharjee, Khalid Almubarak, Kimbo Chen, Kyle Lo, Leandro von Werra, Leon Weber, Long Phan, Loubna Ben allal, Ludovic Tanguy, Manan Dey, Manuel Romero Muñoz, Maraim Masoud, María Grandury, Mario Šaško, Max Huang, Maximin Coavoux, Mayank Singh, Mike Tian-Jian Jiang, Minh Chien Vu, Mohammad A. Jauhar, Mustafa Ghaleb, Nishant Subramani, Nora Kassner, Nurulaqilla Khamis, Olivier Nguyen, Omar Espejel, Ona de Gibert, Paulo Villegas, Peter Henderson, Pierre Colombo, Priscilla Amuok, Quentin Lhoest, Rheza Harliman, Rishi Bommasani, Roberto Luis López, Rui Ribeiro, Salomey Osei, Sampo Pyysalo, Sebastian Nagel, Shamik Bose, Shamsuddeen Hassan Muhammad, Shanya Sharma, Shayne Longpre, Somaieh Nikpoor, Stanislav Silberberg, Suhas Pai, Sydney Zink, Tiago Timponi Torrent, Timo Schick, Tristan Thrush, Valentin Danchev, Vassilina Nikoulina, Veronika Laippala, Violette Lepercq, Vrinda Prabhu, Zaid Alyafeai, Zeerak Talat, Arun Raja, Benjamin Heinzerling, Chenglei Si, Davut Emre Taşar, Elizabeth Salesky, Sabrina J. Mielke, Wilson Y. Lee, Abheesht Sharma, Andrea Santilli, Antoine Chaffin, Arnaud Stiegler, Debajyoti Datta, Eliza Szczechla, Gunjan Chhablani, Han Wang, Harshit Pandey, Hendrik Strobelt, Jason Alan Fries, Jos Rozen, Leo Gao, Lintang Sutawika, M Saiful Bari, Maged S. Al-shaibani, Matteo Manica, Nihal Nayak, Ryan Teehan, Samuel Albanie, Sheng Shen, Srulik Ben-David, Stephen H. Bach, Taewoon Kim, Tali Bers, Thibault Fevry, Trishala Neeraj, Urmish Thakker, Vikas Raunak, Xiangru Tang, Zheng-Xin Yong, Zhiqing Sun, Shaked Brody, Yallow Uri, Hadar Tojarieh, Adam Roberts, Hyung Won Chung, Jaesung Tae, Jason Phang, Ofir Press, Conglong Li, Deepak Narayanan, Hatim Bourfoune, Jared Casper, Jeff Rasley, Max Ryabinin, Mayank Mishra, Minjia Zhang, Mohammad Shoeybi, Myriam Peyrounette, Nicolas Patry, Nouamane Tazi, Omar Sanseviero, Patrick von Platen, Pierre Cornette, Pierre François Lavallée, Rémi Lacroix, Samyam Rajbhandari, Sanchit Gandhi, Shaden Smith, Stéphane Requena, Suraj Patil, Tim Dettmers, Ahmed Baruwa, Amanpreet Singh, Anastasia Cheveleva, Anne-Laure Ligozat, Arjun Subramonian, Aurélie Névéol, Charles Lovering, Dan Garrette, Deepak Tunuguntla, Ehud Reiter, Ekaterina Taktasheva, Ekaterina Voloshina, Eli Bogdanov, Genta Indra Winata, Hailey Schoelkopf, Jan-Christoph Kalo, Jekaterina Novikova, Jessica Zosa Forde, Jordan Clive, Jungo Kasai, Ken Kawamura, Liam Hazan, Marine Carpuat, Miruna Clinciu, Najoung Kim, Newton Cheng, Oleg Serikov, Omer Antverg, Oskar van der Wal, Rui Zhang, Ruochen Zhang, Sebastian Gehrmann, Shachar Mirkin, Shani Pais, Tatiana Shavrina, Thomas Scialom, Tian Yun, Tomasz Limisiewicz, Verena Rieser, Vitaly Protasov, Vladislav Mikhailov, Yada Pruksachatkun, Yonatan Belinkov, Zachary Bamberger, Zdeněk Kasner, Alice Rueda, Amanda Pestana, Amir Feizpour, Ammar Khan, Amy Faranak, Ana Santos, Anthony Hevia, Antigona Unldreaj, Arash Aghagol, Arezoo Abdollahi, Aycha Tammour, Azadeh HajiHosseini, Bahareh Behroozi, Benjamin Ajibade, Bharat Saxena, Carlos Muñoz Ferrandis, Daniel McDuff, Danish Contractor, David Lansky, Davis David, Douwe Kiela, Duong A. Nguyen, Edward Tan, Emi Baylor, Ezinwanne Ozoani, Fatima Mirza, Frankline Ononiwu, Habib Rezanejad, Hessie Jones, Indrani Bhattacharya, Irene Solaiman, Irina Sedenko, Isar Nejadgholi, Jesse Passmore, Josh Seltzer, Julio Bonis Sanz, Livia Dutra, Mairon Samagaio, Maraim Elbadri, Margot Mieskes, Marissa Gerchick, Martha Akinlolu, Michael McKenna, Mike Qiu, Muhammed Ghauri, Mykola Burynok, Nafis Abrar, Nazneen Rajani, Nour Elkott, Nour Fahmy, Olanrewaju Samuel, Ran An, Rasmus Kromann, Ryan Hao, Samira Alizadeh, Sarmad Shubber, Silas Wang, Sourav Roy, Sylvain Viguier, Thanh Le, Tobi Oyebade, Trieu Le, Yoyo Yang, Zach Nguyen, Abhinav Ramesh Kashyap, Alfredo Palasciano, Alison Callahan, Anima Shukla, Antonio Miranda-Escalada, Ayush Singh, Benjamin Beilharz, Bo wang, Caio Brito, Chenxi Zhou, Chirag Jain, Chuxin Xu, Clémentine Fourrier, Daniel León Periñán, Daniel Molano, Dian Yu, Enrique Manjavacas, Fabio Barth, Florian Fuhrimann, Gabriel Altay, Giyaseddin Bayrak, Gully Burns, Helena U. Vrabec, Imane Bello, Ishani Dash, Jihyun Kang, John Giorgi, Jonas Golde, Jose David Posada, Karthik Rangasai Sivaraman, Lokesh Bulchandani, Lu Liu, Luisa Shinzato, Madeleine Hahn de Bykhovetz, Maiko Takeuchi, Marc Pàmies, Maria A Castillo, Marianna Nezhurina, Mario Sänger, Matthias Samwald, Michael Cullan, Michael Weinberg, Michiel De Wolf, Mina Mihaljcic, Minna Liu, Moritz Freidank, Myungsun Kang, Natasha Seelam, Nathan Dahlberg, Nicholas Michio Broad, Nikolaus Muellner, Pascale Fung, Patrick Haller, Ramya Chandrasekhar, Renata Eisenberg, Robert Martin, Rodrigo Canalli, Rosaline Su, Ruisi Su, Samuel Cahyawijaya, Samuele Garda, Shlok S Deshmukh, Shubhanshu Mishra, Sid Kiblawi, Simon Ott, Sinee Sang-aroonsiri, Srishti Kumar, Stefan Schweter, Sushil Bharati, Tanmay Laud, Théo Gigant, Tomoya Kainuma, Wojciech Kusa, Yanis Labrak, Yash Shailesh Bajaj, Yash Venkatraman, Yifan Xu, Yingxin Xu, Yu Xu, Zhe Tan, Zhongli Xie, Zifan Ye, Mathilde Bras, Younes Belkada, Thomas Wolf

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions.

Language Modelling Multilingual NLP

2,181

Paper
Code

MACSum: Controllable Summarization with Mixed Attributes

1 code implementation • 9 Nov 2022 • Yusen Zhang, Yang Liu, ZiYi Yang, Yuwei Fang, Yulong Chen, Dragomir Radev, Chenguang Zhu, Michael Zeng, Rui Zhang

We propose two simple and effective parameter-efficient approaches for the new task of mixed controllable summarization based on hard prompt tuning and soft prefix tuning.

Attribute Specificity

Paper
Code

Uni-Parser: Unified Semantic Parser for Question Answering on Knowledge Base and Database

no code implementations • 9 Nov 2022 • Ye Liu, Semih Yavuz, Rui Meng, Dragomir Radev, Caiming Xiong, Yingbo Zhou

Parsing natural language questions into executable logical forms is a useful and interpretable way to perform question answering on structured data such as knowledge bases (KB) or databases (DB).

Question Answering Semantic Parsing

Paper
Add Code

Crosslingual Generalization through Multitask Finetuning

1 code implementation • 3 Nov 2022 • Niklas Muennighoff, Thomas Wang, Lintang Sutawika, Adam Roberts, Stella Biderman, Teven Le Scao, M Saiful Bari, Sheng Shen, Zheng-Xin Yong, Hailey Schoelkopf, Xiangru Tang, Dragomir Radev, Alham Fikri Aji, Khalid Almubarak, Samuel Albanie, Zaid Alyafeai, Albert Webson, Edward Raff, Colin Raffel

We find finetuning large multilingual language models on English tasks with English prompts allows for task generalization to non-English languages that appear only in the pretraining corpus.

Ranked #1 on Question Answering on StoryCloze

Coreference Resolution Cross-Lingual Transfer +4

493

Paper
Code

ReasTAP: Injecting Table Reasoning Skills During Pre-training via Synthetic Reasoning Examples

1 code implementation • 22 Oct 2022 • Yilun Zhao, Linyong Nan, Zhenting Qi, Rui Zhang, Dragomir Radev

Reasoning over tabular data requires both table structure understanding and a broad set of table reasoning skills.

Ranked #3 on Semantic Parsing on WikiSQL (Denotation accuracy (test) metric)

Fact Verification Question Answering +3

Paper
Code

Binding Language Models in Symbolic Languages

1 code implementation • 6 Oct 2022 • Zhoujun Cheng, Tianbao Xie, Peng Shi, Chengzu Li, Rahul Nadkarni, Yushi Hu, Caiming Xiong, Dragomir Radev, Mari Ostendorf, Luke Zettlemoyer, Noah A. Smith, Tao Yu

We propose Binder, a training-free neural-symbolic framework that maps the task input to a program, which (1) allows binding a unified API of language model (LM) functionalities to a programming language (e. g., SQL, Python) to extend its grammar coverage and thus tackle more diverse questions, (2) adopts an LM as both the program parser and the underlying model called by the API during execution, and (3) requires only a few in-context exemplar annotations.

Ranked #4 on Table-based Fact Verification on TabFact

Language Modelling Semantic Parsing +1

275

Paper
Code

Look Ma, Only 400 Samples! Revisiting the Effectiveness of Automatic N-Gram Rule Generation for Spelling Normalization in Filipino

1 code implementation • 6 Oct 2022 • Lorenzo Jaime Yu Flores, Dragomir Radev

With 84. 75 million Filipinos online, the ability for models to process online text is crucial for developing Filipino NLP applications.

Spelling Correction

Paper
Code

FOLIO: Natural Language Reasoning with First-Order Logic

1 code implementation • 2 Sep 2022 • Simeng Han, Hailey Schoelkopf, Yilun Zhao, Zhenting Qi, Martin Riddell, Luke Benson, Lucy Sun, Ekaterina Zubova, Yujie Qiao, Matthew Burtell, David Peng, Jonathan Fan, Yixin Liu, Brian Wong, Malcolm Sailor, Ansong Ni, Linyong Nan, Jungo Kasai, Tao Yu, Rui Zhang, Shafiq Joty, Alexander R. Fabbri, Wojciech Kryscinski, Xi Victoria Lin, Caiming Xiong, Dragomir Radev

We present FOLIO, a human-annotated, open-domain, and logically complex and diverse dataset for reasoning in natural language (NL), equipped with first order logic (FOL) annotations.

Language Modelling Large Language Model +1

Paper
Code

RealTime QA: What's the Answer Right Now?

1 code implementation • NeurIPS 2023 • Jungo Kasai, Keisuke Sakaguchi, Yoichi Takahashi, Ronan Le Bras, Akari Asai, Xinyan Yu, Dragomir Radev, Noah A. Smith, Yejin Choi, Kentaro Inui

We introduce REALTIME QA, a dynamic question answering (QA) platform that announces questions and evaluates systems on a regular basis (weekly in this version).

Information Retrieval Question Answering +1

Paper
Code

GEMv2: Multilingual NLG Benchmarking in a Single Line of Code

no code implementations • 22 Jun 2022 • Sebastian Gehrmann, Abhik Bhattacharjee, Abinaya Mahendiran, Alex Wang, Alexandros Papangelis, Aman Madaan, Angelina McMillan-Major, Anna Shvets, Ashish Upadhyay, Bingsheng Yao, Bryan Wilie, Chandra Bhagavatula, Chaobin You, Craig Thomson, Cristina Garbacea, Dakuo Wang, Daniel Deutsch, Deyi Xiong, Di Jin, Dimitra Gkatzia, Dragomir Radev, Elizabeth Clark, Esin Durmus, Faisal Ladhak, Filip Ginter, Genta Indra Winata, Hendrik Strobelt, Hiroaki Hayashi, Jekaterina Novikova, Jenna Kanerva, Jenny Chim, Jiawei Zhou, Jordan Clive, Joshua Maynez, João Sedoc, Juraj Juraska, Kaustubh Dhole, Khyathi Raghavi Chandu, Laura Perez-Beltrachini, Leonardo F. R. Ribeiro, Lewis Tunstall, Li Zhang, Mahima Pushkarna, Mathias Creutz, Michael White, Mihir Sanjay Kale, Moussa Kamal Eddine, Nico Daheim, Nishant Subramani, Ondrej Dusek, Paul Pu Liang, Pawan Sasanka Ammanamanchi, Qi Zhu, Ratish Puduppully, Reno Kriz, Rifat Shahriyar, Ronald Cardenas, Saad Mahamood, Salomey Osei, Samuel Cahyawijaya, Sanja Štajner, Sebastien Montella, Shailza, Shailza Jolly, Simon Mille, Tahmid Hasan, Tianhao Shen, Tosin Adewumi, Vikas Raunak, Vipul Raheja, Vitaly Nikolaev, Vivian Tsai, Yacine Jernite, Ying Xu, Yisi Sang, Yixin Liu, Yufang Hou

This problem is especially pertinent in natural language generation which requires ever-improving suites of datasets, metrics, and human evaluation to make definitive claims.

Benchmarking Text Generation

Paper
Add Code

Learning Math Reasoning from Self-Sampled Correct and Partially-Correct Solutions

1 code implementation • 28 May 2022 • Ansong Ni, Jeevana Priya Inala, Chenglong Wang, Oleksandr Polozov, Christopher Meek, Dragomir Radev, Jianfeng Gao

We show that our use of self-sampled correct and partially-correct solutions can benefit learning and help guide the sampling process, leading to more efficient exploration of the solution space.

Ranked #138 on Arithmetic Reasoning on GSM8K

Arithmetic Reasoning Efficient Exploration +3

Paper
Code

Leveraging Locality in Abstractive Text Summarization

1 code implementation • 25 May 2022 • Yixin Liu, Ansong Ni, Linyong Nan, Budhaditya Deb, Chenguang Zhu, Ahmed H. Awadallah, Dragomir Radev

Our experimental results show that our model has a better performance compared with strong baselines with efficient attention modules, and our analysis provides further insights into our locality-aware modeling strategy.

Abstractive Text Summarization Text Generation

Paper
Code

R2D2: Robust Data-to-Text with Replacement Detection

1 code implementation • 25 May 2022 • Linyong Nan, Lorenzo Jaime Yu Flores, Yilun Zhao, Yixin Liu, Luke Benson, Weijin Zou, Dragomir Radev

Unfaithful text generation is a common problem for text generation systems.

Data-to-Text Generation Entity Retrieval +2

Paper
Code

Twist Decoding: Diverse Generators Guide Each Other

1 code implementation • 19 May 2022 • Jungo Kasai, Keisuke Sakaguchi, Ronan Le Bras, Hao Peng, Ximing Lu, Dragomir Radev, Yejin Choi, Noah A. Smith

Our extensive evaluations on machine translation and scientific paper summarization demonstrate that Twist decoding substantially outperforms each model decoded in isolation over various scenarios, including cases where domain-specific and general-purpose models are both available.

Machine Translation Text Generation +1

Paper
Code

Discovering and Explaining the Representation Bottleneck of Graph Neural Networks from Multi-order Interactions

1 code implementation • 15 May 2022 • Fang Wu, Siyuan Li, Lirong Wu, Dragomir Radev, Stan Z. Li

Graph neural networks (GNNs) mainly rely on the message-passing paradigm to propagate node features and build interactions, and different graph learning tasks require different ranges of node interactions.

graph construction Graph Learning +2

Paper
Code

Data Governance in the Age of Large-Scale Data-Driven Language Technology

no code implementations • 4 May 2022 • Yacine Jernite, Huu Nguyen, Stella Biderman, Anna Rogers, Maraim Masoud, Valentin Danchev, Samson Tan, Alexandra Sasha Luccioni, Nishant Subramani, Gérard Dupont, Jesse Dodge, Kyle Lo, Zeerak Talat, Isaac Johnson, Dragomir Radev, Somaieh Nikpoor, Jörg Frohberg, Aaron Gokaslan, Peter Henderson, Rishi Bommasani, Margaret Mitchell

The recent emergence and adoption of Machine Learning technology, and specifically of Large Language Models, has drawn attention to the need for systematic and transparent management of language data.

Management

Paper
Add Code

EHRKit: A Python Natural Language Processing Toolkit for Electronic Health Record Texts

no code implementations • 13 Apr 2022 • Irene Li, Keen You, Yujie Qiao, Lucas Huang, Chia-Chun Hsieh, Benjamin Rosand, Jeremy Goldwasser, Dragomir Radev

The Electronic Health Record (EHR) is an essential part of the modern medical system and impacts healthcare delivery, operations, and research.

Information Retrieval Machine Translation +5

Paper
Add Code

A Call for Clarity in Beam Search: How It Works and When It Stops

1 code implementation • 11 Apr 2022 • Jungo Kasai, Keisuke Sakaguchi, Ronan Le Bras, Dragomir Radev, Yejin Choi, Noah A. Smith

Based on this finding, we introduce a patience factor, a simple modification to this beam decoding implementation, that generalizes the stopping criterion and provides flexibility to the depth of search.

Machine Translation Text Generation +2

Paper
Code

BRIO: Bringing Order to Abstractive Summarization

3 code implementations • ACL 2022 • Yixin Liu, PengFei Liu, Dragomir Radev, Graham Neubig

Abstractive summarization models are commonly trained using maximum likelihood estimation, which assumes a deterministic (one-point) target distribution in which an ideal model will assign all the probability mass to the reference summary.

Ranked #2 on Text Summarization on X-Sum

Abstractive Text Summarization

319

Paper
Code

PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts

1 code implementation • ACL 2022 • Stephen H. Bach, Victor Sanh, Zheng-Xin Yong, Albert Webson, Colin Raffel, Nihal V. Nayak, Abheesht Sharma, Taewoon Kim, M Saiful Bari, Thibault Fevry, Zaid Alyafeai, Manan Dey, Andrea Santilli, Zhiqing Sun, Srulik Ben-David, Canwen Xu, Gunjan Chhablani, Han Wang, Jason Alan Fries, Maged S. Al-shaibani, Shanya Sharma, Urmish Thakker, Khalid Almubarak, Xiangru Tang, Dragomir Radev, Mike Tian-Jian Jiang, Alexander M. Rush

PromptSource is a system for creating, sharing, and using natural language prompts.

2,477

Paper
Code

UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models

1 code implementation • 16 Jan 2022 • Tianbao Xie, Chen Henry Wu, Peng Shi, Ruiqi Zhong, Torsten Scholak, Michihiro Yasunaga, Chien-Sheng Wu, Ming Zhong, Pengcheng Yin, Sida I. Wang, Victor Zhong, Bailin Wang, Chengzu Li, Connor Boyle, Ansong Ni, Ziyu Yao, Dragomir Radev, Caiming Xiong, Lingpeng Kong, Rui Zhang, Noah A. Smith, Luke Zettlemoyer, Tao Yu

Structured knowledge grounding (SKG) leverages structured knowledge to complete user requests, such as semantic parsing over databases and question answering over knowledge bases.

Ranked #1 on Task-Oriented Dialogue Systems on KVRET

Few-Shot Learning Question Answering +3

530

Paper
Code

A Transfer Learning Pipeline for Educational Resource Discovery with Application in Leading Paragraph Generation

no code implementations • 7 Jan 2022 • Irene Li, Thomas George, Alexander Fabbri, Tammy Liao, Benjamin Chen, Rina Kawamura, Richard Zhou, Vanessa Yan, Swapnil Hingmire, Dragomir Radev

In this paper, we propose the educational resource discovery (ERD) pipeline that automates web resource discovery for novel domains.

Information Retrieval Language Modelling +4

Paper
Add Code

CLICKER: A Computational LInguistics Classification Scheme for Educational Resources

1 code implementation • 16 Dec 2021 • Swapnil Hingmire, Irene Li, Rena Kawamura, Benjamin Chen, Alexander Fabbri, Xiangru Tang, Yixin Liu, Thomas George, Tammy Liao, Wai Pan Wong, Vanessa Yan, Richard Zhou, Girish K. Palshikar, Dragomir Radev

We propose a classification scheme -- CLICKER for CL/NLP based on the analysis of online lectures from 77 university courses on this subject.

Classification Retrieval

125

Paper
Code

CONFIT: Toward Faithful Dialogue Summarization with Linguistically-Informed Contrastive Fine-tuning

no code implementations • NAACL 2022 • Xiangru Tang, Arjun Nair, Borui Wang, Bingyao Wang, Jai Desai, Aaron Wade, Haoran Li, Asli Celikyilmaz, Yashar Mehdad, Dragomir Radev

Using human evaluation and automatic faithfulness metrics, we show that our model significantly reduces all kinds of factual errors on the dialogue summarization, SAMSum corpus.

Abstractive Dialogue Summarization Meeting Summarization +1

Paper
Add Code

Surfer100: Generating Surveys From Web Resources, Wikipedia-style

no code implementations • LREC 2022 • Irene Li, Alexander Fabbri, Rina Kawamura, Yixin Liu, Xiangru Tang, Jaesung Tae, Chang Shen, Sally Ma, Tomoe Mizutani, Dragomir Radev

Fast-developing fields such as Artificial Intelligence (AI) often outpace the efforts of encyclopedic sources such as Wikipedia, which either do not completely cover recently-introduced topics or lack such content entirely.

Language Modelling

Paper
Add Code

Summ^N: A Multi-Stage Summarization Framework for Long Input Dialogues and Documents

2 code implementations • ACL 2022 • Yusen Zhang, Ansong Ni, Ziming Mao, Chen Henry Wu, Chenguang Zhu, Budhaditya Deb, Ahmed H. Awadallah, Dragomir Radev, Rui Zhang

To the best of our knowledge, Summ$^N$ is the first multi-stage split-then-summarize framework for long input summarization.

Abstractive Text Summarization Document Summarization +1

Paper
Code

DYLE: Dynamic Latent Extraction for Abstractive Long-Input Summarization

1 code implementation • ACL 2022 • Ziming Mao, Chen Henry Wu, Ansong Ni, Yusen Zhang, Rui Zhang, Tao Yu, Budhaditya Deb, Chenguang Zhu, Ahmed H. Awadallah, Dragomir Radev

Transformer-based models have achieved state-of-the-art performance on short-input summarization.

Abstractive Text Summarization

Paper
Code

Molformer: Motif-based Transformer on 3D Heterogeneous Molecular Graphs

2 code implementations • 4 Oct 2021 • Fang Wu, Dragomir Radev, Stan Z. Li

Then HMGs are constructed with both atom-level and motif-level nodes.

graph construction Translation

Paper
Code

Investigating Crowdsourcing Protocols for Evaluating the Factual Consistency of Summaries

no code implementations • NAACL 2022 • Xiangru Tang, Alexander Fabbri, Haoran Li, Ziming Mao, Griffin Thomas Adams, Borui Wang, Asli Celikyilmaz, Yashar Mehdad, Dragomir Radev

Current pre-trained models applied to summarization are prone to factual inconsistencies which either misrepresent the source text or introduce extraneous information.

Paper
Add Code

Efficient Variational Graph Autoencoders for Unsupervised Cross-domain Prerequisite Chains

no code implementations • 17 Sep 2021 • Irene Li, Vanessa Yan, Dragomir Radev

Our novel model consists of a variational graph autoencoder (VGAE) and a domain discriminator.

Link Prediction

Paper
Add Code

An Exploratory Study on Long Dialogue Summarization: What Works and What's Next

1 code implementation • 10 Sep 2021 • Yusen Zhang, Ansong Ni, Tao Yu, Rui Zhang, Chenguang Zhu, Budhaditya Deb, Asli Celikyilmaz, Ahmed Hassan Awadallah, Dragomir Radev

Dialogue summarization helps readers capture salient information from long conversations in meetings, interviews, and TV series.

Retrieval

Paper
Code

SummerTime: Text Summarization Toolkit for Non-experts

1 code implementation • EMNLP (ACL) 2021 • Ansong Ni, Zhangir Azerbayev, Mutethia Mutuma, Troy Feng, Yusen Zhang, Tao Yu, Ahmed Hassan Awadallah, Dragomir Radev

We also provide explanations for models and evaluation metrics to help users understand the model behaviors and select models that best suit their needs.

Document Summarization Multi-Document Summarization

260

Paper
Code

Neural Natural Language Processing for Unstructured Data in Electronic Health Records: a Review

no code implementations • 7 Jul 2021 • Irene Li, Jessica Pan, Jeremy Goldwasser, Neha Verma, Wai Pan Wong, Muhammed Yavuz Nuzumlali, Benjamin Rosand, Yixin Li, Matthew Zhang, David Chang, R. Andrew Taylor, Harlan M. Krumholz, Dragomir Radev

Electronic health records (EHRs), digital collections of patient healthcare events and observations, are ubiquitous in medicine and critical to healthcare delivery, operations, and research.

Knowledge Graphs Question Answering +1

Paper
Add Code

DocNLI: A Large-scale Dataset for Document-level Natural Language Inference

1 code implementation • Findings (ACL) 2021 • Wenpeng Yin, Dragomir Radev, Caiming Xiong

It has been studied intensively in the past few years thanks to the availability of large-scale labeled datasets.

Natural Language Inference Question Answering +2

Paper
Code

ConvoSumm: Conversation Summarization Benchmark and Improved Abstractive Summarization with Argument Mining

1 code implementation • ACL 2021 • Alexander R. Fabbri, Faiaz Rahman, Imad Rizvi, Borui Wang, Haoran Li, Yashar Mehdad, Dragomir Radev

While online conversations can cover a vast amount of information in many different formats, abstractive text summarization has primarily focused on modeling solely news articles.

Abstractive Text Summarization Argument Mining +2

Paper
Code

BookSum: A Collection of Datasets for Long-form Narrative Summarization

2 code implementations • 18 May 2021 • Wojciech Kryściński, Nazneen Rajani, Divyansh Agarwal, Caiming Xiong, Dragomir Radev

The majority of available text summarization datasets include short-form source documents that lack long-range causal and temporal dependencies, and often contain strong layout and stylistic biases.

Abstractive Text Summarization

173

Paper
Code

Unsupervised Cross-Domain Prerequisite Chain Learning using Variational Graph Autoencoders

no code implementations • ACL 2021 • Irene Li, Vanessa Yan, Tianxiao Li, Rihao Qu, Dragomir Radev

For example, one may be an expert in the natural language processing (NLP) domain but want to determine the best order to learn new concepts in an unfamiliar Computer Vision domain (CV).

Paper
Add Code

QMSum: A New Benchmark for Query-based Multi-domain Meeting Summarization

1 code implementation • NAACL 2021 • Ming Zhong, Da Yin, Tao Yu, Ahmad Zaidi, Mutethia Mutuma, Rahul Jha, Ahmed Hassan Awadallah, Asli Celikyilmaz, Yang Liu, Xipeng Qiu, Dragomir Radev

As increasing numbers of meetings are recorded and transcribed, meeting summaries have become essential to remind those who may or may not have attended the meetings about the key decisions made and the tasks to be completed.

Meeting Summarization

100

Paper
Code

FeTaQA: Free-form Table Question Answering

1 code implementation • 1 Apr 2021 • Linyong Nan, Chiachun Hsieh, Ziming Mao, Xi Victoria Lin, Neha Verma, Rui Zhang, Wojciech Kryściński, Nick Schoelkopf, Riley Kong, Xiangru Tang, Murori Mutuma, Ben Rosand, Isabel Trindade, Renusree Bandaru, Jacob Cunningham, Caiming Xiong, Dragomir Radev

Existing table question answering datasets contain abundant factual questions that primarily evaluate the query and schema comprehension capability of a system, but they fail to include questions that require complex reasoning and integration of information due to the constraint of the associated short-form answers.

Question Answering Retrieval +2

Paper
Code

Improving Zero and Few-Shot Abstractive Summarization with Intermediate Fine-tuning and Data Augmentation

no code implementations • NAACL 2021 • Alexander R. Fabbri, Simeng Han, Haoyuan Li, Haoran Li, Marjan Ghazvininejad, Shafiq Joty, Dragomir Radev, Yashar Mehdad

Models pretrained with self-supervised objectives on large text corpora achieve state-of-the-art performance on English text summarization tasks.

Abstractive Text Summarization Data Augmentation

Paper
Add Code

Universal Natural Language Processing with Limited Annotations: Try Few-shot Textual Entailment as a Start

1 code implementation • EMNLP 2020 • Wenpeng Yin, Nazneen Fatema Rajani, Dragomir Radev, Richard Socher, Caiming Xiong

We demonstrate that this framework enables a pretrained entailment model to work well on new entailment domains in a few-shot setting, and show its effectiveness as a unified solver for several downstream NLP tasks such as question answering and coreference resolution when the end-task annotations are limited.

coreference-resolution Natural Language Inference +1

Paper
Code

GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing

1 code implementation • ICLR 2021 • Tao Yu, Chien-Sheng Wu, Xi Victoria Lin, Bailin Wang, Yi Chern Tan, Xinyi Yang, Dragomir Radev, Richard Socher, Caiming Xiong

We present GraPPa, an effective pre-training approach for table semantic parsing that learns a compositional inductive bias in the joint representations of textual and tabular data.

Ranked #8 on Semantic Parsing on spider

Inductive Bias Language Modelling +3

Paper
Code

SummEval: Re-evaluating Summarization Evaluation

5 code implementations • 24 Jul 2020 • Alexander R. Fabbri, Wojciech Kryściński, Bryan McCann, Caiming Xiong, Richard Socher, Dragomir Radev

The scarcity of comprehensive up-to-date studies on evaluation metrics for text summarization and the lack of consensus regarding evaluation protocols continue to inhibit progress.

Text Summarization

341

Paper
Code

DART: Open-Domain Structured Data Record to Text Generation

2 code implementations • NAACL 2021 • Linyong Nan, Dragomir Radev, Rui Zhang, Amrit Rau, Abhinand Sivaprasad, Chiachun Hsieh, Xiangru Tang, Aadit Vyas, Neha Verma, Pranav Krishna, Yangxiaokang Liu, Nadia Irwanto, Jessica Pan, Faiaz Rahman, Ahmad Zaidi, Mutethia Mutuma, Yasin Tarabar, Ankit Gupta, Tao Yu, Yi Chern Tan, Xi Victoria Lin, Caiming Xiong, Richard Socher, Nazneen Fatema Rajani

Data-to-Text annotations can be a costly process, especially when dealing with tables which are the major source of structured data and contain nontrivial structures.

Domain Generalization Semantic Parsing +2

142

Paper
Code

CO-Search: COVID-19 Information Retrieval with Semantic Search, Question Answering, and Abstractive Summarization

no code implementations • 17 Jun 2020 • Andre Esteva, Anuprit Kale, Romain Paulus, Kazuma Hashimoto, Wenpeng Yin, Dragomir Radev, Richard Socher

The COVID-19 global pandemic has resulted in international efforts to understand, track, and mitigate the disease, yielding a significant corpus of COVID-19 and SARS-CoV-2-related publications across scientific disciplines.

Abstractive Text Summarization Information Retrieval +3

Paper
Add Code

ESPRIT: Explaining Solutions to Physical Reasoning Tasks

2 code implementations • ACL 2020 • Nazneen Fatema Rajani, Rui Zhang, Yi Chern Tan, Stephan Zheng, Jeremy Weiss, Aadit Vyas, Abhijit Gupta, Caiming Xiong, Richard Socher, Dragomir Radev

Our framework learns to generate explanations of how the physical simulation will causally evolve so that an agent or a human can easily reason about a solution using those interpretable descriptions.

425

Paper
Code

R-VGAE: Relational-variational Graph Autoencoder for Unsupervised Prerequisite Chain Learning

1 code implementation • COLING 2020 • Irene Li, Alexander Fabbri, Swapnil Hingmire, Dragomir Radev

The task of concept prerequisite chain learning is to automatically determine the existence of prerequisite relationships among concept pairs.

125

Paper
Code

A Neural Topic-Attention Model for Medical Term Abbreviation Disambiguation

1 code implementation • 30 Oct 2019 • Irene Li, Michihiro Yasunaga, Muhammed Yavuz Nuzumlali, Cesar Caraballo, Shiwani Mahajan, Harlan Krumholz, Dragomir Radev

Specifically, a neural topic-attention model is applied to learn improved contextualized sentence representations for medical term abbreviation disambiguation.

Few-Shot Learning Sentence

Paper
Code

CoSQL: A Conversational Text-to-SQL Challenge Towards Cross-Domain Natural Language Interfaces to Databases

3 code implementations • IJCNLP 2019 • Tao Yu, Rui Zhang, He Yang Er, Suyi Li, Eric Xue, Bo Pang, Xi Victoria Lin, Yi Chern Tan, Tianze Shi, Zihan Li, Youxuan Jiang, Michihiro Yasunaga, Sungrok Shim, Tao Chen, Alexander Fabbri, Zifan Li, Luyao Chen, Yuwen Zhang, Shreya Dixit, Vincent Zhang, Caiming Xiong, Richard Socher, Walter S. Lasecki, Dragomir Radev

We present CoSQL, a corpus for building cross-domain, general-purpose database (DB) querying dialogue systems.

Ranked #8 on Dialogue State Tracking on CoSQL

Dialogue State Tracking Response Generation +1

203

Paper
Code

The CL-SciSumm Shared Task 2018: Results and Key Insights

1 code implementation • 2 Sep 2019 • Kokil Jaidka, Michihiro Yasunaga, Muthu Kumar Chandrasekaran, Dragomir Radev, Min-Yen Kan

This overview describes the official results of the CL-SciSumm Shared Task 2018 -- the first medium-scale shared task on scientific document summarization in the computational linguistics (CL) domain.

Document Summarization Information Retrieval +2

210

Paper
Code

Editing-Based SQL Query Generation for Cross-Domain Context-Dependent Questions

3 code implementations • IJCNLP 2019 • Rui Zhang, Tao Yu, He Yang Er, Sungrok Shim, Eric Xue, Xi Victoria Lin, Tianze Shi, Caiming Xiong, Richard Socher, Dragomir Radev

We focus on the cross-domain context-dependent text-to-SQL generation task.

Ranked #5 on Text-To-SQL on SParC

Dialogue State Tracking Text-To-SQL

203

Paper
Code

Overview and Results: CL-SciSumm Shared Task 2019

1 code implementation • 23 Jul 2019 • Muthu Kumar Chandrasekaran, Michihiro Yasunaga, Dragomir Radev, Dayne Freitag, Min-Yen Kan

All papers are from the open access research papers in the CL domain.

Document Summarization Information Retrieval +2

210

Paper
Code

Creating A Neural Pedagogical Agent by Jointly Learning to Review and Assess

2 code implementations • 26 Jun 2019 • Youngnam Lee, Youngduck Choi, Junghyun Cho, Alexander R. Fabbri, HyunBin Loh, Chanyou Hwang, Yongku Lee, Sang-Wook Kim, Dragomir Radev

Our model outperforms existing approaches over several metrics in predicting user response correctness, notably out-performing other methods on new users without large question-response histories.

Machine Translation TAG

Paper
Code

Improving Low-Resource Cross-lingual Document Retrieval by Reranking with Deep Bilingual Representations

no code implementations • ACL 2019 • Rui Zhang, Caitlin Westerfield, Sungrok Shim, Garrett Bingham, Alexander Fabbri, Neha Verma, William Hu, Dragomir Radev

In this paper, we propose to boost low-resource cross-lingual document retrieval performance with deep bilingual query-document representations.

Cross-Lingual Information Retrieval Cross-Lingual Word Embeddings +3

Paper
Add Code

SParC: Cross-Domain Semantic Parsing in Context

4 code implementations • ACL 2019 • Tao Yu, Rui Zhang, Michihiro Yasunaga, Yi Chern Tan, Xi Victoria Lin, Suyi Li, Heyang Er, Irene Li, Bo Pang, Tao Chen, Emily Ji, Shreya Dixit, David Proctor, Sungrok Shim, Jonathan Kraft, Vincent Zhang, Caiming Xiong, Richard Socher, Dragomir Radev

The best model obtains an exact match accuracy of 20. 2% over all questions and less than10% over all interaction sequences, indicating that the cross-domain setting and the con-textual phenomena of the dataset present significant challenges for future research.

Semantic Parsing Text-To-SQL

203

Paper
Code

Syntax-aware Neural Semantic Role Labeling with Supertags

1 code implementation • NAACL 2019 • Jungo Kasai, Dan Friedman, Robert Frank, Dragomir Radev, Owen Rambow

We introduce a new syntax-aware model for dependency-based semantic role labeling that outperforms syntax-agnostic models for English and Spanish.

Semantic Role Labeling TAG

Paper
Code

SyntaxSQLNet: Syntax Tree Networks for Complex and Cross-DomainText-to-SQL Task

2 code implementations • 11 Oct 2018 • Tao Yu, Michihiro Yasunaga, Kai Yang, Rui Zhang, Dongxu Wang, Zifan Li, Dragomir Radev

In this paper we propose SyntaxSQLNet, a syntax tree network to address the complex and cross-domain text-to-SQL generation task.

Ranked #7 on Text-To-SQL on SParC

Semantic Parsing Text-To-SQL

131

Paper
Code

SyntaxSQLNet: Syntax Tree Networks for Complex and Cross-Domain Text-to-SQL Task

no code implementations • EMNLP 2018 • Tao Yu, Michihiro Yasunaga, Kai Yang, Rui Zhang, Dongxu Wang, Zifan Li, Dragomir Radev

In this paper we propose SyntaxSQLNet, a syntax tree network to address the complex and cross-domain text-to-SQL generation task.

Semantic Parsing Text-To-SQL

Paper
Add Code

Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task

5 code implementations • EMNLP 2018 • Tao Yu, Rui Zhang, Kai Yang, Michihiro Yasunaga, Dongxu Wang, Zifan Li, James Ma, Irene Li, Qingning Yao, Shanelle Roman, Zilin Zhang, Dragomir Radev

We define a new complex and cross-domain semantic parsing and text-to-SQL task where different complex SQL queries and databases appear in train and test sets.

Ranked #10 on Semantic Parsing on spider

Semantic Parsing Text-To-SQL

701

Paper
Code

Improving Text-to-SQL Evaluation Methodology

1 code implementation • ACL 2018 • Catherine Finegan-Dollak, Jonathan K. Kummerfeld, Li Zhang, Karthik Ramanathan, Sesh Sadasivam, Rui Zhang, Dragomir Radev

Second, we show that the current division of data into training and test sets measures robustness to variations in the way questions are asked, but only partially tests how well systems generalize to new queries; therefore, we propose a complementary dataset split for evaluation of future work.

Ranked #1 on SQL Parsing on IMDb

SQL Parsing Text-To-SQL

500

Paper
Code

Neural Coreference Resolution with Deep Biaffine Attention by Joint Mention Detection and Mention Clustering

no code implementations • ACL 2018 • Rui Zhang, Cicero Nogueira dos santos, Michihiro Yasunaga, Bing Xiang, Dragomir Radev

Coreference resolution aims to identify in a text all mentions that refer to the same real-world entity.

Clustering coreference-resolution

Paper
Add Code

TypeSQL: Knowledge-based Type-Aware Neural Text-to-SQL Generation

1 code implementation • NAACL 2018 • Tao Yu, Zifan Li, Zilin Zhang, Rui Zhang, Dragomir Radev

Interacting with relational databases through natural language helps users of any background easily query and analyze a vast amount of data.

Ranked #2 on Code Generation on WikiSQL

slot-filling Slot Filling +2

111

Paper
Code

Robust Multilingual Part-of-Speech Tagging via Adversarial Training

1 code implementation • NAACL 2018 • Michihiro Yasunaga, Jungo Kasai, Dragomir Radev

Adversarial training (AT) is a powerful regularization method for neural networks, aiming to achieve robustness to input perturbations.

Ranked #2 on Part-Of-Speech Tagging on UD

Chunking Dependency Parsing +4

Paper
Code

Addressee and Response Selection in Multi-Party Conversations with Speaker Interaction RNNs

1 code implementation • 12 Sep 2017 • Rui Zhang, Honglak Lee, Lazaros Polymenakos, Dragomir Radev

In this paper, we study the problem of addressee and response selection in multi-party conversations.

Paper
Code

Graph-based Neural Multi-Document Summarization

no code implementations • CONLL 2017 • Michihiro Yasunaga, Rui Zhang, Kshitijh Meelu, Ayush Pareek, Krishnan Srinivasan, Dragomir Radev

We propose a neural multi-document summarization (MDS) system that incorporates sentence relation graphs.

Ranked #1 on Multi-Document Summarization on DUC 2004

Document Summarization Multi-Document Summarization +3

Paper
Add Code

Cruciform: Solving Crosswords with Natural Language Processing

no code implementations • 8 Nov 2016 • Dragomir Radev, Rui Zhang, Steve Wilson, Derek Van Assche, Henrique Spyra Gubert, Alisa Krivokapic, MeiXing Dong, Chongruo wu, Spruce Bondera, Luke Brandl, Jeremy Dohmann

We describe the results of several of our experiments with the system.

Paper
Add Code

Dependency Sensitive Convolutional Neural Networks for Modeling Sentences and Documents

3 code implementations • NAACL 2016 • Rui Zhang, Honglak Lee, Dragomir Radev

Moreover, unlike other CNN-based models that analyze sentences locally by sliding windows, our system captures both the dependency information within each sentence and relationships across sentences in the same document.

Classification General Classification +4

Paper
Code

Sentence Ordering and Coherence Modeling using Recurrent Neural Networks

2 code implementations • 8 Nov 2016 • Lajanugen Logeswaran, Honglak Lee, Dragomir Radev

Modeling the structure of coherent texts is a key NLP problem.

Sentence Sentence Ordering +1

Paper
Code

Nested Propositions in Open Information Extraction

no code implementations • EMNLP 2016 • Nikita Bhutani, H. V. Jagadish, Dragomir Radev

Open Information Extraction Question Answering +1

Paper
Add Code

Effects of Creativity and Cluster Tightness on Short Text Clustering Performance

no code implementations • ACL 2016 • Catherine Finegan-Dollak, Reed Coke, Rui Zhang, Xiangyi Ye, Dragomir Radev

Clustering Semantic Textual Similarity +2

Paper
Add Code

Extractive Summarization under Strict Length Constraints

no code implementations • LREC 2016 • Yashar Mehdad, Am Stent, a, Kapil Thadani, Dragomir Radev, Youssef Billawala, Karolina Buchner

In this paper we report a comparison of various techniques for single-document extractive summarization under strict length budgets, which is a common commercial use case (e. g. summarization of news articles by news aggregators).

Extractive Summarization

Paper
Add Code

Sentence Similarity based on Dependency Tree Kernels for Multi-document Summarization

no code implementations • LREC 2016 • {\c{S}}aziye Bet{\"u}l {\"O}zate{\c{s}}, Arzucan {\"O}zg{\"u}r, Dragomir Radev

We introduce an approach based on using the dependency grammar representations of sentences to compute sentence similarity for extractive multi-document summarization.

Document Summarization Multi-Document Summarization +3

Paper
Add Code

Classifying Syntactic Regularities for Hundreds of Languages

no code implementations • 25 Mar 2016 • Reed Coke, Ben King, Dragomir Radev

This paper presents a comparison of classification methods for linguistic typology for the purpose of expanding an extensive, but sparse language resource: the World Atlas of Language Structures (WALS) (Dryer and Haspelmath, 2013).

General Classification regression