no code implementations • INLG (ACL) 2020 • David M. Howcroft, Anya Belz, Miruna-Adriana Clinciu, Dimitra Gkatzia, Sadid A. Hasan, Saad Mahamood, Simon Mille, Emiel van Miltenburg, Sashank Santhanam, Verena Rieser
Human assessment remains the most trusted form of evaluation in NLG, but highly diverse approaches and a proliferation of different quality criteria used by researchers make it difficult to compare results and draw conclusions across papers, with adverse implications for meta-evaluation and reproducibility.
1 code implementation • 11 Oct 2021 • Sashank Santhanam, Behnam Hedayatnia, Spandana Gella, Aishwarya Padmakumar, Seokhwan Kim, Yang Liu, Dilek Hakkani-Tur
We demonstrate the benefit of our Conv-FEVER dataset by showing that the models trained on this data perform reasonably well to detect factually inconsistent responses with respect to the provided knowledge through evaluation on our human annotated data.
no code implementations • ACL (GEM) 2021 • Sebastian Gehrmann, Tosin Adewumi, Karmanya Aggarwal, Pawan Sasanka Ammanamanchi, Aremu Anuoluwapo, Antoine Bosselut, Khyathi Raghavi Chandu, Miruna Clinciu, Dipanjan Das, Kaustubh D. Dhole, Wanyu Du, Esin Durmus, Ondřej Dušek, Chris Emezue, Varun Gangal, Cristina Garbacea, Tatsunori Hashimoto, Yufang Hou, Yacine Jernite, Harsh Jhamtani, Yangfeng Ji, Shailza Jolly, Mihir Kale, Dhruv Kumar, Faisal Ladhak, Aman Madaan, Mounica Maddela, Khyati Mahajan, Saad Mahamood, Bodhisattwa Prasad Majumder, Pedro Henrique Martins, Angelina McMillan-Major, Simon Mille, Emiel van Miltenburg, Moin Nadeem, Shashi Narayan, Vitaly Nikolaev, Rubungo Andre Niyongabo, Salomey Osei, Ankur Parikh, Laura Perez-Beltrachini, Niranjan Ramesh Rao, Vikas Raunak, Juan Diego Rodriguez, Sashank Santhanam, João Sedoc, Thibault Sellam, Samira Shaikh, Anastasia Shimorina, Marco Antonio Sobrevilla Cabezudo, Hendrik Strobelt, Nishant Subramani, Wei Xu, Diyi Yang, Akhila Yerukola, Jiawei Zhou
We introduce GEM, a living benchmark for natural language Generation (NLG), its Evaluation, and Metrics.
Ranked #1 on Extreme Summarization on GEM-XSum
Abstractive Text Summarization Cross-Lingual Abstractive Summarization +5
1 code implementation • 20 Oct 2020 • Sashank Santhanam, Wei Ping, Raul Puri, Mohammad Shoeybi, Mostofa Patwary, Bryan Catanzaro
State-of-the-art conversational agents have advanced significantly in conjunction with the use of large transformer-based language models.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Sashank Santhanam, Zhuo Cheng, Brodie Mather, Bonnie Dorr, Archna Bhatia, Bryanna Hebenstreit, Alan Zemel, Adam Dalton, Tomek Strzalkowski, Samira Shaikh
Achieving true human-like ability to conduct a conversation remains an elusive goal for open-ended dialogue systems.
no code implementations • WS 2020 • Sashank Santhanam, Samira Shaikh
Evaluation of output from natural language generation (NLG) systems is typically conducted via crowdsourced human judgments.
no code implementations • LREC 2020 • Adam Dalton, Ehsan Aghaei, Ehab Al-Shaer, Archna Bhatia, Esteban Castillo, Zhuo Cheng, Sreekar Dhaduvai, Qi Duan, Bryanna Hebenstreit, Md Mazharul Islam, Younes Karimi, Amir Masoumzadeh, Brodie Mather, Sashank Santhanam, Samira Shaikh, Alan Zemel, Tomek Strzalkowski, Bonnie J. Dorr
We describe a system that supports natural language processing (NLP) components for active defenses against social engineering attacks.
no code implementations • 20 Apr 2020 • Adam Dalton, Ehsan Aghaei, Ehab Al-Shaer, Archna Bhatia, Esteban Castillo, Zhuo Cheng, Sreekar Dhaduvai, Qi Duan, Md Mazharul Islam, Younes Karimi, Amir Masoumzadeh, Brodie Mather, Sashank Santhanam, Samira Shaikh, Tomek Strzalkowski, Bonnie J. Dorr
We describe Panacea, a system that supports natural language processing (NLP) components for active defenses against social engineering attacks.
no code implementations • LREC 2020 • Archna Bhatia, Adam Dalton, Brodie Mather, Sashank Santhanam, Samira Shaikh, Alan Zemel, Tomek Strzalkowski, Bonnie J. Dorr
We present a paradigm for extensible lexicon development based on Lexical Conceptual Structure to support social engineering detection and response generation.
no code implementations • 25 Feb 2020 • Bonnie J. Dorr, Archna Bhatia, Adam Dalton, Brodie Mather, Bryanna Hebenstreit, Sashank Santhanam, Zhuo Cheng, Samira Shaikh, Alan Zemel, Tomek Strzalkowski
Social engineers attempt to manipulate users into undertaking actions such as downloading malware by clicking links or providing access to money or sensitive information.
no code implementations • 18 Feb 2020 • Sashank Santhanam, Alireza Karduni, Samira Shaikh
To investigate, we conducted a between-subjects study with 77 crowdsourced workers to understand the role of cognitive biases, specifically anchoring bias, when humans are asked to evaluate the output of conversational agents.
1 code implementation • 26 Nov 2019 • Vidhushini Srinivasan, Sashank Santhanam, Samira Shaikh
We propose an approach towards natural language generation using a bidirectional encoder-decoder which incorporates external rewards through reinforcement learning (RL).
1 code implementation • CCNLG (ACL) 2019 • Sashank Santhanam, Samira Shaikh
Emotional language generation is one of the keys to human-like artificial intelligence.
1 code implementation • WS 2019 • Sashank Santhanam, Samira Shaikh
To overcome the limitations of automated metrics (e. g. BLEU, METEOR) for evaluating dialogue systems, researchers typically use human judgments to provide convergent evidence.
1 code implementation • 19 Jul 2019 • Sashank Santhanam, Vidhushini Srinivasan, Shaina Glass, Samira Shaikh
We study how emojis are used to express solidarity in social media in the context of two major crisis events - a natural disaster, Hurricane Irma in 2017 and terrorist attacks that occurred on November 2015 in Paris.
no code implementations • 2 Jun 2019 • Sashank Santhanam, Samira Shaikh
We provide a comprehensive review towards building open domain dialogue systems, an important application of natural language generation.