Search Results for author: Rebecca Qian

Found 7 papers, 3 papers with code

Human Evaluation of Conversations is an Open Problem: comparing the sensitivity of various methods for evaluating dialogue agents

no code implementations • NLP4ConvAI (ACL) 2022 • Eric Michael Smith, Orion Hsu, Rebecca Qian, Stephen Roller, Y-Lan Boureau, Jason Weston

At the heart of improving conversational AI is the open problem of how to evaluate conversations.

Dialogue Evaluation

Paper
Add Code

Many Episode Learning in a Modular Embodied Agent via End-to-End Interaction

no code implementations • 19 Apr 2022 • Yuxuan Sun, Ethan Carlson, Rebecca Qian, Kavya Srinet, Arthur Szlam

In this work we give a case study of an embodied machine-learning (ML) powered agent that improves itself via interactions with crowd-workers.

Paper
Add Code

Step by Step to Fairness: Attributing Societal Bias in Task-oriented Dialogue Systems

no code implementations • 11 Nov 2023 • Hsuan Su, Rebecca Qian, Chinnadhurai Sankar, Shahin Shayandeh, Shang-Tse Chen, Hung-Yi Lee, Daniel M. Bikel

In this paper, we propose a diagnosis method to attribute bias to each component of a TOD system.

Attribute Fairness +2

Paper
Add Code

SimpleSafetyTests: a Test Suite for Identifying Critical Safety Risks in Large Language Models

no code implementations • 14 Nov 2023 • Bertie Vidgen, Nino Scherrer, Hannah Rose Kirk, Rebecca Qian, Anand Kannappan, Scott A. Hale, Paul Röttger

While some of the models do not give a single unsafe response, most give unsafe responses to more than 20% of the prompts, with over 50% unsafe responses in the extreme.

Paper
Add Code

FinanceBench: A New Benchmark for Financial Question Answering

1 code implementation • 20 Nov 2023 • Pranab Islam, Anand Kannappan, Douwe Kiela, Rebecca Qian, Nino Scherrer, Bertie Vidgen

We test 16 state of the art model configurations (including GPT-4-Turbo, Llama2 and Claude2, with vector stores and long context prompts) on a sample of 150 cases from FinanceBench, and manually review their answers (n=2, 400).

Question Answering Retrieval +1

Paper
Code

Perturbation Augmentation for Fairer NLP

1 code implementation • 25 May 2022 • Rebecca Qian, Candace Ross, Jude Fernandes, Eric Smith, Douwe Kiela, Adina Williams

Unwanted and often harmful social biases are becoming ever more salient in NLP research, affecting both models and datasets.

Fairness

154

Paper
Code

droidlet: modular, heterogenous, multi-modal agents

1 code implementation • 25 Jan 2021 • Anurag Pratik, Soumith Chintala, Kavya Srinet, Dhiraj Gandhi, Rebecca Qian, Yuxuan Sun, Ryan Drew, Sara Elkafrawy, Anoushka Tiwari, Tucker Hart, Mary Williamson, Abhinav Gupta, Arthur Szlam

In recent years, there have been significant advances in building end-to-end Machine Learning (ML) systems that learn at scale.

829

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.