TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Overall - Test	FeedbackQA	BERT RQA + CombinedReranker	Accuracy	67.97	# 1
Overall - Test	FeedbackQA	BERT RQA + VanillaReranker	Accuracy	65.98	# 3
Overall - Test	FeedbackQA	BERT RQA + FeedbackReranker	Accuracy	66.59	# 2
Overall - Test	FeedbackQA	BERT RQA	Accuracy	64.75	# 4

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/using-interactive-feedback-to-improve-the-1/overall-test-on-feedbackqa)](https://paperswithcode.com/sota/overall-test-on-feedbackqa?p=using-interactive-feedback-to-improve-the-1)`

Using Interactive Feedback to Improve the Accuracy and Explainability of Question Answering Systems Post-Deployment

Findings (ACL) 2022 · Zichao Li, Prakhar Sharma, Xing Han Lu, Jackie C. K. Cheung, Siva Reddy ·

Most research on question answering focuses on the pre-deployment stage; i.e., building an accurate model for deployment. In this paper, we ask the question: Can we improve QA systems further \emph{post-}deployment based on user interactions? We focus on two kinds of improvements: 1) improving the QA system's performance itself, and 2) providing the model with the ability to explain the correctness or incorrectness of an answer. We collect a retrieval-based QA dataset, FeedbackQA, which contains interactive feedback from users. We collect this dataset by deploying a base QA system to crowdworkers who then engage with the system and provide feedback on the quality of its answers. The feedback contains both structured ratings and unstructured natural language explanations. We train a neural model with this feedback data that can generate explanations and re-score answer candidates. We show that feedback data not only improves the accuracy of the deployed QA system but also other stronger non-deployed systems. The generated explanations also help users make informed decisions about the correctness of answers. Project page: https://mcgill-nlp.github.io/feedbackqa/

PDF Abstract Findings (ACL) 2022 PDF Findings (ACL) 2022 Abstract

Code

Add Remove Mark official

McGill-NLP/feedbackqa official

Tasks

Add Remove

Overall - Test

Question Answering

Retrieval

Datasets

FeedbackQA

Results from the Paper

Edit

Ranked #1 on Overall - Test on FeedbackQA

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Overall - Test	FeedbackQA	BERT RQA + CombinedReranker	Accuracy	67.97	# 1	Compare
Overall - Test	FeedbackQA	BERT RQA + VanillaReranker	Accuracy	65.98	# 3	Compare
Overall - Test	FeedbackQA	BERT RQA + FeedbackReranker	Accuracy	66.59	# 2	Compare
Overall - Test	FeedbackQA	BERT RQA	Accuracy	64.75	# 4	Compare

Methods

Add Remove

BASE

Edit Social Preview

Using Interactive Feedback to Improve the Accuracy and Explainability of Question Answering Systems Post-Deployment

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove