Question Answering
2883 papers with code • 143 benchmarks • 360 datasets
Question Answering is the task of answering questions (typically reading comprehension questions), but abstaining when presented with a question that cannot be answered based on the provided context.
Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.
( Image credit: SQuAD )
Libraries
Use these libraries to find Question Answering models and implementationsDatasets
Subtasks
- Open-Ended Question Answering
- Open-Domain Question Answering
- Conversational Question Answering
- Answer Selection
- Answer Selection
- Knowledge Base Question Answering
- Community Question Answering
- Zero-Shot Video Question Answer
- Multiple Choice Question Answering (MCQA)
- Long Form Question Answering
- Science Question Answering
- Generative Question Answering
- Cross-Lingual Question Answering
- Mathematical Question Answering
- Temporal/Casual QA
- Logical Reasoning Question Answering
- Multilingual Machine Comprehension in English Hindi
- True or False Question Answering
- Question Quality Assessment
Latest papers with no code
Boter: Bootstrapping Knowledge Selection and Question Answering for Knowledge-based VQA
Knowledge-based Visual Question Answering (VQA) requires models to incorporate external knowledge to respond to questions about visual content.
Tree of Reviews: A Tree-based Dynamic Iterative Retrieval Framework for Multi-hop Question Answering
Compared to related work, we introduce a tree structure to handle each retrieved paragraph separately, alleviating the misleading effect of irrelevant paragraphs on the reasoning path; the diversity of reasoning path extension reduces the impact of a single reasoning error on the whole.
WangLab at MEDIQA-M3G 2024: Multimodal Medical Answer Generation using Large Language Models
This paper outlines our submission to the MEDIQA2024 Multilingual and Multimodal Medical Answer Generation (M3G) shared task.
WangLab at MEDIQA-CORR 2024: Optimized LLM-based Programs for Medical Error Detection and Correction
Our results demonstrate the effectiveness of LLM based programs for medical error correction.
Exploring Diverse Methods in Visual Question Answering
This study explores innovative methods for improving Visual Question Answering (VQA) using Generative Adversarial Networks (GANs), autoencoders, and attention mechanisms.
Listen Then See: Video Alignment with Speaker Attention
Our approach exhibits an improved ability to leverage the video modality by using the audio modality as a bridge with the language modality.
FakeBench: Uncover the Achilles' Heels of Fake Images with Large Multimodal Models
Therefore, we propose FakeBench, the first-of-a-kind benchmark towards transparent defake, consisting of fake images with human language descriptions on forgery signs.
Predicting Question Quality on StackOverflow with Neural Networks
The wealth of information available through the Internet and social media is unprecedented.
Eyes Can Deceive: Benchmarking Counterfactual Reasoning Abilities of Multi-modal Large Language Models
Counterfactual reasoning, as a crucial manifestation of human intelligence, refers to making presuppositions based on established facts and extrapolating potential outcomes.
MM-PhyRLHF: Reinforcement Learning Framework for Multimodal Physics Question-Answering
We employ the LLaVA open-source model to answer multimodal physics MCQs and compare the performance with and without using RLHF.