Audio Question Answering

5 papers with code • 2 benchmarks • 1 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Audio Question Answering

Trend	Dataset	Best Model	Paper	Code	Compare
	DAQA	MALiMo (6 Blocks)			See all
	RoadTracer	XLNet			See all

Datasets

RoadTracer

Latest papers

Most implemented Social Latest No code

Multi-Scale Attention for Audio Question Answering

gewu-lab/mwafm • • 29 May 2023

Audio question answering (AQA), acting as a widely used proxy task to explore scene understanding, has got more attention.

29 May 2023

Paper
Code

Pengi: An Audio Language Model for Audio Tasks

microsoft/pengi • • NeurIPS 2023

We introduce Pengi, a novel Audio Language Model that leverages Transfer Learning by framing all audio tasks as text-generation tasks.

247

19 May 2023

Paper
Code

ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities

modelscope/modelscope • • 18 May 2023

In this work, we explore a scalable way for building a general representation model toward unlimited modalities.

6,055

18 May 2023

Paper
Code

Temporal Reasoning via Audio Question Answering

facebookresearch/daqa • • 21 Nov 2019

In this paper, we use the task of Audio Question Answering (AQA) to study the temporal reasoning abilities of machine learning models.

21 Nov 2019

Paper
Code

XLNet: Generalized Autoregressive Pretraining for Language Understanding

huggingface/transformers • • NeurIPS 2019

With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling.

124,984

19 Jun 2019

Paper
Code

Audio Question Answering

Benchmarks Add a Result

Datasets

Latest papers

Multi-Scale Attention for Audio Question Answering

Pengi: An Audio Language Model for Audio Tasks

ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities

Temporal Reasoning via Audio Question Answering

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Content

Benchmarks

Add a Result