Audio Question Answering

5 papers with code • 2 benchmarks • 1 datasets

This task has no description! Would you like to contribute one?

Datasets


Multi-Scale Attention for Audio Question Answering

gewu-lab/mwafm 29 May 2023

Audio question answering (AQA), acting as a widely used proxy task to explore scene understanding, has got more attention.

23
29 May 2023

Pengi: An Audio Language Model for Audio Tasks

microsoft/pengi NeurIPS 2023

We introduce Pengi, a novel Audio Language Model that leverages Transfer Learning by framing all audio tasks as text-generation tasks.

247
19 May 2023

ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities

modelscope/modelscope 18 May 2023

In this work, we explore a scalable way for building a general representation model toward unlimited modalities.

6,055
18 May 2023

Temporal Reasoning via Audio Question Answering

facebookresearch/daqa 21 Nov 2019

In this paper, we use the task of Audio Question Answering (AQA) to study the temporal reasoning abilities of machine learning models.

20
21 Nov 2019

XLNet: Generalized Autoregressive Pretraining for Language Understanding

huggingface/transformers NeurIPS 2019

With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling.

124,984
19 Jun 2019