Audio Question Answering
5 papers with code • 2 benchmarks • 1 datasets
Most implemented papers
XLNet: Generalized Autoregressive Pretraining for Language Understanding
With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling.
ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities
In this work, we explore a scalable way for building a general representation model toward unlimited modalities.
Temporal Reasoning via Audio Question Answering
In this paper, we use the task of Audio Question Answering (AQA) to study the temporal reasoning abilities of machine learning models.
Pengi: An Audio Language Model for Audio Tasks
We introduce Pengi, a novel Audio Language Model that leverages Transfer Learning by framing all audio tasks as text-generation tasks.
Multi-Scale Attention for Audio Question Answering
Audio question answering (AQA), acting as a widely used proxy task to explore scene understanding, has got more attention.