Text to Audio/Video Retrieval
2 papers with code • 1 benchmarks • 1 datasets
This task has no description! Would you like to contribute one?
Most implemented papers
MUGEN: A Playground for Video-Audio-Text Multimodal Understanding and GENeration
Altogether, MUGEN can help progress research in many tasks in multimodal understanding and generation.
Audio Retrieval with Natural Language Queries
We consider the task of retrieving audio using free-form natural language queries.