Audio-Visual Video Captioning
0 benchmarks • 0 datasets
This task has no description! Would you like to contribute one?
Benchmarks
These leaderboards are used to track progress in Audio-Visual Video Captioning
No evaluation results yet. Help compare methods by
submitting
evaluation metrics.
Latest papers with no code
Knowledge Distillation for Efficient Audio-Visual Video Captioning
Automatically describing audio-visual content with texts, namely video captioning, has received significant attention due to its potential applications across diverse fields.
An Attempt towards Interpretable Audio-Visual Video Captioning
To achieve this, we propose a multimodal convolutional neural network-based audio-visual video captioning framework and introduce a modality-aware module for exploring modality selection during sentence generation.