zero-shot long video question answering
2 papers with code • 0 benchmarks • 1 datasets
This task has no description! Would you like to contribute one?
Benchmarks
These leaderboards are used to track progress in zero-shot long video question answering
No evaluation results yet. Help compare methods by
submitting
evaluation metrics.
Most implemented papers
MovieChat: From Dense Token to Sparse Memory for Long Video Understanding
Recently, integrating video foundation models and large language models to build a video understanding system can overcome the limitations of specific pre-defined vision tasks.
Understanding Long Videos in One Multimodal Language Model Pass
In addition to faster inference, we discover the resulting models to yield surprisingly good accuracy on long-video tasks, even with no video specific information.