zero-shot long video breakpoint-mode question answering

2 papers with code • 1 benchmarks • 0 datasets

This task has no description! Would you like to contribute one?

Most implemented papers

MovieChat: From Dense Token to Sparse Memory for Long Video Understanding

rese1f/MovieChat CVPR 2024

Recently, integrating video foundation models and large language models to build a video understanding system can overcome the limitations of specific pre-defined vision tasks.

HERMES: temporal-coHERent long-forM understanding with Episodes and Semantics

joslefaure/HERMES 30 Aug 2024

Existing research often treats long-form videos as extended short videos, leading to several limitations: inadequate capture of long-range dependencies, inefficient processing of redundant information, and failure to extract high-level semantic concepts.