1 code implementation • 30 Aug 2024 • Gueter Josmy Faure, Jia-Fong Yeh, Min-Hung Chen, Hung-Ting Su, Shang-Hong Lai, Winston H. Hsu
Existing research often treats long-form videos as extended short videos, leading to several limitations: inadequate capture of long-range dependencies, inefficient processing of redundant information, and failure to extract high-level semantic concepts.
Video Classification zero-shot long video breakpoint-mode question answering +3
1 code implementation • 10 Apr 2023 • Wei-Jhe Huang, Jheng-Hsien Yeh, Min-Hung Chen, Gueter Josmy Faure, Shang-Hong Lai
Finally, we calculate the similarity between the interaction feature and the text feature for each label to determine the action category.
1 code implementation • 23 Oct 2022 • Gueter Josmy Faure, Min-Hung Chen, Shang-Hong Lai
Actions are about how we interact with the environment, including other people, objects, and ourselves.
Ranked #1 on Action Detection on MultiSports