no code implementations • 10 Dec 2024 • Thong Thanh Nguyen, Xiaobao Wu, Yi Bin, Cong-Duy T Nguyen, See-Kiong Ng, Anh Tuan Luu
To overcome this limitation, we introduce a contrastive representation learning framework that focuses on motion pattern for temporal scene graph generation.
no code implementations • 10 Dec 2024 • Thong Thanh Nguyen, Yi Bin, Xiaobao Wu, Zhiyuan Hu, Cong-Duy T Nguyen, See-Kiong Ng, Anh Tuan Luu
To resolve this problem, we propose a contrastive learning framework to capture salient semantics among video moments.
1 code implementation • 30 May 2024 • Thong Thanh Nguyen, Zhiyuan Hu, Xiaobao Wu, Cong-Duy T Nguyen, See-Kiong Ng, Anh Tuan Luu
Seeking answers effectively for long videos is essential to build video question answering (videoQA) systems.