no code implementations • 11 Mar 2024 • Lan Wang, Vishnu Boddeti, SerNam Lim
While existing video editing tasks are limited to changes in attributes, backgrounds, and styles, our method aims to predict open-ended human action changes in video.
no code implementations • 28 Feb 2024 • Zhuoling Li, Xiaogang Xu, SerNam Lim, Hengshuang Zhao
To address these challenges, we build a detector based on the bird's-eye-view (BEV) detection paradigm, where the explicit feature projection is beneficial to addressing the geometry learning ambiguity when employing multiple scenarios of data to train detectors.
no code implementations • 20 Sep 2023 • Mohamed Afham, Satya Narayan Shukla, Omid Poursaeed, Pengchuan Zhang, Ashish Shah, SerNam Lim
While most modern video understanding models operate on short-range clips, real-world videos are often several minutes long with semantically consistent segments of variable length.
1 code implementation • 3 Apr 2023 • Zhuoling Li, Chuanrui Zhang, Wei-Chiu Ma, Yipin Zhou, Linyan Huang, Haoqian Wang, SerNam Lim, Hengshuang Zhao
In recent years, transformer-based detectors have demonstrated remarkable performance in 2D visual perception tasks.
no code implementations • ICCV 2021 • Omid Poursaeed, Tianxing Jiang, Harry Yang, Serge Belongie, SerNam Lim
Adversarial training with these examples enable the model to withstand a wide range of attacks by observing a variety of input alterations during training.