no code implementations • 4 Dec 2023 • YuChao Gu, Yipin Zhou, Bichen Wu, Licheng Yu, Jia-Wei Liu, Rui Zhao, Jay Zhangjie Wu, David Junhao Zhang, Mike Zheng Shou, Kevin Tang
In contrast to previous methods that rely on dense correspondences, we introduce the VideoSwap framework that exploits semantic point correspondences, inspired by our observation that only a small number of semantic points are necessary to align the subject's motion trajectory and modify its shape.
1 code implementation • 3 Apr 2023 • Zhuoling Li, Chuanrui Zhang, Wei-Chiu Ma, Yipin Zhou, Linyan Huang, Haoqian Wang, SerNam Lim, Hengshuang Zhao
In recent years, transformer-based detectors have demonstrated remarkable performance in 2D visual perception tasks.
no code implementations • 20 Nov 2022 • Peirong Liu, Rui Wang, Pengchuan Zhang, Omid Poursaeed, Yipin Zhou, Xuefei Cao, Sreya Dutta Roy, Ashish Shah, Ser-Nam Lim
We propose TrIVD (Tracking and Image-Video Detection), the first framework that unifies image OD, video OD, and MOT within one end-to-end model.
no code implementations • 9 Oct 2021 • Peirong Liu, Rui Wang, Xuefei Cao, Yipin Zhou, Ashish Shah, Ser-Nam Lim
Key findings are twofold: (1) by capturing the motion transfer with an ordinary differential equation (ODE), it helps to regularize the motion field, and (2) by utilizing the source image itself, we are able to inpaint occluded/missing regions arising from large motion changes.
no code implementations • 29 Sep 2021 • Ze Wang, Yipin Zhou, Rui Wang, Tsung-Yu Lin, Ashish Shah, Ser-Nam Lim
Anything outside of a given normal population is by definition an anomaly.
no code implementations • ICCV 2021 • Yipin Zhou, Ser-Nam Lim
Deepfakes ("deep learning" + "fake") are synthetically-generated videos from AI algorithms.
Ranked #6 on
DeepFake Detection
on FakeAVCeleb
no code implementations • 30 Mar 2019 • Yipin Zhou, Zhaowen Wang, Chen Fang, Trung Bui, Tamara L. Berg
This work presents computational methods for transferring body movements from one person to another with videos collected in the wild.
no code implementations • 27 Jan 2018 • Yipin Zhou, Yale Song, Tamara L. Berg
Given a still photograph, one can imagine how dynamic objects might move against a static background.
3 code implementations • CVPR 2018 • Yipin Zhou, Zhaowen Wang, Chen Fang, Trung Bui, Tamara L. Berg
As two of the five traditional human senses (sight, hearing, taste, smell, and touch), vision and sound are basic sources through which humans understand the world.
no code implementations • 27 Aug 2016 • Yipin Zhou, Tamara L. Berg
Based on life-long observations of physical, chemical, and biologic phenomena in the natural world, humans can often easily picture in their minds what an object will look like in the future.
no code implementations • ICCV 2015 • Yipin Zhou, Tamara L. Berg
Given a video of an activity, can we predict what will happen next?