no code implementations • 7 Jan 2025 • Guozhen Zhang, Yuhan Zhu, Yutao Cui, Xiaotong Zhao, Kai Ma, LiMin Wang
Generative frame interpolation, empowered by large-scale pre-trained video generation models, has demonstrated remarkable advantages in complex scenes.
1 code implementation • 3 Dec 2024 • Weijie Kong, Qi Tian, Zijian Zhang, Rox Min, Zuozhuo Dai, Jin Zhou, Jiangfeng Xiong, Xin Li, Bo Wu, Jianwei Zhang, Kathrina Wu, Qin Lin, Junkun Yuan, Yanxin Long, Aladdin Wang, Andong Wang, Changlin Li, Duojun Huang, Fang Yang, Hao Tan, Hongmei Wang, Jacob Song, Jiawang Bai, Jianbing Wu, Jinbao Xue, Joey Wang, Kai Wang, Mengyang Liu, Pengyu Li, Shuai Li, Weiyan Wang, Wenqing Yu, Xinchi Deng, Yang Li, Yi Chen, Yutao Cui, Yuanbo Peng, Zhentao Yu, Zhiyu He, Zhiyong Xu, Zixiang Zhou, Zunnan Xu, Yangyu Tao, Qinglin Lu, Songtao Liu, Dax Zhou, Hongfa Wang, Yong Yang, Di Wang, Yuhong Liu, Jie Jiang, Caesar Zhong
In this report, we introduce HunyuanVideo, an innovative open-source video foundation model that demonstrates performance in video generation comparable to, or even surpassing, that of leading closed-source models.
1 code implementation • 2 Jul 2024 • Guozhen Zhang, Chunxu Liu, Yutao Cui, Xiaotong Zhao, Kai Ma, LiMin Wang
In this paper, we propose VFIMamba, a novel frame interpolation method for efficient and dynamic inter-frame modeling by harnessing the S6 model.
Ranked #1 on
Video Frame Interpolation
on X4K1000FPS-2K
no code implementations • 7 Mar 2024 • Yutao Cui, Xiaotong Zhao, Guozhen Zhang, Shengming Cao, Kai Ma, LiMin Wang
Point-based image editing has attracted remarkable attention since the emergence of DragGAN.
no code implementations • 25 Aug 2023 • Jiaming Zhang, Yutao Cui, Gangshan Wu, LiMin Wang
To overcome these issues, we propose a unified VOS framework, coined as JointFormer, for joint modeling the three elements of feature, correspondence, and a compressed memory.
1 code implementation • NeurIPS 2023 • Yutao Cui, Tianhui Song, Gangshan Wu, LiMin Wang
Our key design is to introduce four special prediction tokens and concatenate them with the tokens from target template and search areas.
1 code implementation • ICCV 2023 • Yutao Cui, Chenkai Zeng, Xiaoyu Zhao, Yichun Yang, Gangshan Wu, LiMin Wang
We expect SportsMOT to encourage the MOT trackers to promote in both motion-based association and appearance-based association.
Ranked #8 on
Multi-Object Tracking
on SportsMOT
(using extra training data)
1 code implementation • 6 Feb 2023 • Yutao Cui, Cheng Jiang, Gangshan Wu, LiMin Wang
Our core design is to utilize the flexibility of attention operations, and propose a Mixed Attention Module (MAM) for simultaneous feature extraction and target information integration.
Ranked #3 on
Visual Object Tracking
on TrackingNet
1 code implementation • CVPR 2022 • Yutao Cui, Cheng Jiang, LiMin Wang, Gangshan Wu
Our core design is to utilize the flexibility of attention operations, and propose a Mixed Attention Module (MAM) for simultaneous feature extraction and target information integration.
Ranked #3 on
Visual Object Tracking
on AVisT
Semi-Supervised Video Object Segmentation
Video Object Tracking
+1
1 code implementation • 1 Apr 2021 • Yutao Cui, Cheng Jiang, LiMin Wang, Gangshan Wu
Accurate tracking is still a challenging task due to appearance variations, pose and view changes, and geometric deformations of target in videos.
Ranked #1 on
Visual Object Tracking
on VOT2019
2 code implementations • 15 Apr 2020 • Yutao Cui, Cheng Jiang, Li-Min Wang, Gangshan Wu
To tackle this issue, we present the fully convolutional online tracking framework, coined as FCOT, and focus on enabling online learning for both classification and regression branches by using a target filter based tracking paradigm.