1 code implementation • 22 Jul 2024 • Xin Ma, Yaohui Wang, Gengyun Jia, Xinyuan Chen, Yuan-Fang Li, Cunjian Chen, Yu Qiao
Diffusion models have achieved great progress in image animation due to powerful generative capabilities.
3 code implementations • 5 Jan 2024 • Xin Ma, Yaohui Wang, Gengyun Jia, Xinyuan Chen, Ziwei Liu, Yuan-Fang Li, Cunjian Chen, Yu Qiao
We propose a novel Latent Diffusion Transformer, namely Latte, for video generation.
1 code implementation • ICCV 2023 • Yuting Xu, Jian Liang, Gengyun Jia, Ziming Yang, Yanhao Zhang, Ran He
This paper introduces a simple yet effective strategy named Thumbnail Layout (TALL), which transforms a video clip into a pre-defined layout to realize the preservation of spatial and temporal dependencies.
no code implementations • CVPR 2022 • Gengyun Jia, Huaibo Huang, Chaoyou Fu, Ran He
In this paper, we regard image cropping as a set prediction problem.
no code implementations • 20 Dec 2021 • Xin Ma, Xiaoqiang Zhou, Huaibo Huang, Gengyun Jia, Zhenhua Chai, Xiaolin Wei
This multi-scale architecture is beneficial for the decoder to utilize discriminative representations learned from encoders into images.
no code implementations • 4 Aug 2019 • Gengyun Jia, Pei-Pei Li, Ran He
RoM pooling pools image features and discards extra padded features to eliminate the side effects of padding.