Search Results for author: Liuhan Chen

Found 5 papers, 5 papers with code

Next Patch Prediction for Autoregressive Visual Generation

1 code implementation19 Dec 2024 Yatian Pang, Peng Jin, Shuo Yang, Bin Lin, Bin Zhu, Zhenyu Tang, Liuhan Chen, Francis E. H. Tay, Ser-Nam Lim, Harry Yang, Li Yuan

Autoregressive models, built based on the Next Token Prediction (NTP) paradigm, show great potential in developing a unified framework that integrates both language and vision tasks.

Image Generation

Open-Sora Plan: Open-Source Large Video Generation Model

6 code implementations28 Nov 2024 Bin Lin, Yunyang Ge, Xinhua Cheng, Zongjian Li, Bin Zhu, Shaodong Wang, Xianyi He, Yang Ye, Shenghai Yuan, Liuhan Chen, Tanghui Jia, Junwu Zhang, Zhenyu Tang, Yatian Pang, Bin She, Cen Yan, Zhiheng Hu, Xiaoyi Dong, Lin Chen, Zhang Pan, Xing Zhou, Shaoling Dong, Yonghong Tian, Li Yuan

We introduce Open-Sora Plan, an open-source project that aims to contribute a large generation model for generating desired high-resolution videos with long durations based on various user inputs.

Video Generation

Identity-Preserving Text-to-Video Generation by Frequency Decomposition

1 code implementation26 Nov 2024 Shenghai Yuan, Jinfa Huang, Xianyi He, Yunyuan Ge, Yujun Shi, Liuhan Chen, Jiebo Luo, Li Yuan

We propose a hierarchical training strategy to leverage frequency information for identity preservation, transforming a vanilla pre-trained video generation model into an IPT2V model.

Image to Video Generation Text-to-Video Generation

WF-VAE: Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model

2 code implementations26 Nov 2024 Zongjian Li, Bin Lin, Yang Ye, Liuhan Chen, Xinhua Cheng, Shenghai Yuan, Li Yuan

However, as the resolution and duration of generated videos increase, the encoding cost of Video VAEs becomes a limiting bottleneck in training LVDMs.

OD-VAE: An Omni-dimensional Video Compressor for Improving Latent Video Diffusion Model

1 code implementation2 Sep 2024 Liuhan Chen, Zongjian Li, Bin Lin, Bin Zhu, Qian Wang, Shenghai Yuan, Xing Zhou, Xinhua Cheng, Li Yuan

With the same reconstruction quality, the more sufficient the VAE's compression for videos is, the more efficient the LVDMs are.

Video Generation Video Reconstruction

Cannot find the paper you are looking for? You can Submit a new open access paper.