no code implementations • 26 Jun 2025 • Hongbo Liu, Jingwen He, Yi Jin, Dian Zheng, Yuhao Dong, Fan Zhang, Ziqi Huang, Yinan He, Yangguang Li, WeiChao Chen, Yu Qiao, Wanli Ouyang, Shengjie Zhao, Ziwei Liu
We open-source our models, data, and code to foster rapid progress in this crucial area of AI-driven cinematic understanding and generation.
1 code implementation • 27 Mar 2025 • Dian Zheng, Ziqi Huang, Hongbo Liu, Kai Zou, Yinan He, Fan Zhang, Yuanhan Zhang, Jingwen He, Wei-Shi Zheng, Yu Qiao, Ziwei Liu
To bridge this gap, we introduce VBench-2. 0, a next-generation benchmark designed to automatically evaluate video generative models for their intrinsic faithfulness.
1 code implementation • 14 Jan 2025 • Weichen Fan, Chenyang Si, Junhao Song, Zhenyu Yang, Yinan He, Long Zhuo, Ziqi Huang, Ziyue Dong, Jingwen He, Dongwei Pan, Yi Wang, Yuming Jiang, Yaohui Wang, Peng Gao, Xinyuan Chen, Hengjie Li, Dahua Lin, Yu Qiao, Ziwei Liu
We present Vchitect-2. 0, a parallel transformer architecture designed to scale up video diffusion models for large-scale text-to-video generation.
no code implementations • CVPR 2025 • Shuwei Shi, Biao Gong, Xi Chen, Dandan Zheng, Shuai Tan, Zizheng Yang, Yuyuan Li, Jingwen He, Kecheng Zheng, Jingdong Chen, Ming Yang, Yinqiang Zheng
We then present a new I2V model, named MotionStone, developed with the decoupled motion estimator.
no code implementations • 4 Dec 2024 • Jing Tan, Shuai Yang, Tong Wu, Jingwen He, Yuwei Guo, Ziwei Liu, Dahua Lin
$360^\circ$ videos offer a hyper-immersive experience that allows the viewers to explore a dynamic scene from full 360 degrees.
no code implementations • 10 Jul 2024 • Jingwen He, Tianfan Xue, Dongyang Liu, Xinqi Lin, Peng Gao, Dahua Lin, Yu Qiao, Wanli Ouyang, Ziwei Liu
Given a generated low-quality video, our approach can increase its spatial and temporal resolution simultaneously with arbitrary up-sampling space and time scales through a unified video diffusion model.
no code implementations • 24 Jun 2024 • Shuwei Shi, Wenbo Li, Yuechen Zhang, Jingwen He, Biao Gong, Yinqiang Zheng
Diffusion models excel at producing high-quality images; however, scaling to higher resolutions, such as 4K, often results in over-smoothed content, structural distortions, and repetitive patterns.
2 code implementations • 9 May 2024 • Peng Gao, Le Zhuo, Dongyang Liu, Ruoyi Du, Xu Luo, Longtian Qiu, Yuhang Zhang, Chen Lin, Rongjie Huang, Shijie Geng, Renrui Zhang, Junlin Xi, Wenqi Shao, Zhengkai Jiang, Tianshuo Yang, Weicai Ye, He Tong, Jingwen He, Yu Qiao, Hongsheng Li
Sora unveils the potential of scaling Diffusion Transformer for generating photorealistic images and videos at arbitrary resolutions, aspect ratios, and durations, yet it still lacks sufficient implementation details.
no code implementations • 30 Apr 2024 • Ziyan Chen, Jingwen He, Xinqi Lin, Yu Qiao, Chao Dong
Blind face restoration (BFR) on images has significantly progressed over the last several years, while real-world video face restoration (VFR), which is more challenging for more complex face motions such as moving gaze directions and facial orientations involved, remains unsolved.
no code implementations • CVPR 2024 • Fanghua Yu, Jinjin Gu, Zheyuan Li, JinFan Hu, Xiangtao Kong, Xintao Wang, Jingwen He, Yu Qiao, Chao Dong
We introduce SUPIR (Scaling-UP Image Restoration), a groundbreaking image restoration method that harnesses generative prior and the power of model scaling up.
2 code implementations • 8 Sep 2023 • Xiangyu Chen, Zheyuan Li, Zhengwen Zhang, Jimmy S. Ren, Yihao Liu, Jingwen He, Yu Qiao, Jiantao Zhou, Chao Dong
However, most resources are still in standard dynamic range (SDR).
2 code implementations • 29 Aug 2023 • Xinqi Lin, Jingwen He, Ziyan Chen, Zhaoyang Lyu, Bo Dai, Fanghua Yu, Wanli Ouyang, Yu Qiao, Chao Dong
We present DiffBIR, a general restoration pipeline that could handle different blind image restoration tasks in a unified framework.
Ranked #1 on
Blind Face Restoration
on LFW
1 code implementation • CVPR 2023 • Yihao Liu, Jingwen He, Jinjin Gu, Xiangtao Kong, Yu Qiao, Chao Dong
However, we argue that pretraining is more significant for high-cost tasks, where data acquisition is more challenging.
1 code implementation • CVPR 2023 • Bin Xia, Jingwen He, Yulun Zhang, Yitong Wang, Yapeng Tian, Wenming Yang, Luc van Gool
In SSL, we design pruning schemes for several key components in VSR models, including residual blocks, recurrent networks, and upsampling networks.
2 code implementations • 16 May 2022 • Fangyuan Kong, Mingxi Li, Songwei Liu, Ding Liu, Jingwen He, Yang Bai, Fangmin Chen, Lean Fu
Moreover, we revisit the popular contrastive loss and observe that the selection of intermediate features of its feature extractor has great influence on the performance.
no code implementations • CVPR 2022 • Jingwen He, Wu Shi, Kai Chen, Lean Fu, Chao Dong
The style modulation aims to generate realistic face details and the feature modulation dynamically fuses the multi-level encoded features and the generated ones conditioned on the upscaling factor.
no code implementations • 7 May 2021 • Haoming Cai, Jingwen He, Qiao Yu, Chao Dong
The base networks comprise a generator and a discriminator.
no code implementations • 13 Apr 2021 • Yihao Liu, Jingwen He, Xiangyu Chen, Zhengwen Zhang, Hengyuan Zhao, Chao Dong, Yu Qiao
In practice, photo retouching can be accomplished by a series of image processing operations.
1 code implementation • 2 Oct 2020 • Hengyuan Zhao, Xiangtao Kong, Jingwen He, Yu Qiao, Chao Dong
Pixel attention (PA) is similar as channel attention and spatial attention in formulation.
1 code implementation • ECCV 2020 • Jingwen He, Yihao Liu, Yu Qiao, Chao Dong
The base network acts like an MLP that processes each pixel independently and the condition network extracts the global features of the input image to generate a condition vector.
3 code implementations • 15 Sep 2020 • Kai Zhang, Martin Danelljan, Yawei Li, Radu Timofte, Jie Liu, Jie Tang, Gangshan Wu, Yu Zhu, Xiangyu He, Wenjie Xu, Chenghua Li, Cong Leng, Jian Cheng, Guangyang Wu, Wenyi Wang, Xiaohong Liu, Hengyuan Zhao, Xiangtao Kong, Jingwen He, Yu Qiao, Chao Dong, Maitreya Suin, Kuldeep Purohit, A. N. Rajagopalan, Xiaochuan Li, Zhiqiang Lang, Jiangtao Nie, Wei Wei, Lei Zhang, Abdul Muqeet, Jiwon Hwang, Subin Yang, JungHeum Kang, Sung-Ho Bae, Yongwoo Kim, Geun-Woo Jeon, Jun-Ho Choi, Jun-Hyuk Kim, Jong-Seok Lee, Steven Marty, Eric Marty, Dongliang Xiong, Siang Chen, Lin Zha, Jiande Jiang, Xinbo Gao, Wen Lu, Haicheng Wang, Vineeth Bhaskara, Alex Levinshtein, Stavros Tsogkas, Allan Jepson, Xiangzhen Kong, Tongtong Zhao, Shanshan Zhao, Hrishikesh P. S, Densen Puthussery, Jiji C. V, Nan Nan, Shuai Liu, Jie Cai, Zibo Meng, Jiaming Ding, Chiu Man Ho, Xuehui Wang, Qiong Yan, Yuzhi Zhao, Long Chen, Jiangtao Zhang, Xiaotong Luo, Liang Chen, Yanyun Qu, Long Sun, Wenhao Wang, Zhenbing Liu, Rushi Lan, Rao Muhammad Umer, Christian Micheloni
This paper reviews the AIM 2020 challenge on efficient single image super-resolution with focus on the proposed solutions and results.
1 code implementation • ECCV 2020 • Jingwen He, Chao Dong, Yu Qiao
To make a step forward, this paper presents a new problem setup, called multi-dimension (MD) modulation, which aims at modulating output effects across multiple degradation types and levels.
1 code implementation • CVPR 2019 • Jingwen He, Chao Dong, Yu Qiao
In image restoration tasks, like denoising and super resolution, continual modulation of restoration levels is of great importance for real-world applications, but has failed most of existing deep learning based image restoration methods.
Ranked #2 on
Color Image Denoising
on CBSD68 sigma75