no code implementations • 14 Jan 2025 • Jiwen Yu, Yiran Qin, Xintao Wang, Pengfei Wan, Di Zhang, Xihui Liu
In this paper, we present GameFactory, a framework focused on exploring scene generalization in game video generation.
no code implementations • 23 Oct 2024 • Yiran Qin, Zhelun Shi, Jiwen Yu, Xijun Wang, Enshen Zhou, Lijun Li, Zhenfei Yin, Xihui Liu, Lu Sheng, Jing Shao, Lei Bai, Wanli Ouyang, Ruimao Zhang
WorldSimBench includes Explicit Perceptual Evaluation and Implicit Manipulative Evaluation, encompassing human preference assessments from the visual perspective and action-level evaluations in embodied tasks, covering three representative embodied scenarios: Open-Ended Embodied Environment, Autonomous, Driving, and Robot Manipulation.
no code implementations • 12 Aug 2024 • Yinhuai Wang, Qihan Zhao, Runyi Yu, Ailing Zeng, Jing Lin, Zhengyi Luo, Hok Wai Tsui, Jiwen Yu, Xiu Li, Qifeng Chen, Jian Zhang, Lei Zhang, Ping Tan
SkillMimic employs a unified configuration to learn diverse skills from human-ball motion datasets, with skill diversity and generalization improving as the dataset grows.
no code implementations • 19 May 2024 • Youmin Xu, Xuanyu Zhang, Jiwen Yu, Chong Mou, Xiandong Meng, Jian Zhang
This paper introduces Hierarchical Image Steganography, a novel method that enhances the security and capacity of embedding multiple images into a single container using diffusion models.
no code implementations • 25 Apr 2024 • Xuanyu Zhang, Youmin Xu, Runyi Li, Jiwen Yu, Weiqi Li, Zhipei Xu, Jian Zhang
Meanwhile, we introduce a sample-level audio localization method and a cross-modal copyright extraction mechanism to couple the information of audio and video frames.
1 code implementation • 25 Mar 2024 • Bin Chen, Zhenyu Zhang, Weiqi Li, Chen Zhao, Jiwen Yu, Shijie Zhao, Jie Chen, Jian Zhang
To enable such memory-intensive end-to-end fine-tuning, we propose a novel two-level invertible design to transform both (1) multi-step sampling process and (2) noise estimation U-Net in each step into invertible networks.
no code implementations • 12 Dec 2023 • Shuzhou Yang, Chong Mou, Jiwen Yu, YuHan Wang, Xiandong Meng, Jian Zhang
Specifically, we construct a neural video field, powered by tri-plane and sparse grid, to enable encoding long videos with hundreds of frames in a memory-efficient manner.
no code implementations • CVPR 2024 • Xuanyu Zhang, Runyi Li, Jiwen Yu, Youmin Xu, Weiqi Li, Jian Zhang
In the era where AI-generated content (AIGC) models can produce stunning and lifelike images, the lingering shadow of unauthorized reproductions and malicious tampering poses imminent threats to copyright integrity and information security.
1 code implementation • 6 Dec 2023 • Jiwen Yu, Xiaodong Cun, Chenyang Qi, Yong Zhang, Xintao Wang, Ying Shan, Jian Zhang
For appearance control, we borrow intermediate latents and their features from the text-to-image (T2I) generation for ensuring the generated first frame is equal to the given generated image.
no code implementations • 18 Aug 2023 • Shuzhou Yang, Xuanyu Zhang, Yinhuai Wang, Jiwen Yu, YuHan Wang, Jian Zhang
Specifically, we adopt a naive unsupervised enhancement algorithm to realize preliminary restoration and design two zero-shot plug-and-play modules based on diffusion model to improve generalization and effectiveness.
1 code implementation • NeurIPS 2023 • Jiwen Yu, Xuanyu Zhang, Youmin Xu, Jian Zhang
Current image steganography techniques are mainly focused on cover-based methods, which commonly have the risk of leaking secret images and poor robustness against degraded container images.
1 code implementation • ICCV 2023 • Jiwen Yu, Yinhuai Wang, Chen Zhao, Bernard Ghanem, Jian Zhang
In this work, we propose a training-Free conditional Diffusion Model (FreeDoM) used for various conditions.
1 code implementation • 1 Mar 2023 • Yinhuai Wang, Jiwen Yu, Runyi Yu, Jian Zhang
Our simple, parameter-free approaches can be used not only for image restoration but also for image generation of unlimited sizes, with the potential to be a general tool for diffusion models.
4 code implementations • 1 Dec 2022 • Yinhuai Wang, Jiwen Yu, Jian Zhang
Most existing Image Restoration (IR) models are task-specific, which can not be generalized to different degradation operators.
Ranked #1 on
Image Compressed Sensing
on CelebA
1 code implementation • 24 Nov 2022 • Yinhuai Wang, Yujie Hu, Jiwen Yu, Jian Zhang
Consistency and realness have always been the two critical issues of image super-resolution.