no code implementations • 8 Mar 2025 • Shaobin Zhuang, Zhipeng Huang, Binxin Yang, Ying Zhang, Fangyikang Wang, Canmiao Fu, Chong Sun, Zheng-Jun Zha, Chen Li, Yali Wang
Video editing increasingly demands the ability to incorporate specific real-world instances into existing footage, yet current approaches fundamentally fail to capture the unique visual characteristics of particular subjects and ensure natural instance/scene interactions.
no code implementations • 3 Mar 2025 • Zhipeng Huang, Shaobin Zhuang, Canmiao Fu, Binxin Yang, Ying Zhang, Chong Sun, Zhizheng Zhang, Yali Wang, Chen Li, Zheng-Jun Zha
In this work, we introduce WeGen, a model that unifies multimodal generation and understanding, and promotes their interplay in iterative generation.
1 code implementation • CVPR 2024 • Zigang Geng, Binxin Yang, Tiankai Hang, Chen Li, Shuyang Gu, Ting Zhang, Jianmin Bao, Zheng Zhang, Han Hu, Dong Chen, Baining Guo
We present InstructDiffusion, a unifying and generic framework for aligning computer vision tasks with human instructions.
2 code implementations • CVPR 2023 • Binxin Yang, Shuyang Gu, Bo Zhang, Ting Zhang, Xuejin Chen, Xiaoyan Sun, Dong Chen, Fang Wen
Language-guided image editing has achieved great success recently.
no code implementations • 23 Nov 2022 • Binxin Yang, Xuejin Chen, Chaoqun Wang, Chi Zhang, Zihan Chen, Xiaoyan Sun
With a semantic feature matching loss for effective semantic supervision, our sketch embedding precisely conveys the semantics in the input sketches to the synthesized images.
1 code implementation • 12 Sep 2022 • Junshu Tang, Bo Zhang, Binxin Yang, Ting Zhang, Dong Chen, Lizhuang Ma, Fang Wen
In contrast to the traditional avatar creation pipeline which is a costly process, contemporary generative approaches directly learn the data distribution from photographs.
1 code implementation • 31 Aug 2020 • Yuhang Li, Xuejin Chen, Binxin Yang, Zihan Chen, Zhihua Cheng, Zheng-Jun Zha
In this paper, we explore the task of generating photo-realistic face images from hand-drawn sketches.