no code implementations • 19 Dec 2024 • Kunpeng Song, Tingbo Hou, Zecheng He, Haoyu Ma, Jialiang Wang, Animesh Sinha, Sam Tsai, Yaqiao Luo, Xiaoliang Dai, Li Chen, Xide Xia, Peizhao Zhang, Peter Vajda, Ahmed Elgammal, Felix Juefei-Xu
In this paper, we introduce DirectorLLM, a novel video generation model that employs a large language model (LLM) to orchestrate human poses within videos.
1 code implementation • 8 Apr 2024 • Kunpeng Song, Yizhe Zhu, Bingchen Liu, Qing Yan, Ahmed Elgammal, Xiao Yang
This approach effectively synergizes reference image and text prompt information to produce valuable image features, facilitating an image diffusion model.
1 code implementation • 8 Jun 2023 • Ligong Han, Song Wen, Qi Chen, Zhixing Zhang, Kunpeng Song, Mengwei Ren, Ruijiang Gao, Anastasis Stathopoulos, Xiaoxiao He, Yuxiao Chen, Di Liu, Qilong Zhangli, Jindong Jiang, Zhaoyang Xia, Akash Srivastava, Dimitris Metaxas
Null-text inversion (NTI) optimizes null embeddings to align the reconstruction and inversion trajectories with larger CFG scales, enabling real image editing with cross-attention control.
no code implementations • 8 Dec 2022 • Kunpeng Song, Ligong Han, Bingchen Liu, Dimitris Metaxas, Ahmed Elgammal
Can a text-to-image diffusion model be used as a training objective for adapting a GAN generator to another domain?
7 code implementations • ICLR 2021 • Bingchen Liu, Yizhe Zhu, Kunpeng Song, Ahmed Elgammal
Training Generative Adversarial Networks (GAN) on high-fidelity images usually requires large-scale GPU-clusters and a vast number of training images.
Ranked #2 on Image Generation on ADE-Indoor
1 code implementation • 16 Dec 2020 • Bingchen Liu, Yizhe Zhu, Kunpeng Song, Ahmed Elgammal
Moreover, with the proposed sketch generator, the model shows a promising performance on style mixing and style transfer, which require synthesized images to be both style-consistent and semantically meaningful.
no code implementations • 27 May 2020 • Bingchen Liu, Kunpeng Song, Yizhe Zhu, Gerard de Melo, Ahmed Elgammal
Focusing on text-to-image (T2I) generation, we propose Text and Image Mutual-Translation Adversarial Networks (TIME), a lightweight but effective model that jointly learns a T2I generator G and an image captioning discriminator D under the Generative Adversarial Network framework.
1 code implementation • 26 Feb 2020 • Bingchen Liu, Kunpeng Song, Ahmed Elgammal
We propose a new approach for synthesizing fully detailed art-stylized images from sketches.
1 code implementation • 10 Oct 2017 • Yuqing Zhu, Jianxun Liu, Mengying Guo, Yungang Bao, Wenlong Ma, Zhuoyue Liu, Kunpeng Song, Yingchun Yang
To help users tap the performance potential of systems, we present BestConfig, a system for automatically finding a best configuration setting within a resource limit for a deployed system under a given application workload.
Performance Databases Distributed, Parallel, and Cluster Computing Software Engineering