no code implementations • 30 Nov 2023 • Wilson Yan, Andrew Brown, Pieter Abbeel, Rohit Girdhar, Samaneh Azadi
We introduce MoCA, a Motion-Conditioned Image Animation approach for video editing.
no code implementations • 17 Nov 2023 • Rohit Girdhar, Mannat Singh, Andrew Brown, Quentin Duval, Samaneh Azadi, Sai Saketh Rambhatla, Akbar Shah, Xi Yin, Devi Parikh, Ishan Misra
We present Emu Video, a text-to-video generation model that factorizes the generation into two steps: first generating an image conditioned on the text, and then generating a video conditioned on the text and the generated image.
no code implementations • ICCV 2023 • Samaneh Azadi, Akbar Shah, Thomas Hayes, Devi Parikh, Sonal Gupta
However, existing approaches are limited by their reliance on relatively small-scale motion capture data, leading to poor performance on more diverse, in-the-wild prompts.
Ranked #25 on Motion Synthesis on HumanML3D
no code implementations • 14 Apr 2023 • Samaneh Azadi, Thomas Hayes, Akbar Shah, Guan Pang, Devi Parikh, Sonal Gupta
Recent large-scale text-to-image generation models have made significant improvements in the quality, realism, and diversity of the synthesized images and enable users to control the created content through language.
no code implementations • 1 Dec 2022 • Dong Huk Park, Grace Luo, Clayton Toste, Samaneh Azadi, Xihui Liu, Maka Karalashvili, Anna Rohrbach, Trevor Darrell
We introduce precise object silhouette as a new form of user control in text-to-image diffusion models, which we dub Shape-Guided Diffusion.
no code implementations • 10 Dec 2021 • Xihui Liu, Dong Huk Park, Samaneh Azadi, Gong Zhang, Arman Chopikyan, Yuxiao Hu, Humphrey Shi, Anna Rohrbach, Trevor Darrell
We investigate fine-grained, continuous control of this model class, and introduce a novel unified framework for semantic diffusion guidance, which allows either language or image guidance, or both.
no code implementations • 1 Jan 2021 • Samaneh Azadi, Michael Tschannen, Eric Tzeng, Sylvain Gelly, Trevor Darrell, Mario Lucic
Coupling the high-fidelity generation capabilities of label-conditional image synthesis methods with the flexibility of unconditional generative models, we propose a semantic bottleneck GAN model for unconditional synthesis of complex scenes.
2 code implementations • 26 Nov 2019 • Samaneh Azadi, Michael Tschannen, Eric Tzeng, Sylvain Gelly, Trevor Darrell, Mario Lucic
For the former, we use an unconditional progressive segmentation generation network that captures the distribution of realistic semantic scene layouts.
Ranked #1 on Image Generation on Cityscapes-5K 256x512
no code implementations • ICLR Workshop DeepGenStruct 2019 • Samaneh Azadi, Deepak Pathak, Sayna Ebrahimi, Trevor Darrell
Generative Adversarial Networks (GANs) can produce images of surprising complexity and realism but are generally structured to sample from a single latent source ignoring the explicit spatial interaction between multiple entities that could be present in a scene.
1 code implementation • ICLR 2019 • Samaneh Azadi, Catherine Olsson, Trevor Darrell, Ian Goodfellow, Augustus Odena
We propose a rejection sampling scheme using the discriminator of a GAN to approximately correct errors in the GAN generator distribution.
1 code implementation • 19 Jul 2018 • Samaneh Azadi, Deepak Pathak, Sayna Ebrahimi, Trevor Darrell
Generative Adversarial Networks (GANs) can produce images of remarkable complexity and realism but are generally structured to sample from a single latent source ignoring the explicit spatial interaction between multiple entities that could be present in a scene.
6 code implementations • CVPR 2018 • Samaneh Azadi, Matthew Fisher, Vladimir Kim, Zhaowen Wang, Eli Shechtman, Trevor Darrell
In this work, we focus on the challenge of taking partial observations of highly-stylized text and generalizing the observations to generate unobserved glyphs in the ornamented typeface.
1 code implementation • CVPR 2017 • Samaneh Azadi, Jiashi Feng, Trevor Darrell
To predict a set of diverse and informative proposals with enriched representations, this paper introduces a differentiable Determinantal Point Process (DPP) layer that is able to augment the object detection architectures.
no code implementations • 22 Nov 2015 • Samaneh Azadi, Jiashi Feng, Stefanie Jegelka, Trevor Darrell
Precisely-labeled data sets with sufficient amount of samples are very important for training deep convolutional neural networks (CNNs).