Search Results for author: Chitwan Saharia

Found 16 papers, 8 papers with code

TryOnDiffusion: A Tale of Two UNets

1 code implementation CVPR 2023 Luyang Zhu, Dawei Yang, Tyler Zhu, Fitsum Reda, William Chan, Chitwan Saharia, Mohammad Norouzi, Ira Kemelmacher-Shlizerman

Given two images depicting a person and a garment worn by another person, our goal is to generate a visualization of how the garment might look on the input person.

Virtual Try-on

Synthetic Data from Diffusion Models Improves ImageNet Classification

no code implementations17 Apr 2023 Shekoofeh Azizi, Simon Kornblith, Chitwan Saharia, Mohammad Norouzi, David J. Fleet

Deep generative models are becoming increasingly powerful, now generating diverse high fidelity photo-realistic samples given text prompts.

Classification Data Augmentation

Denoising Diffusion Probabilistic Models for Robust Image Super-Resolution in the Wild

no code implementations15 Feb 2023 Hshmat Sahak, Daniel Watson, Chitwan Saharia, David Fleet

Diffusion models have shown promising results on single-image super-resolution and other image- to-image translation tasks.

Blind Super-Resolution Denoising +2

Character-Aware Models Improve Visual Text Rendering

1 code implementation20 Dec 2022 Rosanne Liu, Dan Garrette, Chitwan Saharia, William Chan, Adam Roberts, Sharan Narang, Irina Blok, RJ Mical, Mohammad Norouzi, Noah Constant

In the text-only domain, we find that character-aware models provide large gains on a novel spelling task (WikiSpell).

Image Generation

Imagen Editor and EditBench: Advancing and Evaluating Text-Guided Image Inpainting

no code implementations CVPR 2023 Su Wang, Chitwan Saharia, Ceslee Montgomery, Jordi Pont-Tuset, Shai Noy, Stefano Pellegrini, Yasumasa Onoe, Sarah Laszlo, David J. Fleet, Radu Soricut, Jason Baldridge, Mohammad Norouzi, Peter Anderson, William Chan

Through extensive human evaluation on EditBench, we find that object-masking during training leads to across-the-board improvements in text-image alignment -- such that Imagen Editor is preferred over DALL-E 2 and Stable Diffusion -- and, as a cohort, these models are better at object-rendering than text-rendering, and handle material/color/size attributes better than count/shape attributes.

Image Inpainting Object +1

Re-Imagen: Retrieval-Augmented Text-to-Image Generator

no code implementations29 Sep 2022 Wenhu Chen, Hexiang Hu, Chitwan Saharia, William W. Cohen

To further evaluate the capabilities of the model, we introduce EntityDrawBench, a new benchmark that evaluates image generation for diverse entities, from frequent to rare, across multiple object categories including dogs, foods, landmarks, birds, and characters.

Retrieval Text Retrieval +1

Deblurring via Stochastic Refinement

no code implementations CVPR 2022 Jay Whang, Mauricio Delbracio, Hossein Talebi, Chitwan Saharia, Alexandros G. Dimakis, Peyman Milanfar

Unlike existing techniques, we train a stochastic sampler that refines the output of a deterministic predictor and is capable of producing a diverse set of plausible reconstructions for a given input.

Deblurring Image Deblurring

Palette: Image-to-Image Diffusion Models

4 code implementations10 Nov 2021 Chitwan Saharia, William Chan, Huiwen Chang, Chris A. Lee, Jonathan Ho, Tim Salimans, David J. Fleet, Mohammad Norouzi

We expect this standardized evaluation protocol to play a role in advancing image-to-image translation research.

Colorization Denoising +5

Cascaded Diffusion Models for High Fidelity Image Generation

no code implementations30 May 2021 Jonathan Ho, Chitwan Saharia, William Chan, David J. Fleet, Mohammad Norouzi, Tim Salimans

We show that cascaded diffusion models are capable of generating high fidelity images on the class-conditional ImageNet generation benchmark, without any assistance from auxiliary image classifiers to boost sample quality.

Data Augmentation Image Generation +2

Non-Autoregressive Machine Translation with Latent Alignments

2 code implementations EMNLP 2020 Chitwan Saharia, William Chan, Saurabh Saxena, Mohammad Norouzi

In addition, we adapt the Imputer model for non-autoregressive machine translation and demonstrate that Imputer with just 4 generation steps can match the performance of an autoregressive Transformer baseline.

Machine Translation Translation

Combating False Negatives in Adversarial Imitation Learning

no code implementations2 Feb 2020 Konrad Zolna, Chitwan Saharia, Leonard Boussioux, David Yu-Tung Hui, Maxime Chevalier-Boisvert, Dzmitry Bahdanau, Yoshua Bengio

In adversarial imitation learning, a discriminator is trained to differentiate agent episodes from expert demonstrations representing the desired behavior.

Imitation Learning

BabyAI: A Platform to Study the Sample Efficiency of Grounded Language Learning

6 code implementations ICLR 2019 Maxime Chevalier-Boisvert, Dzmitry Bahdanau, Salem Lahlou, Lucas Willems, Chitwan Saharia, Thien Huu Nguyen, Yoshua Bengio

Allowing humans to interactively train artificial agents to understand language instructions is desirable for both practical and scientific reasons, but given the poor data efficiency of the current learning methods, this goal may require substantial research efforts.

Grounded language learning

Cannot find the paper you are looking for? You can Submit a new open access paper.