InstructPix2Pix: Learning to Follow Image Editing Instructions

timothybrooks/instruct-pix2pix 17 Nov 2022

We propose a method for editing images from human instructions: given an input image and a written instruction that tells the model what to do, our model follows these instructions to edit the image.

Language Modelling Text-based Image Editing +1

2,514
7.30 stars / hour

StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis

autonomousvision/stylegan-t 23 Jan 2023

Text-to-image synthesis has recently seen significant progress thanks to large pretrained language models, large-scale training data, and the introduction of scalable model families such as diffusion and autoregressive models.

Pretrained Language Models Text-to-Image Generation

245
4.21 stars / hour

Demonstrate-Search-Predict: Composing retrieval and language models for knowledge-intensive NLP

stanfordnlp/dsp 28 Dec 2022

Retrieval-augmented in-context learning has emerged as a powerful approach for addressing knowledge-intensive tasks using frozen language models (LM) and retrieval models (RM).

Retrieval

143
2.55 stars / hour

Learning the Beauty in Songs: Neural Singing Voice Beautifier

MoonInTheRiver/DiffSinger ACL 2022

Furthermore, we propose a latent-mapping algorithm in the latent space to convert the amateur vocal tone to the professional one.

Dynamic Time Warping

1,399
1.82 stars / hour

K-Planes: Explicit Radiance Fields in Space, Time, and Appearance

sarafridov/k-planes 24 Jan 2023

We introduce k-planes, a white-box model for radiance fields in arbitrary dimensions.

70
1.58 stars / hour

Hungry Hungry Hippos: Towards Language Modeling with State Space Models

hazyresearch/h3 28 Dec 2022

First, we use synthetic language modeling tasks to understand the gap between SSMs and attention.

Few-Shot Learning Language Modelling

149
1.32 stars / hour

Learning-Rate-Free Learning by D-Adaptation

facebookresearch/dadaptation 18 Jan 2023

In this work, we describe a single-loop method, with no back-tracking or line searches, which does not require knowledge of $D$ yet asymptotically achieves the optimal rate of convergence for the complexity class of convex Lipschitz functions.

165
1.04 stars / hour

Towards Robust Blind Face Restoration with Codebook Lookup Transformer

sczhou/codeformer 22 Jun 2022

In this paper, we demonstrate that a learned discrete codebook prior in a small proxy space largely reduces the uncertainty and ambiguity of restoration mapping by casting blind face restoration as a code prediction task, while providing rich visual atoms for generating high-quality faces.

Blind Face Restoration

3,953
0.76 stars / hour

Multiview Compressive Coding for 3D Reconstruction

facebookresearch/mcc 19 Jan 2023

We introduce a simple framework that operates on 3D points of single objects or whole scenes coupled with category-agnostic large-scale training from diverse RGB-D videos.

3D Reconstruction Self-Supervised Learning +1

176
0.69 stars / hour