Images Speak in Images: A Generalist Painter for In-Context Visual Learning

baaivision/painter 5 Dec 2022

In this work, we present Painter, a generalist model which addresses these obstacles with an "image"-centric solution, that is, to redefine the output of core vision tasks as images, and specify task prompts as also images.

Keypoint Detection Semantic Segmentation

Zero-Shot Image Restoration Using Denoising Diffusion Null-Space Model

wyhuai/ddnm 1 Dec 2022

Most existing Image Restoration (IR) models are task-specific, which can not be generalized to different degradation operators.

Colorization Deblurring +7

Melody transcription via generative pre-training

chrisdonahue/sheetsage 4 Dec 2022

The combination of generative pre-training and a new dataset for this task results in $77$% stronger performance on melody transcription relative to the strongest available baseline.

Chord Recognition Information Retrieval +2

ExtremeBERT: A Toolkit for Accelerating Pretraining of Customized BERT

extreme-bert/extreme-bert 30 Nov 2022

In this paper, we present ExtremeBERT, a toolkit for accelerating and customizing BERT pretraining.

Molecular System Prediction Sentence Classification

DAMO-YOLO : A Report on Real-Time Object Detection Design

tinyvision/damo-yolo 23 Nov 2022

In this report, we present a fast and accurate object detection method dubbed DAMO-YOLO, which achieves higher performance than the state-of-the-art YOLO series.

Neural Architecture Search object-detection +1

Score Jacobian Chaining: Lifting Pretrained 2D Diffusion Models for 3D Generation

pals-ttic/sjc 1 Dec 2022

We propose to apply chain rule on the learned gradients, and back-propagate the score of a diffusion model through the Jacobian of a differentiable renderer, which we instantiate to be a voxel radiance field.

GET3D: A Generative Model of High Quality 3D Textured Shapes Learned from Images

nv-tlabs/GET3D 22 Sep 2022

As several industries are moving towards modeling massive 3D virtual worlds, the need for content creation tools that can scale in terms of the quantity, quality, and diversity of 3D content is becoming evident.

DiffusionBERT: Improving Generative Masked Language Models with Diffusion Models

hzfinfdu/diffusion-bert 28 Nov 2022

We present DiffusionBERT, a new generative masked language model based on discrete diffusion models.

Denoising Language Modelling +1

Compressing Volumetric Radiance Fields to 1 MB

algohunt/vqrf 29 Nov 2022

Approximating radiance fields with volumetric grids is one of promising directions for improving NeRF, represented by methods like Plenoxels and DVGO, which achieve super-fast training convergence and real-time rendering.

Model Compression Neural Rendering +1

