VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation

modelscope/modelscope 15 Mar 2023

A diffusion probabilistic model (DPM), which constructs a forward diffusion process by gradually adding noise to data points and learns the reverse denoising process to generate new samples, has been shown to handle complex data distribution.

Denoising Image Generation +1

1,546
0.41 stars / hour

GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers

ist-daslab/gptq 31 Oct 2022

In this paper, we address this challenge, and propose GPTQ, a new one-shot weight quantization method based on approximate second-order information, that is both highly-accurate and highly-efficient.

Language Modelling Model Compression +1

317
0.40 stars / hour

Planning-oriented Autonomous Driving

opendrivelab/uniad 20 Dec 2022

Oriented at this, we revisit the key components within perception and prediction, and prioritize the tasks such that all these tasks contribute to planning.

Autonomous Driving Philosophy

355
0.37 stars / hour

Masked Scene Contrast: A Scalable Framework for Unsupervised 3D Representation Learning

Pointcept/Pointcept 24 Mar 2023

As a pioneering work, PointContrast conducts unsupervised 3D representation learning via leveraging contrastive learning over raw RGB-D frames and proves its effectiveness on various downstream tasks.

 Ranked #1 on Semantic Segmentation on ScanNet (using extra training data)

Contrastive Learning Data Augmentation +3

159
0.36 stars / hour

Ablating Concepts in Text-to-Image Diffusion Models

nupurkmr9/concept-ablation 23 Mar 2023

To achieve this goal, we propose an efficient method of ablating concepts in the pretrained model, i. e., preventing the generation of a target concept.

41
0.36 stars / hour

MaskSketch: Unpaired Structure-guided Masked Image Generation

lllyasviel/controlnet 10 Feb 2023

We show that intermediate self-attention maps of a masked generative transformer encode important structural information of the input image, such as scene layout and object shape, and we propose a novel sampling method based on this observation to enable structure-guided generation.

Conditional Image Generation Image-to-Image Translation +2

15,396
0.36 stars / hour

CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis

salesforce/CodeGen 25 Mar 2022

To democratize this, we train and release a family of large language models up to 16. 1B parameters, called CODEGEN, on natural language and programming language data, and open source the training library JAXFORMER.

Code Generation Language Modelling +1

2,738
0.35 stars / hour

Bringing Inputs to Shared Domains for 3D Interacting Hands Recovery in the Wild

facebookresearch/interwild 23 Mar 2023

Hence, interacting hands of MoCap datasets are brought to the 2D scale space of single hands of ITW datasets.

29
0.35 stars / hour

High Fidelity Image Synthesis With Deep VAEs In Latent Space

ericl122333/latent-vae 23 Mar 2023

With this method, the VAE avoids modeling the fine-grained details that constitute the majority of the image's code length, allowing it to focus on learning its structural components.

Image Generation

14
0.35 stars / hour

Robust Speech Recognition via Large-Scale Weak Supervision

openai/whisper Preprint 2022

We study the capabilities of speech processing systems trained simply to predict large amounts of transcripts of audio on the internet.

Robust Speech Recognition speech-recognition

30,124
0.34 stars / hour