SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation

winfredy/sadtalker 22 Nov 2022

We present SadTalker, which generates 3D motion coefficients (head pose, expression) of the 3DMM from audio and implicitly modulates a novel 3D-aware face render for talking head generation.

Talking Head Generation

1.39 stars / hour

Learning Context-aware Classifier for Semantic Segmentation

Pointcept/Pointcept 21 Mar 2023

Semantic segmentation is still a challenging task for parsing diverse contexts in different scenes, thus the fixed classifier might not be able to well address varying feature distributions during testing.

Semantic Segmentation

1.34 stars / hour

ReVersion: Diffusion-Based Relation Inversion from Images

ziqihuangg/reversion 23 Mar 2023

Specifically, we propose a novel relation-steering contrastive learning scheme to impose two critical properties of the relation prompt: 1) The relation prompt should capture the interaction between objects, enforced by the preposition prior.

Contrastive Learning

1.33 stars / hour

Deep symbolic regression for physics guided by units constraints: toward the automated discovery of physical laws

wassimtenachi/physo 6 Mar 2023

Here we present $\Phi$-SO, a Physical Symbolic Optimization framework for recovering analytical symbolic expressions from physics data using deep reinforcement learning techniques by learning units constraints.

Symbolic Regression

1.05 stars / hour

SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot

ist-daslab/sparsegpt 2 Jan 2023

We show for the first time that large-scale generative pretrained transformer (GPT) family models can be pruned to at least 50% sparsity in one-shot, without any retraining, at minimal loss of accuracy.

 Ranked #1 on Language Modelling on WikiText-2 (using extra training data)

Common Sense Reasoning Language Modelling +2

1.04 stars / hour

CLIP goes 3D: Leveraging Prompt Tuning for Language Grounded 3D Recognition

deeptibhegde/clip-goes-3d 20 Mar 2023

Attempting to train the visual and text encoder to account for this shift results in catastrophic forgetting and a notable decrease in performance.

Retrieval Scene Understanding

0.99 stars / hour

FateZero: Fusing Attentions for Zero-shot Text-based Video Editing

chenyangqiqi/fatezero 16 Mar 2023

We also have a better zero-shot shape-aware editing ability based on the text-to-video model.

Video Editing

0.98 stars / hour

SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

BlinkDL/RWKV-LM 18 Nov 2022

We propose SmoothQuant, a training-free, accuracy-preserving, and general-purpose post-training quantization (PTQ) solution to enable 8-bit weight, 8-bit activation (W8A8) quantization for LLMs.


0.93 stars / hour

GPT-4 Technical Report

openai/evals Preprint 2023

We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs.

0.90 stars / hour

LoRA: Low-Rank Adaptation of Large Language Models

microsoft/LoRA ICLR 2022

We propose Low-Rank Adaptation, or LoRA, which freezes the pre-trained model weights and injects trainable rank decomposition matrices into each layer of the Transformer architecture, greatly reducing the number of trainable parameters for downstream tasks.

Language Modelling

0.86 stars / hour