SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation

winfredy/sadtalker 22 Nov 2022

We present SadTalker, which generates 3D motion coefficients (head pose, expression) of the 3DMM from audio and implicitly modulates a novel 3D-aware face render for talking head generation.

Talking Head Generation

536
1.24 stars / hour

FateZero: Fusing Attentions for Zero-shot Text-based Video Editing

chenyangqiqi/fatezero 16 Mar 2023

We also have a better zero-shot shape-aware editing ability based on the text-to-video model.

Video Editing

348
1.23 stars / hour

Deep symbolic regression for physics guided by units constraints: toward the automated discovery of physical laws

wassimtenachi/physo 6 Mar 2023

Here we present $\Phi$-SO, a Physical Symbolic Optimization framework for recovering analytical symbolic expressions from physics data using deep reinforcement learning techniques by learning units constraints.

Symbolic Regression

1,195
1.20 stars / hour

GLM-130B: An Open Bilingual Pre-trained Model

thudm/glm-130b 5 Oct 2022

We introduce GLM-130B, a bilingual (English and Chinese) pre-trained language model with 130 billion parameters.

Language Modelling Multi-task Language Understanding +1

2,755
1.11 stars / hour

GPT Understands, Too

THUDM/GLM 18 Mar 2021

On the SuperGlue benchmark, GPTs achieve comparable and sometimes better performance to similar-sized BERTs in supervised learning.

Knowledge Probing Natural Language Understanding +1

986
0.95 stars / hour

GPT-4 Technical Report

openai/evals Preprint 2023

We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs.

6,246
0.94 stars / hour

Deep Learning for Camera Calibration and Beyond: A Survey

kangliao929/awesome-deep-camera-calibration 19 Mar 2023

In this paper, we provide a comprehensive survey of learning-based camera calibration techniques, by analyzing their strengths and limitations.

Camera Calibration

77
0.90 stars / hour

TriAAN-VC: Triple Adaptive Attention Normalization for Any-to-Any Voice Conversion

winddori2002/TriAAN-VC 16 Mar 2023

The existing methods do not simultaneously satisfy the above two aspects of VC, and their conversion outputs suffer from a trade-off problem between maintaining source contents and target characteristics.

Voice Conversion

23
0.87 stars / hour

CAT-Seg: Cost Aggregation for Open-Vocabulary Semantic Segmentation

KU-CVLAB/CAT-Seg 21 Mar 2023

However, the problem of transferring these capabilities learned from image-level supervision to the pixel-level task of segmentation and addressing arbitrary unseen categories at inference makes this task challenging.

Image Segmentation Open Vocabulary Semantic Segmentation +2

33
0.85 stars / hour

Generative Semantic Segmentation

fudan-zvg/gss 20 Mar 2023

To that end, the segmentation mask is expressed with a special type of image (dubbed as maskige).

Semantic Segmentation

53
0.83 stars / hour