DPE: Disentanglement of Pose and Expression for General Video Portrait Editing

winfredy/sadtalker 16 Jan 2023

In this paper, we introduce a novel self-supervised disentanglement framework to decouple pose and expression without 3DMMs and paired data, which consists of a motion editing module, a pose generator, and an expression generator.

Disentanglement Talking Face Generation +1

Effectively Modeling Time Series with Simple Discrete State Spaces

hazyresearch/spacetime 16 Mar 2023

For expressivity, we propose a new SSM parameterization based on the companion matrix -- a canonical representation for discrete-time processes -- which enables SpaceTime's SSM layers to learn desirable autoregressive processes.

Time Series Classification

Parameter is Not All You Need: Starting from Non-Parametric Networks for 3D Point Cloud Analysis

zrrskywalker/point-nn 14 Mar 2023

We present a Non-parametric Network for 3D point cloud analysis, Point-NN, which consists of purely non-learnable components: farthest point sampling (FPS), k-nearest neighbors (k-NN), and pooling operations, with trigonometric functions.

3D Point Cloud Classification Training-free 3D Part Segmentation +1

Universal Instance Perception as Object Discovery and Retrieval

MasterBin-IIAU/UNINEXT 12 Mar 2023

All instance perception tasks aim at finding certain objects specified by some queries such as category names, language expressions, and target annotations, but this complete field has been split into multiple independent subtasks.

Multi-Object Tracking and Segmentation Multiple Object Tracking +12

Erasing Concepts from Diffusion Models

rohitgandikota/erasing 13 Mar 2023

We propose a fine-tuning method that can erase a visual concept from a pre-trained diffusion model, given only the name of the style and using negative guidance as a teacher.

Text-based Image Editing

ChatGPT Asks, BLIP-2 Answers: Automatic Questioning Towards Enriched Visual Descriptions

vision-cair/chatcaptioner 12 Mar 2023

By keeping acquiring new visual information from BLIP-2's answers, ChatCaptioner is able to generate more enriched image descriptions.

Image Captioning Question Answering

Eliciting Latent Predictions from Transformers with the Tuned Lens

alignmentresearch/tuned-lens 14 Mar 2023

We analyze transformers from the perspective of iterative inference, seeking to understand how model predictions are refined layer by layer.

Language Modelling

Zero-Shot Information Extraction via Chatting with ChatGPT

cocacola-lab/chatie 20 Feb 2023

Zero-shot information extraction (IE) aims to build IE systems from the unannotated text.

Event Extraction named-entity-recognition +3

DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation

cloneofsimo/lora 25 Aug 2022

Once the subject is embedded in the output domain of the model, the unique identifier can be used to synthesize novel photorealistic images of the subject contextualized in different scenes.

Image Generation

Efficient Teacher: Semi-Supervised Object Detection for YOLOv5

AlibabaResearch/efficientteacher 15 Feb 2023

The Pseudo Label Assigner prevents the occurrence of bias caused by a large number of low-quality pseudo labels that may interfere with the Dense Detector during the student-teacher mutual learning mechanism, and the Epoch Adaptor utilizes domain and distribution adaptation to allow Dense Detector to learn globally distributed consistent features, making the training independent of the proportion of labeled data.

object-detection Object Detection +2

