SHERF: Generalizable Human NeRF from a Single Image

To this end, we propose a bank of 3D-aware hierarchical features, including global, point-level, and pixel-aligned features, to facilitate informative encoding.

3D Human Reconstruction

NeRF-LOAM: Neural Implicit Representation for Large-Scale Incremental LiDAR Odometry and Mapping

To bridge this gap, in this paper, we propose a novel NeRF-based LiDAR odometry and mapping approach, NeRF-LOAM, consisting of three modules neural odometry, neural mapping, and mesh reconstruction.

SadTalker: Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation

We present SadTalker, which generates 3D motion coefficients (head pose, expression) of the 3DMM from audio and implicitly modulates a novel 3D-aware face render for talking head generation.

Talking Head Generation

LoRA: Low-Rank Adaptation of Large Language Models

We propose Low-Rank Adaptation, or LoRA, which freezes the pre-trained model weights and injects trainable rank decomposition matrices into each layer of the Transformer architecture, greatly reducing the number of trainable parameters for downstream tasks.

Language Modelling

Learning Context-aware Classifier for Semantic Segmentation

Semantic segmentation is still a challenging task for parsing diverse contexts in different scenes, thus the fixed classifier might not be able to well address varying feature distributions during testing.

Semantic Segmentation

SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot

We show for the first time that large-scale generative pretrained transformer (GPT) family models can be pruned to at least 50% sparsity in one-shot, without any retraining, at minimal loss of accuracy.

 Ranked #1 on Language Modelling on WikiText-2 (using extra training data)

Common Sense Reasoning Language Modelling +2

Neural Preset for Color Style Transfer

In this paper, we present a Neural Preset technique to address the limitations of existing color style transfer methods, including visual artifacts, vast memory requirement, and slow style switching speed.

Image Dehazing Image Harmonization +2

FateZero: Fusing Attentions for Zero-shot Text-based Video Editing

We also have a better zero-shot shape-aware editing ability based on the text-to-video model.

Video Editing

Planning-oriented Autonomous Driving

Oriented at this, we revisit the key components within perception and prediction, and prioritize the tasks such that all these tasks contribute to planning.

Autonomous Driving Philosophy

Ablating Concepts in Text-to-Image Diffusion Models

To achieve this goal, we propose an efficient method of ablating concepts in the pretrained model, i. e., preventing the generation of a target concept.

