Planning-oriented Autonomous Driving

opendrivelab/uniad 20 Dec 2022

Oriented at this, we revisit the key components within perception and prediction, and prioritize the tasks such that all these tasks contribute to planning.

Autonomous Driving Philosophy

DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing

microsoft/DeBERTa 18 Nov 2021

We thus propose a new gradient-disentangled embedding sharing method that avoids the tug-of-war dynamics, improving both training efficiency and the quality of the pre-trained model.

Natural Language Inference Natural Language Understanding +2

Generative Semantic Segmentation

fudan-zvg/gss 20 Mar 2023

To that end, the segmentation mask is expressed with a special type of image (dubbed as maskige).

Semantic Segmentation

Deep Learning for Camera Calibration and Beyond: A Survey

kangliao929/awesome-deep-camera-calibration 19 Mar 2023

In this paper, we provide a comprehensive survey of learning-based camera calibration techniques, by analyzing their strengths and limitations.

Camera Calibration

GPT Understands, Too

THUDM/GLM 18 Mar 2021

On the SuperGlue benchmark, GPTs achieve comparable and sometimes better performance to similar-sized BERTs in supervised learning.

Knowledge Probing Natural Language Understanding +1

LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale

timdettmers/bitsandbytes 15 Aug 2022

We develop a procedure for Int8 matrix multiplication for feed-forward and attention projection layers in transformers, which cut the memory needed for inference by half while retaining full precision performance.

Language Modelling Linguistic Acceptability +4

Spherical Transformer for LiDAR-based 3D Recognition

dvlab-research/sphereformer 22 Mar 2023

In this work, we study the varying-sparsity distribution of LiDAR points and present SphereFormer to directly aggregate information from dense close points to the sparse distant ones.

3D Object Detection 3D Semantic Segmentation +2

DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation

cloneofsimo/lora 25 Aug 2022

Once the subject is embedded in the output domain of the model, the unique identifier can be used to synthesize novel photorealistic images of the subject contextualized in different scenes.

Image Generation

TriAAN-VC: Triple Adaptive Attention Normalization for Any-to-Any Voice Conversion

winddori2002/TriAAN-VC 16 Mar 2023

The existing methods do not simultaneously satisfy the above two aspects of VC, and their conversion outputs suffer from a trade-off problem between maintaining source contents and target characteristics.

Voice Conversion

On the De-duplication of LAION-2B

ryanwebster90/snip-dedup 17 Mar 2023

Generative models, such as DALL-E, Midjourney, and Stable Diffusion, have societal implications that extend beyond the field of computer science.

