SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

BlinkDL/RWKV-LM 18 Nov 2022

We propose SmoothQuant, a training-free, accuracy-preserving, and general-purpose post-training quantization (PTQ) solution to enable 8-bit weight, 8-bit activation (W8A8) quantization for LLMs.


Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators

picsart-ai-research/text2video-zero 23 Mar 2023

Recent text-to-video generation approaches rely on computationally heavy training and require large-scale video datasets.

Image Generation Text-to-Video Generation +3

Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation

showlab/Tune-A-Video 22 Dec 2022

To replicate the success of text-to-image (T2I) generation, recent works employ large-scale video datasets to train a text-to-video (T2V) generator.

Style Transfer Text-to-Video Generation +1

LoRA: Low-Rank Adaptation of Large Language Models

microsoft/LoRA ICLR 2022

We propose Low-Rank Adaptation, or LoRA, which freezes the pre-trained model weights and injects trainable rank decomposition matrices into each layer of the Transformer architecture, greatly reducing the number of trainable parameters for downstream tasks.

Language Modelling

Make-It-3D: High-Fidelity 3D Creation from A Single Image with Diffusion Prior

junshutang/Make-It-3D 24 Mar 2023

In this work, we investigate the problem of creating high-fidelity 3D content from only a single image.

Text to 3D

Text2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models

lukashoel/text2room 21 Mar 2023

We present Text2Room, a method for generating room-scale textured 3D meshes from a given text prompt as input.

Monocular Depth Estimation

More than you've asked for: A Comprehensive Analysis of Novel Prompt Injection Threats to Application-Integrated Large Language Models

greshake/lm-safety 23 Feb 2023

In such attacks, an adversary can prompt the LLM to produce malicious content or override the original instructions and the employed filtering schemes.

Instruction Following Retrieval

ReVersion: Diffusion-Based Relation Inversion from Images

ziqihuangg/reversion 23 Mar 2023

Specifically, we propose a novel relation-steering contrastive learning scheme to impose two critical properties of the relation prompt: 1) The relation prompt should capture the interaction between objects, enforced by the preposition prior.

Contrastive Learning

ADAPT: Action-aware Driving Caption Transformer

jxbbb/adapt 1 Feb 2023

To bridge the gap, we propose an end-to-end transformer-based architecture, ADAPT (Action-aware Driving cAPtion Transformer), which provides user-friendly natural language narrations and reasoning for each decision making step of autonomous vehicular control and action.

Autonomous Driving Decision Making

