LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention

zrrskywalker/llama-adapter 28 Mar 2023

We present LLaMA-Adapter, a lightweight adaption method to efficiently fine-tune LLaMA into an instruction-following model.

Instruction Following Language Modelling +2

1,326
9.56 stars / hour

ChatDoctor: A Medical Chat Model Fine-tuned on LLaMA Model using Medical Domain Knowledge

kent0n-li/chatdoctor 24 Mar 2023

Recent large language models (LLMs) in the general domain, such as ChatGPT, have shown remarkable success in following instructions and producing human-like responses.

Medical Diagnosis

1,491
6.32 stars / hour

Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators

picsart-ai-research/text2video-zero 23 Mar 2023

Recent text-to-video generation approaches rely on computationally heavy training and require large-scale video datasets.

Image Generation Text-to-Video Generation +3

2,017
6.14 stars / hour

Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation

showlab/Tune-A-Video 22 Dec 2022

To replicate the success of text-to-image (T2I) generation, recent works employ large-scale video datasets to train a text-to-video (T2V) generator.

Style Transfer Text-to-Video Generation +1

2,444
4.66 stars / hour

PAniC-3D: Stylized Single-view 3D Reconstruction from Portraits of Anime Characters

shuhongchen/panic3d-anime-reconstruction 25 Mar 2023

We propose PAniC-3D, a system to reconstruct stylized 3D character heads directly from illustrated (p)ortraits of (ani)me (c)haracters.

3D Reconstruction Single-View 3D Reconstruction

294
3.26 stars / hour

P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks

huggingface/peft 14 Oct 2021

Prompt tuning, which only tunes continuous prompts with a frozen language model, substantially reduces per-task storage and memory usage at training.

Language Modelling

2,849
3.10 stars / hour

Exploring the Impact of Instruction Data Scaling on Large Language Models: An Empirical Study on Real-World Use Cases

lianjiatech/belle 26 Mar 2023

However current research rarely studies the impact of different amounts of instruction data on model performance, especially in the real-world use cases.

2,794
2.74 stars / hour

Colossal-Auto: Unified Automation of Parallelization and Activation Checkpoint for Large-scale Models

hpcaitech/colossalai 6 Feb 2023

To address these challenges, we introduce a system that can jointly optimize distributed execution and gradient checkpointing plans.

Scheduling

25,490
2.18 stars / hour

Make-It-3D: High-Fidelity 3D Creation from A Single Image with Diffusion Prior

junshutang/Make-It-3D 24 Mar 2023

In this work, we investigate the problem of creating high-fidelity 3D content from only a single image.

Text to 3D

385
2.17 stars / hour

SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

BlinkDL/RWKV-LM 18 Nov 2022

We propose SmoothQuant, a training-free, accuracy-preserving, and general-purpose post-training quantization (PTQ) solution to enable 8-bit weight, 8-bit activation (W8A8) quantization for LLMs.

Quantization

4,611
1.92 stars / hour