Search Results for author: Shenggan Cheng

Found 5 papers, 2 papers with code

AutoChunk: Automated Activation Chunk for Memory-Efficient Long Sequence Inference

no code implementations19 Jan 2024 Xuanlei Zhao, Shenggan Cheng, Guangyang Lu, Jiarui Fang, Haotian Zhou, Bin Jia, Ziming Liu, Yang You

The experiments demonstrate that AutoChunk can reduce over 80\% of activation memory while maintaining speed loss within 10%, extend max sequence length by 3. 2x to 11. 7x, and outperform state-of-the-art methods by a large margin.

Code Generation

DSP: Dynamic Sequence Parallelism for Multi-Dimensional Transformers

1 code implementation15 Mar 2024 Xuanlei Zhao, Shenggan Cheng, Zangwei Zheng, Zheming Yang, Ziming Liu, Yang You

Scaling large models with long sequences across applications like language generation, video generation and multimodal tasks requires efficient sequence parallelism.

Text Generation Video Generation

FTL: A universal framework for training low-bit DNNs via Feature Transfer

no code implementations ECCV 2020 Kunyuan Du, Ya zhang, Haibing Guan, Qi Tian, Shenggan Cheng, James Lin

Compared with low-bit models trained directly, the proposed framework brings 0. 5% to 3. 4% accuracy gains to three different quantization schemes.

Quantization Transfer Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.