Search Results for author: Shang Yang

Found 3 papers, 2 papers with code

TorchSparse++: Efficient Training and Inference Framework for Sparse Convolution on GPUs

1 code implementation25 Oct 2023 Haotian Tang, Shang Yang, Zhijian Liu, Ke Hong, Zhongming Yu, Xiuyu Li, Guohao Dai, Yu Wang, Song Han

On top of this, we design the Sparse Autotuner, which extends the design space of existing sparse convolution libraries and searches for the best dataflow configurations for training and inference workloads.

Autonomous Driving Recommendation Systems

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

2 code implementations1 Jun 2023 Ji Lin, Jiaming Tang, Haotian Tang, Shang Yang, Xingyu Dang, Chuang Gan, Song Han

Large language models (LLMs) have shown excellent performance on various tasks, but the astronomical model size raises the hardware barrier for serving (memory size) and slows down token generation (memory bandwidth).

Common Sense Reasoning Language Modelling +1

Cannot find the paper you are looking for? You can Submit a new open access paper.