Search Results for author: Changsheng Zhao

Found 15 papers, 6 papers with code

Scaling Parameter-Constrained Language Models with Quality Data

no code implementations4 Oct 2024 Ernie Chang, Matteo Paltenghi, Yang Li, Pin-Jie Lin, Changsheng Zhao, Patrick Huber, Zechun Liu, Rastislav Rabatin, Yangyang Shi, Vikas Chandra

Scaling laws in language modeling traditionally quantify training loss as a function of dataset size and model parameters, providing compute-optimal estimates but often neglecting the impact of data quality on model generalization.

Diversity Language Modeling +1

Target-Aware Language Modeling via Granular Data Sampling

no code implementations23 Sep 2024 Ernie Chang, Pin-Jie Lin, Yang Li, Changsheng Zhao, Daeil Kim, Rastislav Rabatin, Zechun Liu, Yangyang Shi, Vikas Chandra

A cost-effective and straightforward approach is sampling with low-dimensional data features, which allows to select large-scale pretraining data for domain-specific use cases.

Language Modeling Language Modelling +2

BUPTCMCC-6G-CMG+: A GBSM-Based ISAC Standard Channel Model Generator

no code implementations22 Sep 2024 Changsheng Zhao, Jianhua Zhang, Yuxiang Zhang, Lei Tian, Heng Wang, Hanyuan Jiang, Yameng Liu, Wenjun Chen, Tao Jiang, Guangyi Liu

Integrated sensing and communication (ISAC) has been recognized as the key technology in the vision of the sixth generation (6G) era.

SpinQuant: LLM quantization with learned rotations

3 code implementations26 May 2024 Zechun Liu, Changsheng Zhao, Igor Fedorov, Bilge Soran, Dhruv Choudhary, Raghuraman Krishnamoorthi, Vikas Chandra, Yuandong Tian, Tijmen Blankevoort

With 4-bit quantization of weight, activation, and KV-cache, SpinQuant narrows the accuracy gap on zero-shot reasoning tasks with full precision to merely 2. 9 points on the LLaMA-2 7B model, surpassing LLM-QAT by 19. 1 points and SmoothQuant by 25. 0 points.

Quantization

Basis Selection: Low-Rank Decomposition of Pretrained Large Language Models for Target Applications

no code implementations24 May 2024 Yang Li, Changsheng Zhao, Hyungtak Lee, Ernie Chang, Yangyang Shi, Vikas Chandra

Large language models (LLMs) significantly enhance the performance of various applications, but they are computationally intensive and energy-demanding.

Code Generation Low-rank compression +1

On The Open Prompt Challenge In Conditional Audio Generation

no code implementations1 Nov 2023 Ernie Chang, Sidd Srinivasan, Mahi Luthra, Pin-Jie Lin, Varun Nagaraja, Forrest Iandola, Zechun Liu, Zhaoheng Ni, Changsheng Zhao, Yangyang Shi, Vikas Chandra

Text-to-audio generation (TTA) produces audio from a text description, learning from pairs of audio samples and hand-annotated text.

Audio Generation

Revisiting Sample Size Determination in Natural Language Understanding

1 code implementation1 Jul 2023 Ernie Chang, Muhammad Hassan Rashid, Pin-Jie Lin, Changsheng Zhao, Vera Demberg, Yangyang Shi, Vikas Chandra

Knowing exactly how many data points need to be labeled to achieve a certain model performance is a hugely beneficial step towards reducing the overall budgets for annotation.

Active Learning Natural Language Understanding

LLM-QAT: Data-Free Quantization Aware Training for Large Language Models

3 code implementations29 May 2023 Zechun Liu, Barlas Oguz, Changsheng Zhao, Ernie Chang, Pierre Stock, Yashar Mehdad, Yangyang Shi, Raghuraman Krishnamoorthi, Vikas Chandra

Several post-training quantization methods have been applied to large language models (LLMs), and have been shown to perform well down to 8-bits.

Data Free Quantization

Hyperparameter-free Continuous Learning for Domain Classification in Natural Language Understanding

no code implementations NAACL 2021 Ting Hua, Yilin Shen, Changsheng Zhao, Yen-Chang Hsu, Hongxia Jin

Most existing continual learning approaches suffer from low accuracy and performance fluctuation, especially when the distributions of old and new data are significantly different.

Continual Learning domain classification +1

Automatic Mixed-Precision Quantization Search of BERT

no code implementations30 Dec 2021 Changsheng Zhao, Ting Hua, Yilin Shen, Qian Lou, Hongxia Jin

Knowledge distillation, Weight pruning, and Quantization are known to be the main directions in model compression.

Knowledge Distillation Model Compression +2

Cannot find the paper you are looking for? You can Submit a new open access paper.