Search Results for author: Qihuang Zhong

Found 17 papers, 10 papers with code

ROSE Doesn't Do That: Boosting the Safety of Instruction-Tuned Large Language Models with Reverse Prompt Contrastive Decoding

no code implementations • 19 Feb 2024 • Qihuang Zhong, Liang Ding, Juhua Liu, Bo Du, DaCheng Tao

With the development of instruction-tuned large language models (LLMs), improving the safety of LLMs has become more critical.

Paper
Add Code

Revisiting Knowledge Distillation for Autoregressive Language Models

no code implementations • 19 Feb 2024 • Qihuang Zhong, Liang Ding, Li Shen, Juhua Liu, Bo Du, DaCheng Tao

Knowledge distillation (KD) is a common approach to compress a teacher model to reduce its inference cost and memory footprint, by training a smaller student model.

Knowledge Distillation

Paper
Add Code

Zero-Shot Sharpness-Aware Quantization for Pre-trained Language Models

no code implementations • 20 Oct 2023 • Miaoxi Zhu, Qihuang Zhong, Li Shen, Liang Ding, Juhua Liu, Bo Du, DaCheng Tao

The key algorithm in solving ZSAQ is the SAM-SGA optimization, which aims to improve the quantization accuracy and model generalization via optimizing a minimax problem.

Language Modelling Quantization

Paper
Add Code

Self-Evolution Learning for Discriminative Language Model Pretraining

1 code implementation • 24 May 2023 • Qihuang Zhong, Liang Ding, Juhua Liu, Bo Du, DaCheng Tao

Masked language modeling, widely used in discriminative language model (e. g., BERT) pretraining, commonly adopts a random masking strategy.

Language Modelling Masked Language Modeling +1

Paper
Code

Revisiting Token Dropping Strategy in Efficient BERT Pretraining

1 code implementation • 24 May 2023 • Qihuang Zhong, Liang Ding, Juhua Liu, Xuebo Liu, Min Zhang, Bo Du, DaCheng Tao

Token dropping is a recently-proposed strategy to speed up the pretraining of masked language models, such as BERT, by skipping the computation of a subset of the input tokens at several middle layers.

Paper
Code

Self-Evolution Learning for Mixup: Enhance Data Augmentation on Few-Shot Text Classification Tasks

no code implementations • 22 May 2023 • Haoqi Zheng, Qihuang Zhong, Liang Ding, Zhiliang Tian, Xin Niu, Dongsheng Li, DaCheng Tao

However, most of the mixup methods do not consider the varying degree of learning difficulty in different stages of training and generate new samples with one hot labels, resulting in the model over confidence.

Data Augmentation Few-Shot Text Classification +1

Paper
Add Code

Towards Making the Most of ChatGPT for Machine Translation

1 code implementation • 24 Mar 2023 • Keqin Peng, Liang Ding, Qihuang Zhong, Li Shen, Xuebo Liu, Min Zhang, Yuanxin Ouyang, DaCheng Tao

We show that: 1) The performance of ChatGPT depends largely on temperature, and a lower temperature usually can achieve better performance; 2) Emphasizing the task information can further improve ChatGPT's performance, particularly in complex MT tasks; 3) Introducing domain information can elicit ChatGPT's generalization ability and improve its performance in the specific domain; 4) ChatGPT tends to generate hallucinations for non-English-centric MT tasks, which can be partially addressed by our proposed prompts but still need to be highlighted for the MT/NLP community.

In-Context Learning Machine Translation +2

Paper
Code

AdaSAM: Boosting Sharpness-Aware Minimization with Adaptive Learning Rate and Momentum for Training Deep Neural Networks

no code implementations • 1 Mar 2023 • Hao Sun, Li Shen, Qihuang Zhong, Liang Ding, Shixiang Chen, Jingwei Sun, Jing Li, Guangzhong Sun, DaCheng Tao

Integrating SAM with adaptive learning rate and momentum acceleration, dubbed AdaSAM, has already been explored empirically to train large-scale deep neural networks without theoretical guarantee due to the triple difficulties in analyzing the coupled perturbation step, adaptive learning rate and momentum step.

Paper
Add Code

Can ChatGPT Understand Too? A Comparative Study on ChatGPT and Fine-tuned BERT

1 code implementation • 19 Feb 2023 • Qihuang Zhong, Liang Ding, Juhua Liu, Bo Du, DaCheng Tao

Recently, ChatGPT has attracted great attention, as it can generate fluent and high-quality responses to human inquiries.

Question Answering Sentiment Analysis

191

Paper
Code

Bag of Tricks for Effective Language Model Pretraining and Downstream Adaptation: A Case Study on GLUE

no code implementations • 18 Feb 2023 • Qihuang Zhong, Liang Ding, Keqin Peng, Juhua Liu, Bo Du, Li Shen, Yibing Zhan, DaCheng Tao

This technical report briefly describes our JDExplore d-team's submission Vega v1 on the General Language Understanding Evaluation (GLUE) leaderboard, where GLUE is a collection of nine natural language understanding tasks, including question answering, linguistic acceptability, sentiment analysis, text similarity, paraphrase detection, and natural language inference.

Contrastive Learning Denoising +12

Paper
Add Code

Toward Efficient Language Model Pretraining and Downstream Adaptation via Self-Evolution: A Case Study on SuperGLUE

no code implementations • 4 Dec 2022 • Qihuang Zhong, Liang Ding, Yibing Zhan, Yu Qiao, Yonggang Wen, Li Shen, Juhua Liu, Baosheng Yu, Bo Du, Yixin Chen, Xinbo Gao, Chunyan Miao, Xiaoou Tang, DaCheng Tao

This technical report briefly describes our JDExplore d-team's Vega v2 submission on the SuperGLUE leaderboard.

Ranked #1 on Common Sense Reasoning on ReCoRD

Common Sense Reasoning coreference-resolution +5

Paper
Add Code

Improving Sharpness-Aware Minimization with Fisher Mask for Better Generalization on Language Models

1 code implementation • 11 Oct 2022 • Qihuang Zhong, Liang Ding, Li Shen, Peng Mi, Juhua Liu, Bo Du, DaCheng Tao

Fine-tuning large pretrained language models on a limited training corpus usually suffers from poor generalization.

Paper
Code

PANDA: Prompt Transfer Meets Knowledge Distillation for Efficient Model Adaptation

1 code implementation • 22 Aug 2022 • Qihuang Zhong, Liang Ding, Juhua Liu, Bo Du, DaCheng Tao

Prompt Transfer (PoT) is a recently-proposed approach to improve prompt-tuning, by initializing the target prompt with the existing prompt trained on similar source tasks.

General Knowledge Knowledge Distillation +1

Paper
Code

E2S2: Encoding-Enhanced Sequence-to-Sequence Pretraining for Language Understanding and Generation

1 code implementation • 30 May 2022 • Qihuang Zhong, Liang Ding, Juhua Liu, Bo Du, DaCheng Tao

To verify our hypothesis, we first empirically study the functionalities of the encoder and decoder in seq2seq pretrained language models, and find that the encoder takes an important but under-exploitation role than the decoder regarding the downstream performance and neuron activation.

Denoising Language Modelling +2

Paper
Code

A Contrastive Cross-Channel Data Augmentation Framework for Aspect-based Sentiment Analysis

1 code implementation • COLING 2022 • Bing Wang, Liang Ding, Qihuang Zhong, Ximing Li, DaCheng Tao

Aspect-based sentiment analysis (ABSA) is a fine-grained sentiment analysis task, which focuses on detecting the sentiment polarity towards the aspect in a sentence.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +4

Paper
Code

Knowledge Graph Augmented Network Towards Multiview Representation Learning for Aspect-based Sentiment Analysis

1 code implementation • 13 Jan 2022 • Qihuang Zhong, Liang Ding, Juhua Liu, Bo Du, Hua Jin, DaCheng Tao

To this end, we propose a knowledge graph augmented network KGAN, which aims to effectively incorporate external knowledge with explicitly syntactic and contextual information.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +2

Paper
Code

Unified Instance and Knowledge Alignment Pretraining for Aspect-based Sentiment Analysis

1 code implementation • 26 Oct 2021 • Juhua Liu, Qihuang Zhong, Liang Ding, Hua Jin, Bo Du, DaCheng Tao

In practice, we formulate the model pretrained on the sampled instances into a knowledge guidance model and a learner model, respectively.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +2

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.