Search Results for author: Yuxian Gu

Found 19 papers, 13 papers with code

When does Further Pre-training MLM Help? An Empirical Study on Task-Oriented Dialog Pre-training

1 code implementation EMNLP (insights) 2021 Qi Zhu, Yuxian Gu, Lingxiao Luo, Bing Li, Cheng Li, Wei Peng, Minlie Huang, Xiaoyan Zhu

Further pre-training language models on in-domain data (domain-adaptive pre-training, DAPT) or task-relevant data (task-adaptive pre-training, TAPT) before fine-tuning has been shown to improve downstream tasks’ performances.

Direct Preference Knowledge Distillation for Large Language Models

no code implementations28 Jun 2024 Yixing Li, Yuxian Gu, Li Dong, Dequan Wang, Yu Cheng, Furu Wei

Meanwhile, we prove the value and effectiveness of the introduced implicit reward and output preference in KD through experiments and theoretical analysis.

Knowledge Distillation

Instruction Pre-Training: Language Models are Supervised Multitask Learners

1 code implementation20 Jun 2024 Daixuan Cheng, Yuxian Gu, Shaohan Huang, Junyu Bi, Minlie Huang, Furu Wei

Unsupervised multitask pre-training has been the critical method behind the recent success of language models (LMs).

Towards Optimal Learning of Language Models

no code implementations27 Feb 2024 Yuxian Gu, Li Dong, Yaru Hao, Qingxiu Dong, Minlie Huang, Furu Wei

This work studies the general principles of improving the learning of language models (LMs), which aims at reducing the necessary training steps for achieving superior performance.

Data Compression Language Modelling

Pre-Training to Learn in Context

1 code implementation16 May 2023 Yuxian Gu, Li Dong, Furu Wei, Minlie Huang

In-context learning, where pre-trained language models learn to perform tasks from task examples and instructions in their contexts, has attracted much attention in the NLP community.

In-Context Learning Language Modelling +3

Structured Prompting: Scaling In-Context Learning to 1,000 Examples

1 code implementation13 Dec 2022 Yaru Hao, Yutao Sun, Li Dong, Zhixiong Han, Yuxian Gu, Furu Wei

Large language models have exhibited intriguing in-context learning capability, achieving promising zero- and few-shot performance without updating the parameters.

In-Context Learning

Learning Instructions with Unlabeled Data for Zero-Shot Cross-Task Generalization

1 code implementation17 Oct 2022 Yuxian Gu, Pei Ke, Xiaoyan Zhu, Minlie Huang

Recently, instruction tuning (IT), which fine-tunes a pre-trained language model on a massive collection of tasks described via human-craft instructions, has been shown effective in instruction learning for unseen tasks.

Language Modelling

Many-Class Text Classification with Matching

no code implementations23 May 2022 Yi Song, Yuxian Gu, Minlie Huang

In this work, we formulate \textbf{T}ext \textbf{C}lassification as a \textbf{M}atching problem between the text and the labels, and propose a simple yet effective framework named TCM.

text-classification Text Classification

PPT: Pre-trained Prompt Tuning for Few-shot Learning

1 code implementation ACL 2022 Yuxian Gu, Xu Han, Zhiyuan Liu, Minlie Huang

To ensure the generalization of PPT, we formulate similar classification tasks into a unified task form and pre-train soft prompts for this unified task.

Attribute Few-Shot Learning

EVA: An Open-Domain Chinese Dialogue System with Large-Scale Generative Pre-Training

2 code implementations3 Aug 2021 Hao Zhou, Pei Ke, Zheng Zhang, Yuxian Gu, Yinhe Zheng, Chujie Zheng, Yida Wang, Chen Henry Wu, Hao Sun, Xiaocong Yang, Bosi Wen, Xiaoyan Zhu, Minlie Huang, Jie Tang

Although pre-trained language models have remarkably enhanced the generation ability of dialogue systems, open-domain Chinese dialogue systems are still limited by the dialogue data and the model size compared with English ones.

CPM-2: Large-scale Cost-effective Pre-trained Language Models

2 code implementations20 Jun 2021 Zhengyan Zhang, Yuxian Gu, Xu Han, Shengqi Chen, Chaojun Xiao, Zhenbo Sun, Yuan YAO, Fanchao Qi, Jian Guan, Pei Ke, Yanzheng Cai, Guoyang Zeng, Zhixing Tan, Zhiyuan Liu, Minlie Huang, Wentao Han, Yang Liu, Xiaoyan Zhu, Maosong Sun

We present a suite of cost-effective techniques for the use of PLMs to deal with the efficiency issues of pre-training, fine-tuning, and inference.

Decoder

Train No Evil: Selective Masking for Task-Guided Pre-Training

1 code implementation EMNLP 2020 Yuxian Gu, Zhengyan Zhang, Xiaozhi Wang, Zhiyuan Liu, Maosong Sun

In this stage, the model is trained by masked language modeling on in-domain unsupervised data to learn domain-specific patterns and we propose a novel selective masking strategy to learn task-specific patterns.

Language Modelling Masked Language Modeling +1

Adapting Meta Knowledge Graph Information for Multi-Hop Reasoning over Few-Shot Relations

1 code implementation IJCNLP 2019 Xin Lv, Yuxian Gu, Xu Han, Lei Hou, Juanzi Li, Zhiyuan Liu

Multi-hop knowledge graph (KG) reasoning is an effective and explainable method for predicting the target entity via reasoning paths in query answering (QA) task.

Link Prediction Meta-Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.