Search Results for author: Yuxian Gu

Found 17 papers, 12 papers with code

When does Further Pre-training MLM Help? An Empirical Study on Task-Oriented Dialog Pre-training

1 code implementation • EMNLP (insights) 2021 • Qi Zhu, Yuxian Gu, Lingxiao Luo, Bing Li, Cheng Li, Wei Peng, Minlie Huang, Xiaoyan Zhu

Further pre-training language models on in-domain data (domain-adaptive pre-training, DAPT) or task-relevant data (task-adaptive pre-training, TAPT) before fine-tuning has been shown to improve downstream tasks’ performances.

Paper
Code

Towards Optimal Learning of Language Models

no code implementations • 27 Feb 2024 • Yuxian Gu, Li Dong, Yaru Hao, Qingxiu Dong, Minlie Huang, Furu Wei

This work studies the general principles of improving the learning of language models (LMs), which aims at reducing the necessary training steps for achieving superior performance.

Data Compression Language Modelling

Paper
Add Code

Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models

no code implementations • 20 Feb 2024 • Haoran Li, Qingxiu Dong, Zhengyang Tang, Chaojun Wang, Xingxing Zhang, Haoyang Huang, Shaohan Huang, Xiaolong Huang, Zeqiang Huang, Dongdong Zhang, Yuxian Gu, Xin Cheng, Xun Wang, Si-Qing Chen, Li Dong, Wei Lu, Zhifang Sui, Benyou Wang, Wai Lam, Furu Wei

We introduce Generalized Instruction Tuning (called GLAN), a general and scalable method for instruction tuning of Large Language Models (LLMs).

Instruction Following Logical Reasoning +1

Paper
Add Code

MiniLLM: Knowledge Distillation of Large Language Models

2 code implementations • 14 Jun 2023 • Yuxian Gu, Li Dong, Furu Wei, Minlie Huang

In this work, we propose a KD approach that distills LLMs into smaller language models.

Instruction Following Knowledge Distillation +1

3,260

Paper
Code

Pre-Training to Learn in Context

1 code implementation • 16 May 2023 • Yuxian Gu, Li Dong, Furu Wei, Minlie Huang

In-context learning, where pre-trained language models learn to perform tasks from task examples and instructions in their contexts, has attracted much attention in the NLP community.

In-Context Learning Language Modelling +3

105

Paper
Code

Structured Prompting: Scaling In-Context Learning to 1,000 Examples

1 code implementation • 13 Dec 2022 • Yaru Hao, Yutao Sun, Li Dong, Zhixiong Han, Yuxian Gu, Furu Wei

Large language models have exhibited intriguing in-context learning capability, achieving promising zero- and few-shot performance without updating the parameters.

In-Context Learning

3,260

Paper
Code

Learning Instructions with Unlabeled Data for Zero-Shot Cross-Task Generalization

1 code implementation • 17 Oct 2022 • Yuxian Gu, Pei Ke, Xiaoyan Zhu, Minlie Huang

Recently, instruction tuning (IT), which fine-tunes a pre-trained language model on a massive collection of tasks described via human-craft instructions, has been shown effective in instruction learning for unseen tasks.

Language Modelling

Paper
Code

Many-Class Text Classification with Matching

no code implementations • 23 May 2022 • Yi Song, Yuxian Gu, Minlie Huang

In this work, we formulate \textbf{T}ext \textbf{C}lassification as a \textbf{M}atching problem between the text and the labels, and propose a simple yet effective framework named TCM.

text-classification Text Classification

Paper
Add Code

EVA2.0: Investigating Open-Domain Chinese Dialogue Systems with Large-Scale Pre-Training

1 code implementation • 17 Mar 2022 • Yuxian Gu, Jiaxin Wen, Hao Sun, Yi Song, Pei Ke, Chujie Zheng, Zheng Zhang, Jianzhu Yao, Lei Liu, Xiaoyan Zhu, Minlie Huang

Large-scale pre-training has shown remarkable performance in building open-domain dialogue systems.

Chatbot

301

Paper
Code

CUGE: A Chinese Language Understanding and Generation Evaluation Benchmark

no code implementations • 27 Dec 2021 • Yuan YAO, Qingxiu Dong, Jian Guan, Boxi Cao, Zhengyan Zhang, Chaojun Xiao, Xiaozhi Wang, Fanchao Qi, Junwei Bao, Jinran Nie, Zheni Zeng, Yuxian Gu, Kun Zhou, Xuancheng Huang, Wenhao Li, Shuhuai Ren, Jinliang Lu, Chengqiang Xu, Huadong Wang, Guoyang Zeng, Zile Zhou, Jiajun Zhang, Juanzi Li, Minlie Huang, Rui Yan, Xiaodong He, Xiaojun Wan, Xin Zhao, Xu sun, Yang Liu, Zhiyuan Liu, Xianpei Han, Erhong Yang, Zhifang Sui, Maosong Sun

We argue that for general-purpose language intelligence evaluation, the benchmark itself needs to be comprehensive and systematic.

Paper
Add Code

PPT: Pre-trained Prompt Tuning for Few-shot Learning

1 code implementation • ACL 2022 • Yuxian Gu, Xu Han, Zhiyuan Liu, Minlie Huang

To ensure the generalization of PPT, we formulate similar classification tasks into a unified task form and pre-train soft prompts for this unified task.

Attribute Few-Shot Learning

106

Paper
Code

EVA: An Open-Domain Chinese Dialogue System with Large-Scale Generative Pre-Training

2 code implementations • 3 Aug 2021 • Hao Zhou, Pei Ke, Zheng Zhang, Yuxian Gu, Yinhe Zheng, Chujie Zheng, Yida Wang, Chen Henry Wu, Hao Sun, Xiaocong Yang, Bosi Wen, Xiaoyan Zhu, Minlie Huang, Jie Tang

Although pre-trained language models have remarkably enhanced the generation ability of dialogue systems, open-domain Chinese dialogue systems are still limited by the dialogue data and the model size compared with English ones.

565

Paper
Code

CPM-2: Large-scale Cost-effective Pre-trained Language Models

2 code implementations • 20 Jun 2021 • Zhengyan Zhang, Yuxian Gu, Xu Han, Shengqi Chen, Chaojun Xiao, Zhenbo Sun, Yuan YAO, Fanchao Qi, Jian Guan, Pei Ke, Yanzheng Cai, Guoyang Zeng, Zhixing Tan, Zhiyuan Liu, Minlie Huang, Wentao Han, Yang Liu, Xiaoyan Zhu, Maosong Sun

We present a suite of cost-effective techniques for the use of PLMs to deal with the efficiency issues of pre-training, fine-tuning, and inference.

Decoder

565

Paper
Code

Pre-Trained Models: Past, Present and Future

no code implementations • 14 Jun 2021 • Xu Han, Zhengyan Zhang, Ning Ding, Yuxian Gu, Xiao Liu, Yuqi Huo, Jiezhong Qiu, Yuan YAO, Ao Zhang, Liang Zhang, Wentao Han, Minlie Huang, Qin Jin, Yanyan Lan, Yang Liu, Zhiyuan Liu, Zhiwu Lu, Xipeng Qiu, Ruihua Song, Jie Tang, Ji-Rong Wen, Jinhui Yuan, Wayne Xin Zhao, Jun Zhu

Large-scale pre-trained models (PTMs) such as BERT and GPT have recently achieved great success and become a milestone in the field of artificial intelligence (AI).

Computational Efficiency Self-Supervised Learning +1

Paper
Add Code

CPM: A Large-scale Generative Chinese Pre-trained Language Model

6 code implementations • 1 Dec 2020 • Zhengyan Zhang, Xu Han, Hao Zhou, Pei Ke, Yuxian Gu, Deming Ye, Yujia Qin, Yusheng Su, Haozhe Ji, Jian Guan, Fanchao Qi, Xiaozhi Wang, Yanan Zheng, Guoyang Zeng, Huanqi Cao, Shengqi Chen, Daixuan Li, Zhenbo Sun, Zhiyuan Liu, Minlie Huang, Wentao Han, Jie Tang, Juanzi Li, Xiaoyan Zhu, Maosong Sun

However, applying GPT-3 to address Chinese NLP tasks is still challenging, as the training corpus of GPT-3 is primarily English, and the parameters are not publicly available.

Cloze Test Language Modelling +1

1,590

Paper
Code

Train No Evil: Selective Masking for Task-Guided Pre-Training

1 code implementation • EMNLP 2020 • Yuxian Gu, Zhengyan Zhang, Xiaozhi Wang, Zhiyuan Liu, Maosong Sun

In this stage, the model is trained by masked language modeling on in-domain unsupervised data to learn domain-specific patterns and we propose a novel selective masking strategy to learn task-specific patterns.

Language Modelling Masked Language Modeling +1

Paper
Code

Adapting Meta Knowledge Graph Information for Multi-Hop Reasoning over Few-Shot Relations

1 code implementation • IJCNLP 2019 • Xin Lv, Yuxian Gu, Xu Han, Lei Hou, Juanzi Li, Zhiyuan Liu

Multi-hop knowledge graph (KG) reasoning is an effective and explainable method for predicting the target entity via reasoning paths in query answering (QA) task.

Ranked #3 on Link Prediction on NELL-995

Link Prediction Meta-Learning

112

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.