Search Results for author: Zhengxiao Du

Found 12 papers, 11 papers with code

P-Tuning: Prompt Tuning Can Be Comparable to Fine-tuning Across Scales and Tasks

no code implementations ACL 2022 Xiao Liu, Kaixuan Ji, Yicheng Fu, Weng Tam, Zhengxiao Du, Zhilin Yang, Jie Tang

Prompt tuning, which only tunes continuous prompts with a frozen language model, substantially reduces per-task storage and memory usage at training.

Language Modelling

SciGLM: Training Scientific Language Models with Self-Reflective Instruction Annotation and Tuning

1 code implementation15 Jan 2024 Dan Zhang, Ziniu Hu, Sining Zhoubian, Zhengxiao Du, Kaiyu Yang, Zihan Wang, Yisong Yue, Yuxiao Dong, Jie Tang

We fine-tuned the ChatGLM family of language models with SciInstruct, enhancing their capabilities in scientific and mathematical reasoning.

Mathematical Reasoning

LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding

1 code implementation28 Aug 2023 Yushi Bai, Xin Lv, Jiajie Zhang, Hongchang Lyu, Jiankai Tang, Zhidian Huang, Zhengxiao Du, Xiao Liu, Aohan Zeng, Lei Hou, Yuxiao Dong, Jie Tang, Juanzi Li

In this paper, we introduce LongBench, the first bilingual, multi-task benchmark for long context understanding, enabling a more rigorous evaluation of long context understanding.

Code Completion Few-Shot Learning

AgentBench: Evaluating LLMs as Agents

1 code implementation7 Aug 2023 Xiao Liu, Hao Yu, Hanchen Zhang, Yifan Xu, Xuanyu Lei, Hanyu Lai, Yu Gu, Hangliang Ding, Kaiwen Men, Kejuan Yang, Shudan Zhang, Xiang Deng, Aohan Zeng, Zhengxiao Du, Chenhui Zhang, Sheng Shen, Tianjun Zhang, Yu Su, Huan Sun, Minlie Huang, Yuxiao Dong, Jie Tang

We present AgentBench, a multi-dimensional evolving benchmark that currently consists of 8 distinct environments to assess LLM-as-Agent's reasoning and decision-making abilities in a multi-turn open-ended generation setting.

Decision Making Instruction Following

P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks

3 code implementations14 Oct 2021 Xiao Liu, Kaixuan Ji, Yicheng Fu, Weng Lam Tam, Zhengxiao Du, Zhilin Yang, Jie Tang

Prompt tuning, which only tunes continuous prompts with a frozen language model, substantially reduces per-task storage and memory usage at training.

Language Modelling

GLM: General Language Model Pretraining with Autoregressive Blank Infilling

9 code implementations ACL 2022 Zhengxiao Du, Yujie Qian, Xiao Liu, Ming Ding, Jiezhong Qiu, Zhilin Yang, Jie Tang

On a wide range of tasks across NLU, conditional and unconditional generation, GLM outperforms BERT, T5, and GPT given the same model sizes and data, and achieves the best performance from a single pretrained model with 1. 25x parameters of BERT Large , demonstrating its generalizability to different downstream tasks.

Ranked #4 on Language Modelling on WikiText-103 (using extra training data)

Abstractive Text Summarization Classification +4

GPT Understands, Too

6 code implementations18 Mar 2021 Xiao Liu, Yanan Zheng, Zhengxiao Du, Ming Ding, Yujie Qian, Zhilin Yang, Jie Tang

Prompting a pretrained language model with natural language patterns has been proved effective for natural language understanding (NLU).

Knowledge Probing Language Modelling +2

Policy-Gradient Training of Fair and Unbiased Ranking Functions

1 code implementation19 Nov 2019 Himank Yadav, Zhengxiao Du, Thorsten Joachims

is an abundant and attractive source of data for learning to rank, it can produce unfair ranking policies for both exogenous and endogenous reasons.

counterfactual Decision Making +2

Cognitive Knowledge Graph Reasoning for One-shot Relational Learning

1 code implementation13 Jun 2019 Zhengxiao Du, Chang Zhou, Ming Ding, Hongxia Yang, Jie Tang

Inferring new facts from existing knowledge graphs (KG) with explainable reasoning processes is a significant problem and has received much attention recently.

Knowledge Graphs Relational Reasoning +1

Sequential Scenario-Specific Meta Learner for Online Recommendation

1 code implementation2 Jun 2019 Zhengxiao Du, Xiaowei Wang, Hongxia Yang, Jingren Zhou, Jie Tang

Our approach is based on the insight that having a good generalization from a few examples relies on both a generic model initialization and an effective strategy for adapting this model to newly arising tasks.

Few-Shot Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.