Search Results for author: Aohan Zeng

Found 14 papers, 9 papers with code

CogDL: A Comprehensive Library for Graph Deep Learning

1 code implementation1 Mar 2021 Yukuo Cen, Zhenyu Hou, Yan Wang, Qibin Chen, Yizhen Luo, Zhongming Yu, Hengrui Zhang, Xingcheng Yao, Aohan Zeng, Shiguang Guo, Yuxiao Dong, Yang Yang, Peng Zhang, Guohao Dai, Yu Wang, Chang Zhou, Hongxia Yang, Jie Tang

In CogDL, we propose a unified design for the training and evaluation of GNN models for various graph tasks, making it unique among existing graph learning libraries.

Graph Classification Graph Embedding +5

FastMoE: A Fast Mixture-of-Expert Training System

3 code implementations24 Mar 2021 Jiaao He, Jiezhong Qiu, Aohan Zeng, Zhilin Yang, Jidong Zhai, Jie Tang

However, training trillion-scale MoE requires algorithm and system co-design for a well-tuned high performance distributed training system.

Language Modelling

Revisiting Parallel Context Windows: A Frustratingly Simple Alternative and Chain-of-Thought Deterioration

no code implementations24 May 2023 Kejuan Yang, Xiao Liu, Kaiwen Men, Aohan Zeng, Yuxiao Dong, Jie Tang

We identify two crucial limitations in the evaluation of recent parallel-integrated method Parallel Context Windows (PCW), which extends the maximum context lengths of language models, e. g., 2048 for LLaMA, by harnessing window-wise attention and positional embedding techniques.

Long-Context Understanding

AgentBench: Evaluating LLMs as Agents

1 code implementation7 Aug 2023 Xiao Liu, Hao Yu, Hanchen Zhang, Yifan Xu, Xuanyu Lei, Hanyu Lai, Yu Gu, Hangliang Ding, Kaiwen Men, Kejuan Yang, Shudan Zhang, Xiang Deng, Aohan Zeng, Zhengxiao Du, Chenhui Zhang, Sheng Shen, Tianjun Zhang, Yu Su, Huan Sun, Minlie Huang, Yuxiao Dong, Jie Tang

We present AgentBench, a multi-dimensional evolving benchmark that currently consists of 8 distinct environments to assess LLM-as-Agent's reasoning and decision-making abilities in a multi-turn open-ended generation setting.

Decision Making Instruction Following

LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding

1 code implementation28 Aug 2023 Yushi Bai, Xin Lv, Jiajie Zhang, Hongchang Lyu, Jiankai Tang, Zhidian Huang, Zhengxiao Du, Xiao Liu, Aohan Zeng, Lei Hou, Yuxiao Dong, Jie Tang, Juanzi Li

In this paper, we introduce LongBench, the first bilingual, multi-task benchmark for long context understanding, enabling a more rigorous evaluation of long context understanding.

16k Code Completion +2

AgentTuning: Enabling Generalized Agent Abilities for LLMs

1 code implementation19 Oct 2023 Aohan Zeng, Mingdao Liu, Rui Lu, Bowen Wang, Xiao Liu, Yuxiao Dong, Jie Tang

Though many prompting methods have been proposed to complete particular agent tasks, there is lack of research focusing on improving the agent capabilities of LLMs themselves without compromising their general abilities.

Memorization

CritiqueLLM: Scaling LLM-as-Critic for Effective and Explainable Evaluation of Large Language Model Generation

2 code implementations30 Nov 2023 Pei Ke, Bosi Wen, Zhuoer Feng, Xiao Liu, Xuanyu Lei, Jiale Cheng, Shengyuan Wang, Aohan Zeng, Yuxiao Dong, Hongning Wang, Jie Tang, Minlie Huang

Since the natural language processing (NLP) community started to make large language models (LLMs), such as GPT-4, act as a critic to evaluate the quality of generated texts, most of them only train a critique generation model of a specific scale on specific datasets.

Language Modelling Large Language Model

xTrimoPGLM: Unified 100B-Scale Pre-trained Transformer for Deciphering the Language of Protein

no code implementations11 Jan 2024 Bo Chen, Xingyi Cheng, Pan Li, Yangli-ao Geng, Jing Gong, Shen Li, Zhilei Bei, Xu Tan, Boyan Wang, Xin Zeng, Chiming Liu, Aohan Zeng, Yuxiao Dong, Jie Tang, Le Song

We propose a unified protein language model, xTrimoPGLM, to address these two types of tasks simultaneously through an innovative pre-training framework.

Protein Language Model

APAR: LLMs Can Do Auto-Parallel Auto-Regressive Decoding

no code implementations12 Jan 2024 Mingdao Liu, Aohan Zeng, Bowen Wang, Peng Zhang, Jie Tang, Yuxiao Dong

The massive adoption of large language models (LLMs) demands efficient deployment strategies.

Understanding Emergent Abilities of Language Models from the Loss Perspective

no code implementations23 Mar 2024 Zhengxiao Du, Aohan Zeng, Yuxiao Dong, Jie Tang

Recent studies have put into question the belief that emergent abilities in language models are exclusive to large models.

ChatGLM-RLHF: Practices of Aligning Large Language Models with Human Feedback

no code implementations1 Apr 2024 Zhenyu Hou, Yilin Niu, Zhengxiao Du, Xiaohan Zhang, Xiao Liu, Aohan Zeng, Qinkai Zheng, Minlie Huang, Hongning Wang, Jie Tang, Yuxiao Dong

The work presents our practices of aligning LLMs with human preferences, offering insights into the challenges and solutions in RLHF implementations.

ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline

1 code implementation3 Apr 2024 Yifan Xu, Xiao Liu, Xinghan Liu, Zhenyu Hou, Yueyan Li, Xiaohan Zhang, Zihan Wang, Aohan Zeng, Zhengxiao Du, Wenyi Zhao, Jie Tang, Yuxiao Dong

Large language models (LLMs) have shown excellent mastering of human language, but still struggle in real-world applications that require mathematical problem-solving.

Math

Cannot find the paper you are looking for? You can Submit a new open access paper.