Search Results for author: Yiduo Guo

Found 10 papers, 7 papers with code

PPTC-R benchmark: Towards Evaluating the Robustness of Large Language Models for PowerPoint Task Completion

1 code implementation6 Mar 2024 Zekai Zhang, Yiduo Guo, Yaobo Liang, Dongyan Zhao, Nan Duan

The growing dependence on Large Language Models (LLMs) for finishing user instructions necessitates a comprehensive understanding of their robustness to complex task completion in real-world situations.

Sentence

PPTC Benchmark: Evaluating Large Language Models for PowerPoint Task Completion

1 code implementation3 Nov 2023 Yiduo Guo, Zekai Zhang, Yaobo Liang, Dongyan Zhao, Nan Duan

Recent evaluations of Large Language Models (LLMs) have centered around testing their zero-shot/few-shot capabilities for basic natural language tasks and their ability to translate instructions into tool APIs.

EIPE-text: Evaluation-Guided Iterative Plan Extraction for Long-Form Narrative Text Generation

no code implementations12 Oct 2023 Wang You, Wenshan Wu, Yaobo Liang, Shaoguang Mao, Chenfei Wu, Maosong Cao, Yuzhe Cai, Yiduo Guo, Yan Xia, Furu Wei, Nan Duan

In this paper, we propose a new framework called Evaluation-guided Iterative Plan Extraction for long-form narrative text generation (EIPE-text), which extracts plans from the corpus of narratives and utilizes the extracted plans to construct a better planner.

In-Context Learning Text Generation

Class Incremental Learning via Likelihood Ratio Based Task Prediction

2 code implementations26 Sep 2023 Haowei Lin, Yijia Shao, Weinan Qian, Ningxin Pan, Yiduo Guo, Bing Liu

An emerging theory-guided approach (called TIL+OOD) is to train a task-specific model for each task in a shared network for all tasks based on a task-incremental learning (TIL) method to deal with catastrophic forgetting.

Class Incremental Learning Incremental Learning

Class-Incremental Learning based on Label Generation

1 code implementation22 Jun 2023 Yijia Shao, Yiduo Guo, Dongyan Zhao, Bing Liu

Despite the great success of pre-trained language models, it is still a challenge to use these models for continual learning, especially for the class-incremental learning (CIL) setting due to catastrophic forgetting (CF).

Class Incremental Learning Incremental Learning

Dealing with Cross-Task Class Discrimination in Online Continual Learning

1 code implementation CVPR 2023 Yiduo Guo, Bing Liu, Dongyan Zhao

A novel optimization objective with a gradient-based adaptive method is proposed to dynamically deal with the problem in the online CL process.

Class Incremental Learning Incremental Learning

Analyzing and Reducing the Performance Gap in Cross-Lingual Transfer with Fine-tuning Slow and Fast

no code implementations19 May 2023 Yiduo Guo, Yaobo Liang, Dongyan Zhao, Bing Liu, Duan Nan

Existing research has shown that a multilingual pre-trained language model fine-tuned with one (source) language also performs well on downstream tasks for non-source languages, even though no fine-tuning is done on these languages.

Cross-Lingual Transfer Language Modelling

Learning to Plan with Natural Language

1 code implementation20 Apr 2023 Yiduo Guo, Yaobo Liang, Chenfei Wu, Wenshan Wu, Dongyan Zhao, Nan Duan

To obtain it, we propose the Learning to Plan method, which involves two phases: (1) In the first learning task plan phase, it iteratively updates the task plan with new step-by-step solutions and behavioral instructions, which are obtained by prompting LLMs to derive from training error feedback.

Transfer Learning

AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models

2 code implementations13 Apr 2023 Wanjun Zhong, Ruixiang Cui, Yiduo Guo, Yaobo Liang, Shuai Lu, Yanlin Wang, Amin Saied, Weizhu Chen, Nan Duan

Impressively, GPT-4 surpasses average human performance on SAT, LSAT, and math competitions, attaining a 95% accuracy rate on the SAT Math test and a 92. 5% accuracy on the English test of the Chinese national college entrance exam.

Decision Making Math

Cannot find the paper you are looking for? You can Submit a new open access paper.