Search Results for author: Shuaiyi Li

Found 7 papers, 2 papers with code

A Silver Bullet or a Compromise for Full Attention? A Comprehensive Study of Gist Token-based Context Compression

no code implementations23 Dec 2024 Chenlong Deng, Zhisong Zhang, Kelong Mao, Shuaiyi Li, Xinting Huang, Dong Yu, Zhicheng Dou

In this work, we provide a thorough investigation of gist-based context compression methods to improve long-context processing in large language models.

Retrieval-augmented Generation

Knowledge Boundary of Large Language Models: A Survey

no code implementations17 Dec 2024 Moxin Li, Yong Zhao, Yang Deng, Wenxuan Zhang, Shuaiyi Li, Wenya Xie, See-Kiong Ng, Tat-Seng Chua

Although large language models (LLMs) store vast amount of knowledge in their parameters, they still have limitations in the memorization and utilization of certain knowledge, leading to undesired behaviors such as generating untruthful and inaccurate responses.

Memorization Survey

On the Transformations across Reward Model, Parameter Update, and In-Context Prompt

no code implementations24 Jun 2024 Deng Cai, Huayang Li, Tingchen Fu, Siheng Li, Weiwen Xu, Shuaiyi Li, Bowen Cao, Zhisong Zhang, Xinting Huang, Leyang Cui, Yan Wang, Lemao Liu, Taro Watanabe, Shuming Shi

Despite the general capabilities of pre-trained large language models (LLMs), they still need further adaptation to better serve practical applications.

Consecutive Batch Model Editing with HooK Layers

1 code implementation8 Mar 2024 Shuaiyi Li, Yang Deng, Deng Cai, Hongyuan Lu, Liang Chen, Wai Lam

As the typical retraining paradigm is unacceptably time- and resource-consuming, researchers are turning to model editing to find an effective way that supports both consecutive and batch scenarios to edit the model behavior directly.

model Model Editing

DepWiGNN: A Depth-wise Graph Neural Network for Multi-hop Spatial Reasoning in Text

1 code implementation19 Oct 2023 Shuaiyi Li, Yang Deng, Wai Lam

Specifically, we design a novel node memory scheme and aggregate the information over the depth dimension instead of the breadth dimension of the graph, which empowers the ability to collect long dependencies without stacking multiple layers.

Graph Neural Network Spatial Reasoning

Cannot find the paper you are looking for? You can Submit a new open access paper.