Search Results for author: Guanlin Liu

Found 8 papers, 0 papers with code

Flaming-hot Initiation with Regular Execution Sampling for Large Language Models

no code implementations28 Oct 2024 Weizhe Chen, Zhicheng Zhang, Guanlin Liu, Renjie Zheng, Wenlei Shi, Chen Dun, Zheng Wu, Xing Jin, Lin Yan

Since the release of ChatGPT, large language models (LLMs) have demonstrated remarkable capabilities across various domains.

Diversity Math

Process Supervision-Guided Policy Optimization for Code Generation

no code implementations23 Oct 2024 Ning Dai, Zheng Wu, Renjie Zheng, Ziyun Wei, Wenlei Shi, Xing Jin, Guanlin Liu, Chen Dun, Liang Huang, Lin Yan

Reinforcement Learning (RL) with unit test feedback has enhanced large language models (LLMs) code generation, but relies on sparse rewards provided only after complete code evaluation, limiting learning efficiency and incremental improvements.

Code Generation Reinforcement Learning (RL)

Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization

no code implementations11 Oct 2024 Guanlin Liu, Kaixuan Ji, Renjie Zheng, Zheng Wu, Chen Dun, Quanquan Gu, Lin Yan

Reinforcement Learning (RL) plays a crucial role in aligning large language models (LLMs) with human preferences and improving their ability to perform complex tasks.

GSM8K Language Modelling +5

Optimal Cost Constrained Adversarial Attacks For Multiple Agent Systems

no code implementations1 Nov 2023 Ziqing Lu, Guanlin Liu, Lifeng Cai, Weiyu Xu

Finding optimal adversarial attack strategies is an important topic in reinforcement learning and the Markov decision process.

Adversarial Attack

Efficient Action Robust Reinforcement Learning with Probabilistic Policy Execution Uncertainty

no code implementations15 Jul 2023 Guanlin Liu, Zhihan Zhou, Han Liu, Lifeng Lai

Robust reinforcement learning (RL) aims to find a policy that optimizes the worst-case performance in the face of uncertainties.

reinforcement-learning Reinforcement Learning +1

Efficient Action Poisoning Attacks on Linear Contextual Bandits

no code implementations10 Dec 2021 Guanlin Liu, Lifeng Lai

We show that, in both white-box and black-box settings, the proposed attack schemes can force the LinUCB agent to pull a target arm very frequently by spending only logarithm cost.

Multi-Armed Bandits

Provably Efficient Black-Box Action Poisoning Attacks Against Reinforcement Learning

no code implementations NeurIPS 2021 Guanlin Liu, Lifeng Lai

In this paper, we introduce a new class of attacks named action poisoning attacks, where an adversary can change the action signal selected by the agent.

reinforcement-learning Reinforcement Learning +1

Action-Manipulation Attacks Against Stochastic Bandits: Attacks and Defense

no code implementations19 Feb 2020 Guanlin Liu, Lifeng Lai

To defend against this class of attacks, we introduce a novel algorithm that is robust to action-manipulation attacks when an upper bound for the total attack cost is given.

Cannot find the paper you are looking for? You can Submit a new open access paper.