Search Results for author: Boyi Liu

Found 28 papers, 3 papers with code

Double Duality: Variational Primal-Dual Policy Optimization for Constrained Reinforcement Learning

no code implementations16 Feb 2024 Zihao Li, Boyi Liu, Zhuoran Yang, Zhaoran Wang, Mengdi Wang

Designing algorithms for a constrained convex MDP faces several challenges, including (1) handling the large state space, (2) managing the exploration/exploitation tradeoff, and (3) solving the constrained optimization where the objective and the constraint are both nonlinear functions of the visitation measure.

reinforcement-learning

Improving Efficiency of DNN-based Relocalization Module for Autonomous Driving with Server-side Computing

no code implementations1 Dec 2023 Dengbo Li, Jieren Cheng, Boyi Liu

Our findings highlight the vital role of server-side offloading in DNN-based camera relocation for autonomous vehicles, and we also discuss the results of data fusion.

Autonomous Driving

Let Models Speak Ciphers: Multiagent Debate through Embeddings

no code implementations10 Oct 2023 Chau Pham, Boyi Liu, Yingxiang Yang, Zhengyu Chen, Tianyi Liu, Jianbo Yuan, Bryan A. Plummer, Zhaoran Wang, Hongxia Yang

Although natural language is an obvious choice for communication due to LLM's language understanding capability, the token sampling step needed when generating natural language poses a potential risk of information loss, as it uses only one token to represent the model's belief across the entire vocabulary.

Reason for Future, Act for Now: A Principled Framework for Autonomous LLM Agents with Provable Sample Efficiency

1 code implementation29 Sep 2023 Zhihan Liu, Hao Hu, Shenao Zhang, Hongyi Guo, Shuqi Ke, Boyi Liu, Zhaoran Wang

Specifically, we design a prompt template for reasoning that learns from the memory buffer and plans a future trajectory over a long horizon ("reason for future").

Differentiable Arbitrating in Zero-sum Markov Games

no code implementations20 Feb 2023 Jing Wang, Meichen Song, Feng Gao, Boyi Liu, Zhaoran Wang, Yi Wu

We initiate the study of how to perturb the reward in a zero-sum Markov game with two players to induce a desirable Nash equilibrium, namely arbitrating.

Multi-agent Reinforcement Learning reinforcement-learning +1

An Efficient Approach to the Online Multi-Agent Path Finding Problem by Using Sustainable Information

no code implementations11 Jan 2023 Mingkai Tang, Boyi Liu, Yuanhang Li, Hongji Liu, Ming Liu, Lujia Wang

The low-level solver, the Sustainable Reverse Safe Interval Path Planning algorithm (SRSIPP), is an efficient single-agent solver that uses previous planning context to reduce duplicate calculations.

Computational Efficiency Multi-Agent Path Finding

An Analysis of Attention via the Lens of Exchangeability and Latent Variable Models

no code implementations30 Dec 2022 Yufeng Zhang, Boyi Liu, Qi Cai, Lingxiao Wang, Zhaoran Wang

In particular, such a representation instantiates the posterior distribution of the latent variable given input tokens, which plays a central role in predicting output labels and solving downstream tasks.

Relational Reasoning via Set Transformers: Provable Efficiency and Applications to MARL

no code implementations20 Sep 2022 Fengzhuo Zhang, Boyi Liu, Kaixin Wang, Vincent Y. F. Tan, Zhuoran Yang, Zhaoran Wang

The cooperative Multi-A gent R einforcement Learning (MARL) with permutation invariant agents framework has achieved tremendous empirical successes in real-world applications.

Relational Reasoning

Differentiable Bilevel Programming for Stackelberg Congestion Games

1 code implementation15 Sep 2022 Jiayang Li, Jing Yu, Qianni Wang, Boyi Liu, Zhaoran Wang, Yu Marco Nie

A Stackelberg congestion game (SCG) is a bilevel program in which a leader aims to maximize their own gain by anticipating and manipulating the equilibrium state at which followers settle by playing a congestion game.

Dynamic Graph Learning Based on Hierarchical Memory for Origin-Destination Demand Prediction

1 code implementation29 May 2022 Ruixing Zhang, Liangzhe Han, Boyi Liu, Jiayuan Zeng, Leilei Sun

Last, an objective function is designed to derive the future OD demands according to the most recent node representations, and also to tackle the data sparsity problem in OD prediction.

Graph Learning Graph Representation Learning

BooVI: Provably Efficient Bootstrapped Value Iteration

no code implementations NeurIPS 2021 Boyi Liu, Qi Cai, Zhuoran Yang, Zhaoran Wang

Despite the tremendous success of reinforcement learning (RL) with function approximation, efficient exploration remains a significant challenge, both practically and theoretically.

Efficient Exploration Reinforcement Learning (RL)

Inducing Equilibria via Incentives: Simultaneous Design-and-Play Ensures Global Convergence

no code implementations4 Oct 2021 Boyi Liu, Jiayang Li, Zhuoran Yang, Hoi-To Wai, Mingyi Hong, Yu Marco Nie, Zhaoran Wang

To regulate a social system comprised of self-interested agents, economic incentives are often required to induce a desirable outcome.

Bilevel Optimization

Foreground Object Structure Transfer for Unsupervised Domain Adaptation

no code implementations14 Sep 2021 Jieren Cheng, Le Liu, Xiangyan Tang, Wenxuan Tu, Boyi Liu, Ke Zhou, Qiaobo Da, Yue Yang

In practice, since the label of the target domain is not available, we use the clustering information of the source domain to assign pseudo labels to the target domain samples, and then according to the source domain data prior knowledge guides those positive features to maximum the inter-class distance between different classes and mimimum the intra-class distance.

Clustering Object +1

Policy Optimization in Zero-Sum Markov Games: Fictitious Self-Play Provably Attains Nash Equilibria

no code implementations1 Jan 2021 Boyi Liu, Zhuoran Yang, Zhaoran Wang

Specifically, in each iteration, each player infers the policy of the opponent implicitly via policy evaluation and improves its current policy by taking the smoothed best-response via a proximal policy optimization (PPO) step.

A Real-time Contribution Measurement Method for Participants in Federated Learning

no code implementations28 Sep 2020 Bingjie Yan, Yize Zhou, Boyi Liu, Jun Wang, Yuhan Zhang, Li Liu, Xiaolan Nie, Zhiwei Fan, Zhixuan Liang

However, there is a lack of a sufficiently reasonable contribution measurement mechanism to distribute the reward for each agent.

Federated Learning

Experiments of Federated Learning for COVID-19 Chest X-ray Images

no code implementations5 Jul 2020 Boyi Liu, Bingjie Yan, Yize Zhou, Yifan Yang, Yixian Zhang

However, for the protection and respect of the privacy of patients, the hospital's specific medical-related data did not allow leakage and sharing without permission.

Federated Learning

Neural Trust Region/Proximal Policy Optimization Attains Globally Optimal Policy

no code implementations NeurIPS 2019 Boyi Liu, Qi Cai, Zhuoran Yang, Zhaoran Wang

Proximal policy optimization and trust region policy optimization (PPO and TRPO) with actor and critic parametrized by neural networks achieve significant empirical success in deep reinforcement learning.

Reinforcement Learning (RL)

Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy

no code implementations25 Jun 2019 Boyi Liu, Qi Cai, Zhuoran Yang, Zhaoran Wang

Proximal policy optimization and trust region policy optimization (PPO and TRPO) with actor and critic parametrized by neural networks achieve significant empirical success in deep reinforcement learning.

reinforcement-learning Reinforcement Learning (RL)

Traffic Flow Combination Forecasting Method Based on Improved LSTM and ARIMA

no code implementations25 Jun 2019 Boyi Liu, Xiangyan Tang, Jieren Cheng, Pengchao Shi

In this paper, we define the traffic data time singularity ratio in the dropout module and propose a combination prediction method based on the improved long short-term memory neural network and time series autoregressive integrated moving average model (SDLSTM-ARIMA), which is derived from the Recurrent Neural Networks (RNN) model.

Time Series Time Series Analysis +1

Lifelong Federated Reinforcement Learning: A Learning Architecture for Navigation in Cloud Robotic Systems

no code implementations19 Jan 2019 Boyi Liu, Lujia Wang, Ming Liu

To address the problem, we present a learning architecture for navigation in cloud robotic systems: Lifelong Federated Reinforcement Learning (LFRL).

reinforcement-learning Reinforcement Learning (RL) +2

Cannot find the paper you are looking for? You can Submit a new open access paper.