Search Results for author: Lipeng Wan

Found 14 papers, 2 papers with code

Enhancing Decision Transformer with Diffusion-Based Trajectory Branch Generation

no code implementations18 Nov 2024 Zhihong Liu, Long Qian, Zeyang Liu, Lipeng Wan, Xingyu Chen, Xuguang Lan

Decision Transformer (DT) can learn effective policy from offline datasets by converting the offline reinforcement learning (RL) into a supervised sequence modeling task, where the trajectory elements are generated auto-regressively conditioned on the return-to-go (RTG). However, the sequence modeling learning approach tends to learn policies that converge on the sub-optimal trajectories within the dataset, for lack of bridging data to move to better trajectories, even if the condition is set to the highest RTG. To address this issue, we introduce Diffusion-Based Trajectory Branch Generation (BG), which expands the trajectories of the dataset with branches generated by a diffusion model. The trajectory branch is generated based on the segment of the trajectory within the dataset, and leads to trajectories with higher returns. We concatenate the generated branch with the trajectory segment as an expansion of the trajectory. After expanding, DT has more opportunities to learn policies to move to better trajectories, preventing it from converging to the sub-optimal trajectories. Empirically, after processing with BG, DT outperforms state-of-the-art sequence modeling methods on D4RL benchmark, demonstrating the effectiveness of adding branches to the dataset without further modifications.

D4RL Reinforcement Learning (RL)

Grounded Answers for Multi-agent Decision-making Problem through Generative World Model

no code implementations3 Oct 2024 Zeyang Liu, Xinrui Yang, Shiguang Sun, Long Qian, Lipeng Wan, Xingyu Chen, Xuguang Lan

The simulator is a world model that separately learns dynamics and reward, where the dynamics model comprises an image tokenizer as well as a causal transformer to generate interaction transitions autoregressively, and the reward model is a bidirectional transformer learned by maximizing the likelihood of trajectories in the expert demonstrations under language guidance.

Decision Making Image Generation +2

Imagine, Initialize, and Explore: An Effective Exploration Method in Multi-Agent Reinforcement Learning

no code implementations28 Feb 2024 Zeyang Liu, Lipeng Wan, Xinrui Yang, Zhuoran Chen, Xingyu Chen, Xuguang Lan

To address this limitation, we propose Imagine, Initialize, and Explore (IIE), a novel method that offers a promising solution for efficient multi-agent exploration in complex scenarios.

Action Generation SMAC+ +1

Spatiotemporally adaptive compression for scientific dataset with feature preservation -- a case study on simulation data with extreme climate events analysis

no code implementations6 Jan 2024 Qian Gong, Chengzhu Zhang, Xin Liang, Viktor Reshniak, Jieyang Chen, Anand Rangarajan, Sanjay Ranka, Nicolas Vidal, Lipeng Wan, Paul Ullrich, Norbert Podhorszki, Robert Jacob, Scott Klasky

Additionally, we integrate spatiotemporal feature detection with data compression and demonstrate that performing adaptive error-bounded compression in higher dimensional space enables greater compression ratios, leveraging the error propagation theory of a transformation-based compressor.

Data Compression

High-performance Data Management for Whole Slide Image Analysis in Digital Pathology

no code implementations10 Aug 2023 Haoju Leng, Ruining Deng, Shunxing Bao, Dazheng Fang, Bryan A. Millis, Yucheng Tang, Haichun Yang, Xiao Wang, Yifan Peng, Lipeng Wan, Yuankai Huo

The performance evaluation encompasses two key scenarios: (1) a pure CPU-based image analysis scenario ("CPU scenario"), and (2) a GPU-based deep learning framework scenario ("GPU scenario").

Management whole slide images

An Accelerated Pipeline for Multi-label Renal Pathology Image Segmentation at the Whole Slide Image Level

1 code implementation23 May 2023 Haoju Leng, Ruining Deng, Zuhayr Asad, R. Michael Womick, Haichun Yang, Lipeng Wan, Yuankai Huo

Our proposed method's innovative contribution is two-fold: (1) a Docker is released for an end-to-end slide-wise multi-tissue segmentation for WSIs; and (2) the pipeline is deployed on a GPU to accelerate the prediction, achieving better segmentation quality in less time.

Image Segmentation Segmentation +2

MMRDN: Consistent Representation for Multi-View Manipulation Relationship Detection in Object-Stacked Scenes

no code implementations25 Apr 2023 Han Wang, Jiayuan Zhang, Lipeng Wan, Xingyu Chen, Xuguang Lan, Nanning Zheng

Manipulation relationship detection (MRD) aims to guide the robot to grasp objects in the right order, which is important to ensure the safety and reliability of grasping in object stacked scenes.

Position Relationship Detection

Greedy-based Value Representation for Efficient Coordination in Multi-agent Reinforcement Learning

no code implementations29 Sep 2021 Lipeng Wan, Zeyang Liu, Xingyu Chen, Han Wang, Xuguang Lan

Due to the representation limitation of the joint Q value function, multi-agent reinforcement learning (MARL) methods with linear or monotonic value decomposition can not ensure the optimal consistency (i. e. the correspondence between the individual greedy actions and the maximal true Q value), leading to instability and poor coordination.

Multi-agent Reinforcement Learning reinforcement-learning +1

Multi-agent Policy Optimization with Approximatively Synchronous Advantage Estimation

no code implementations7 Dec 2020 Lipeng Wan, Xuwei Song, Xuguang Lan, Nanning Zheng

General methods for policy based multi-agent reinforcement learning to solve the challenge introduce differentiate value functions or advantage functions for individual agents.

Multi-agent Reinforcement Learning Starcraft

FTRANS: Energy-Efficient Acceleration of Transformers using FPGA

no code implementations16 Jul 2020 Bingbing Li, Santosh Pandey, Haowen Fang, Yanjun Lyv, Ji Li, Jieyang Chen, Mimi Xie, Lipeng Wan, Hang Liu, Caiwen Ding

In natural language processing (NLP), the "Transformer" architecture was proposed as the first transduction model replying entirely on self-attention mechanisms without using sequence-aligned recurrent neural networks (RNNs) or convolution, and it achieved significant improvements for sequence to sequence tasks.

Model Compression

Cannot find the paper you are looking for? You can Submit a new open access paper.