Search Results for author: Xuguang Lan

Found 25 papers, 7 papers with code

Human Action Recognition Based on Spatial-Temporal Attention

no code implementations ICLR 2019 Wensong Chan, Zhiqiang Tian, Xuguang Lan

Many state-of-the-art methods of recognizing human action are based on attention mechanism, which shows the importance of attention mechanism in action recognition.

Action Recognition Temporal Action Localization

PRISM: Projection-based Reward Integration for Scene-Aware Real-to-Sim-to-Real Transfer with Few Demonstrations

no code implementations29 Apr 2025 Haowen Sun, Han Wang, Chengzhong Ma, Shaolong Zhang, Jiawei Ye, Xingyu Chen, Xuguang Lan

Learning from few demonstrations to develop policies robust to variations in robot initial positions and object poses is a problem of significant practical interest in robotics.

Imitation Learning Reinforcement Learning (RL)

Playing Non-Embedded Card-Based Games with Reinforcement Learning

1 code implementation7 Apr 2025 Tianyang Wu, Lipeng Wan, Yuhang Wang, Qiang Wan, Xuguang Lan

Developing complex non-embedded agents remains challenging, especially in card-based RTS games with complex features and large state spaces.

Board Games Decision Making +6

Bootstrapped Model Predictive Control

2 code implementations24 Mar 2025 Yuhang Wang, Hanwei Guo, Sizhe Wang, Long Qian, Xuguang Lan

In this work, we introduce Bootstrapped Model Predictive Control (BMPC), a novel algorithm that performs policy learning in a bootstrapped manner.

continuous-control Continuous Control +3

Enhancing Decision Transformer with Diffusion-Based Trajectory Branch Generation

no code implementations18 Nov 2024 Zhihong Liu, Long Qian, Zeyang Liu, Lipeng Wan, Xingyu Chen, Xuguang Lan

Decision Transformer (DT) can learn effective policy from offline datasets by converting the offline reinforcement learning (RL) into a supervised sequence modeling task, where the trajectory elements are generated auto-regressively conditioned on the return-to-go (RTG). However, the sequence modeling learning approach tends to learn policies that converge on the sub-optimal trajectories within the dataset, for lack of bridging data to move to better trajectories, even if the condition is set to the highest RTG. To address this issue, we introduce Diffusion-Based Trajectory Branch Generation (BG), which expands the trajectories of the dataset with branches generated by a diffusion model. The trajectory branch is generated based on the segment of the trajectory within the dataset, and leads to trajectories with higher returns. We concatenate the generated branch with the trajectory segment as an expansion of the trajectory. After expanding, DT has more opportunities to learn policies to move to better trajectories, preventing it from converging to the sub-optimal trajectories. Empirically, after processing with BG, DT outperforms state-of-the-art sequence modeling methods on D4RL benchmark, demonstrating the effectiveness of adding branches to the dataset without further modifications.

D4RL Reinforcement Learning (RL)

Innovative Thinking, Infinite Humor: Humor Research of Large Language Models through Structured Thought Leaps

no code implementations14 Oct 2024 Han Wang, Yilin Zhao, Dian Li, Xiaohan Wang, Gang Liu, Xuguang Lan, Hui Wang

Humor is a culturally nuanced aspect of human language that presents challenges for understanding and generation, requiring participants to possess good creativity and strong associative thinking.

Math

Grounded Answers for Multi-agent Decision-making Problem through Generative World Model

no code implementations3 Oct 2024 Zeyang Liu, Xinrui Yang, Shiguang Sun, Long Qian, Lipeng Wan, Xingyu Chen, Xuguang Lan

The simulator is a world model that separately learns dynamics and reward, where the dynamics model comprises an image tokenizer as well as a causal transformer to generate interaction transitions autoregressively, and the reward model is a bidirectional transformer learned by maximizing the likelihood of trajectories in the expert demonstrations under language guidance.

Decision Making Image Generation +2

Relation DETR: Exploring Explicit Position Relation Prior for Object Detection

2 code implementations16 Jul 2024 Xiuquan Hou, Meiqin Liu, Senlin Zhang, Ping Wei, Badong Chen, Xuguang Lan

The experimental results on the dataset illustrate that the proposed explicit position relation achieves a clear improvement of 1. 3% AP, highlighting its potential towards universal object detection.

object-detection Object Detection +2

Imagine, Initialize, and Explore: An Effective Exploration Method in Multi-Agent Reinforcement Learning

no code implementations28 Feb 2024 Zeyang Liu, Lipeng Wan, Xinrui Yang, Zhuoran Chen, Xingyu Chen, Xuguang Lan

To address this limitation, we propose Imagine, Initialize, and Explore (IIE), a novel method that offers a promising solution for efficient multi-agent exploration in complex scenarios.

Action Generation SMAC+ +1

ESMC: Entire Space Multi-Task Model for Post-Click Conversion Rate via Parameter Constraint

no code implementations18 Jul 2023 Zhenhao Jiang, Biao Zeng, Hao Feng, Jin Liu, Jicong Fan, Jie Zhang, Jia Jia, Ning Hu, Xingyu Chen, Xuguang Lan

We propose a novel Entire Space Multi-Task Model for Post-Click Conversion Rate via Parameter Constraint (ESMC) and two alternatives: Entire Space Multi-Task Model with Siamese Network (ESMS) and Entire Space Multi-Task Model in Global Domain (ESMG) to address the PSC issue.

Decision Making Recommendation Systems +1

MMRDN: Consistent Representation for Multi-View Manipulation Relationship Detection in Object-Stacked Scenes

no code implementations25 Apr 2023 Han Wang, Jiayuan Zhang, Lipeng Wan, Xingyu Chen, Xuguang Lan, Nanning Zheng

Manipulation relationship detection (MRD) aims to guide the robot to grasp objects in the right order, which is important to ensure the safety and reliability of grasping in object stacked scenes.

Position Relationship Detection

Greedy-based Value Representation for Efficient Coordination in Multi-agent Reinforcement Learning

no code implementations29 Sep 2021 Lipeng Wan, Zeyang Liu, Xingyu Chen, Han Wang, Xuguang Lan

Due to the representation limitation of the joint Q value function, multi-agent reinforcement learning (MARL) methods with linear or monotonic value decomposition can not ensure the optimal consistency (i. e. the correspondence between the individual greedy actions and the maximal true Q value), leading to instability and poor coordination.

Multi-agent Reinforcement Learning reinforcement-learning +1

INVIGORATE: Interactive Visual Grounding and Grasping in Clutter

no code implementations25 Aug 2021 Hanbo Zhang, Yunfan Lu, Cunjun Yu, David Hsu, Xuguang Lan, Nanning Zheng

This paper presents INVIGORATE, a robot system that interacts with human through natural language and grasps a specified object in clutter.

Blocking Object +5

Multi-agent Policy Optimization with Approximatively Synchronous Advantage Estimation

no code implementations7 Dec 2020 Lipeng Wan, Xuwei Song, Xuguang Lan, Nanning Zheng

General methods for policy based multi-agent reinforcement learning to solve the challenge introduce differentiate value functions or advantage functions for individual agents.

Multi-agent Reinforcement Learning Starcraft

A Boundary Based Out-of-Distribution Classifier for Generalized Zero-Shot Learning

2 code implementations ECCV 2020 Xingyu Chen, Xuguang Lan, Fuchun Sun, Nanning Zheng

Using a gating mechanism that discriminates the unseen samples from the seen samples can decompose the GZSL problem to a conventional Zero-Shot Learning (ZSL) problem and a supervised classification problem.

Generalized Zero-Shot Learning

REGNet: REgion-based Grasp Network for End-to-end Grasp Detection in Point Clouds

1 code implementation28 Feb 2020 Binglei Zhao, Hanbo Zhang, Xuguang Lan, Haoyu Wang, Zhiqiang Tian, Nanning Zheng

Reliable robotic grasping in unstructured environments is a crucial but challenging task.

Robotics

Hindsight Trust Region Policy Optimization

1 code implementation29 Jul 2019 Hanbo Zhang, Site Bai, Xuguang Lan, David Hsu, Nanning Zheng

We propose \emph{Hindsight Trust Region Policy Optimization}(HTRPO), a new RL algorithm that extends the highly successful TRPO algorithm with \emph{hindsight} to tackle the challenge of sparse rewards.

Atari Games Policy Gradient Methods +2

A Real-time Robotic Grasp Approach with Oriented Anchor Box

no code implementations8 Sep 2018 Hanbo Zhang, Xinwen Zhou, Xuguang Lan, Jin Li, Zhiqiang Tian, Nanning Zheng

The main component of our approach is a grasp detection network with oriented anchor boxes as detection priors.

Robotics

ROI-based Robotic Grasp Detection for Object Overlapping Scenes

no code implementations30 Aug 2018 Hanbo Zhang, Xuguang Lan, Site Bai, Xinwen Zhou, Zhiqiang Tian, Nanning Zheng

Experimental results demonstrate that ROI-GD performs much better in object overlapping scenes and at the meantime, remains comparable with state-of-the-art grasp detection algorithms on Cornell Grasp Dataset and Jacquard Dataset.

Robotics

Cannot find the paper you are looking for? You can Submit a new open access paper.