no code implementations • ICLR 2019 • Wensong Chan, Zhiqiang Tian, Xuguang Lan
Many state-of-the-art methods of recognizing human action are based on attention mechanism, which shows the importance of attention mechanism in action recognition.
no code implementations • 29 Apr 2025 • Haowen Sun, Han Wang, Chengzhong Ma, Shaolong Zhang, Jiawei Ye, Xingyu Chen, Xuguang Lan
Learning from few demonstrations to develop policies robust to variations in robot initial positions and object poses is a problem of significant practical interest in robotics.
1 code implementation • 7 Apr 2025 • Tianyang Wu, Lipeng Wan, Yuhang Wang, Qiang Wan, Xuguang Lan
Developing complex non-embedded agents remains challenging, especially in card-based RTS games with complex features and large state spaces.
2 code implementations • 24 Mar 2025 • Yuhang Wang, Hanwei Guo, Sizhe Wang, Long Qian, Xuguang Lan
In this work, we introduce Bootstrapped Model Predictive Control (BMPC), a novel algorithm that performs policy learning in a bootstrapped manner.
no code implementations • 18 Nov 2024 • Zhihong Liu, Long Qian, Zeyang Liu, Lipeng Wan, Xingyu Chen, Xuguang Lan
Decision Transformer (DT) can learn effective policy from offline datasets by converting the offline reinforcement learning (RL) into a supervised sequence modeling task, where the trajectory elements are generated auto-regressively conditioned on the return-to-go (RTG). However, the sequence modeling learning approach tends to learn policies that converge on the sub-optimal trajectories within the dataset, for lack of bridging data to move to better trajectories, even if the condition is set to the highest RTG. To address this issue, we introduce Diffusion-Based Trajectory Branch Generation (BG), which expands the trajectories of the dataset with branches generated by a diffusion model. The trajectory branch is generated based on the segment of the trajectory within the dataset, and leads to trajectories with higher returns. We concatenate the generated branch with the trajectory segment as an expansion of the trajectory. After expanding, DT has more opportunities to learn policies to move to better trajectories, preventing it from converging to the sub-optimal trajectories. Empirically, after processing with BG, DT outperforms state-of-the-art sequence modeling methods on D4RL benchmark, demonstrating the effectiveness of adding branches to the dataset without further modifications.
no code implementations • 14 Oct 2024 • Han Wang, Yilin Zhao, Dian Li, Xiaohan Wang, Gang Liu, Xuguang Lan, Hui Wang
Humor is a culturally nuanced aspect of human language that presents challenges for understanding and generation, requiring participants to possess good creativity and strong associative thinking.
no code implementations • 3 Oct 2024 • Zeyang Liu, Xinrui Yang, Shiguang Sun, Long Qian, Lipeng Wan, Xingyu Chen, Xuguang Lan
The simulator is a world model that separately learns dynamics and reward, where the dynamics model comprises an image tokenizer as well as a causal transformer to generate interaction transitions autoregressively, and the reward model is a bidirectional transformer learned by maximizing the likelihood of trajectories in the expert demonstrations under language guidance.
2 code implementations • 16 Jul 2024 • Xiuquan Hou, Meiqin Liu, Senlin Zhang, Ping Wei, Badong Chen, Xuguang Lan
The experimental results on the dataset illustrate that the proposed explicit position relation achieves a clear improvement of 1. 3% AP, highlighting its potential towards universal object detection.
Ranked #1 on
Object Detection
on SA-Det-100k
no code implementations • 28 Feb 2024 • Zeyang Liu, Lipeng Wan, Xinrui Yang, Zhuoran Chen, Xingyu Chen, Xuguang Lan
To address this limitation, we propose Imagine, Initialize, and Explore (IIE), a novel method that offers a promising solution for efficient multi-agent exploration in complex scenarios.
no code implementations • 18 Jul 2023 • Zhenhao Jiang, Biao Zeng, Hao Feng, Jin Liu, Jicong Fan, Jie Zhang, Jia Jia, Ning Hu, Xingyu Chen, Xuguang Lan
We propose a novel Entire Space Multi-Task Model for Post-Click Conversion Rate via Parameter Constraint (ESMC) and two alternatives: Entire Space Multi-Task Model with Siamese Network (ESMS) and Entire Space Multi-Task Model in Global Domain (ESMG) to address the PSC issue.
no code implementations • 25 Apr 2023 • Han Wang, Jiayuan Zhang, Lipeng Wan, Xingyu Chen, Xuguang Lan, Nanning Zheng
Manipulation relationship detection (MRD) aims to guide the robot to grasp objects in the right order, which is important to ensure the safety and reliability of grasping in object stacked scenes.
no code implementations • 22 Nov 2022 • Lipeng Wan, Zeyang Liu, Xingyu Chen, Xuguang Lan, Nanning Zheng
To ensure optimal consistency, the optimal node is required to be the unique STN.
Multi-agent Reinforcement Learning
reinforcement-learning
+1
no code implementations • 29 Sep 2021 • Lipeng Wan, Zeyang Liu, Xingyu Chen, Han Wang, Xuguang Lan
Due to the representation limitation of the joint Q value function, multi-agent reinforcement learning (MARL) methods with linear or monotonic value decomposition can not ensure the optimal consistency (i. e. the correspondence between the individual greedy actions and the maximal true Q value), leading to instability and poor coordination.
Multi-agent Reinforcement Learning
reinforcement-learning
+1
no code implementations • 29 Aug 2021 • Xun Tan, Xingyu Chen, Guowei Zhang, Jishiyu Ding, Xuguang Lan
Fusing the two kinds of data usually helps to improve the detection results.
no code implementations • 25 Aug 2021 • Hanbo Zhang, Yunfan Lu, Cunjun Yu, David Hsu, Xuguang Lan, Nanning Zheng
This paper presents INVIGORATE, a robot system that interacts with human through natural language and grasps a specified object in clutter.
no code implementations • 14 Jul 2021 • Jie Xu, Xingyu Chen, Xuguang Lan, Nanning Zheng
The experimental results show that our approach makes the interaction more efficient and safer.
1 code implementation • 29 Apr 2021 • Hanbo Zhang, Deyu Yang, Han Wang, Binglei Zhao, Xuguang Lan, Jishiyu Ding, Nanning Zheng
In this paper, we present a new dataset named REGRAD for the learning of relationships among objects and grasps.
no code implementations • 7 Dec 2020 • Lipeng Wan, Xuwei Song, Xuguang Lan, Nanning Zheng
General methods for policy based multi-agent reinforcement learning to solve the challenge introduce differentiate value functions or advantage functions for individual agents.
2 code implementations • ECCV 2020 • Xingyu Chen, Xuguang Lan, Fuchun Sun, Nanning Zheng
Using a gating mechanism that discriminates the unseen samples from the seen samples can decompose the GZSL problem to a conventional Zero-Shot Learning (ZSL) problem and a supervised classification problem.
1 code implementation • 28 Feb 2020 • Binglei Zhao, Hanbo Zhang, Xuguang Lan, Haoyu Wang, Zhiqiang Tian, Nanning Zheng
Reliable robotic grasping in unstructured environments is a crucial but challenging task.
Robotics
1 code implementation • 29 Jul 2019 • Hanbo Zhang, Site Bai, Xuguang Lan, David Hsu, Nanning Zheng
We propose \emph{Hindsight Trust Region Policy Optimization}(HTRPO), a new RL algorithm that extends the highly successful TRPO algorithm with \emph{hindsight} to tackle the challenge of sparse rewards.
no code implementations • 19 Sep 2018 • Hanbo Zhang, Xuguang Lan, Site Bai, Lipeng Wan, Chenjie Yang, Nanning Zheng
Autonomous robotic grasping plays an important role in intelligent robotics.
Robotics
no code implementations • 8 Sep 2018 • Hanbo Zhang, Xinwen Zhou, Xuguang Lan, Jin Li, Zhiqiang Tian, Nanning Zheng
The main component of our approach is a grasp detection network with oriented anchor boxes as detection priors.
Robotics
no code implementations • 30 Aug 2018 • Hanbo Zhang, Xuguang Lan, Site Bai, Xinwen Zhou, Zhiqiang Tian, Nanning Zheng
Experimental results demonstrate that ROI-GD performs much better in object overlapping scenes and at the meantime, remains comparable with state-of-the-art grasp detection algorithms on Cornell Grasp Dataset and Jacquard Dataset.
Robotics
no code implementations • 6 Mar 2018 • Xinwen Zhou, Xuguang Lan, Hanbo Zhang, Zhiqiang Tian, Yang Zhang, Nanning Zheng
The feature extractor is a deep convolutional neural network.