Reparameterized Policy Learning for Multimodal Trajectory Optimization

no code implementations20 Jul 2023 Zhiao Huang, Litian Liang, Zhan Ling, Xuanlin Li, Chuang Gan, Hao Su

We then present a practical model-based RL method, called Reparameterized Policy Gradient (RPG), which leverages the multimodal policy parameterization and learned world model to achieve strong exploration capabilities and high data efficiency.

Reinforcement Learning (RL)

Distilling Large Vision-Language Model with Out-of-Distribution Generalizability

1 code implementation ICCV 2023 Xuanlin Li, Yunhao Fang, Minghua Liu, Zhan Ling, Zhuowen Tu, Hao Su

Model distillation, the process of creating smaller, faster models that maintain the performance of larger models, is a promising direction towards the solution.

Few-Shot Image Classification Knowledge Distillation +7

On the Efficacy of 3D Point Cloud Reinforcement Learning

1 code implementation11 Jun 2023 Zhan Ling, Yunchao Yao, Xuanlin Li, Hao Su

Recent studies on visual reinforcement learning (visual RL) have explored the use of 3D visual representations.

3D Point Cloud Reinforcement Learning Inductive Bias +3

Deductive Verification of Chain-of-Thought Reasoning

1 code implementation NeurIPS 2023 Zhan Ling, Yunhao Fang, Xuanlin Li, Zhiao Huang, Mingu Lee, Roland Memisevic, Hao Su

In light of this, we propose to decompose a reasoning verification process into a series of step-by-step subprocesses, each only receiving their necessary context and premises.

Logical Reasoning

Frame Mining: a Free Lunch for Learning Robotic Manipulation from 3D Point Clouds

1 code implementation14 Oct 2022 Minghua Liu, Xuanlin Li, Zhan Ling, Yangyan Li, Hao Su

We study how choices of input point cloud coordinate frames impact learning of manipulation skills from 3D point clouds.

3D Point Cloud Reinforcement Learning Imitation Learning +2

Improving Policy Optimization with Generalist-Specialist Learning

1 code implementation26 Jun 2022 Zhiwei Jia, Xuanlin Li, Zhan Ling, Shuang Liu, Yiran Wu, Hao Su

Generalization in deep reinforcement learning over unseen environment variations usually requires policy learning over a large set of diverse training variations.

Imitation Learning

Approximate Convex Decomposition for 3D Meshes with Collision-Aware Concavity and Tree Search

1 code implementation5 May 2022 Xinyue Wei, Minghua Liu, Zhan Ling, Hao Su

Approximate convex decomposition aims to decompose a 3D shape into a set of almost convex components, whose convex hulls can then be used to represent the input shape.

ManiSkill: Generalizable Manipulation Skill Benchmark with Large-Scale Demonstrations

3 code implementations30 Jul 2021 Tongzhou Mu, Zhan Ling, Fanbo Xiang, Derek Yang, Xuanlin Li, Stone Tao, Zhiao Huang, Zhiwei Jia, Hao Su

Here we propose SAPIEN Manipulation Skill Benchmark (ManiSkill) to benchmark manipulation skills over diverse objects in a full-physics simulator.

