1 code implementation • 5 Sep 2024 • JianGuo Zhang, Tian Lan, Ming Zhu, Zuxin Liu, Thai Hoang, Shirley Kokane, Weiran Yao, Juntao Tan, Akshara Prabhakar, Haolin Chen, Zhiwei Liu, Yihao Feng, Tulika Awalgaonkar, Rithesh Murthy, Eric Hu, Zeyuan Chen, ran Xu, Juan Carlos Niebles, Shelby Heinecke, Huan Wang, Silvio Savarese, Caiming Xiong
By releasing the xLAM series, we aim to advance the performance of open-source LLMs for autonomous AI agents, potentially accelerating progress and democratizing access to high-performance models for agent tasks.
no code implementations • 13 Aug 2024 • Kexun Zhang, Weiran Yao, Zuxin Liu, Yihao Feng, Zhiwei Liu, Rithesh Murthy, Tian Lan, Lei LI, Renze Lou, Jiacheng Xu, Bo Pang, Yingbo Zhou, Shelby Heinecke, Silvio Savarese, Huan Wang, Caiming Xiong
For instance, a group of open-source SWE agents, with a maximum individual resolve rate of 27. 3% on SWE-Bench Lite, can achieve a 34. 3% resolve rate with DEI, making a 25% improvement and beating most closed-source solutions.
no code implementations • 26 Jun 2024 • Zuxin Liu, Thai Hoang, JianGuo Zhang, Ming Zhu, Tian Lan, Shirley Kokane, Juntao Tan, Weiran Yao, Zhiwei Liu, Yihao Feng, Rithesh Murthy, Liangwei Yang, Silvio Savarese, Juan Carlos Niebles, Huan Wang, Shelby Heinecke, Caiming Xiong
The advancement of function-calling agent models requires diverse, reliable, and high-quality datasets.
no code implementations • 25 Jun 2024 • Jesse Zhang, Minho Heo, Zuxin Liu, Erdem Biyik, Joseph J Lim, Yao Liu, Rasool Fakoor
Prior work in skill-based RL either requires expert supervision to define useful skills, which is hard to scale, or learns a skill-space from offline data with heuristics that limit the adaptability of the skills, making them difficult to transfer during downstream RL.
no code implementations • 12 Jun 2024 • Rithesh Murthy, Liangwei Yang, Juntao Tan, Tulika Manoj Awalgaonkar, Yilun Zhou, Shelby Heinecke, Sachin Desai, Jason Wu, ran Xu, Sarah Tan, JianGuo Zhang, Zhiwei Liu, Shirley Kokane, Zuxin Liu, Ming Zhu, Huan Wang, Caiming Xiong, Silvio Savarese
The deployment of Large Language Models (LLMs) and Large Multimodal Models (LMMs) on mobile devices has gained significant attention due to the benefits of enhanced privacy, stability, and personalization.
1 code implementation • 20 May 2024 • Zhepeng Cen, Yihang Yao, Zuxin Liu, Ding Zhao
A key obstacle in this endeavor is the estimation of safety constraints, which is typically more difficult than estimating a reward metric due to the sparse nature of the constraint signals.
1 code implementation • 23 Feb 2024 • Zhiwei Liu, Weiran Yao, JianGuo Zhang, Liangwei Yang, Zuxin Liu, Juntao Tan, Prafulla K. Choubey, Tian Lan, Jason Wu, Huan Wang, Shelby Heinecke, Caiming Xiong, Silvio Savarese
Thus, we open-source a new AI agent library, AgentLite, which simplifies this process by offering a lightweight, user-friendly platform for innovating LLM agent reasoning, architectures, and applications with ease.
2 code implementations • 23 Feb 2024 • JianGuo Zhang, Tian Lan, Rithesh Murthy, Zhiwei Liu, Weiran Yao, Juntao Tan, Thai Hoang, Liangwei Yang, Yihao Feng, Zuxin Liu, Tulika Awalgaonkar, Juan Carlos Niebles, Silvio Savarese, Shelby Heinecke, Huan Wang, Caiming Xiong
It meticulously standardizes and unifies these trajectories into a consistent format, streamlining the creation of a generic data loader optimized for agent training.
1 code implementation • 16 Jan 2024 • Zhepeng Cen, Zuxin Liu, Zitong Wang, Yihang Yao, Henry Lam, Ding Zhao
Offline reinforcement learning (RL) offers a promising direction for learning policies from pre-collected datasets without requiring further interactions with the environment.
no code implementations • 23 Dec 2023 • Yihang Yao, Zuxin Liu, Zhepeng Cen, Peide Huang, Tingnan Zhang, Wenhao Yu, Ding Zhao
Leveraging insights from this framework and recognizing the significance of \textit{redundant} and \textit{conflicting} constraint conditions, we introduce the Gradient Shaping (GradS) method for general Lagrangian-based safe RL algorithms to improve the training efficiency in terms of both reward and constraint satisfaction.
no code implementations • 31 Oct 2023 • Haohong Lin, Wenhao Ding, Zuxin Liu, Yaru Niu, Jiacheng Zhu, Yuming Niu, Ding Zhao
However, maintaining safety in diverse safety-critical scenarios remains a significant challenge due to long-tailed and unforeseen scenarios absent from offline datasets.
no code implementations • 10 Oct 2023 • Fan Yang, Wenxuan Zhou, Zuxin Liu, Ding Zhao, David Held
This work introduces a novel approach that combines RL with trajectory optimization to manage this trade-off effectively.
no code implementations • 9 Oct 2023 • Zuxin Liu, Jesse Zhang, Kavosh Asadi, Yao Liu, Ding Zhao, Shoham Sabach, Rasool Fakoor
Inspired by recent advancements in parameter-efficient fine-tuning in language domains, we explore efficient fine-tuning techniques -- e. g., Bottleneck Adapters, P-Tuning, and Low-Rank Adaptation (LoRA) -- in TAIL to adapt large pretrained models for new tasks with limited demonstration data.
no code implementations • NeurIPS 2023 • Yihang Yao, Zuxin Liu, Zhepeng Cen, Jiacheng Zhu, Wenhao Yu, Tingnan Zhang, Ding Zhao
Safe reinforcement learning (RL) focuses on training reward-maximizing agents subject to pre-defined safety constraints.
1 code implementation • 22 Sep 2023 • Hanjiang Hu, Zuxin Liu, Linyi Li, Jiacheng Zhu, Ding Zhao
The current certification process for assessing robustness is costly and time-consuming due to the extensive number of image projections required for Monte Carlo sampling in the 3D camera motion space.
1 code implementation • NeurIPS 2023 • Konwoo Kim, Gokul Swamy, Zuxin Liu, Ding Zhao, Sanjiban Choudhury, Zhiwei Steven Wu
Regardless of the particular task we want them to perform in an environment, there are often shared safety constraints we want our agents to respect.
3 code implementations • 15 Jun 2023 • Zuxin Liu, Zijian Guo, Haohong Lin, Yihang Yao, Jiacheng Zhu, Zhepeng Cen, Hanjiang Hu, Wenhao Yu, Tingnan Zhang, Jie Tan, Ding Zhao
This paper presents a comprehensive benchmarking suite tailored to offline safe reinforcement learning (RL) challenges, aiming to foster progress in the development and evaluation of safe learning algorithms in both the training and deployment phases.
1 code implementation • 14 Feb 2023 • Zuxin Liu, Zijian Guo, Yihang Yao, Zhepeng Cen, Wenhao Yu, Tingnan Zhang, Ding Zhao
Safe reinforcement learning (RL) trains a constraint satisfaction policy by interacting with the environment.
2 code implementations • 4 Oct 2022 • Hanjiang Hu, Zuxin Liu, Linyi Li, Jiacheng Zhu, Ding Zhao
To this end, we study the robustness of the visual perception model under camera motion perturbations to investigate the influence of camera motion on robotic perception.
no code implementations • 16 Sep 2022 • Mengdi Xu, Zuxin Liu, Peide Huang, Wenhao Ding, Zhepeng Cen, Bo Li, Ding Zhao
A trustworthy reinforcement learning algorithm should be competent in solving challenging real-world problems, including {robustly} handling uncertainties, satisfying {safety} constraints to avoid catastrophic failures, and {generalizing} to unseen scenarios during deployments.
1 code implementation • 29 May 2022 • Zuxin Liu, Zijian Guo, Zhepeng Cen, huan zhang, Jie Tan, Bo Li, Ding Zhao
One interesting and counter-intuitive finding is that the maximum reward attack is strong, as it can both induce unsafe behaviors and make the attack stealthy by maintaining the reward.
2 code implementations • 28 Jan 2022 • Zuxin Liu, Zhepeng Cen, Vladislav Isenbaev, Wei Liu, Zhiwei Steven Wu, Bo Li, Ding Zhao
Safe reinforcement learning (RL) aims to learn policies that satisfy certain constraints before deploying them to safety-critical applications.
1 code implementation • CVPR 2022 • Hanjiang Hu, Zuxin Liu, Sharad Chitlangia, Akhil Agnihotri, Ding Zhao
To this end, we introduce an easy-to-compute information-theoretic surrogate metric to quantitatively and fast evaluate LiDAR placement for 3D detection of different types of objects.
no code implementations • 2 Jan 2021 • Baiming Chen, Zuxin Liu, Jiacheng Zhu, Mengdi Xu, Wenhao Ding, Ding Zhao
The algorithm is evaluated in realistic safety-critical environments with non-stationary disturbances.
1 code implementation • 9 Nov 2020 • Hanjiang Hu, Baoquan Yang, Zhijian Qiao, Shiqi Liu, Jiacheng Zhu, Zuxin Liu, Wenhao Ding, Ding Zhao, Hesheng Wang
Different environments pose a great challenge to the outdoor robust visual perception for long-term autonomous driving, and the generalization of learning-based algorithms on different environments is still an open problem.
1 code implementation • 15 Oct 2020 • Zuxin Liu, Hongyi Zhou, Baiming Chen, Sicheng Zhong, Martial Hebert, Ding Zhao
We propose a model-based approach to enable RL agents to effectively explore the environment with unknown system dynamics and environment constraints given a significantly small number of violation budgets.
Model-based Reinforcement Learning Model Predictive Control +3
no code implementations • 30 Jul 2020 • Zuxin Liu, Baiming Chen, Hongyi Zhou, Guru Koushik, Martial Hebert, Ding Zhao
Multi-agent navigation in dynamic environments is of great industrial value when deploying a large scale fleet of robot to real-world applications.
1 code implementation • NeurIPS 2020 • Mengdi Xu, Wenhao Ding, Jiacheng Zhu, Zuxin Liu, Baiming Chen, Ding Zhao
We propose a transition prior to account for the temporal dependencies in streaming data and update the mixture online via sequential variational inference.
1 code implementation • 11 May 2020 • Baiming Chen, Mengdi Xu, Zuxin Liu, Liang Li, Ding Zhao
We also test the proposed algorithm in traffic scenarios that require coordination of all autonomous vehicles to show the practical value of delay-awareness.
2 code implementations • 22 Sep 2018 • Chao Yu, Zuxin Liu, Xinjun Liu, Fugui Xie, Yi Yang, Qi Wei, Qiao Fei
It is one of the state-of-the-art SLAM systems in high-dynamic environments.
Robotics