Search Results for author: Tsun-Hsuan Wang

Found 31 papers, 11 papers with code

LuciBot: Automated Robot Policy Learning from Generated Videos

no code implementations12 Mar 2025 Xiaowen Qiu, Yian Wang, Jiting Cai, Zhehuan Chen, Chunru Lin, Tsun-Hsuan Wang, Chuang Gan

While prior works use large language models (LLMs) or vision-language models (VLMs) to generate rewards, these approaches are largely limited to simple tasks with well-defined rewards, such as pick-and-place.

Video Generation

Articulate AnyMesh: Open-Vocabulary 3D Articulated Objects Modeling

no code implementations4 Feb 2025 Xiaowen Qiu, Jincheng Yang, Yian Wang, Zhehuan Chen, YuFei Wang, Tsun-Hsuan Wang, Zhou Xian, Chuang Gan

3D articulated objects modeling has long been a challenging problem, since it requires to capture both accurate surface geometries and semantically meaningful and spatially precise structures, parts, and joints.

Object Visual Prompting

Embodied Red Teaming for Auditing Robotic Foundation Models

no code implementations27 Nov 2024 Sathwik Karnik, Zhang-Wei Hong, Nishant Abhangi, Yen-Chen Lin, Tsun-Hsuan Wang, Christophe Dupuy, Rahul Gupta, Pulkit Agrawal

Language-conditioned robot models have the potential to enable robots to perform a wide range of tasks based on natural language instructions.

Red Teaming

Flex: End-to-End Text-Instructed Visual Navigation from Foundation Model Features

no code implementations16 Oct 2024 Makram Chahine, Alex Quach, Alaa Maalouf, Tsun-Hsuan Wang, Daniela Rus

End-to-end learning directly maps sensory inputs to actions, creating highly integrated and efficient policies for complex robotics tasks.

Visual Navigation

Human Insights Driven Latent Space for Different Driving Perspectives: A Unified Encoder for Efficient Multi-Task Inference

no code implementations16 Sep 2024 Huy-Dung Nguyen, Anass Bairouk, Mirjana Maras, Wei Xiao, Tsun-Hsuan Wang, Patrick Chareyre, Ramin Hasani, Marc Blanchon, Daniela Rus

While our performance in steering angle estimation is comparable to existing methods, the integration of human-like perception through multi-task learning holds significant potential for advancing autonomous driving systems.

Autonomous Driving Knowledge Distillation +4

ABNet: Attention BarrierNet for Safe and Scalable Robot Learning

1 code implementation18 Jun 2024 Wei Xiao, Tsun-Hsuan Wang, Daniela Rus

Safe learning is central to AI-enabled robots where a single failure may lead to catastrophic results.

Autonomous Driving Robot Manipulation

LLM and Simulation as Bilevel Optimizers: A New Paradigm to Advance Physical Scientific Discovery

1 code implementation16 May 2024 Pingchuan Ma, Tsun-Hsuan Wang, Minghao Guo, Zhiqing Sun, Joshua B. Tenenbaum, Daniela Rus, Chuang Gan, Wojciech Matusik

Large Language Models have recently gained significant attention in scientific discovery for their extensive knowledge and advanced reasoning capabilities.

Bilevel Optimization scientific discovery

Probing Multimodal LLMs as World Models for Driving

1 code implementation9 May 2024 Shiva Sreeram, Tsun-Hsuan Wang, Alaa Maalouf, Guy Rosman, Sertac Karaman, Daniela Rus

We provide a sober look at the application of Multimodal Large Language Models (MLLMs) in autonomous driving, challenging common assumptions about their ability to interpret dynamic driving scenarios.

Autonomous Driving Trajectory Planning

Grounding Language Plans in Demonstrations Through Counterfactual Perturbations

no code implementations25 Mar 2024 Yanwei Wang, Tsun-Hsuan Wang, Jiayuan Mao, Michael Hagenow, Julie Shah

Grounding the common-sense reasoning of Large Language Models (LLMs) in physical domains remains a pivotal yet unsolved problem for embodied AI.

Common Sense Reasoning counterfactual +2

Curiosity-driven Red-teaming for Large Language Models

1 code implementation29 Feb 2024 Zhang-Wei Hong, Idan Shenfeld, Tsun-Hsuan Wang, Yung-Sung Chuang, Aldo Pareja, James Glass, Akash Srivastava, Pulkit Agrawal

To probe when an LLM generates unwanted content, the current paradigm is to recruit a \textit{red team} of human testers to design input prompts (i. e., test cases) that elicit undesirable responses from LLMs.

Red Teaming Reinforcement Learning (RL)

Drive Anywhere: Generalizable End-to-end Autonomous Driving with Multi-modal Foundation Models

no code implementations26 Oct 2023 Tsun-Hsuan Wang, Alaa Maalouf, Wei Xiao, Yutong Ban, Alexander Amini, Guy Rosman, Sertac Karaman, Daniela Rus

As autonomous driving technology matures, end-to-end methodologies have emerged as a leading strategy, promising seamless integration from perception to control via deep learning.

Autonomous Driving Data Augmentation

SafeDiffuser: Safe Planning with Diffusion Probabilistic Models

no code implementations31 May 2023 Wei Xiao, Tsun-Hsuan Wang, Chuang Gan, Daniela Rus

Diffusion model-based approaches have shown promise in data-driven planning, but there are no safety guarantees, thus making it hard to be applied for safety-critical applications.

Denoising

Towards Generalist Robots: A Promising Paradigm via Generative Simulation

no code implementations17 May 2023 Zhou Xian, Theophile Gervet, Zhenjia Xu, Yi-Ling Qiao, Tsun-Hsuan Wang, Yian Wang

This document serves as a position paper that outlines the authors' vision for a potential pathway towards generalist robots.

Scene Generation

Towards Cooperative Flight Control Using Visual-Attention

no code implementations21 Dec 2022 Lianhao Yin, Makram Chahine, Tsun-Hsuan Wang, Tim Seyde, Chao Liu, Mathias Lechner, Ramin Hasani, Daniela Rus

We propose an air-guardian system that facilitates cooperation between a pilot with eye tracking and a parallel end-to-end neural control system.

Feature Importance

Interpreting Neural Policies with Disentangled Tree Representations

no code implementations13 Oct 2022 Tsun-Hsuan Wang, Wei Xiao, Tim Seyde, Ramin Hasani, Daniela Rus

The advancement of robots, particularly those functioning in complex human-centric environments, relies on control solutions that are driven by machine learning.

Disentanglement

On the Forward Invariance of Neural ODEs

no code implementations10 Oct 2022 Wei Xiao, Tsun-Hsuan Wang, Ramin Hasani, Mathias Lechner, Yutong Ban, Chuang Gan, Daniela Rus

We propose a new method to ensure neural ordinary differential equations (ODEs) satisfy output specifications by using invariance set propagation.

Autonomous Vehicles Collision Avoidance +2

Are All Vision Models Created Equal? A Study of the Open-Loop to Closed-Loop Causality Gap

no code implementations9 Oct 2022 Mathias Lechner, Ramin Hasani, Alexander Amini, Tsun-Hsuan Wang, Thomas A. Henzinger, Daniela Rus

Our results imply that the causality gap can be solved in situation one with our proposed training guideline with any modern network architecture, whereas achieving out-of-distribution generalization (situation two) requires further investigations, for instance, on data diversity rather than the model architecture.

All Autonomous Driving +3

Liquid Structural State-Space Models

1 code implementation26 Sep 2022 Ramin Hasani, Mathias Lechner, Tsun-Hsuan Wang, Makram Chahine, Alexander Amini, Daniela Rus

A proper parametrization of state transition matrices of linear state-space models (SSMs) followed by standard nonlinearities enables them to efficiently learn representations from sequential data, establishing the state-of-the-art on a large series of long-range sequence modeling benchmarks.

Heart rate estimation Long-range modeling +4

Differentiable Control Barrier Functions for Vision-based End-to-End Autonomous Driving

no code implementations4 Mar 2022 Wei Xiao, Tsun-Hsuan Wang, Makram Chahine, Alexander Amini, Ramin Hasani, Daniela Rus

They are interpretable at scale, achieve great test performance under limited training data, and are safety guaranteed in a series of autonomous driving scenarios such as lane keeping and obstacle avoidance.

Autonomous Driving

V2VNet: Vehicle-to-Vehicle Communication for Joint Perception and Prediction

3 code implementations ECCV 2020 Tsun-Hsuan Wang, Sivabalan Manivasagam, Ming Liang, Bin Yang, Wenyuan Zeng, James Tu, Raquel Urtasun

In this paper, we explore the use of vehicle-to-vehicle (V2V) communication to improve the perception and motion forecasting performance of self-driving vehicles.

3D Object Detection Motion Forecasting

3D LiDAR and Stereo Fusion using Stereo Matching Network with Conditional Cost Volume Normalization

1 code implementation5 Apr 2019 Tsun-Hsuan Wang, Hou-Ning Hu, Chieh Hubert Lin, Yi-Hsuan Tsai, Wei-Chen Chiu, Min Sun

The complementary characteristics of active and passive depth sensing techniques motivate the fusion of the Li-DAR sensor and stereo camera for improved depth perception.

Depth Completion Stereo-LiDAR Fusion +2

Point-to-Point Video Generation

3 code implementations ICCV 2019 Tsun-Hsuan Wang, Yen-Chi Cheng, Chieh Hubert Lin, Hwann-Tzong Chen, Min Sun

We introduce point-to-point video generation that controls the generation process with two control points: the targeted start- and end-frames.

Image Manipulation Video Editing +1

Plug-and-Play: Improve Depth Estimation via Sparse Data Propagation

2 code implementations20 Dec 2018 Tsun-Hsuan Wang, Fu-En Wang, Juan-Ting Lin, Yi-Hsuan Tsai, Wei-Chen Chiu, Min Sun

We propose a novel plug-and-play (PnP) module for improving depth prediction with taking arbitrary patterns of sparse depths as input.

Depth Estimation Depth Prediction

Omnidirectional CNN for Visual Place Recognition and Navigation

no code implementations12 Mar 2018 Tsun-Hsuan Wang, Hung-Jui Huang, Juan-Ting Lin, Chan-Wei Hu, Kuo-Hao Zeng, Min Sun

Given a visual input, the task of the O-CNN is not to retrieve the matched place exemplar, but to retrieve the closest place exemplar and estimate the relative distance between the input and the closest place.

Navigate Visual Place Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.