Search Results for author: Guy Rosman

Found 37 papers, 5 papers with code

Holistic Surgical Phase Recognition with Hierarchical Input Dependent State Space Models

no code implementations26 Jun 2025 Haoyang Wu, Tsun-Hsuan Wang, Mathias Lechner, Ramin Hasani, Jennifer A. Eckhoff, Paul Pak, Ozanan R. Meireles, Guy Rosman, Yutong Ban, Daniela Rus

Surgical workflow analysis is essential in robot-assisted surgeries, yet the long duration of such procedures poses significant challenges for comprehensive video analysis.

State Space Models Surgical phase recognition

Shared Autonomy for Proximal Teaching

no code implementations27 Feb 2025 Megha Srivastava, Reihaneh Iranmanesh, Yuchen Cui, Deepak Gopinath, Emily Sumner, Andrew Silva, Laporsha Dees, Guy Rosman, Dorsa Sadigh

We use this to design Z-COACH, a method for using shared autonomy to provide personalized instruction targeting interpretable task sub-skills.

Autonomous Driving

Generating Out-Of-Distribution Scenarios Using Language Models

no code implementations25 Nov 2024 Erfan Aasi, Phat Nguyen, Shiva Sreeram, Guy Rosman, Sertac Karaman, Daniela Rus

In this paper, we leverage these LLM strengths to introduce a framework for generating diverse OOD driving scenarios.

Autonomous Driving Common Sense Reasoning +2

Learning autonomous driving from aerial imagery

no code implementations18 Oct 2024 Varun Murali, Guy Rosman, Sertac Karaman, Daniela Rus

In this work, we consider the problem of learning end to end perception to control for ground vehicles solely from aerial imagery.

Autonomous Driving Autonomous Navigation +2

Dreaming to Assist: Learning to Align with Human Objectives for Shared Control in High-Speed Racing

no code implementations14 Oct 2024 Jonathan DeCastro, Andrew Silva, Deepak Gopinath, Emily Sumner, Thomas M. Balch, Laporsha Dees, Guy Rosman

Tight coordination is required for effective human-robot teams in domains involving fast dynamics and tactical decisions, such as multi-car racing.

Car Racing

Probing Multimodal LLMs as World Models for Driving

1 code implementation9 May 2024 Shiva Sreeram, Tsun-Hsuan Wang, Alaa Maalouf, Guy Rosman, Sertac Karaman, Daniela Rus

We provide a sober look at the application of Multimodal Large Language Models (MLLMs) in autonomous driving, challenging common assumptions about their ability to interpret dynamic driving scenarios.

Autonomous Driving Trajectory Planning

Blending Data-Driven Priors in Dynamic Games

no code implementations21 Feb 2024 Justin Lidard, Haimin Hu, Asher Hancock, Zixu Zhang, Albert Gimó Contreras, Vikash Modi, Jonathan DeCastro, Deepak Gopinath, Guy Rosman, Naomi Ehrich Leonard, María Santos, Jaime Fernández Fisac

We formulate KLGame, an algorithm for solving non-cooperative dynamic game with Kullback-Leibler (KL) regularization with respect to a general, stochastic, and possibly multi-modal reference policy.

Autonomous Driving Motion Planning

Hypergraph-Transformer (HGT) for Interactive Event Prediction in Laparoscopic and Robotic Surgery

no code implementations3 Feb 2024 Lianhao Yin, Yutong Ban, Jennifer Eckhoff, Ozanan Meireles, Daniela Rus, Guy Rosman

Understanding and anticipating intraoperative events and actions is critical for intraoperative assistance and decision-making during minimally invasive surgery.

Decision Making Knowledge Graphs

Drive Anywhere: Generalizable End-to-end Autonomous Driving with Multi-modal Foundation Models

no code implementations26 Oct 2023 Tsun-Hsuan Wang, Alaa Maalouf, Wei Xiao, Yutong Ban, Alexander Amini, Guy Rosman, Sertac Karaman, Daniela Rus

As autonomous driving technology matures, end-to-end methodologies have emerged as a leading strategy, promising seamless integration from perception to control via deep learning.

Autonomous Driving Data Augmentation

Specification-Guided Data Aggregation for Semantically Aware Imitation Learning

no code implementations29 Mar 2023 Ameesh Shah, Jonathan DeCastro, John Gideon, Beyazit Yalcinkaya, Guy Rosman, Sanjit A. Seshia

Advancements in simulation and formal methods-guided environment sampling have enabled the rigorous evaluation of machine learning models in a number of safety-critical scenarios, such as autonomous driving.

Autonomous Driving Imitation Learning

Leveraging Smooth Attention Prior for Multi-Agent Trajectory Prediction

no code implementations8 Mar 2022 Zhangjie Cao, Erdem Biyik, Guy Rosman, Dorsa Sadigh

At a certain time, to forecast a reasonable future trajectory, each agent needs to pay attention to the interactions with only a small group of most relevant agents instead of unnecessarily paying attention to all the other agents.

Prediction Trajectory Prediction

Concept Graph Neural Networks for Surgical Video Understanding

no code implementations27 Feb 2022 Yutong Ban, Jennifer A. Eckhoff, Thomas M. Ward, Daniel A. Hashimoto, Ozanan R. Meireles, Daniela Rus, Guy Rosman

We constantly integrate our knowledge and understanding of the world to enhance our interpretation of what we see.

Video Understanding

Trajectory Prediction with Linguistic Representations

no code implementations19 Oct 2021 Yen-Ling Kuo, Xin Huang, Andrei Barbu, Stephen G. McGill, Boris Katz, John J. Leonard, Guy Rosman

Language allows humans to build mental models that interpret what is happening around them resulting in more accurate long-term predictions.

Prediction Trajectory Prediction

TIP: Task-Informed Motion Prediction for Intelligent Vehicles

no code implementations17 Oct 2021 Xin Huang, Guy Rosman, Ashkan Jasour, Stephen G. McGill, John J. Leonard, Brian C. Williams

When predicting trajectories of road agents, motion predictors usually approximate the future distribution by a limited number of samples.

Autonomous Driving Decision Making +2

MAAD: A Model and Dataset for "Attended Awareness" in Driving

1 code implementation16 Oct 2021 Deepak Gopinath, Guy Rosman, Simon Stent, Katsuya Terahata, Luke Fletcher, Brenna Argall, John Leonard

Our model takes as input scene information in the form of a video and noisy gaze estimates, and outputs visual saliency, a refined gaze estimate, and an estimate of the person's attended awareness.

Denoising

Risk Conditioned Neural Motion Planning

1 code implementation4 Aug 2021 Xin Huang, Meng Feng, Ashkan Jasour, Guy Rosman, Brian Williams

In this paper, we propose an extension of soft actor critic model to estimate the execution risk of a plan through a risk critic and produce risk-bounded policies efficiently by adding an extra risk term in the loss function of the policy network.

Deep Reinforcement Learning Motion Planning

SUPR-GAN: SUrgical PRediction GAN for Event Anticipation in Laparoscopic and Robotic Surgery

no code implementations10 May 2021 Yutong Ban, Guy Rosman, Jennifer A. Eckhoff, Thomas M. Ward, Daniel A. Hashimoto, Taisei Kondo, Hidekazu Iwaki, Ozanan R. Meireles, Daniela Rus

Comprehension of surgical workflow is the foundation upon which artificial intelligence (AI) and machine learning (ML) holds the potential to assist intraoperative decision-making and risk mitigation.

Decision Making Generative Adversarial Network

Aggregating Long-Term Context for Learning Laparoscopic and Robot-Assisted Surgical Workflows

no code implementations1 Sep 2020 Yutong Ban, Guy Rosman, Thomas Ward, Daniel Hashimoto, Taisei Kondo, Hidekazu Iwaki, Ozanan Meireles, Daniela Rus

With the understanding of the complete surgical workflow, the robots are able to assist the surgeons in intra-operative events, such as by giving a warning when the surgeon is entering specific keys or high-risk phases.

Surgical phase recognition

Driving Through Ghosts: Behavioral Cloning with False Positives

no code implementations29 Aug 2020 Andreas Bühler, Adrien Gaidon, Andrei Cramariuc, Rares Ambrus, Guy Rosman, Wolfram Burgard

In this work, we propose a behavioral cloning approach that can safely leverage imperfect perception without being conservative.

Autonomous Driving

Reinforcement Learning based Control of Imitative Policies for Near-Accident Driving

1 code implementation1 Jul 2020 Zhangjie Cao, Erdem Biyik, Woodrow Z. Wang, Allan Raventos, Adrien Gaidon, Guy Rosman, Dorsa Sadigh

To address driving in near-accident scenarios, we propose a hierarchical reinforcement and imitation learning (H-ReIL) approach that consists of low-level policies learned by IL for discrete driving modes, and a high-level policy learned by RL that switches between different driving modes.

Autonomous Driving Imitation Learning +2

Uncertainty-Aware Driver Trajectory Prediction at Urban Intersections

no code implementations16 Jan 2019 Xin Huang, Stephen McGill, Brian C. Williams, Luke Fletcher, Guy Rosman

In this paper, we propose a variational neural network approach that predicts future driver trajectory distributions for the vehicle based on multiple sensors.

Mixture-of-Experts Prediction +1

Variational End-to-End Navigation and Localization

no code implementations25 Nov 2018 Alexander Amini, Guy Rosman, Sertac Karaman, Daniela Rus

We define a novel variational network capable of learning from raw camera data of the environment as well as higher level roadmaps to predict (1) a full probability distribution over the possible control commands; and (2) a deterministic control command capable of navigating on the route specified within the map.

A Nonparametric Model for Multimodal Collaborative Activities Summarization

no code implementations4 Sep 2017 Guy Rosman, John W. Fisher III, Daniela Rus

We demonstrate the utility of this model for inference tasks such as activity detection, classification, and summarization.

Action Detection Activity Detection

Information-Driven Adaptive Structured-Light Scanners

no code implementations CVPR 2016 Guy Rosman, Daniela Rus, John W. Fisher III

We then demonstrate how different choices of relevant variable sets (corresponding to the subproblems of locatization and mapping) lead to different criteria for pattern selection and can be computed in an online fashion.

Pose Estimation

Real-Time Depth Refinement for Specular Objects

no code implementations CVPR 2016 Roy Or - El, Rom Hershkovitz, Aaron Wetzler, Guy Rosman, Alfred M. Bruckstein, Ron Kimmel

The introduction of consumer RGB-D scanners set off a major boost in 3D computer vision research.

Coresets for k-Segmentation of Streaming Data

no code implementations NeurIPS 2014 Guy Rosman, Mikhail Volkov, Dan Feldman, John W. Fisher III, Daniela Rus

We consider the problem of computing optimal segmentation of such signals by k-piecewise linear function, using only one pass over the data by maintaining a coreset for the signal.

Segmentation Time Series +1

Aerial Reconstructions via Probabilistic Data Fusion

no code implementations CVPR 2014 Randi Cabezas, Oren Freifeld, Guy Rosman, John W. Fisher III

We propose an integrated probabilistic model for multi-modal fusion of aerial imagery, LiDAR data, and (optional) GPS measurements.

A Mixture of Manhattan Frames: Beyond the Manhattan World

no code implementations CVPR 2014 Julian Straub, Guy Rosman, Oren Freifeld, John J. Leonard, John W. Fisher III

Traditional approaches to scene representation exploit this phenomenon via the somewhat restrictive assumption that every plane is perpendicular to one of the axes of a single coordinate system.

Cannot find the paper you are looking for? You can Submit a new open access paper.