Search Results for author: Raquel Urtasun

Found 200 papers, 46 papers with code

QuAD: Query-based Interpretable Neural Motion Planning for Autonomous Driving

no code implementations1 Apr 2024 Sourav Biswas, Sergio Casas, Quinlan Sykora, Ben Agro, Abbas Sadat, Raquel Urtasun

Instead, we shift the paradigm to have the planner query occupancy at relevant spatio-temporal points, restricting the computation to those regions of interest.

Autonomous Driving Collision Avoidance +3

LightSim: Neural Lighting Simulation for Urban Scenes

no code implementations11 Dec 2023 Ava Pun, Gary Sun, Jingkang Wang, Yun Chen, Ze Yang, Sivabalan Manivasagam, Wei-Chiu Ma, Raquel Urtasun

Different outdoor illumination conditions drastically alter the appearance of urban scenes, and they can harm the performance of image-based robot perception systems if not seen during training.

Reconstructing Objects in-the-wild for Realistic Sensor Simulation

no code implementations9 Nov 2023 Ze Yang, Sivabalan Manivasagam, Yun Chen, Jingkang Wang, Rui Hu, Raquel Urtasun

In this work, we present NeuSim, a novel approach that estimates accurate geometry and realistic appearance from sparse in-the-wild data captured at distance and at limited viewpoints.

Adv3D: Generating Safety-Critical 3D Objects through Closed-Loop Simulation

no code implementations2 Nov 2023 Jay Sarva, Jingkang Wang, James Tu, Yuwen Xiong, Sivabalan Manivasagam, Raquel Urtasun

In this paper, we propose a framework, Adv3D, that takes real world scenarios and performs closed-loop sensor simulation to evaluate autonomy performance, and finds vehicle shapes that make the scenario more challenging, resulting in autonomy failures and uncomfortable SDV maneuvers.

Motion Planning

LabelFormer: Object Trajectory Refinement for Offboard Perception from LiDAR Point Clouds

no code implementations2 Nov 2023 Anqi Joyce Yang, Sergio Casas, Nikita Dvornik, Sean Segal, Yuwen Xiong, Jordan Sir Kwang Hu, Carter Fang, Raquel Urtasun

Auto-labels are most commonly generated via a two-stage approach -- first objects are detected and tracked over time, and then each object trajectory is passed to a learned refinement model to improve accuracy.

Learning Realistic Traffic Agents in Closed-loop

no code implementations2 Nov 2023 Chris Zhang, James Tu, Lunjun Zhang, Kelvin Wong, Simon Suo, Raquel Urtasun

Our experiments show that RTR learns more realistic and generalizable traffic simulation policies, achieving significantly better tradeoffs between human-like driving and traffic compliance in both nominal and long-tail scenarios.

Imitation Learning Reinforcement Learning (RL)

CADSim: Robust and Scalable in-the-wild 3D Reconstruction for Controllable Sensor Simulation

no code implementations2 Nov 2023 Jingkang Wang, Sivabalan Manivasagam, Yun Chen, Ze Yang, Ioan Andrei Bârsan, Anqi Joyce Yang, Wei-Chiu Ma, Raquel Urtasun

To tackle these issues, we present CADSim, which combines part-aware object-class priors via a small set of CAD models with differentiable rendering to automatically reconstruct vehicle geometry, including articulated wheels, with high-quality appearance.

3D Reconstruction

UltraLiDAR: Learning Compact Representations for LiDAR Completion and Generation

no code implementations2 Nov 2023 Yuwen Xiong, Wei-Chiu Ma, Jingkang Wang, Raquel Urtasun

We show that by aligning the representation of a sparse point cloud to that of a dense point cloud, we can densify the sparse point clouds as if they were captured by a real high-density LiDAR, drastically reducing the cost.

MemorySeg: Online LiDAR Semantic Segmentation with a Latent Memory

no code implementations ICCV 2023 Enxu Li, Sergio Casas, Raquel Urtasun

To address this challenge, we propose a novel framework for semantic segmentation of a temporal sequence of LiDAR point clouds that utilizes a memory network to store, update and retrieve past information.

LIDAR Semantic Segmentation Segmentation +1

4D-Former: Multimodal 4D Panoptic Segmentation

no code implementations2 Nov 2023 Ali Athar, Enxu Li, Sergio Casas, Raquel Urtasun

4D panoptic segmentation is a challenging but practically useful task that requires every point in a LiDAR point-cloud sequence to be assigned a semantic class label, and individual objects to be segmented and tracked over time.

4D Panoptic Segmentation Panoptic Segmentation +2

UniSim: A Neural Closed-Loop Sensor Simulator

2 code implementations CVPR 2023 Ze Yang, Yun Chen, Jingkang Wang, Sivabalan Manivasagam, Wei-Chiu Ma, Anqi Joyce Yang, Raquel Urtasun

Previously recorded driving logs provide a rich resource to build these new scenarios from, but for closed loop evaluation, we need to modify the sensor data based on the new scene configuration and the SDV's decisions, as actors might be added or removed and the trajectories of existing actors and the SDV will differ from the original log.

Implicit Occupancy Flow Fields for Perception and Prediction in Self-Driving

no code implementations CVPR 2023 Ben Agro, Quinlan Sykora, Sergio Casas, Raquel Urtasun

A self-driving vehicle (SDV) must be able to perceive its surroundings and predict the future behavior of other traffic participants.

Future prediction object-detection +2

Rethinking Closed-loop Training for Autonomous Driving

no code implementations27 Jun 2023 Chris Zhang, Runsheng Guo, Wenyuan Zeng, Yuwen Xiong, Binbin Dai, Rui Hu, Mengye Ren, Raquel Urtasun

Recent advances in high-fidelity simulators have enabled closed-loop training of autonomous driving agents, potentially solving the distribution shift in training v. s.

Autonomous Driving

Learning Compact Representations for LiDAR Completion and Generation

no code implementations CVPR 2023 Yuwen Xiong, Wei-Chiu Ma, Jingkang Wang, Raquel Urtasun

We show that by aligning the representation of a sparse point cloud to that of a dense point cloud, we can densify the sparse point clouds as if they were captured by a real high-density LiDAR, drastically reducing the cost.

MixSim: A Hierarchical Framework for Mixed Reality Traffic Simulation

no code implementations CVPR 2023 Simon Suo, Kelvin Wong, Justin Xu, James Tu, Alexander Cui, Sergio Casas, Raquel Urtasun

Towards this goal, we propose to leverage the wealth of interesting scenarios captured in the real world and make them reactive and controllable to enable closed-loop SDV evaluation in what-if situations.

Mixed Reality

Towards Zero Domain Gap: A Comprehensive Study of Realistic LiDAR Simulation for Autonomy Testing

no code implementations ICCV 2023 Sivabalan Manivasagam, Ioan Andrei Bârsan, Jingkang Wang, Ze Yang, Raquel Urtasun

We leverage this setting to analyze what aspects of LiDAR simulation, such as pulse phenomena, scanning effects, and asset quality, affect the domain gap with respect to the autonomy system, including perception, prediction, and motion planning, and analyze how modifications to the simulated LiDAR influence each part.

Motion Planning

GoRela: Go Relative for Viewpoint-Invariant Motion Forecasting

no code implementations4 Nov 2022 Alexander Cui, Sergio Casas, Kelvin Wong, Simon Suo, Raquel Urtasun

However, this approach is computationally expensive for multi-agent prediction as inference needs to be run for each agent.

Motion Forecasting

Virtual Correspondence: Humans as a Cue for Extreme-View Geometry

no code implementations CVPR 2022 Wei-Chiu Ma, Anqi Joyce Yang, Shenlong Wang, Raquel Urtasun, Antonio Torralba

Similar to classic correspondences, VCs conform with epipolar geometry; unlike classic correspondences, VCs do not need to be co-visible across views.

3D Reconstruction Novel View Synthesis +1

NP-DRAW: A Non-Parametric Structured Latent Variable Model for Image Generation

1 code implementation25 Jun 2021 Xiaohui Zeng, Raquel Urtasun, Richard Zemel, Sanja Fidler, Renjie Liao

1) We propose a non-parametric prior distribution over the appearance of image parts so that the latent variable ``what-to-draw'' per step becomes a categorical random variable.

Image Generation

Just Label What You Need: Fine-Grained Active Selection for Perception and Prediction through Partially Labeled Scenes

no code implementations8 Apr 2021 Sean Segal, Nishanth Kumar, Sergio Casas, Wenyuan Zeng, Mengye Ren, Jingkang Wang, Raquel Urtasun

As data collection is often significantly cheaper than labeling in this domain, the decision of which subset of examples to label can have a profound impact on model performance.

Active Learning

IntentNet: Learning to Predict Intention from Raw Sensor Data

no code implementations20 Jan 2021 Sergio Casas, Wenjie Luo, Raquel Urtasun

In order to plan a safe maneuver, self-driving vehicles need to understand the intent of other traffic participants.

Deep Feedback Inverse Problem Solver

no code implementations ECCV 2020 Wei-Chiu Ma, Shenlong Wang, Jiayuan Gu, Sivabalan Manivasagam, Antonio Torralba, Raquel Urtasun

Specifically, at each iteration, the neural network takes the feedback as input and outputs an update on the current estimation.

Pose Estimation

Non-parametric Memory for Spatio-Temporal Segmentation of Construction Zones for Self-Driving

no code implementations18 Jan 2021 Min Bai, Shenlong Wang, Kelvin Wong, Ersin Yumer, Raquel Urtasun

In this paper, we introduce a non-parametric memory representation for spatio-temporal segmentation that captures the local space and time around an autonomous vehicle (AV).

Mending Neural Implicit Modeling for 3D Vehicle Reconstruction in the Wild

no code implementations18 Jan 2021 Shivam Duggal, ZiHao Wang, Wei-Chiu Ma, Sivabalan Manivasagam, Justin Liang, Shenlong Wang, Raquel Urtasun

Reconstructing high-quality 3D objects from sparse, partial observations from a single view is of crucial importance for various applications in computer vision, robotics, and graphics.

3D Object Reconstruction

Deep Structured Reactive Planning

no code implementations18 Jan 2021 Jerry Liu, Wenyuan Zeng, Raquel Urtasun, Ersin Yumer

An intelligent agent operating in the real-world must balance achieving its goal with maintaining the safety and comfort of not only itself, but also other participants within the surrounding scene.

MP3: A Unified Model to Map, Perceive, Predict and Plan

no code implementations CVPR 2021 Sergio Casas, Abbas Sadat, Raquel Urtasun

High-definition maps (HD maps) are a key component of most modern self-driving systems due to their valuable semantic and geometric information.

Deep Parametric Continuous Convolutional Neural Networks

no code implementations CVPR 2018 Shenlong Wang, Simon Suo, Wei-Chiu Ma, Andrei Pokrovsky, Raquel Urtasun

Standard convolutional neural networks assume a grid structured input is available and exploit discrete convolutions as their fundamental building blocks.

Ranked #2 on Semantic Segmentation on S3DIS Area5 (Number of params metric)

Motion Estimation Point Cloud Segmentation +1

End-to-end Interpretable Neural Motion Planner

1 code implementation CVPR 2019 Wenyuan Zeng, Wenjie Luo, Simon Suo, Abbas Sadat, Bin Yang, Sergio Casas, Raquel Urtasun

In this paper, we propose a neural motion planner (NMP) for learning to drive autonomously in complex urban scenarios that include traffic-light handling, yielding, and interactions with multiple road-users.

Deep Multi-Task Learning for Joint Localization, Perception, and Prediction

no code implementations CVPR 2021 John Phillips, Julieta Martinez, Ioan Andrei Bârsan, Sergio Casas, Abbas Sadat, Raquel Urtasun

Over the last few years, we have witnessed tremendous progress on many subtasks of autonomous driving, including perception, motion forecasting, and motion planning.

Motion Forecasting Motion Planning +1

Exploring Adversarial Robustness of Multi-Sensor Perception Systems in Self Driving

no code implementations17 Jan 2021 James Tu, Huichen Li, Xinchen Yan, Mengye Ren, Yun Chen, Ming Liang, Eilyan Bitar, Ersin Yumer, Raquel Urtasun

Yet, there have been limited studies on the adversarial robustness of multi-modal models that fuse LiDAR features with image features.

Adversarial Robustness Denoising +1

PLUMENet: Efficient 3D Object Detection from Stereo Images

1 code implementation17 Jan 2021 Yan Wang, Bin Yang, Rui Hu, Ming Liang, Raquel Urtasun

In this paper we propose a model that unifies these two tasks and performs them in the same metric space.

3D Object Detection From Stereo Images Depth Estimation +2

Cost-Efficient Online Hyperparameter Optimization

no code implementations17 Jan 2021 Jingkang Wang, Mengye Ren, Ilija Bogunovic, Yuwen Xiong, Raquel Urtasun

Recent work on hyperparameters optimization (HPO) has shown the possibility of training certain hyperparameters together with regular parameters.

Bayesian Optimization Hyperparameter Optimization

Auto4D: Learning to Label 4D Objects from Sequential Point Clouds

no code implementations17 Jan 2021 Bin Yang, Min Bai, Ming Liang, Wenyuan Zeng, Raquel Urtasun

The key idea is to decompose the 4D object label into two parts: the object size in 3D that's fixed through time for rigid objects, and the motion path describing the evolution of the object's pose through time.

3D Object Detection Object

S3: Neural Shape, Skeleton, and Skinning Fields for 3D Human Modeling

no code implementations CVPR 2021 Ze Yang, Shenlong Wang, Sivabalan Manivasagam, Zeng Huang, Wei-Chiu Ma, Xinchen Yan, Ersin Yumer, Raquel Urtasun

Constructing and animating humans is an important component for building virtual worlds in a wide variety of applications such as virtual reality or robotics testing in simulation.

Asynchronous Multi-View SLAM

no code implementations17 Jan 2021 Anqi Joyce Yang, Can Cui, Ioan Andrei Bârsan, Raquel Urtasun, Shenlong Wang

Existing multi-camera SLAM systems assume synchronized shutters for all cameras, which is often not the case in practice.

Sensor Modeling

Network Automatic Pruning: Start NAP and Take a Nap

no code implementations17 Jan 2021 Wenyuan Zeng, Yuwen Xiong, Raquel Urtasun

This process is typically time-consuming and requires expert knowledge to achieve good results.

Network Pruning

TrafficSim: Learning to Simulate Realistic Multi-Agent Behaviors

1 code implementation CVPR 2021 Simon Suo, Sebastian Regalado, Sergio Casas, Raquel Urtasun

We show TrafficSim generates significantly more realistic and diverse traffic scenarios as compared to a diverse set of baselines.

Common Sense Reasoning Data Augmentation

Adversarial Attacks On Multi-Agent Communication

no code implementations ICCV 2021 James Tu, TsunHsuan Wang, Jingkang Wang, Sivabalan Manivasagam, Mengye Ren, Raquel Urtasun

Growing at a fast pace, modern autonomous systems will soon be deployed at scale, opening up the possibility for cooperative multi-agent systems.

Domain Adaptation

Self-Supervised Representation Learning from Flow Equivariance

no code implementations ICCV 2021 Yuwen Xiong, Mengye Ren, Wenyuan Zeng, Raquel Urtasun

Motivated by this ability, we present a new self-supervised learning representation framework that can be directly deployed on a video stream of complex scenes with many moving objects.

Instance Segmentation object-detection +5

SceneGen: Learning to Generate Realistic Traffic Scenes

no code implementations CVPR 2021 Shuhan Tan, Kelvin Wong, Shenlong Wang, Sivabalan Manivasagam, Mengye Ren, Raquel Urtasun

Existing methods typically insert actors into the scene according to a set of hand-crafted heuristics and are limited in their ability to model the true complexity and diversity of real traffic scenes, thus inducing a content gap between synthesized traffic scenes versus real ones.

VideoClick: Video Object Segmentation with a Single Click

no code implementations16 Jan 2021 Namdar Homayounfar, Justin Liang, Wei-Chiu Ma, Raquel Urtasun

Towards this goal, in this paper we propose a bottom up approach where given a single click for each object in a video, we obtain the segmentation masks of these objects in the full video.

Object Segmentation +4

LookOut: Diverse Multi-Future Prediction and Planning for Self-Driving

no code implementations ICCV 2021 Alexander Cui, Sergio Casas, Abbas Sadat, Renjie Liao, Raquel Urtasun

In this paper, we present LookOut, a novel autonomy system that perceives the environment, predicts a diverse set of futures of how the scene might unroll and estimates the trajectory of the SDV by optimizing a set of contingency plans over these future realizations.

Future prediction Motion Forecasting

AdvSim: Generating Safety-Critical Scenarios for Self-Driving Vehicles

no code implementations CVPR 2021 Jingkang Wang, Ava Pun, James Tu, Sivabalan Manivasagam, Abbas Sadat, Sergio Casas, Mengye Ren, Raquel Urtasun

Importantly, by simulating directly from sensor data, we obtain adversarial scenarios that are safety-critical for the full autonomy stack.

Diverse Complexity Measures for Dataset Curation in Self-driving

no code implementations16 Jan 2021 Abbas Sadat, Sean Segal, Sergio Casas, James Tu, Bin Yang, Raquel Urtasun, Ersin Yumer

Our experiments on a wide range of tasks and models show that the proposed curation pipeline is able to select datasets that lead to better generalization and higher performance.

Active Learning Motion Forecasting +1

Safety-Oriented Pedestrian Motion and Scene Occupancy Forecasting

no code implementations7 Jan 2021 Katie Luo, Sergio Casas, Renjie Liao, Xinchen Yan, Yuwen Xiong, Wenyuan Zeng, Raquel Urtasun

On two large-scale real-world datasets, nuScenes and ATG4D, we showcase that our scene-occupancy predictions are more accurate and better calibrated than those from state-of-the-art motion forecasting methods, while also matching their performance in pedestrian motion forecasting metrics.

Motion Forecasting

Pit30M: A Benchmark for Global Localization in the Age of Self-Driving Cars

1 code implementation23 Dec 2020 Julieta Martinez, Sasha Doubov, Jack Fan, Ioan Andrei Bârsan, Shenlong Wang, Gellért Máttyus, Raquel Urtasun

We are interested in understanding whether retrieval-based localization approaches are good enough in the context of self-driving vehicles.

LIDAR Semantic Segmentation Retrieval +2

DAGMapper: Learning to Map by Discovering Lane Topology

no code implementations ICCV 2019 Namdar Homayounfar, Wei-Chiu Ma, Justin Liang, Xinyu Wu, Jack Fan, Raquel Urtasun

One of the fundamental challenges to scale self-driving is being able to create accurate high definition maps (HD maps) with low cost.

Fast and Furious: Real Time End-to-End 3D Detection, Tracking and Motion Forecasting with a Single Convolutional Net

no code implementations CVPR 2018 Wenjie Luo, Bin Yang, Raquel Urtasun

In this paper we propose a novel deep neural network that is able to jointly reason about 3D detection, tracking and motion forecasting given data captured by a 3D sensor.

Motion Forecasting

End-to-End Deep Structured Models for Drawing Crosswalks

no code implementations ECCV 2018 Justin Liang, Raquel Urtasun

In this paper we address the problem of detecting crosswalks from LiDAR and camera imagery.

Convolutional Recurrent Network for Road Boundary Extraction

no code implementations CVPR 2019 Justin Liang, Namdar Homayounfar, Wei-Chiu Ma, Shenlong Wang, Raquel Urtasun

Creating high definition maps that contain precise information of static elements of the scene is of utmost importance for enabling self driving cars to drive safely.

Self-Driving Cars

HDNET: Exploiting HD Maps for 3D Object Detection

no code implementations21 Dec 2020 Bin Yang, Ming Liang, Raquel Urtasun

In this paper we show that High-Definition (HD) maps provide strong priors that can boost the performance and robustness of modern 3D object detectors.

3D Object Detection Object +1

Deep Continuous Fusion for Multi-Sensor 3D Object Detection

no code implementations ECCV 2018 Ming Liang, Bin Yang, Shenlong Wang, Raquel Urtasun

In this paper, we propose a novel 3D object detector that can exploit both LIDAR as well as cameras to perform very accurate localization.

3D Object Detection Object +1

Learning to Localize Through Compressed Binary Maps

no code implementations CVPR 2019 Xinkai Wei, Ioan Andrei Bârsan, Shenlong Wang, Julieta Martinez, Raquel Urtasun

One of the main difficulties of scaling current localization systems to large environments is the on-board storage required for the maps.

Learning to Localize Using a LiDAR Intensity Map

no code implementations20 Dec 2020 Ioan Andrei Bârsan, Shenlong Wang, Andrei Pokrovsky, Raquel Urtasun

In this paper we propose a real-time, calibration-agnostic and effective localization system for self-driving cars.

Self-Driving Cars

A PAC-Bayesian Approach to Generalization Bounds for Graph Neural Networks

no code implementations ICLR 2021 Renjie Liao, Raquel Urtasun, Richard Zemel

In this paper, we derive generalization bounds for the two primary classes of graph neural networks (GNNs), namely graph convolutional networks (GCNs) and message passing GNNs (MPGNNs), via a PAC-Bayesian approach.

Generalization Bounds

GeoNet++: Iterative Geometric Neural Network with Edge-Aware Refinement for Joint Depth and Surface Normal Estimation

2 code implementations13 Dec 2020 Xiaojuan Qi, Zhengzhe Liu, Renjie Liao, Philip H. S. Torr, Raquel Urtasun, Jiaya Jia

Note that GeoNet++ is generic and can be used in other depth/normal prediction frameworks to improve the quality of 3D reconstruction and pixel-wise accuracy of depth and surface normals.

3D Reconstruction Depth Estimation +2

Recovering and Simulating Pedestrians in the Wild

no code implementations16 Nov 2020 Ze Yang, Siva Manivasagam, Ming Liang, Bin Yang, Wei-Chiu Ma, Raquel Urtasun

We then incorporate the reconstructed pedestrian assets bank in a realistic LiDAR simulation system by performing motion retargeting, and show that the simulated LiDAR data can be used to significantly reduce the amount of annotated real-world data required for visual perception tasks.

Data Augmentation motion retargeting

MuSCLE: Multi Sweep Compression of LiDAR using Deep Entropy Models

no code implementations NeurIPS 2020 Sourav Biswas, Jerry Liu, Kelvin Wong, Shenlong Wang, Raquel Urtasun

Our model exploits spatio-temporal relationships across multiple LiDAR sweeps to reduce the bitrate of both geometry and intensity values.

StrObe: Streaming Object Detection from LiDAR Packets

no code implementations12 Nov 2020 Davi Frossard, Simon Suo, Sergio Casas, James Tu, Rui Hu, Raquel Urtasun

In this paper we propose StrObe, a novel approach that minimizes latency by ingesting LiDAR packets and emitting a stream of detections without waiting for the full sweep to be built.

Object object-detection +1

Universal Embeddings for Spatio-Temporal Tagging of Self-Driving Logs

no code implementations12 Nov 2020 Sean Segal, Eric Kee, Wenjie Luo, Abbas Sadat, Ersin Yumer, Raquel Urtasun

In this paper, we tackle the problem of spatio-temporal tagging of self-driving scenes from raw sensor data.

Blocking TAG +1

Learning to Communicate and Correct Pose Errors

no code implementations10 Nov 2020 Nicholas Vadivelu, Mengye Ren, James Tu, Jingkang Wang, Raquel Urtasun

Learned communication makes multi-agent systems more effective by aggregating distributed information.

Motion Forecasting object-detection +1

Perceive, Attend, and Drive: Learning Spatial Attention for Safe Self-Driving

no code implementations2 Nov 2020 Bob Wei, Mengye Ren, Wenyuan Zeng, Ming Liang, Bin Yang, Raquel Urtasun

In this paper, we propose an end-to-end self-driving network featuring a sparse attention module that learns to automatically attend to important regions of the input.

Motion Planning

LiRaNet: End-to-End Trajectory Prediction using Spatio-Temporal Radar Fusion

no code implementations2 Oct 2020 Meet Shah, Zhiling Huang, Ankit Laddha, Matthew Langford, Blake Barber, Sidney Zhang, Carlos Vallespi-Gonzalez, Raquel Urtasun

In this paper, we present LiRaNet, a novel end-to-end trajectory prediction method which utilizes radar sensor information along with widely used lidar and high definition (HD) maps.

Trajectory Prediction

Conditional Entropy Coding for Efficient Video Compression

no code implementations ECCV 2020 Jerry Liu, Shenlong Wang, Wei-Chiu Ma, Meet Shah, Rui Hu, Pranaab Dhawan, Raquel Urtasun

We propose a very simple and efficient video compression framework that only focuses on modeling the conditional entropy between frames.

MS-SSIM SSIM +1

Weakly-supervised 3D Shape Completion in the Wild

no code implementations ECCV 2020 Jiayuan Gu, Wei-Chiu Ma, Sivabalan Manivasagam, Wenyuan Zeng, ZiHao Wang, Yuwen Xiong, Hao Su, Raquel Urtasun

3D shape completion for real data is important but challenging, since partial point clouds acquired by real-world sensors are usually sparse, noisy and unaligned.

Point Cloud Registration Pose Estimation

V2VNet: Vehicle-to-Vehicle Communication for Joint Perception and Prediction

3 code implementations ECCV 2020 Tsun-Hsuan Wang, Sivabalan Manivasagam, Ming Liang, Bin Yang, Wenyuan Zeng, James Tu, Raquel Urtasun

In this paper, we explore the use of vehicle-to-vehicle (V2V) communication to improve the perception and motion forecasting performance of self-driving vehicles.

3D Object Detection Motion Forecasting

DSDNet: Deep Structured self-Driving Network

no code implementations ECCV 2020 Wenyuan Zeng, Shenlong Wang, Renjie Liao, Yun Chen, Bin Yang, Raquel Urtasun

In this paper, we propose the Deep Structured self-Driving Network (DSDNet), which performs object detection, motion prediction, and motion planning with a single neural network.

Motion Planning motion prediction +2

Perceive, Predict, and Plan: Safe Motion Planning Through Interpretable Semantic Representations

no code implementations ECCV 2020 Abbas Sadat, Sergio Casas, Mengye Ren, Xinyu Wu, Pranaab Dhawan, Raquel Urtasun

In this paper we propose a novel end-to-end learnable network that performs joint perception, prediction and motion planning for self-driving vehicles and produces interpretable intermediate representations.

Motion Planning

LoCo: Local Contrastive Representation Learning

no code implementations NeurIPS 2020 Yuwen Xiong, Mengye Ren, Raquel Urtasun

Deep neural nets typically perform end-to-end backpropagation to learn the weights, a procedure that creates synchronization constraints in the weight update step across layers and is not biologically plausible.

Contrastive Learning Decoder +5

LevelSet R-CNN: A Deep Variational Method for Instance Segmentation

no code implementations30 Jul 2020 Namdar Homayounfar, Yuwen Xiong, Justin Liang, Wei-Chiu Ma, Raquel Urtasun

Obtaining precise instance segmentation masks is of high importance in many modern applications such as robotic manipulation and autonomous driving.

Autonomous Driving Instance Segmentation +2

RadarNet: Exploiting Radar for Robust Perception of Dynamic Objects

no code implementations ECCV 2020 Bin Yang, Runsheng Guo, Ming Liang, Sergio Casas, Raquel Urtasun

We tackle the problem of exploiting Radar for perception in the context of self-driving as Radar provides complementary information to other sensors such as LiDAR or cameras in the form of Doppler velocity.

object-detection Object Detection

Learning Lane Graph Representations for Motion Forecasting

1 code implementation ECCV 2020 Ming Liang, Bin Yang, Rui Hu, Yun Chen, Renjie Liao, Song Feng, Raquel Urtasun

We propose a motion forecasting model that exploits a novel structured map representation as well as actor-map interactions.

Motion Forecasting Trajectory Prediction

Implicit Latent Variable Model for Scene-Consistent Motion Forecasting

no code implementations ECCV 2020 Sergio Casas, Cole Gulino, Simon Suo, Katie Luo, Renjie Liao, Raquel Urtasun

In order to plan a safe maneuver an autonomous vehicle must accurately perceive its environment, and understand the interactions among traffic participants.

Decoder Motion Forecasting +1

Hierarchical Verification for Adversarial Robustness

no code implementations ICML 2020 Cong Han Lim, Raquel Urtasun, Ersin Yumer

We show that, under certain conditions on the algorithm parameters, LayerCert provably reduces the number and size of the convex programs that one needs to solve compared to GeoCert.

Adversarial Robustness

LiDARsim: Realistic LiDAR Simulation by Leveraging the Real World

no code implementations CVPR 2020 Sivabalan Manivasagam, Shenlong Wang, Kelvin Wong, Wenyuan Zeng, Mikita Sazanovich, Shuhan Tan, Bin Yang, Wei-Chiu Ma, Raquel Urtasun

We first utilize ray casting over the 3D scene and then use a deep neural network to produce deviations from the physics-based simulation, producing realistic LiDAR point clouds.

The Importance of Prior Knowledge in Precise Multimodal Prediction

no code implementations4 Jun 2020 Sergio Casas, Cole Gulino, Simon Suo, Raquel Urtasun

Towards this goal, we design a framework that leverages REINFORCE to incorporate non-differentiable priors over sample trajectories from a probabilistic model, thus optimizing the whole distribution.

Motion Forecasting Motion Planning

ShapeAdv: Generating Shape-Aware Adversarial 3D Point Clouds

no code implementations24 May 2020 Kibok Lee, Zhuoyuan Chen, Xinchen Yan, Raquel Urtasun, Ersin Yumer

Our shape-aware adversarial attacks are orthogonal to existing point cloud based attacks and shed light on the vulnerability of 3D deep neural networks.

Physically Realizable Adversarial Examples for LiDAR Object Detection

no code implementations CVPR 2020 James Tu, Mengye Ren, Siva Manivasagam, Ming Liang, Bin Yang, Richard Du, Frank Cheng, Raquel Urtasun

Modern autonomous driving systems rely heavily on deep learning models to process point cloud sensory data; meanwhile, deep models have been shown to be susceptible to adversarial attacks with visually imperceptible perturbations.

Adversarial Defense Autonomous Driving +4

Dense RepPoints: Representing Visual Objects with Dense Point Sets

2 code implementations ECCV 2020 Ze Yang, Yinghao Xu, Han Xue, Zheng Zhang, Raquel Urtasun, Li-Wei Wang, Stephen Lin, Han Hu

We present a new object representation, called Dense RepPoints, that utilizes a large set of points to describe an object at multiple levels, including both box level and pixel level.

Object Object Detection

PolyTransform: Deep Polygon Transformer for Instance Segmentation

no code implementations CVPR 2020 Justin Liang, Namdar Homayounfar, Wei-Chiu Ma, Yuwen Xiong, Rui Hu, Raquel Urtasun

In this paper, we propose PolyTransform, a novel instance segmentation algorithm that produces precise, geometry-preserving masks by combining the strengths of prevailing segmentation approaches and modern polygon-based methods.

Ranked #1000000000 on Instance Segmentation on Cityscapes test (using extra training data)

Instance Segmentation Segmentation +1

Identifying Unknown Instances for Autonomous Driving

no code implementations24 Oct 2019 Kelvin Wong, Shenlong Wang, Mengye Ren, Ming Liang, Raquel Urtasun

In the past few years, we have seen great progress in perception algorithms, particular through the use of deep learning.

Autonomous Driving Instance Segmentation +1

Jointly Learnable Behavior and Trajectory Planning for Self-Driving Vehicles

no code implementations10 Oct 2019 Abbas Sadat, Mengye Ren, Andrei Pokrovsky, Yen-Chen Lin, Ersin Yumer, Raquel Urtasun

The motion planners used in self-driving vehicles need to generate trajectories that are safe, comfortable, and obey the traffic rules.

Trajectory Planning

Learning to Remember from a Multi-Task Teacher

no code implementations10 Oct 2019 Yuwen Xiong, Mengye Ren, Raquel Urtasun

Recent studies on catastrophic forgetting during sequential learning typically focus on fixing the accuracy of the predictions for a previously learned task.

Meta-Learning

DSIC: Deep Stereo Image Compression

1 code implementation ICCV 2019 Jerry Liu, Shenlong Wang, Raquel Urtasun

In this paper we tackle the problem of stereo image compression, and leverage the fact that the two images have overlapping fields of view to further compress the representations.

Decoder Image Compression

DeepSignals: Predicting Intent of Drivers Through Visual Signals

no code implementations3 May 2019 Davi Frossard, Eric Kee, Raquel Urtasun

Detecting the intention of drivers is an essential task in self-driving, necessary to anticipate sudden events like lane changes and stops.

LanczosNet: Multi-Scale Deep Graph Convolutional Networks

1 code implementation ICLR 2019 Renjie Liao, Zhizhen Zhao, Raquel Urtasun, Richard S. Zemel

We propose the Lanczos network (LanczosNet), which uses the Lanczos algorithm to construct low rank approximations of the graph Laplacian for graph convolution.

Node Classification

SurfConv: Bridging 3D and 2D Convolution for RGBD Images

1 code implementation CVPR 2018 Hang Chu, Wei-Chiu Ma, Kaustav Kundu, Raquel Urtasun, Sanja Fidler

On the other hand, 3D convolution wastes a large amount of memory on mostly unoccupied 3D space, which consists of only the surface visible to the sensor.

3D Semantic Segmentation

Graph HyperNetworks for Neural Architecture Search

1 code implementation ICLR 2019 Chris Zhang, Mengye Ren, Raquel Urtasun

Neural architecture search (NAS) automatically finds the best task-specific neural network topology, outperforming many manual architecture designs.

Neural Architecture Search

MLPrune: Multi-Layer Pruning for Automated Neural Network Compression

no code implementations27 Sep 2018 Wenyuan Zeng, Raquel Urtasun

Model compression can significantly reduce the computation and memory footprint of large neural networks.

Neural Network Compression

Single Image Intrinsic Decomposition without a Single Intrinsic Image

no code implementations ECCV 2018 Wei-Chiu Ma, Hang Chu, Bolei Zhou, Raquel Urtasun, Antonio Torralba

At inference time, our model can be easily reduced to a single stream module that performs intrinsic decomposition on a single input image.

Intrinsic Image Decomposition

End-to-end Learning of Multi-sensor 3D Tracking by Detection

no code implementations29 Jun 2018 Davi Frossard, Raquel Urtasun

In this paper we propose a novel approach to tracking by detection that can exploit both cameras as well as LIDAR data to produce very accurate 3D trajectories.

Multiple Object Tracking

Matching Adversarial Networks

no code implementations CVPR 2018 Gellért Máttyus, Raquel Urtasun

We argue that the main difficulty of applying CGANs to supervised tasks is that the generator training consists of optimizing a loss function that does not depend directly on the ground truth labels.

Depth Estimation Line Detection +1

Learning to Reweight Examples for Robust Deep Learning

9 code implementations ICML 2018 Mengye Ren, Wenyuan Zeng, Bin Yang, Raquel Urtasun

Deep neural networks have been shown to be very powerful modeling tools for many supervised learning tasks involving complex input patterns.

Meta-Learning

Inference in Probabilistic Graphical Models by Graph Neural Networks

1 code implementation21 Mar 2018 KiJung Yoon, Renjie Liao, Yuwen Xiong, Lisa Zhang, Ethan Fetaya, Raquel Urtasun, Richard Zemel, Xaq Pitkow

Message-passing algorithms, such as belief propagation, are a natural way to disseminate evidence amongst correlated variables while exploiting the graph structure, but these algorithms can struggle when the conditional dependency graphs contain loops.

Decision Making

Reviving and Improving Recurrent Back-Propagation

1 code implementation ICML 2018 Renjie Liao, Yuwen Xiong, Ethan Fetaya, Lisa Zhang, KiJung Yoon, Xaq Pitkow, Raquel Urtasun, Richard Zemel

We examine all RBP variants along with BPTT and TBPTT in three different application domains: associative memory with continuous Hopfield networks, document classification in citation networks using graph neural networks and hyperparameter optimization for fully connected networks.

Document Classification Hyperparameter Optimization

Learning deep structured active contours end-to-end

2 code implementations CVPR 2018 Diego Marcos, Devis Tuia, Benjamin Kellenberger, Lisa Zhang, Min Bai, Renjie Liao, Raquel Urtasun

The world is covered with millions of buildings, and precisely knowing each instance's position and extents is vital to a multitude of applications.

Instance Segmentation Segmentation +1

SBNet: Sparse Blocks Network for Fast Inference

2 code implementations CVPR 2018 Mengye Ren, Andrei Pokrovsky, Bin Yang, Raquel Urtasun

Conventional deep convolutional neural networks (CNNs) apply convolution operators uniformly in space across all feature maps for hundreds of layers - this incurs a high computational cost for real-time applications.

3D Object Detection Object +2

Be Your Own Prada: Fashion Synthesis with Structural Coherence

no code implementations ICCV 2017 Shizhan Zhu, Sanja Fidler, Raquel Urtasun, Dahua Lin, Chen Change Loy

In the second stage, a generative model with a newly proposed compositional mapping layer is used to render the final image with precise regions and textures conditioned on this map.

Fashion Synthesis Semantic Segmentation +1

3D Graph Neural Networks for RGBD Semantic Segmentation

2 code implementations ICCV 2017 Xiaojuan Qi, Renjie Liao, Jiaya Jia, Sanja Fidler, Raquel Urtasun

Each node in the graph corresponds to a set of points and is associated with a hidden representation vector initialized with an appearance feature extracted by a unary CNN from 2D images.

Ranked #30 on Semantic Segmentation on SUN-RGBD (using extra training data)

RGBD Semantic Segmentation Semantic Segmentation

DeepRoadMapper: Extracting Road Topology From Aerial Images

no code implementations ICCV 2017 Gellert Mattyus, Wenjie Luo, Raquel Urtasun

In contrast, in this paper we propose an approach that directly estimates road topology from aerial images.

Autonomous Driving

SGN: Sequential Grouping Networks for Instance Segmentation

no code implementations ICCV 2017 Shu Liu, Jiaya Jia, Sanja Fidler, Raquel Urtasun

By exploiting two-directional information, the second network groups horizontal and vertical lines into connected components.

Instance Segmentation Object +1

Deep Spectral Clustering Learning

no code implementations ICML 2017 Marc T. Law, Raquel Urtasun, Richard S. Zemel

We derive a closed-form expression for the gradient that is efficient to compute: the complexity to compute the gradient is linear in the size of the training mini-batch and quadratic in the representation dimensionality.

Clustering Metric Learning

The Reversible Residual Network: Backpropagation Without Storing Activations

9 code implementations NeurIPS 2017 Aidan N. Gomez, Mengye Ren, Raquel Urtasun, Roger B. Grosse

Deep residual networks (ResNets) have significantly pushed forward the state-of-the-art on image classification, increasing in performance as networks grow both deeper and wider.

General Classification Image Classification

Sports Field Localization via Deep Structured Models

no code implementations CVPR 2017 Namdar Homayounfar, Sanja Fidler, Raquel Urtasun

In this work, we propose a novel way of efficiently localizing a sports field from a single broadcast image of the game.

Semantic Segmentation

Annotating Object Instances with a Polygon-RNN

2 code implementations CVPR 2017 Lluis Castrejon, Kaustav Kundu, Raquel Urtasun, Sanja Fidler

We show that our approach speeds up the annotation process by a factor of 4. 7 across all classes in Cityscapes, while achieving 78. 4% agreement in IoU with original ground-truth, matching the typical agreement between human annotators.

Object Segmentation +1

Towards Diverse and Natural Image Descriptions via a Conditional GAN

1 code implementation ICCV 2017 Bo Dai, Sanja Fidler, Raquel Urtasun, Dahua Lin

Despite the substantial progress in recent years, the image captioning techniques are still far from being perfect. Sentences produced by existing methods, e. g. those based on RNNs, are often overly rigid and lacking in variability.

Image Captioning

MultiNet: Real-time Joint Semantic Reasoning for Autonomous Driving

16 code implementations22 Dec 2016 Marvin Teichmann, Michael Weber, Marius Zoellner, Roberto Cipolla, Raquel Urtasun

While most approaches to semantic reasoning have focused on improving performance, in this paper we argue that computational times are very important in order to enable real time applications such as autonomous driving.

Autonomous Driving General Classification +2

Proximal Deep Structured Models

no code implementations NeurIPS 2016 Shenlong Wang, Sanja Fidler, Raquel Urtasun

Many problems in real-world applications involve predicting continuous-valued random variables that are statistically related.

Image Denoising Optical Flow Estimation

TorontoCity: Seeing the World with a Million Eyes

no code implementations ICCV 2017 Shenlong Wang, Min Bai, Gellert Mattyus, Hang Chu, Wenjie Luo, Bin Yang, Justin Liang, Joel Cheverie, Sanja Fidler, Raquel Urtasun

In this paper we introduce the TorontoCity benchmark, which covers the full greater Toronto area (GTA) with 712. 5 $km^2$ of land, 8439 $km$ of road and around 400, 000 buildings.

Instance Segmentation Semantic Segmentation

Deep Watershed Transform for Instance Segmentation

3 code implementations CVPR 2017 Min Bai, Raquel Urtasun

Most contemporary approaches to instance segmentation use complex pipelines involving conditional random fields, recurrent neural networks, object proposals, or template matching schemes.

Instance Segmentation Object +3

Efficient Summarization with Read-Again and Copy Mechanism

no code implementations10 Nov 2016 Wenyuan Zeng, Wenjie Luo, Sanja Fidler, Raquel Urtasun

Towards this goal, we first introduce a simple mechanism that first reads the input sequence before committing to a representation of each word.

Decoder

Find your Way by Observing the Sun and Other Semantic Cues

no code implementations23 Jun 2016 Wei-Chiu Ma, Shenlong Wang, Marcus A. Brubaker, Sanja Fidler, Raquel Urtasun

In this paper we present a robust, efficient and affordable approach to self-localization which does not require neither GPS nor knowledge about the appearance of the world.

HD Maps: Fine-Grained Road Segmentation by Parsing Ground and Aerial Images

no code implementations CVPR 2016 Gellert Mattyus, Shenlong Wang, Sanja Fidler, Raquel Urtasun

In this paper we present an approach to enhance existing maps with fine grained segmentation categories such as parking spots and sidewalk, as well as the number and location of road lanes.

Road Segmentation

Soccer Field Localization from a Single Image

no code implementations10 Apr 2016 Namdar Homayounfar, Sanja Fidler, Raquel Urtasun

In this work, we propose a novel way of efficiently localizing a soccer field from a single broadcast image of the game.

Exploiting Semantic Information and Deep Matching for Optical Flow

no code implementations6 Apr 2016 Min Bai, Wenjie Luo, Kaustav Kundu, Raquel Urtasun

We tackle the problem of estimating optical flow from a monocular camera in the context of autonomous driving.

Autonomous Driving Optical Flow Estimation

Instance-Level Segmentation for Autonomous Driving with Deep Densely Connected MRFs

no code implementations CVPR 2016 Ziyu Zhang, Sanja Fidler, Raquel Urtasun

Our aim is to provide a pixel-wise instance-level labeling of a monocular image in the context of autonomous driving.

Autonomous Driving

Enhancing Road Maps by Parsing Aerial Images Around the World

no code implementations ICCV 2015 Gellert Mattyus, Shenlong Wang, Sanja Fidler, Raquel Urtasun

In recent years, contextual models that exploit maps have been shown to be very effective for many recognition and localization tasks.

Semantic Segmentation

Lost Shopping! Monocular Localization in Large Indoor Spaces

no code implementations ICCV 2015 Shenlong Wang, Sanja Fidler, Raquel Urtasun

In this paper we propose a novel approach to localization in very large indoor spaces (i. e., 200+ store shopping malls) that takes a single image and a floor plan of the environment as input.

Text Detection Translation

Order-Embeddings of Images and Language

2 code implementations19 Nov 2015 Ivan Vendrov, Ryan Kiros, Sanja Fidler, Raquel Urtasun

Hypernymy, textual entailment, and image captioning can be seen as special cases of a single visual-semantic hierarchy over words, sentences, and images.

Cross-Modal Retrieval Image Captioning +2

Reduced-Precision Strategies for Bounded Memory in Deep Neural Nets

no code implementations17 Nov 2015 Patrick Judd, Jorge Albericio, Tayler Hetherington, Tor Aamodt, Natalie Enright Jerger, Raquel Urtasun, Andreas Moshovos

A diverse set of CNNs is analyzed showing that compared to a conventional implementation using a 32-bit floating-point representation for all layers, and with less than 1% loss in relative accuracy, the data footprint required by these networks can be reduced by an average of 74% and up to 92%.

Aligning Books and Movies: Towards Story-like Visual Explanations by Watching Movies and Reading Books

3 code implementations ICCV 2015 Yukun Zhu, Ryan Kiros, Richard Zemel, Ruslan Salakhutdinov, Raquel Urtasun, Antonio Torralba, Sanja Fidler

Books are a rich source of both fine-grained information, how a character, an object or a scene looks like, as well as high-level semantics, what someone is thinking, feeling and how these states evolve through a story.

Descriptive Sentence +2

Skip-Thought Vectors

16 code implementations NeurIPS 2015 Ryan Kiros, Yukun Zhu, Ruslan Salakhutdinov, Richard S. Zemel, Antonio Torralba, Raquel Urtasun, Sanja Fidler

The end result is an off-the-shelf encoder that can produce highly generic sentence representations that are robust and perform well in practice.

Decoder Sentence

Rent3D: Floor-Plan Priors for Monocular Layout Estimation

no code implementations CVPR 2015 Chenxi Liu, Alexander G. Schwing, Kaustav Kundu, Raquel Urtasun, Sanja Fidler

What sets us apart from past work in layout estimation is the use of floor plans as a source of prior knowledge, as well as localization of each image within a bigger space (apartment).

Learning to Segment Under Various Forms of Weak Supervision

no code implementations CVPR 2015 Jia Xu, Alexander G. Schwing, Raquel Urtasun

Despite the promising performance of conventional fully supervised algorithms, semantic segmentation has remained an important, yet challenging task.

Segmentation Semantic Segmentation

Fully Connected Deep Structured Networks

no code implementations9 Mar 2015 Alexander G. Schwing, Raquel Urtasun

Convolutional neural networks with many layers have recently been shown to achieve excellent results on many high-level tasks such as image classification, object detection and more recently also semantic segmentation.

General Classification Image Classification +6

Generating Multi-Sentence Lingual Descriptions of Indoor Scenes

no code implementations28 Feb 2015 Dahua Lin, Chen Kong, Sanja Fidler, Raquel Urtasun

This paper proposes a novel framework for generating lingual descriptions of indoor scenes.

Sentence Text Generation

segDeepM: Exploiting Segmentation and Context in Deep Neural Networks for Object Detection

no code implementations CVPR 2015 Yukun Zhu, Raquel Urtasun, Ruslan Salakhutdinov, Sanja Fidler

In this paper, we propose an approach that exploits object segmentation in order to improve the accuracy of object detection.

Object object-detection