no code implementations • 1 May 2025 • Chih-Hao Lin, Zian Wang, Ruofan Liang, Yuxuan Zhang, Sanja Fidler, Shenlong Wang, Zan Gojcic
Generating realistic and controllable weather effects in videos is valuable for many applications.
no code implementations • 21 Apr 2025 • Hongchi Xia, Entong Su, Marius Memmel, Arhan Jain, Raymond Yu, Numfor Mbiziwo-Tiapo, Ali Farhadi, Abhishek Gupta, Shenlong Wang, Wei-Chiu Ma
Creating virtual digital replicas from real-world data unlocks significant potential across domains like gaming and robotics.
1 code implementation • 27 Mar 2025 • David Yifan Yao, Albert J. Zhai, Shenlong Wang
This paper presents a unified approach to understanding dynamic scenes from casual videos.
no code implementations • 26 Mar 2025 • Boyuan Chen, Hanxiao Jiang, Shaowei Liu, Saurabh Gupta, Yunzhu Li, Hao Zhao, Shenlong Wang
Envisioning physically plausible outcomes from a single image requires a deep understanding of the world's dynamics.
1 code implementation • 23 Mar 2025 • Hanxiao Jiang, Hao-Yu Hsu, Kaifeng Zhang, Hsin-Ni Yu, Shenlong Wang, Yunzhu Li
Creating a physical digital twin of a real-world object has immense potential in robotics, content creation, and XR.
1 code implementation • 11 Mar 2025 • Kai Deng, Jian Yang, Shenlong Wang, Jin Xie
Consequently, GigaSLAM delivers high-precision tracking and visually faithful rendering on urban outdoor benchmarks, establishing a robust SLAM solution for large-scale, long-term scenarios, and significantly extending the applicability of Gaussian Splatting SLAM systems to unbounded outdoor environments.
no code implementations • 28 Feb 2025 • Joao Marcos Correia Marques, Nils Dengler, Tobias Zaenker, Jesper Mucke, Shenlong Wang, Maren Bennewitz, Kris Hauser
To tackle this, we define a POMDP whose belief is summarized by a metric-semantic grid map and propose a novel framework that uses neural networks to perform map-space belief updates to reason efficiently and simultaneously about object geometries, locations, categories, occlusions, and manipulation physics.
no code implementations • 13 Feb 2025 • Jing Wen, Alexander G. Schwing, Shenlong Wang
Generalizable rendering of an animatable human avatar from sparse inputs relies on data priors and inductive biases extracted from training on large data to avoid scene-specific optimization and to enable fast reconstruction.
no code implementations • 14 Nov 2024 • Albert J. Zhai, Xinlei Wang, Kaiyuan Li, Zhao Jiang, Junxiong Zhou, Sheng Wang, Zhenong Jin, Kaiyu Guan, Shenlong Wang
The ability to automatically build 3D digital twins of plants from images has countless applications in agriculture, environmental science, robotics, and other fields.
1 code implementation • 4 Nov 2024 • Hao-Yu Hsu, Zhi-Hao Lin, Albert Zhai, Hongchi Xia, Shenlong Wang
Modern visual effects (VFX) software has made it possible for skilled artists to create imagery of virtually anything.
1 code implementation • 27 Sep 2024 • Shaowei Liu, Zhongzheng Ren, Saurabh Gupta, Shenlong Wang
We present PhysGen, a novel image-to-video generation method that converts a single image and an input condition (e. g., force and torque applied to an object in the image) to produce a realistic, physically plausible, and temporally consistent video.
no code implementations • 24 Sep 2024 • Jae Yong Lee, Yuqun Wu, Chuhang Zou, Derek Hoiem, Shenlong Wang
The goal of this paper is to encode a 3D scene into an extremely compact representation from 2D images and to enable its transmittance, decoding and rendering in real-time across various platforms.
no code implementations • 6 Sep 2024 • Boyuan Tian, Yihan Pang, Muhammad Huzaifa, Shenlong Wang, Sarita Adve
We identify a Pareto-optimal curve and show that the designs on the curve are achievable only through synergistic co-optimization of all three optimization classes and by considering the latency and accuracy needs of downstream scene reconstruction consumers.
1 code implementation • 2 Sep 2024 • Ansh Sharma, Albert Xiao, Praneet Rathi, Rohit Kundu, Albert Zhai, Yuan Shen, Shenlong Wang
In this work, we present a novel method for extensive multi-scale generative terrain modeling.
no code implementations • 2 Jun 2024 • Yuan Shen, Duygu Ceylan, Paul Guerrero, Zexiang Xu, Niloy J. Mitra, Shenlong Wang, Anna Frühstück
We demonstrate that it is possible to directly repurpose existing (pretrained) video models for 3D super-resolution and thus sidestep the problem of the shortage of large repositories of high-quality 3D training models.
no code implementations • 15 Apr 2024 • Hongchi Xia, Zhi-Hao Lin, Wei-Chiu Ma, Shenlong Wang
Creating high-quality and interactive virtual environments, such as games and simulators, often involves complex and costly manual modeling processes.
no code implementations • 12 Apr 2024 • Yuqun Wu, Jae Yong Lee, Chuhang Zou, Shenlong Wang, Derek Hoiem
The latest regularized Neural Radiance Field (NeRF) approaches produce poor geometry and view extrapolation for large scale sparse view scenes, such as ETH3D.
1 code implementation • CVPR 2024 • Jing Wen, Xiaoming Zhao, Zhongzheng Ren, Alexander G. Schwing, Shenlong Wang
We introduce GoMAvatar, a novel approach for real-time, memory-efficient, high-quality animatable human modeling.
no code implementations • CVPR 2024 • Albert J. Zhai, Yuan Shen, Emily Y. Chen, Gloria X. Wang, Xinlei Wang, Sheng Wang, Kaiyu Guan, Shenlong Wang
Can computers perceive the physical properties of objects solely through vision?
1 code implementation • 3 Apr 2024 • Vlas Zyrianov, Henry Che, Zhijian Liu, Shenlong Wang
We present LidarDM, a novel LiDAR generative model capable of producing realistic, layout-aware, physically plausible, and temporally coherent LiDAR videos.
1 code implementation • 23 Feb 2024 • Hanxiao Jiang, Binghao Huang, Ruihai Wu, Zhuoran Li, Shubham Garg, Hooshang Nayyeri, Shenlong Wang, Yunzhu Li
We introduce the novel task of interactive scene exploration, wherein robots autonomously explore environments and produce an action-conditioned scene graph (ACSG) that captures the structure of the underlying environment.
no code implementations • 23 Jan 2024 • Chih-Hao Lin, Jia-Bin Huang, Zhengqin Li, Zhao Dong, Christian Richardt, Tuotuo Li, Michael Zollhöfer, Johannes Kopf, Shenlong Wang, Changil Kim
Our results show that IRIS effectively recovers HDR lighting, accurate material, and plausible camera response functions, supporting photorealistic relighting and object insertion.
1 code implementation • 17 Jan 2024 • Benjamin Ummenhofer, Sanskar Agrawal, Rene Sepulveda, Yixing Lao, Kai Zhang, Tianhang Cheng, Stephan Richter, Shenlong Wang, German Ros
Reconstructing an object from photos and placing it virtually in a new environment goes beyond the standard novel view synthesis task as the appearance of the object has to not only adapt to the novel viewpoint but also to the new lighting conditions and yet evaluations of inverse rendering methods rely on novel view synthesis data or simplistic synthetic datasets for quantitative analysis.
1 code implementation • NeurIPS 2023 • Tianhang Cheng, Wei-Chiu Ma, Kaiyu Guan, Antonio Torralba, Shenlong Wang
Our world is full of identical objects (\emphe. g., cans of coke, cars of same model).
no code implementations • CVPR 2024 • Hongchi Xia, Zhi-Hao Lin, Wei-Chiu Ma, Shenlong Wang
Creating high-quality and interactive virtual environments such as games and simulators often involves complex and costly manual modeling processes.
1 code implementation • 16 Nov 2023 • Joao Marcos Correia Marques, Albert Zhai, Shenlong Wang, Kris Hauser
Semantic 3D mapping, the process of fusing depth and image segmentation information between multiple views to build 3D maps annotated with object classes in real-time, is a recent topic of interest.
1 code implementation • ICCV 2023 • Shaowei Liu, Yang Zhou, Jimei Yang, Saurabh Gupta, Shenlong Wang
This paper presents a novel object-centric contact representation ContactGen for hand-object interaction.
no code implementations • ICCV 2023 • Xiyue Zhu, Vlas Zyrianov, Zhijian Liu, Shenlong Wang
Despite tremendous advancements in bird's-eye view (BEV) perception, existing models fall short in generating realistic and coherent semantic map layouts, and they fail to account for uncertainties arising from partial sensor information (such as occlusion or limited coverage).
no code implementations • 15 Jun 2023 • Chih-Hao Lin, Bohan Liu, Yi-Ting Chen, Kuan-Sheng Chen, David Forsyth, Jia-Bin Huang, Anand Bhattad, Shenlong Wang
We present UrbanIR (Urban Scene Inverse Rendering), a new inverse graphics model that enables realistic, free-viewpoint renderings of scenes under various lighting conditions with a single video.
1 code implementation • CVPR 2023 • Shaowei Liu, Saurabh Gupta, Shenlong Wang
We build rearticulable models for arbitrary everyday man-made objects containing an arbitrary number of parts that are connected together in arbitrary ways via 1 degree-of-freedom joints.
no code implementations • ICCV 2023 • Albert J. Zhai, Shenlong Wang
In this work, we present a straightforward method for learning these regularities by predicting the locations of unobserved objects from incomplete semantic maps.
no code implementations • 2 Dec 2022 • Jae Yong Lee, Yuqun Wu, Chuhang Zou, Shenlong Wang, Derek Hoiem
Instead, we propose to encode features in bins of Fourier features that are commonly used for positional encoding.
1 code implementation • ICCV 2023 • Yuan Li, Zhi-Hao Lin, David Forsyth, Jia-Bin Huang, Shenlong Wang
Physical simulations produce excellent predictions of weather effects.
no code implementations • 4 Nov 2022 • Yuefan Wu, Zeyuan Chen, Shaowei Liu, Zhongzheng Ren, Shenlong Wang
Recovering the skeletal shape of an animal from a monocular video is a longstanding challenge.
1 code implementation • 8 Sep 2022 • Vlas Zyrianov, Xiyue Zhu, Shenlong Wang
We present LiDARGen, a novel, effective, and controllable generative model that produces realistic LiDAR point cloud sensory readings.
no code implementations • CVPR 2022 • Wei-Chiu Ma, Anqi Joyce Yang, Shenlong Wang, Raquel Urtasun, Antonio Torralba
Similar to classic correspondences, VCs conform with epipolar geometry; unlike classic correspondences, VCs do not need to be co-visible across views.
1 code implementation • CVPR 2022 • Zhi-Hao Lin, Wei-Chiu Ma, Hao-Yu Hsu, Yu-Chiang Frank Wang, Shenlong Wang
We present Neural Mixtures of Planar Experts (NeurMiPs), a novel planar-based scene representation for modeling geometry and appearance.
no code implementations • ECCV 2020 • Wei-Chiu Ma, Shenlong Wang, Jiayuan Gu, Sivabalan Manivasagam, Antonio Torralba, Raquel Urtasun
Specifically, at each iteration, the neural network takes the feedback as input and outputs an update on the current estimation.
no code implementations • 18 Jan 2021 • Min Bai, Shenlong Wang, Kelvin Wong, Ersin Yumer, Raquel Urtasun
In this paper, we introduce a non-parametric memory representation for spatio-temporal segmentation that captures the local space and time around an autonomous vehicle (AV).
no code implementations • 18 Jan 2021 • Shivam Duggal, ZiHao Wang, Wei-Chiu Ma, Sivabalan Manivasagam, Justin Liang, Shenlong Wang, Raquel Urtasun
Reconstructing high-quality 3D objects from sparse, partial observations from a single view is of crucial importance for various applications in computer vision, robotics, and graphics.
no code implementations • CVPR 2018 • Shenlong Wang, Simon Suo, Wei-Chiu Ma, Andrei Pokrovsky, Raquel Urtasun
Standard convolutional neural networks assume a grid structured input is available and exploit discrete convolutions as their fundamental building blocks.
Ranked #2 on
Semantic Segmentation
on S3DIS Area5
(Number of params metric)
no code implementations • CVPR 2021 • Ze Yang, Shenlong Wang, Sivabalan Manivasagam, Zeng Huang, Wei-Chiu Ma, Xinchen Yan, Ersin Yumer, Raquel Urtasun
Constructing and animating humans is an important component for building virtual worlds in a wide variety of applications such as virtual reality or robotics testing in simulation.
no code implementations • 17 Jan 2021 • Anqi Joyce Yang, Can Cui, Ioan Andrei Bârsan, Raquel Urtasun, Shenlong Wang
Existing multi-camera SLAM systems assume synchronized shutters for all cameras, which is often not the case in practice.
no code implementations • CVPR 2021 • Shuhan Tan, Kelvin Wong, Shenlong Wang, Sivabalan Manivasagam, Mengye Ren, Raquel Urtasun
Existing methods typically insert actors into the scene according to a set of hand-crafted heuristics and are limited in their ability to model the true complexity and diversity of real traffic scenes, thus inducing a content gap between synthesized traffic scenes versus real ones.
no code implementations • CVPR 2021 • Yun Chen, Frieda Rong, Shivam Duggal, Shenlong Wang, Xinchen Yan, Sivabalan Manivasagam, Shangjie Xue, Ersin Yumer, Raquel Urtasun
Scalable sensor simulation is an important yet challenging open problem for safety-critical domains such as self-driving.
1 code implementation • 23 Dec 2020 • Julieta Martinez, Sasha Doubov, Jack Fan, Ioan Andrei Bârsan, Shenlong Wang, Gellért Máttyus, Raquel Urtasun
We are interested in understanding whether retrieval-based localization approaches are good enough in the context of self-driving vehicles.
no code implementations • CVPR 2019 • Justin Liang, Namdar Homayounfar, Wei-Chiu Ma, Shenlong Wang, Raquel Urtasun
Creating high definition maps that contain precise information of static elements of the scene is of utmost importance for enabling self driving cars to drive safely.
no code implementations • 20 Dec 2020 • Ioan Andrei Bârsan, Shenlong Wang, Andrei Pokrovsky, Raquel Urtasun
In this paper we propose a real-time, calibration-agnostic and effective localization system for self-driving cars.
no code implementations • CVPR 2019 • Xinkai Wei, Ioan Andrei Bârsan, Shenlong Wang, Julieta Martinez, Raquel Urtasun
One of the main difficulties of scaling current localization systems to large environments is the on-board storage required for the maps.
no code implementations • ECCV 2018 • Ming Liang, Bin Yang, Shenlong Wang, Raquel Urtasun
In this paper, we propose a novel 3D object detector that can exploit both LIDAR as well as cameras to perform very accurate localization.
no code implementations • NeurIPS 2020 • Sourav Biswas, Jerry Liu, Kelvin Wong, Shenlong Wang, Raquel Urtasun
Our model exploits spatio-temporal relationships across multiple LiDAR sweeps to reduce the bitrate of both geometry and intensity values.
no code implementations • ECCV 2020 • Jerry Liu, Shenlong Wang, Wei-Chiu Ma, Meet Shah, Rui Hu, Pranaab Dhawan, Raquel Urtasun
We propose a very simple and efficient video compression framework that only focuses on modeling the conditional entropy between frames.
no code implementations • ECCV 2020 • Wenyuan Zeng, Shenlong Wang, Renjie Liao, Yun Chen, Bin Yang, Raquel Urtasun
In this paper, we propose the Deep Structured self-Driving Network (DSDNet), which performs object detection, motion prediction, and motion planning with a single neural network.
no code implementations • CVPR 2020 • Sivabalan Manivasagam, Shenlong Wang, Kelvin Wong, Wenyuan Zeng, Mikita Sazanovich, Shuhan Tan, Bin Yang, Wei-Chiu Ma, Raquel Urtasun
We first utilize ray casting over the 3D scene and then use a deep neural network to produce deviations from the physics-based simulation, producing realistic LiDAR point clouds.
6 code implementations • CVPR 2020 • Lila Huang, Shenlong Wang, Kelvin Wong, Jerry Liu, Raquel Urtasun
We present a novel deep compression algorithm to reduce the memory footprint of LiDAR point clouds.
no code implementations • 24 Oct 2019 • Kelvin Wong, Shenlong Wang, Mengye Ren, Ming Liang, Raquel Urtasun
In the past few years, we have seen great progress in perception algorithms, particular through the use of deep learning.
2 code implementations • NeurIPS 2019 • Renjie Liao, Yujia Li, Yang Song, Shenlong Wang, Charlie Nash, William L. Hamilton, David Duvenaud, Raquel Urtasun, Richard S. Zemel
Our model generates graphs one block of nodes and associated edges at a time.
1 code implementation • ICCV 2019 • Shivam Duggal, Shenlong Wang, Wei-Chiu Ma, Rui Hu, Raquel Urtasun
Our goal is to significantly speed up the runtime of current state-of-the-art stereo algorithms to enable real-time inference.
1 code implementation • ICCV 2019 • Jerry Liu, Shenlong Wang, Raquel Urtasun
In this paper we tackle the problem of stereo image compression, and leverage the fact that the two images have overlapping fields of view to further compress the representations.
no code implementations • 8 Aug 2019 • Wei-Chiu Ma, Ignacio Tartavull, Ioan Andrei Bârsan, Shenlong Wang, Min Bai, Gellert Mattyus, Namdar Homayounfar, Shrinidhi Kowshika Lakshmikanth, Andrei Pokrovsky, Raquel Urtasun
In this paper we propose a novel semantic localization algorithm that exploits multiple sensors and has precision on the order of a few centimeters.
no code implementations • 4 May 2019 • Min Bai, Gellert Mattyus, Namdar Homayounfar, Shenlong Wang, Shrinidhi Kowshika Lakshmikanth, Raquel Urtasun
Reliable and accurate lane detection has been a long-standing problem in the field of autonomous driving.
no code implementations • CVPR 2019 • Wei-Chiu Ma, Shenlong Wang, Rui Hu, Yuwen Xiong, Raquel Urtasun
In this paper we tackle the problem of scene flow estimation in the context of self-driving.
no code implementations • NeurIPS 2016 • Shenlong Wang, Sanja Fidler, Raquel Urtasun
Many problems in real-world applications involve predicting continuous-valued random variables that are statistically related.
no code implementations • ICCV 2017 • Shenlong Wang, Min Bai, Gellert Mattyus, Hang Chu, Wenjie Luo, Bin Yang, Justin Liang, Joel Cheverie, Sanja Fidler, Raquel Urtasun
In this paper we introduce the TorontoCity benchmark, which covers the full greater Toronto area (GTA) with 712. 5 $km^2$ of land, 8439 $km$ of road and around 400, 000 buildings.
no code implementations • 17 Nov 2016 • Shenlong Wang, Linjie Luo, Ning Zhang, Jia Li
We propose AutoScaler, a scale-attention network to explicitly optimize this trade-off in visual correspondence tasks.
no code implementations • 23 Jun 2016 • Wei-Chiu Ma, Shenlong Wang, Marcus A. Brubaker, Sanja Fidler, Raquel Urtasun
In this paper we present a robust, efficient and affordable approach to self-localization which does not require neither GPS nor knowledge about the appearance of the world.
no code implementations • CVPR 2016 • Gellert Mattyus, Shenlong Wang, Sanja Fidler, Raquel Urtasun
In this paper we present an approach to enhance existing maps with fine grained segmentation categories such as parking spots and sidewalk, as well as the number and location of road lanes.
no code implementations • CVPR 2016 • Shenlong Wang, Sean Ryan Fanello, Christoph Rhemann, Shahram Izadi, Pushmeet Kohli
In contrast to conventional approaches that rely on pairwise distance computation, our algorithm isolates distinctive pixel pairs that hit the same leaf during traversal through multiple learned tree structures.
no code implementations • ICCV 2015 • Gellert Mattyus, Shenlong Wang, Sanja Fidler, Raquel Urtasun
In recent years, contextual models that exploit maps have been shown to be very effective for many recognition and localization tasks.
no code implementations • ICCV 2015 • Shenlong Wang, Sanja Fidler, Raquel Urtasun
In this paper we propose a novel approach to localization in very large indoor spaces (i. e., 200+ store shopping malls) that takes a single image and a floor plan of the environment as input.
no code implementations • CVPR 2015 • Shenlong Wang, Sanja Fidler, Raquel Urtasun
In this paper we are interested in exploiting geographic priors to help outdoor scene understanding.
no code implementations • NeurIPS 2014 • Shenlong Wang, Alex Schwing, Raquel Urtasun
In this paper, we prove that every multivariate polynomial with even degree can be decomposed into a sum of convex and concave polynomials.