Search Results for author: Wei-Chiu Ma

Found 36 papers, 8 papers with code

BLINK: Multimodal Large Language Models Can See but Not Perceive

no code implementations • 18 Apr 2024 • Xingyu Fu, Yushi Hu, Bangzheng Li, Yu Feng, Haoyu Wang, Xudong Lin, Dan Roth, Noah A. Smith, Wei-Chiu Ma, Ranjay Krishna

We introduce Blink, a new benchmark for multimodal language models (LLMs) that focuses on core visual perception abilities not found in other evaluations.

Depth Estimation Multiple-choice +1

Paper
Add Code

Video2Game: Real-time, Interactive, Realistic and Browser-Compatible Environment from a Single Video

no code implementations • 15 Apr 2024 • Hongchi Xia, Zhi-Hao Lin, Wei-Chiu Ma, Shenlong Wang

Creating high-quality and interactive virtual environments, such as games and simulators, often involves complex and costly manual modeling processes.

Paper
Add Code

Structure from Duplicates: Neural Inverse Graphics from a Pile of Objects

1 code implementation • NeurIPS 2023 • Tianhang Cheng, Wei-Chiu Ma, Kaiyu Guan, Antonio Torralba, Shenlong Wang

Our world is full of identical objects (\emphe. g., cans of coke, cars of same model).

Image Reconstruction Object +1

Paper
Code

LightSim: Neural Lighting Simulation for Urban Scenes

no code implementations • 11 Dec 2023 • Ava Pun, Gary Sun, Jingkang Wang, Yun Chen, Ze Yang, Sivabalan Manivasagam, Wei-Chiu Ma, Raquel Urtasun

Different outdoor illumination conditions drastically alter the appearance of urban scenes, and they can harm the performance of image-based robot perception systems if not seen during training.

Paper
Add Code

CADSim: Robust and Scalable in-the-wild 3D Reconstruction for Controllable Sensor Simulation

no code implementations • 2 Nov 2023 • Jingkang Wang, Sivabalan Manivasagam, Yun Chen, Ze Yang, Ioan Andrei Bârsan, Anqi Joyce Yang, Wei-Chiu Ma, Raquel Urtasun

To tackle these issues, we present CADSim, which combines part-aware object-class priors via a small set of CAD models with differentiable rendering to automatically reconstruct vehicle geometry, including articulated wheels, with high-quality appearance.

3D Reconstruction

Paper
Add Code

UltraLiDAR: Learning Compact Representations for LiDAR Completion and Generation

no code implementations • 2 Nov 2023 • Yuwen Xiong, Wei-Chiu Ma, Jingkang Wang, Raquel Urtasun

We show that by aligning the representation of a sparse point cloud to that of a dense point cloud, we can densify the sparse point clouds as if they were captured by a real high-density LiDAR, drastically reducing the cost.

Paper
Add Code

UniSim: A Neural Closed-Loop Sensor Simulator

no code implementations • CVPR 2023 • Ze Yang, Yun Chen, Jingkang Wang, Sivabalan Manivasagam, Wei-Chiu Ma, Anqi Joyce Yang, Raquel Urtasun

Previously recorded driving logs provide a rich resource to build these new scenarios from, but for closed loop evaluation, we need to modify the sensor data based on the new scene configuration and the SDV's decisions, as actors might be added or removed and the trajectories of existing actors and the SDV will differ from the original log.

Paper
Add Code

What Happened 3 Seconds Ago? Inferring the Past with Thermal Imaging

1 code implementation • CVPR 2023 • Zitian Tang, Wenjie Ye, Wei-Chiu Ma, Hang Zhao

Inferring past human motion from RGB images is challenging due to the inherent uncertainty of the prediction problem.

Human-Object Interaction Detection Pose Estimation

Paper
Code

VoxelFormer: Bird's-Eye-View Feature Generation based on Dual-view Attention for Multi-view 3D Object Detection

1 code implementation • 3 Apr 2023 • Zhuoling Li, Chuanrui Zhang, Wei-Chiu Ma, Yipin Zhou, Linyan Huang, Haoqian Wang, SerNam Lim, Hengshuang Zhao

In recent years, transformer-based detectors have demonstrated remarkable performance in 2D visual perception tasks.

3D Object Detection object-detection

Paper
Code

Learning Compact Representations for LiDAR Completion and Generation

no code implementations • CVPR 2023 • Yuwen Xiong, Wei-Chiu Ma, Jingkang Wang, Raquel Urtasun

Paper
Add Code

Virtual Correspondence: Humans as a Cue for Extreme-View Geometry

no code implementations • CVPR 2022 • Wei-Chiu Ma, Anqi Joyce Yang, Shenlong Wang, Raquel Urtasun, Antonio Torralba

Similar to classic correspondences, VCs conform with epipolar geometry; unlike classic correspondences, VCs do not need to be co-visible across views.

3D Reconstruction Novel View Synthesis +1

Paper
Add Code

NeurMiPs: Neural Mixture of Planar Experts for View Synthesis

1 code implementation • CVPR 2022 • Zhi-Hao Lin, Wei-Chiu Ma, Hao-Yu Hsu, Yu-Chiang Frank Wang, Shenlong Wang

We present Neural Mixtures of Planar Experts (NeurMiPs), a novel planar-based scene representation for modeling geometry and appearance.

Novel View Synthesis

113

Paper
Code

BARF: Bundle-Adjusting Neural Radiance Fields

4 code implementations • ICCV 2021 • Chen-Hsuan Lin, Wei-Chiu Ma, Antonio Torralba, Simon Lucey

In this paper, we propose Bundle-Adjusting Neural Radiance Fields (BARF) for training NeRF from imperfect (or even unknown) camera poses -- the joint problem of learning neural 3D representations and registering camera frames.

Visual Localization

767

Paper
Code

Deep Feedback Inverse Problem Solver

no code implementations • ECCV 2020 • Wei-Chiu Ma, Shenlong Wang, Jiayuan Gu, Sivabalan Manivasagam, Antonio Torralba, Raquel Urtasun

Specifically, at each iteration, the neural network takes the feedback as input and outputs an update on the current estimation.

Pose Estimation

Paper
Add Code

Mending Neural Implicit Modeling for 3D Vehicle Reconstruction in the Wild

no code implementations • 18 Jan 2021 • Shivam Duggal, ZiHao Wang, Wei-Chiu Ma, Sivabalan Manivasagam, Justin Liang, Shenlong Wang, Raquel Urtasun

Reconstructing high-quality 3D objects from sparse, partial observations from a single view is of crucial importance for various applications in computer vision, robotics, and graphics.

3D Object Reconstruction

Paper
Add Code

Deep Parametric Continuous Convolutional Neural Networks

no code implementations • CVPR 2018 • Shenlong Wang, Simon Suo, Wei-Chiu Ma, Andrei Pokrovsky, Raquel Urtasun

Standard convolutional neural networks assume a grid structured input is available and exploit discrete convolutions as their fundamental building blocks.

Ranked #2 on Semantic Segmentation on S3DIS Area5 (Number of params metric)

Motion Estimation Point Cloud Segmentation +1

Paper
Add Code

S3: Neural Shape, Skeleton, and Skinning Fields for 3D Human Modeling

no code implementations • CVPR 2021 • Ze Yang, Shenlong Wang, Sivabalan Manivasagam, Zeng Huang, Wei-Chiu Ma, Xinchen Yan, Ersin Yumer, Raquel Urtasun

Constructing and animating humans is an important component for building virtual worlds in a wide variety of applications such as virtual reality or robotics testing in simulation.

Paper
Add Code

VideoClick: Video Object Segmentation with a Single Click

no code implementations • 16 Jan 2021 • Namdar Homayounfar, Justin Liang, Wei-Chiu Ma, Raquel Urtasun

Towards this goal, in this paper we propose a bottom up approach where given a single click for each object in a video, we obtain the segmentation masks of these objects in the full video.

Object Segmentation +4

Paper
Add Code

Hierarchical Recurrent Attention Networks for Structured Online Maps

no code implementations • CVPR 2018 • Namdar Homayounfar, Wei-Chiu Ma, Shrinidhi Kowshika Lakshmikanth, Raquel Urtasun

In this paper, we tackle the problem of online road network extraction from sparse 3D point clouds.

Paper
Add Code

DAGMapper: Learning to Map by Discovering Lane Topology

no code implementations • ICCV 2019 • Namdar Homayounfar, Wei-Chiu Ma, Justin Liang, Xinyu Wu, Jack Fan, Raquel Urtasun

One of the fundamental challenges to scale self-driving is being able to create accurate high definition maps (HD maps) with low cost.

Paper
Add Code

Convolutional Recurrent Network for Road Boundary Extraction

no code implementations • CVPR 2019 • Justin Liang, Namdar Homayounfar, Wei-Chiu Ma, Shenlong Wang, Raquel Urtasun

Creating high definition maps that contain precise information of static elements of the scene is of utmost importance for enabling self driving cars to drive safely.

Self-Driving Cars

Paper
Add Code

Recovering and Simulating Pedestrians in the Wild

no code implementations • 16 Nov 2020 • Ze Yang, Siva Manivasagam, Ming Liang, Bin Yang, Wei-Chiu Ma, Raquel Urtasun

We then incorporate the reconstructed pedestrian assets bank in a realistic LiDAR simulation system by performing motion retargeting, and show that the simulated LiDAR data can be used to significantly reduce the amount of annotated real-world data required for visual perception tasks.

Data Augmentation motion retargeting

Paper
Add Code

Weakly-supervised 3D Shape Completion in the Wild

no code implementations • ECCV 2020 • Jiayuan Gu, Wei-Chiu Ma, Sivabalan Manivasagam, Wenyuan Zeng, ZiHao Wang, Yuwen Xiong, Hao Su, Raquel Urtasun

3D shape completion for real data is important but challenging, since partial point clouds acquired by real-world sensors are usually sparse, noisy and unaligned.

Point Cloud Registration Pose Estimation

Paper
Add Code

Conditional Entropy Coding for Efficient Video Compression

no code implementations • ECCV 2020 • Jerry Liu, Shenlong Wang, Wei-Chiu Ma, Meet Shah, Rui Hu, Pranaab Dhawan, Raquel Urtasun

We propose a very simple and efficient video compression framework that only focuses on modeling the conditional entropy between frames.

MS-SSIM SSIM +1

Paper
Add Code

LevelSet R-CNN: A Deep Variational Method for Instance Segmentation

no code implementations • 30 Jul 2020 • Namdar Homayounfar, Yuwen Xiong, Justin Liang, Wei-Chiu Ma, Raquel Urtasun

Obtaining precise instance segmentation masks is of high importance in many modern applications such as robotic manipulation and autonomous driving.

Autonomous Driving Instance Segmentation +2

Paper
Add Code

LiDARsim: Realistic LiDAR Simulation by Leveraging the Real World

no code implementations • CVPR 2020 • Sivabalan Manivasagam, Shenlong Wang, Kelvin Wong, Wenyuan Zeng, Mikita Sazanovich, Shuhan Tan, Bin Yang, Wei-Chiu Ma, Raquel Urtasun

We first utilize ray casting over the 3D scene and then use a deep neural network to produce deviations from the physics-based simulation, producing realistic LiDAR point clouds.

Paper
Add Code

PolyTransform: Deep Polygon Transformer for Instance Segmentation

no code implementations • CVPR 2020 • Justin Liang, Namdar Homayounfar, Wei-Chiu Ma, Yuwen Xiong, Rui Hu, Raquel Urtasun

In this paper, we propose PolyTransform, a novel instance segmentation algorithm that produces precise, geometry-preserving masks by combining the strengths of prevailing segmentation approaches and modern polygon-based methods.

Ranked #1000000000 on Instance Segmentation on Cityscapes test (using extra training data)

Instance Segmentation Segmentation +1

Paper
Add Code

DeepPruner: Learning Efficient Stereo Matching via Differentiable PatchMatch

1 code implementation • ICCV 2019 • Shivam Duggal, Shenlong Wang, Wei-Chiu Ma, Rui Hu, Raquel Urtasun

Our goal is to significantly speed up the runtime of current state-of-the-art stereo algorithms to enable real-time inference.

Stereo Matching Stereo Matching Hand

345

Paper
Code

Exploiting Sparse Semantic HD Maps for Self-Driving Vehicle Localization

no code implementations • 8 Aug 2019 • Wei-Chiu Ma, Ignacio Tartavull, Ioan Andrei Bârsan, Shenlong Wang, Min Bai, Gellert Mattyus, Namdar Homayounfar, Shrinidhi Kowshika Lakshmikanth, Andrei Pokrovsky, Raquel Urtasun

In this paper we propose a novel semantic localization algorithm that exploits multiple sensors and has precision on the order of a few centimeters.

Self-Driving Cars

Paper
Add Code

Deep Rigid Instance Scene Flow

no code implementations • CVPR 2019 • Wei-Chiu Ma, Shenlong Wang, Rui Hu, Yuwen Xiong, Raquel Urtasun

In this paper we tackle the problem of scene flow estimation in the context of self-driving.

Rolling Shutter Correction Scene Flow Estimation

Paper
Add Code

The Sound of Motions

1 code implementation • ICCV 2019 • Hang Zhao, Chuang Gan, Wei-Chiu Ma, Antonio Torralba

Sounds originate from object motions and vibrations of surrounding air.

Paper
Code

SurfConv: Bridging 3D and 2D Convolution for RGBD Images

1 code implementation • CVPR 2018 • Hang Chu, Wei-Chiu Ma, Kaustav Kundu, Raquel Urtasun, Sanja Fidler

On the other hand, 3D convolution wastes a large amount of memory on mostly unoccupied 3D space, which consists of only the surface visible to the sensor.

3D Semantic Segmentation

Paper
Code

Single Image Intrinsic Decomposition without a Single Intrinsic Image

no code implementations • ECCV 2018 • Wei-Chiu Ma, Hang Chu, Bolei Zhou, Raquel Urtasun, Antonio Torralba

At inference time, our model can be easily reduced to a single stream module that performs intrinsic decomposition on a single input image.

Intrinsic Image Decomposition

Paper
Add Code

Find your Way by Observing the Sun and Other Semantic Cues

no code implementations • 23 Jun 2016 • Wei-Chiu Ma, Shenlong Wang, Marcus A. Brubaker, Sanja Fidler, Raquel Urtasun

In this paper we present a robust, efficient and affordable approach to self-localization which does not require neither GPS nor knowledge about the appearance of the world.

Paper
Add Code

Forecasting Interactive Dynamics of Pedestrians with Fictitious Play

no code implementations • CVPR 2017 • Wei-Chiu Ma, De-An Huang, Namhoon Lee, Kris M. Kitani

We develop predictive models of pedestrian dynamics by encoding the coupled nature of multi-pedestrian interaction using game theory, and deep learning-based visual analysis to estimate person-specific behavior parameters.

Decision Making

Paper
Add Code

How Do We Use Our Hands? Discovering a Diverse Set of Common Grasps

no code implementations • CVPR 2015 • De-An Huang, Minghuang Ma, Wei-Chiu Ma, Kris M. Kitani

Furthermore, we develop a hierarchical extension to the DPP clustering algorithm and show that it can be used to discover appearance-based grasp taxonomies.

Clustering Online Clustering

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.