Search Results for author: Xinshuo Weng

Found 37 papers, 17 papers with code

Augmenting Lane Perception and Topology Understanding with Standard Definition Navigation Maps

1 code implementation • 7 Nov 2023 • Katie Z Luo, Xinshuo Weng, Yan Wang, Shuang Wu, Jie Li, Kilian Q Weinberger, Yue Wang, Marco Pavone

We propose a novel framework to integrate SD maps into online map prediction and propose a Transformer-based encoder, SD Map Encoder Representations from transFormers, to leverage priors in SD maps for the lane-topology prediction task.

Autonomous Driving Lane Detection

Paper
Code

EmerNeRF: Emergent Spatial-Temporal Scene Decomposition via Self-Supervision

1 code implementation • 3 Nov 2023 • Jiawei Yang, Boris Ivanovic, Or Litany, Xinshuo Weng, Seung Wook Kim, Boyi Li, Tong Che, Danfei Xu, Sanja Fidler, Marco Pavone, Yue Wang

We present EmerNeRF, a simple yet powerful approach for learning spatial-temporal representations of dynamic driving scenes.

Optical Flow Estimation Semantic Segmentation

484

Paper
Code

Language Conditioned Traffic Generation

1 code implementation • 16 Jul 2023 • Shuhan Tan, Boris Ivanovic, Xinshuo Weng, Marco Pavone, Philipp Kraehenbuehl

In this work, we turn to language as a source of supervision for dynamic traffic scene generation.

Language Modelling Large Language Model +1

Paper
Code

Robust Trajectory Prediction against Adversarial Attacks

no code implementations • 29 Jul 2022 • Yulong Cao, Danfei Xu, Xinshuo Weng, Zhuoqing Mao, Anima Anandkumar, Chaowei Xiao, Marco Pavone

We demonstrate that our method is able to improve the performance by 46% on adversarial data and at the cost of only 3% performance degradation on clean data, compared to the model trained with clean data.

Autonomous Driving Data Augmentation +1

Paper
Add Code

Multiface: A Dataset for Neural Face Rendering

1 code implementation • 22 Jul 2022 • Cheng-hsin Wuu, Ningyuan Zheng, Scott Ardisson, Rohan Bali, Danielle Belko, Eric Brockmeyer, Lucas Evans, Timothy Godisart, Hyowon Ha, Xuhua Huang, Alexander Hypes, Taylor Koska, Steven Krenn, Stephen Lombardi, Xiaomin Luo, Kevyn McPhail, Laura Millerschoen, Michal Perdoch, Mark Pitts, Alexander Richard, Jason Saragih, Junko Saragih, Takaaki Shiratori, Tomas Simon, Matt Stewart, Autumn Trimble, Xinshuo Weng, David Whitewolf, Chenglei Wu, Shoou-I Yu, Yaser Sheikh

Along with the release of the dataset, we conduct ablation studies on the influence of different model architectures toward the model's interpolation capacity of novel viewpoint and expressions.

Novel View Synthesis

703

Paper
Code

Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking

7 code implementations • CVPR 2023 • Jinkun Cao, Jiangmiao Pang, Xinshuo Weng, Rawal Khirodkar, Kris Kitani

Instead of relying only on the linear state estimate (i. e., estimation-centric approach), we use object observations (i. e., the measurements by object detector) to compute a virtual trajectory over the occlusion period to fix the error accumulation of filter parameters during the occlusion period.

Ranked #2 on Multiple Object Tracking on KITTI Tracking test

Multi-Object Tracking Multiple Object Tracking +3

12,048

Paper
Code

Whose Track Is It Anyway? Improving Robustness to Tracking Errors With Affinity-Based Trajectory Prediction

no code implementations • CVPR 2022 • Xinshuo Weng, Boris Ivanovic, Kris Kitani, Marco Pavone

This is typically caused by the propagation of errors from tracking to prediction, such as noisy tracks, fragments, and identity switches.

Autonomous Driving Decision Making +2

Paper
Add Code

MTP: Multi-Hypothesis Tracking and Prediction for Reduced Error Propagation

1 code implementation • 18 Oct 2021 • Xinshuo Weng, Boris Ivanovic, Marco Pavone

Recently, there has been tremendous progress in developing each individual module of the standard perception-planning robot autonomy pipeline, including detection, tracking, prediction of other agents' trajectories, and ego-agent trajectory planning.

Trajectory Planning

Paper
Code

Multi-Echo LiDAR for 3D Object Detection

no code implementations • ICCV 2021 • Yunze Man, Xinshuo Weng, Prasanna Kumar Sivakuma, Matthew O'Toole, Kris Kitani

LiDAR sensors can be used to obtain a wide range of measurement signals other than a simple 3D point cloud, and those signals can be leveraged to improve perception tasks like 3D object detection.

3D Object Detection Object +1

Paper
Add Code

Multi-Modality Task Cascade for 3D Object Detection

1 code implementation • 8 Jul 2021 • Jinhyung Park, Xinshuo Weng, Yunze Man, Kris Kitani

To provide a more integrated approach, we propose a novel Multi-Modality Task Cascade network (MTC-RCNN) that leverages 3D box proposals to improve 2D segmentation predictions, which are then used to further refine the 3D boxes.

3D Object Detection Object +3

Paper
Code

Wide-Baseline Multi-Camera Calibration using Person Re-Identification

no code implementations • CVPR 2021 • Yan Xu, Yu-Jhe Li, Xinshuo Weng, Kris Kitani

We address the problem of estimating the 3D pose of a network of cameras for large-environment wide-baseline scenarios, e. g., cameras for construction sites, sports stadiums, and public spaces.

Camera Calibration Person Re-Identification

Paper
Add Code

AgentFormer: Agent-Aware Transformers for Socio-Temporal Multi-Agent Forecasting

2 code implementations • ICCV 2021 • Ye Yuan, Xinshuo Weng, Yanglan Ou, Kris Kitani

Instead, we would prefer a method that allows an agent's state at one time to directly affect another agent's state at a future time.

Ranked #10 on Trajectory Prediction on ETH/UCY

Autonomous Driving Pedestrian Trajectory Prediction +1

243

Paper
Code

Supervision by Registration and Triangulation for Landmark Detection

1 code implementation • 25 Jan 2021 • Xuanyi Dong, Yi Yang, Shih-En Wei, Xinshuo Weng, Yaser Sheikh, Shoou-I Yu

End-to-end training is made possible by differentiable registration and 3D triangulation modules.

Optical Flow Estimation

917

Paper
Code

Visio-Temporal Attention for Multi-Camera Multi-Target Association

no code implementations • ICCV 2021 • Yu-Jhe Li, Xinshuo Weng, Yan Xu, Kris M. Kitani

We propose a inter-tracklet (person to person) attention mechanism that learns a representation for a target tracklet while taking into account other tracklets across multiple views.

Paper
Add Code

AutoSelect: Automatic and Dynamic Detection Selection for 3D Multi-Object Tracking

no code implementations • 10 Dec 2020 • Xinshuo Weng, Kris Kitani

Also, this threshold is sensitive to many factors such as target object category so we need to re-search the threshold if these factors change.

3D Multi-Object Tracking Object

Paper
Add Code

End-to-End 3D Multi-Object Tracking and Trajectory Forecasting

no code implementations • 25 Aug 2020 • Xinshuo Weng, Ye Yuan, Kris Kitani

To evaluate this hypothesis, we propose a unified solution for 3D MOT and trajectory forecasting which also incorporates two additional novel computational units.

3D Multi-Object Tracking Trajectory Forecasting

Paper
Add Code

Graph Neural Networks for 3D Multi-Object Tracking

no code implementations • 20 Aug 2020 • Xinshuo Weng, Yongxin Wang, Yunze Man, Kris Kitani

3D Multi-object tracking (MOT) is crucial to autonomous systems.

3D Multi-Object Tracking Object

Paper
Add Code

AB3DMOT: A Baseline for 3D Multi-Object Tracking and New Evaluation Metrics

no code implementations • 18 Aug 2020 • Xinshuo Weng, Jianren Wang, David Held, Kris Kitani

Additionally, 3D MOT datasets such as KITTI evaluate MOT methods in 2D space and standardized 3D MOT evaluation tools are missing for a fair comparison of 3D MOT methods.

3D Multi-Object Tracking Autonomous Driving

Paper
Add Code

Joint Object Detection and Multi-Object Tracking with Graph Neural Networks

1 code implementation • 23 Jun 2020 • Yongxin Wang, Kris Kitani, Xinshuo Weng

Despite the fact that the two components are dependent on each other, prior works often design detection and data association modules separately which are trained with separate objectives.

Ranked #1 on Multi-Object Tracking on 2D MOT 2015

Multi-Object Tracking Object +2

459

Paper
Code

When We First Met: Visual-Inertial Person Localization for Co-Robot Rendezvous

no code implementations • 17 Jun 2020 • Xi Sun, Xinshuo Weng, Kris Kitani

We propose a method to learn a visual-inertial feature space in which the motion of a person in video can be easily matched to the motion measured by a wearable inertial measurement unit (IMU).

Paper
Add Code

GNN3DMOT: Graph Neural Network for 3D Multi-Object Tracking with Multi-Feature Learning

1 code implementation • 12 Jun 2020 • Xinshuo Weng, Yongxin Wang, Yunze Man, Kris Kitani

As a result, the feature of one object is informed of the features of other objects so that the object feature can lean towards the object with similar feature (i. e., object probably with a same ID) and deviate from objects with dissimilar features (i. e., object probably with different IDs), leading to a more discriminative feature for each object; (2) instead of obtaining the feature from either 2D or 3D space in prior work, we propose a novel joint feature extractor to learn appearance and motion features from 2D and 3D space simultaneously.

3D Multi-Object Tracking Object

Paper
Code

GNN3DMOT: Graph Neural Network for 3D Multi-Object Tracking With 2D-3D Multi-Feature Learning

1 code implementation • CVPR 2020 • Xinshuo Weng, Yongxin Wang, Yunze Man, Kris M. Kitani

3D Multi-Object Tracking Object

Paper
Code

Inverting the Pose Forecasting Pipeline with SPF2: Sequential Pointcloud Forecasting for Sequential Pose Forecasting

no code implementations • 18 Mar 2020 • Xinshuo Weng, Jianren Wang, Sergey Levine, Kris Kitani, Nicholas Rhinehart

Through experiments on a robotic manipulation dataset and two driving datasets, we show that SPFNet is effective for the SPF task, our forecast-then-detect pipeline outperforms the detect-then-forecast approaches to which we compared, and that pose forecasting performance improves with the addition of unlabeled data.

Decision Making Future prediction +1

Paper
Add Code

PTP: Parallelized Tracking and Prediction with Graph Neural Networks and Diversity Sampling

no code implementations • 17 Mar 2020 • Xinshuo Weng, Ye Yuan, Kris Kitani

We evaluate on KITTI and nuScenes datasets showing that our method with socially-aware feature learning and diversity sampling achieves new state-of-the-art performance on 3D MOT and trajectory prediction.

3D Multi-Object Tracking Trajectory Forecasting

Paper
Add Code

Learning Shape Representations for Clothing Variations in Person Re-Identification

no code implementations • 16 Mar 2020 • Yu-Jhe Li, Zhengyi Luo, Xinshuo Weng, Kris M. Kitani

To tackle the re-ID problem in the context of clothing changes, we propose a novel representation learning model which is able to generate a body shape feature representation without being affected by clothing color or patterns.

Disentanglement Person Re-Identification

Paper
Add Code

3D Multi-Object Tracking: A Baseline and New Evaluation Metrics

1 code implementation • 9 Jul 2019 • Xinshuo Weng, Jianren Wang, David Held, Kris Kitani

Additionally, 3D MOT datasets such as KITTI evaluate MOT methods in the 2D space and standardized 3D MOT evaluation tools are missing for a fair comparison of 3D MOT methods.

Ranked #3 on 3D Multi-Object Tracking on KITTI

3D Multi-Object Tracking Autonomous Driving +1

1,623

Paper
Code

Learning Spatio-Temporal Features with Two-Stream Deep 3D CNNs for Lipreading

no code implementations • 4 May 2019 • Xinshuo Weng, Kris Kitani

We evaluate different combinations of front-end and back-end modules with the grayscale video and optical flow inputs on the LRW dataset.

General Classification Lipreading +1

Paper
Add Code

Monocular 3D Object Detection with Pseudo-LiDAR Point Cloud

1 code implementation • 23 Mar 2019 • Xinshuo Weng, Kris Kitani

Following the pipeline of two-stage 3D detection algorithms, we detect 2D object proposals in the input image and extract a point cloud frustum from the pseudo-LiDAR for each proposal.

Monocular 3D Object Detection Monocular Depth Estimation +3

129

Paper
Code

On the Importance of Video Action Recognition for Visual Lipreading

no code implementations • 22 Mar 2019 • Xinshuo Weng

We focus on the word-level visual lipreading, which requires to decode the word from the speaker's video.

Action Recognition Lipreading +2

Paper
Add Code

Future Near-Collision Prediction from Monocular Video: Feasibility, Dataset, and Challenges

1 code implementation • 21 Mar 2019 • Aashi Manglik, Xinshuo Weng, Eshed Ohn-Bar, Kris M. Kitani

Our results show that our proposed multi-stream CNN is the best model for predicting time to near-collision.

Robotics

Paper
Code

Deep Reinforcement Learning for Autonomous Driving

1 code implementation • 28 Nov 2018 • Sen Wang, Daoyuan Jia, Xinshuo Weng

To deal with these challenges, we first adopt the deep deterministic policy gradient (DDPG) algorithm, which has the capacity to handle complex state and action spaces in continuous domain.

Autonomous Driving reinforcement-learning +1

Paper
Code

CyLKs: Unsupervised Cycle Lucas-Kanade Network for Landmark Tracking

no code implementations • 28 Nov 2018 • Xinshuo Weng, Wentao Han

Across a majority of modern learning-based tracking systems, expensive annotations are needed to achieve state-of-the-art performance.

Landmark Tracking

Paper
Add Code

Image Labeling with Markov Random Fields and Conditional Random Fields

no code implementations • 28 Nov 2018 • Shangxuan Wu, Xinshuo Weng

Most existing methods for object segmentation in computer vision are formulated as a labeling task.

Segmentation Semantic Segmentation

Paper
Add Code

GroundNet: Monocular Ground Plane Normal Estimation with Geometric Consistency

no code implementations • 17 Nov 2018 • Yunze Man, Xinshuo Weng, Xi Li, Kris Kitani

We focus on estimating the 3D orientation of the ground plane from a single image.

Line Detection Segmentation +1

Paper
Add Code

Supervision-by-Registration: An Unsupervised Approach to Improve the Precision of Facial Landmark Detectors

1 code implementation • CVPR 2018 • Xuanyi Dong, Shoou-I Yu, Xinshuo Weng, Shih-En Wei, Yi Yang, Yaser Sheikh

In this paper, we present supervision-by-registration, an unsupervised approach to improve the precision of facial landmark detectors on both images and video.

Ranked #1 on Facial Landmark Detection on 300-VW (C)

Facial Landmark Detection Optical Flow Estimation

757

Paper
Code

Rotational Rectification Network: Enabling Pedestrian Detection for Mobile Vision

no code implementations • 19 Jun 2017 • Xinshuo Weng, Shangxuan Wu, Fares Beainy, Kris Kitani

To address this issue, we propose a Rotational Rectification Network (R2N) that can be inserted into any CNN-based pedestrian (or object) detector to adapt it to significant changes in camera rotation.

Pedestrian Detection valid

Paper
Add Code

Visual Compiler: Synthesizing a Scene-Specific Pedestrian Detector and Pose Estimator

no code implementations • 15 Dec 2016 • Namhoon Lee, Xinshuo Weng, Vishnu Naresh Boddeti, Yu Zhang, Fares Beainy, Kris Kitani, Takeo Kanade

We introduce the concept of a Visual Compiler that generates a scene specific pedestrian detector and pose estimator without any pedestrian observations.

Human Detection Pose Estimation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.