Search Results for author: Kris Kitani

Found 56 papers, 24 papers with code

Embodied Scene-aware Human Pose Estimation

no code implementations18 Jun 2022 Zhengyi Luo, Shun Iwase, Ye Yuan, Kris Kitani

We propose embodied scene-aware human pose estimation where we estimate 3D poses based on a simulated agent's proprioception and scene awareness, along with external third-person observations.

Ranked #72 on 3D Human Pose Estimation on Human3.6M (PA-MPJPE metric)

3D Human Pose Estimation Causal Inference

Online No-regret Model-Based Meta RL for Personalized Navigation

no code implementations5 Apr 2022 Yuda Song, Ye Yuan, Wen Sun, Kris Kitani

Our theoretical analysis shows that our method is a no-regret algorithm and we provide the convergence rate in the agnostic setting.

Model-based Reinforcement Learning

Occluded Human Mesh Recovery

no code implementations CVPR 2022 Rawal Khirodkar, Shashank Tripathi, Kris Kitani

Along with the input image, we condition the top-down model on spatial context from the image in the form of body-center heatmaps.

Ranked #34 on 3D Human Pose Estimation on 3DPW (using extra training data)

3D Human Pose Estimation Human Mesh Recovery

Modality-Agnostic Learning for Radar-Lidar Fusion in Vehicle Detection

no code implementations CVPR 2022 Yu-Jhe Li, Jinhyung Park, Matthew O'Toole, Kris Kitani

To mitigate this problem, we propose the Self-Training Multimodal Vehicle Detection Network (ST-MVDNet) which leverages a Teacher-Student mutual learning framework and a simulated sensor noise model used in strong data augmentation for Lidar and Radar.

Autonomous Vehicles Data Augmentation

GLAMR: Global Occlusion-Aware Human Mesh Recovery with Dynamic Cameras

1 code implementation CVPR 2022 Ye Yuan, Umar Iqbal, Pavlo Molchanov, Kris Kitani, Jan Kautz

Since the joint reconstruction of human motions and camera poses is underconstrained, we propose a global trajectory predictor that generates global human trajectories based on local body movements.

3D Human Pose Estimation Human Mesh Recovery

DanceTrack: Multi-Object Tracking in Uniform Appearance and Diverse Motion

3 code implementations CVPR 2022 Peize Sun, Jinkun Cao, Yi Jiang, Zehuan Yuan, Song Bai, Kris Kitani, Ping Luo

A typical pipeline for multi-object tracking (MOT) is to use a detector for object localization, and following re-identification (re-ID) for object association.

Multi-Object Tracking object-detection +2

Cross-Domain Adaptive Teacher for Object Detection

1 code implementation CVPR 2022 Yu-Jhe Li, Xiaoliang Dai, Chih-Yao Ma, Yen-Cheng Liu, Kan Chen, Bichen Wu, Zijian He, Kris Kitani, Peter Vajda

To mitigate this problem, we propose a teacher-student framework named Adaptive Teacher (AT) which leverages domain adversarial learning and weak-strong data augmentation to address the domain gap.

Data Augmentation Domain Adaptation +2

AEI: Actors-Environment Interaction with Adaptive Attention for Temporal Action Proposals Generation

1 code implementation21 Oct 2021 Khoa Vo, Hyekang Joo, Kashu Yamazaki, Sang Truong, Kris Kitani, Minh-Triet Tran, Ngan Le

In this paper, we make an attempt to simulate that ability of a human by proposing Actor Environment Interaction (AEI) network to improve the video representation for temporal action proposals generation.

Action Detection Temporal Action Proposal Generation

Ego4D: Around the World in 3,000 Hours of Egocentric Video

1 code implementation CVPR 2022 Kristen Grauman, Andrew Westbury, Eugene Byrne, Zachary Chavis, Antonino Furnari, Rohit Girdhar, Jackson Hamburger, Hao Jiang, Miao Liu, Xingyu Liu, Miguel Martin, Tushar Nagarajan, Ilija Radosavovic, Santhosh Kumar Ramakrishnan, Fiona Ryan, Jayant Sharma, Michael Wray, Mengmeng Xu, Eric Zhongcong Xu, Chen Zhao, Siddhant Bansal, Dhruv Batra, Vincent Cartillier, Sean Crane, Tien Do, Morrie Doulaty, Akshay Erapalli, Christoph Feichtenhofer, Adriano Fragomeni, Qichen Fu, Abrham Gebreselasie, Cristina Gonzalez, James Hillis, Xuhua Huang, Yifei HUANG, Wenqi Jia, Weslie Khoo, Jachym Kolar, Satwik Kottur, Anurag Kumar, Federico Landini, Chao Li, Yanghao Li, Zhenqiang Li, Karttikeya Mangalam, Raghava Modhugu, Jonathan Munro, Tullie Murrell, Takumi Nishiyasu, Will Price, Paola Ruiz Puentes, Merey Ramazanova, Leda Sari, Kiran Somasundaram, Audrey Southerland, Yusuke Sugano, Ruijie Tao, Minh Vo, Yuchen Wang, Xindi Wu, Takuma Yagi, Ziwei Zhao, Yunyi Zhu, Pablo Arbelaez, David Crandall, Dima Damen, Giovanni Maria Farinella, Christian Fuegen, Bernard Ghanem, Vamsi Krishna Ithapu, C. V. Jawahar, Hanbyul Joo, Kris Kitani, Haizhou Li, Richard Newcombe, Aude Oliva, Hyun Soo Park, James M. Rehg, Yoichi Sato, Jianbo Shi, Mike Zheng Shou, Antonio Torralba, Lorenzo Torresani, Mingfei Yan, Jitendra Malik

We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite.


Transform2Act: Learning a Transform-and-Control Policy for Efficient Agent Design

1 code implementation ICLR 2022 Ye Yuan, Yuda Song, Zhengyi Luo, Wen Sun, Kris Kitani

Specifically, we learn a conditional policy that, in an episode, first applies a sequence of transform actions to modify an agent's skeletal structure and joint attributes, and then applies control actions under the new design.

Decision Making Policy Gradient Methods

Multi-Echo LiDAR for 3D Object Detection

no code implementations ICCV 2021 Yunze Man, Xinshuo Weng, Prasanna Kumar Sivakuma, Matthew O'Toole, Kris Kitani

LiDAR sensors can be used to obtain a wide range of measurement signals other than a simple 3D point cloud, and those signals can be leveraged to improve perception tasks like 3D object detection.

3D Object Detection object-detection

Multi-Modality Task Cascade for 3D Object Detection

1 code implementation8 Jul 2021 Jinhyung Park, Xinshuo Weng, Yunze Man, Kris Kitani

To provide a more integrated approach, we propose a novel Multi-Modality Task Cascade network (MTC-RCNN) that leverages 3D box proposals to improve 2D segmentation predictions, which are then used to further refine the 3D boxes.

3D Object Detection object-detection

Dynamics-Regulated Kinematic Policy for Egocentric Pose Estimation

1 code implementation NeurIPS 2021 Zhengyi Luo, Ryo Hachiuma, Ye Yuan, Kris Kitani

By comparing the pose instructed by the kinematic model against the pose generated by the dynamics model, we can use their misalignment to further improve the kinematic model.

Egocentric Pose Estimation Human-Object Interaction Detection +1

Wide-Baseline Multi-Camera Calibration using Person Re-Identification

no code implementations CVPR 2021 Yan Xu, Yu-Jhe Li, Xinshuo Weng, Kris Kitani

We address the problem of estimating the 3D pose of a network of cameras for large-environment wide-baseline scenarios, e. g., cameras for construction sites, sports stadiums, and public spaces.

Camera Calibration Person Re-Identification

SimPoE: Simulated Character Control for 3D Human Pose Estimation

no code implementations CVPR 2021 Ye Yuan, Shih-En Wei, Tomas Simon, Kris Kitani, Jason Saragih

Based on this refined kinematic pose, the policy learns to compute dynamics-based control (e. g., joint torques) of the character to advance the current-frame pose estimate to the pose estimate of the next frame.

3D Human Pose Estimation

Efficient Model Performance Estimation via Feature Histories

no code implementations7 Mar 2021 Shengcao Cao, Xiaofang Wang, Kris Kitani

Using a sampling-based search algorithm and parallel computing, our method can find an architecture which is better than DARTS and with an 80% reduction in wall-clock search time.

Image Classification Neural Architecture Search

Inverse Reinforcement Learning with Explicit Policy Estimates

no code implementations4 Mar 2021 Navyata Sanghvi, Shinnosuke Usami, Mohit Sharma, Joachim Groeger, Kris Kitani

Various methods for solving the inverse reinforcement learning (IRL) problem have been developed independently in machine learning and economics.


DeepBLE: Generalizing RSSI-based Localization Across Different Devices

no code implementations27 Feb 2021 Harsh Agarwal, Navyata Sanghvi, Vivek Roy, Kris Kitani

Accurate smartphone localization (< 1-meter error) for indoor navigation using only RSSI received from a set of BLE beacons remains a challenging problem, due to the inherent noise of RSSI measurements.

IDOL: Inertial Deep Orientation-Estimation and Localization

1 code implementation8 Feb 2021 Scott Sun, Dennis Melamed, Kris Kitani

Many smartphone applications use inertial measurement units (IMUs) to sense movement, but the use of these sensors for pedestrian localization can be challenging due to their noise characteristics.

AutoSelect: Automatic and Dynamic Detection Selection for 3D Multi-Object Tracking

no code implementations10 Dec 2020 Xinshuo Weng, Kris Kitani

Also, this threshold is sensitive to many factors such as target object category so we need to re-search the threshold if these factors change.

3D Multi-Object Tracking

Rethinking Transformer-based Set Prediction for Object Detection

1 code implementation ICCV 2021 Zhiqing Sun, Shengcao Cao, Yiming Yang, Kris Kitani

DETR is a recently proposed Transformer-based method which views object detection as a set prediction problem and achieves state-of-the-art performance but demands extra-long training time to converge.

object-detection Object Detection

Audio-Visual Self-Supervised Terrain Type Discovery for Mobile Platforms

no code implementations13 Oct 2020 Akiyoshi Kurobe, Yoshikatsu Nakajima, Hideo Saito, Kris Kitani

The ability to both recognize and discover terrain characteristics is an important function required for many autonomous ground robots such as social robots, assistive robots, autonomous vehicles, and ground exploration robots.

Autonomous Vehicles Self-Supervised Learning

End-to-End 3D Multi-Object Tracking and Trajectory Forecasting

no code implementations25 Aug 2020 Xinshuo Weng, Ye Yuan, Kris Kitani

To evaluate this hypothesis, we propose a unified solution for 3D MOT and trajectory forecasting which also incorporates two additional novel computational units.

3D Multi-Object Tracking Trajectory Forecasting

Few-Shot Learning with Intra-Class Knowledge Transfer

no code implementations22 Aug 2020 Vivek Roy, Yan Xu, Yu-Xiong Wang, Kris Kitani, Ruslan Salakhutdinov, Martial Hebert

Recent works have proposed to solve this task by augmenting the training data of the few-shot classes using generative models with the few-shot training samples as the seeds.

Few-Shot Learning Transfer Learning

AB3DMOT: A Baseline for 3D Multi-Object Tracking and New Evaluation Metrics

no code implementations18 Aug 2020 Xinshuo Weng, Jianren Wang, David Held, Kris Kitani

Additionally, 3D MOT datasets such as KITTI evaluate MOT methods in 2D space and standardized 3D MOT evaluation tools are missing for a fair comparison of 3D MOT methods.

3D Multi-Object Tracking Autonomous Driving

Efficient Non-Line-of-Sight Imaging from Transient Sinograms

no code implementations ECCV 2020 Mariko Isogawa, Dorian Chan, Ye Yuan, Kris Kitani, Matthew O'Toole

Non-line-of-sight (NLOS) imaging techniques use light that diffusely reflects off of visible surfaces (e. g., walls) to see around corners.

Joint Object Detection and Multi-Object Tracking with Graph Neural Networks

1 code implementation23 Jun 2020 Yongxin Wang, Kris Kitani, Xinshuo Weng

Despite the fact that the two components are dependent on each other, prior works often design detection and data association modules separately which are trained with separate objectives.

Multi-Object Tracking object-detection +1

When We First Met: Visual-Inertial Person Localization for Co-Robot Rendezvous

no code implementations17 Jun 2020 Xi Sun, Xinshuo Weng, Kris Kitani

We propose a method to learn a visual-inertial feature space in which the motion of a person in video can be easily matched to the motion measured by a wearable inertial measurement unit (IMU).

Residual Force Control for Agile Human Behavior Imitation and Extended Motion Synthesis

1 code implementation NeurIPS 2020 Ye Yuan, Kris Kitani

Our approach is the first humanoid control method that successfully learns from a large-scale human motion dataset (Human3. 6M) and generates diverse long-term motions.

motion synthesis

GNN3DMOT: Graph Neural Network for 3D Multi-Object Tracking with Multi-Feature Learning

1 code implementation12 Jun 2020 Xinshuo Weng, Yongxin Wang, Yunze Man, Kris Kitani

As a result, the feature of one object is informed of the features of other objects so that the object feature can lean towards the object with similar feature (i. e., object probably with a same ID) and deviate from objects with dissimilar features (i. e., object probably with different IDs), leading to a more discriminative feature for each object; (2) instead of obtaining the feature from either 2D or 3D space in prior work, we propose a novel joint feature extractor to learn appearance and motion features from 2D and 3D space simultaneously.

3D Multi-Object Tracking

No-Reference Image Quality Assessment via Feature Fusion and Multi-Task Learning

no code implementations6 Jun 2020 S. Alireza Golestaneh, Kris Kitani

In our experiments, we demonstrate that by utilizing multi-task learning and our proposed feature fusion method, our model yields better performance for the NR-IQA task.

Multi-Task Learning No-Reference Image Quality Assessment

Optical Non-Line-of-Sight Physics-based 3D Human Pose Estimation

1 code implementation CVPR 2020 Mariko Isogawa, Ye Yuan, Matthew O'Toole, Kris Kitani

We bring together a diverse set of technologies from NLOS imaging, human pose estimation and deep reinforcement learning to construct an end-to-end data processing pipeline that converts a raw stream of photon measurements into a full 3D human pose sequence estimate.

3D Human Pose Estimation

Inverting the Pose Forecasting Pipeline with SPF2: Sequential Pointcloud Forecasting for Sequential Pose Forecasting

no code implementations18 Mar 2020 Xinshuo Weng, Jianren Wang, Sergey Levine, Kris Kitani, Nicholas Rhinehart

Through experiments on a robotic manipulation dataset and two driving datasets, we show that SPFNet is effective for the SPF task, our forecast-then-detect pipeline outperforms the detect-then-forecast approaches to which we compared, and that pose forecasting performance improves with the addition of unlabeled data.

Decision Making Future prediction +1

DLow: Diversifying Latent Flows for Diverse Human Motion Prediction

1 code implementation ECCV 2020 Ye Yuan, Kris Kitani

To obtain samples from a pretrained generative model, most existing generative human motion prediction methods draw a set of independent Gaussian latent codes and convert them to motion samples.

Human motion prediction Human Pose Forecasting +1

PTP: Parallelized Tracking and Prediction with Graph Neural Networks and Diversity Sampling

no code implementations17 Mar 2020 Xinshuo Weng, Ye Yuan, Kris Kitani

We evaluate on KITTI and nuScenes datasets showing that our method with socially-aware feature learning and diversity sampling achieves new state-of-the-art performance on 3D MOT and trajectory prediction.

3D Multi-Object Tracking Trajectory Forecasting

Estimating 3D Camera Pose from 2D Pedestrian Trajectories

no code implementations12 Dec 2019 Yan Xu, Vivek Roy, Kris Kitani

We propose an alternative strategy for extracting 3D information to solve for camera pose by using pedestrian trajectories.

Pose Estimation

Incremental Class Discovery for Semantic Segmentation with RGBD Sensing

no code implementations ICCV 2019 Yoshikatsu Nakajima, Byeongkeun Kang, Hideo Saito, Kris Kitani

This work addresses the task of open world semantic segmentation using RGBD sensing to discover new semantic classes over time.

Semantic Segmentation

Diverse Trajectory Forecasting with Determinantal Point Processes

no code implementations ICLR 2020 Ye Yuan, Kris Kitani

To learn the parameters of the DSF, the diversity of the trajectory samples is evaluated by a diversity loss based on a determinantal point process (DPP).

Autonomous Vehicles Point Processes +1

3D Multi-Object Tracking: A Baseline and New Evaluation Metrics

1 code implementation9 Jul 2019 Xinshuo Weng, Jianren Wang, David Held, Kris Kitani

Additionally, 3D MOT datasets such as KITTI evaluate MOT methods in the 2D space and standardized 3D MOT evaluation tools are missing for a fair comparison of 3D MOT methods.

3D Multi-Object Tracking Autonomous Driving +1

Ego-Pose Estimation and Forecasting as Real-Time PD Control

1 code implementation ICCV 2019 Ye Yuan, Kris Kitani

We propose the use of a proportional-derivative (PD) control based policy learned via reinforcement learning (RL) to estimate and forecast 3D human pose from egocentric videos.

Egocentric Pose Estimation Human Pose Forecasting

Doctor of Crosswise: Reducing Over-parametrization in Neural Networks

1 code implementation24 May 2019 J. D. Curtó, I. C. Zarza, Kris Kitani, Irwin King, Michael R. Lyu

Dr. of Crosswise proposes a new architecture to reduce over-parametrization in Neural Networks.

Learning Spatio-Temporal Features with Two-Stream Deep 3D CNNs for Lipreading

no code implementations4 May 2019 Xinshuo Weng, Kris Kitani

We evaluate different combinations of front-end and back-end modules with the grayscale video and optical flow inputs on the LRW dataset.

General Classification Lipreading +1

PRECOG: PREdiction Conditioned On Goals in Visual Multi-Agent Settings

2 code implementations ICCV 2019 Nicholas Rhinehart, Rowan Mcallister, Kris Kitani, Sergey Levine

For autonomous vehicles (AVs) to behave appropriately on roads populated by human-driven vehicles, they must be able to reason about the uncertain intentions and decisions of other drivers from rich perceptual information.

Autonomous Vehicles

Monocular 3D Object Detection with Pseudo-LiDAR Point Cloud

1 code implementation23 Mar 2019 Xinshuo Weng, Kris Kitani

Following the pipeline of two-stage 3D detection algorithms, we detect 2D object proposals in the input image and extract a point cloud frustum from the pseudo-LiDAR for each proposal.

Monocular 3D Object Detection Monocular Depth Estimation +2

MGpi: A Computational Model of Multiagent Group Perception and Interaction

1 code implementation4 Mar 2019 Navyata Sanghvi, Ryo Yonetani, Kris Kitani

Toward enabling next-generation robots capable of socially intelligent interaction with humans, we present a $\mathbf{computational\; model}$ of interactions in a social environment of multiple agents and multiple groups.

Imitation Learning

3D Ego-Pose Estimation via Imitation Learning

no code implementations ECCV 2018 Ye Yuan, Kris Kitani

Motivated by this, we propose a novel control-based approach to model human motion with physics simulation and use imitation learning to learn a video-conditioned control policy for ego-pose estimation.

Domain Adaptation Imitation Learning +1

Understanding hand-object manipulation by modeling the contextual relationship between actions, grasp types and object attributes

no code implementations22 Jul 2018 Minjie Cai, Kris Kitani, Yoichi Sato

In the proposed model, we explore various semantic relationships between actions, grasp types and object attributes, and show how the context can be used to boost the recognition of each component.

Personalized Dynamics Models for Adaptive Assistive Navigation Systems

no code implementations11 Apr 2018 Eshed Ohn-Bar, Kris Kitani, Chieko Asakawa

Consider an assistive system that guides visually impaired users through speech and haptic feedback to their destination.

Model-based Reinforcement Learning Transfer Learning

Rotational Rectification Network: Enabling Pedestrian Detection for Mobile Vision

no code implementations19 Jun 2017 Xinshuo Weng, Shangxuan Wu, Fares Beainy, Kris Kitani

To address this issue, we propose a Rotational Rectification Network (R2N) that can be inserted into any CNN-based pedestrian (or object) detector to adapt it to significant changes in camera rotation.

Pedestrian Detection

Visual Compiler: Synthesizing a Scene-Specific Pedestrian Detector and Pose Estimator

no code implementations15 Dec 2016 Namhoon Lee, Xinshuo Weng, Vishnu Naresh Boddeti, Yu Zhang, Fares Beainy, Kris Kitani, Takeo Kanade

We introduce the concept of a Visual Compiler that generates a scene specific pedestrian detector and pose estimator without any pedestrian observations.

Human Detection Pose Estimation

Cannot find the paper you are looking for? You can Submit a new open access paper.