no code implementations • CVPR 2022 • Fanyi Xiao, Kaustav Kundu, Joseph Tighe, Davide Modolo
Most self-supervised video representation learning approaches focus on action recognition.
no code implementations • CVPR 2022 • A S M Iftekhar, Hao Chen, Kaustav Kundu, Xinyu Li, Joseph Tighe, Davide Modolo
We propose a novel one-stage Transformer-based semantic and spatial refined transformer (SSRT) to solve the Human-Object Interaction detection task, which requires to localize humans and objects, and predicts their interactions.
no code implementations • CVPR 2022 • Bing Shuai, Xinyu Li, Kaustav Kundu, Joseph Tighe
In this work, we explore training such a model by only using person box annotations, thus removing the necessity of manually labeling a training dataset with additional person identity annotation as these are expensive to collect.
1 code implementation • CVPR 2022 • Jiaojiao Zhao, Yanyi Zhang, Xinyu Li, Hao Chen, Shuai Bing, Mingze Xu, Chunhui Liu, Kaustav Kundu, Yuanjun Xiong, Davide Modolo, Ivan Marsic, Cees G. M. Snoek, Joseph Tighe
We propose TubeR: a simple solution for spatio-temporal video action detection.
no code implementations • NeurIPS 2020 • Kaustav Kundu, Joseph Tighe
Ignoring these un-annotated labels result in loss of supervisory signal which reduces the performance of the classification models.
no code implementations • CVPR 2021 • Sijie Yan, Yuanjun Xiong, Kaustav Kundu, Shuo Yang, Siqi Deng, Meng Wang, Wei Xia, Stefano Soatto
Reducing inconsistencies in the behavior of different versions of an AI system can be as important in practice as reducing its overall error.
1 code implementation • CVPR 2018 • Hang Chu, Wei-Chiu Ma, Kaustav Kundu, Raquel Urtasun, Sanja Fidler
On the other hand, 3D convolution wastes a large amount of memory on mostly unoccupied 3D space, which consists of only the surface visible to the sensor.
no code implementations • 13 Oct 2018 • Enric Corona, Kaustav Kundu, Sanja Fidler
In particular, our aim is to infer poses for objects not seen at training time, but for which their 3D CAD models are available at test time.
2 code implementations • CVPR 2017 • Lluis Castrejon, Kaustav Kundu, Raquel Urtasun, Sanja Fidler
We show that our approach speeds up the annotation process by a factor of 4. 7 across all classes in Cityscapes, while achieving 78. 4% agreement in IoU with original ground-truth, matching the typical agreement between human annotators.
no code implementations • 27 Aug 2016 • Xiaozhi Chen, Kaustav Kundu, Yukun Zhu, Huimin Ma, Sanja Fidler, Raquel Urtasun
We then exploit a CNN on top of these proposals to perform object detection.
no code implementations • CVPR 2016 • Xiaozhi Chen, Kaustav Kundu, Ziyu Zhang, Huimin Ma, Sanja Fidler, Raquel Urtasun
The focus of this paper is on proposal generation.
Ranked #8 on
Vehicle Pose Estimation
on KITTI Cars Hard
no code implementations • 6 Apr 2016 • Min Bai, Wenjie Luo, Kaustav Kundu, Raquel Urtasun
We tackle the problem of estimating optical flow from a monocular camera in the context of autonomous driving.
no code implementations • NeurIPS 2015 • Xiaozhi Chen, Kaustav Kundu, Yukun Zhu, Andrew G. Berneshawi, Huimin Ma, Sanja Fidler, Raquel Urtasun
The goal of this paper is to generate high-quality 3D object proposals in the context of autonomous driving.
Ranked #10 on
Vehicle Pose Estimation
on KITTI Cars Hard
no code implementations • CVPR 2015 • Chenxi Liu, Alexander G. Schwing, Kaustav Kundu, Raquel Urtasun, Sanja Fidler
What sets us apart from past work in layout estimation is the use of floor plans as a source of prior knowledge, as well as localization of each image within a bigger space (apartment).