1 code implementation • 25 Feb 2023 • Tarasha Khurana, Peiyun Hu, David Held, Deva Ramanan
One promising self-supervised task is 3D point cloud forecasting from unannotated LiDAR sequences.
2 code implementations • 16 Feb 2023 • Kangle Deng, Gengshan Yang, Deva Ramanan, Jun-Yan Zhu
We propose pix2pix3D, a 3D-aware conditional generative model for controllable photorealistic image synthesis.
1 code implementation • 16 Jan 2023 • Zhiqiu Lin, Samuel Yu, Zhiyi Kuang, Deepak Pathak, Deva Ramanan
By repurposing class names as additional one-shot training samples, we achieve SOTA results with an embarrassingly simple linear classifier for vision-language adaptation.
no code implementations • 10 Jan 2023 • Xindi Wu, KwunFung Lau, Francesco Ferroni, Aljoša Ošep, Deva Ramanan
Moreover, we show that our retrieved maps can be used to update or expand existing maps and even show proof-of-concept results for visual localization and image retrieval from spatial graphs.
no code implementations • 6 Jan 2023 • Ali Athar, Alexander Hermans, Jonathon Luiten, Deva Ramanan, Bastian Leibe
A single TarViS model can be trained jointly on a collection of datasets spanning different tasks, and can hot-swap between tasks during inference without any task-specific retraining.
2 code implementations • Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 2021 • Benjamin Wilson, William Qi, Tanmay Agarwal, John Lambert, Jagjeet Singh, Siddhesh Khandelwal, Bowen Pan, Ratnesh Kumar, Andrew Hartnett, Jhony Kaesemodel Pontes, Deva Ramanan, Peter Carr, James Hays
Models are tasked with the prediction of future motion for "scored actors" in each scenario and are provided with track histories that capture object location, heading, velocity, and category.
no code implementations • 25 Nov 2022 • Shubham Gupta, Jeet Kanjani, Mengtian Li, Francesco Ferroni, James Hays, Deva Ramanan, Shu Kong
We focus on the task of far-field 3D detection (Far3Det) of objects beyond a certain distance from an observer, e. g., $>$50m.
no code implementations • 16 Nov 2022 • Neehar Peri, Achal Dave, Deva Ramanan, Shu Kong
Moreover, semantic classes are often organized within a hierarchy, e. g., tail classes such as child and construction-worker are arguably subclasses of pedestrian.
no code implementations • 9 Nov 2022 • Yang Liu, Shen Yan, Laura Leal-Taixé, James Hays, Deva Ramanan
We draw inspiration from human visual classification studies and propose generalizing augmentation with invariant transforms to soft augmentation where the learning target softens non-linearly as a function of the degree of the transform applied to the sample: e. g., more aggressive image crop augmentations produce less confident learning targets.
1 code implementation • 19 Oct 2022 • Vladimir Fomenko, Ismail Elezi, Deva Ramanan, Laura Leal-Taixé, Aljoša Ošep
We then train our network to learn to classify each RoI, either as one of the known classes, seen in the source dataset, or one of the novel classes, with a long-tail distribution constraint on the class assignments, reflecting the natural frequency of classes in the real world.
no code implementations • 10 Oct 2022 • Zhiqiu Lin, Deepak Pathak, Yu-Xiong Wang, Deva Ramanan, Shu Kong
LECO requires learning classifiers in distinct time periods (TPs); each TP introduces a new ontology of "fine" labels that refines old ontologies of "coarse" labels (e. g., dog breeds that refine the previous ${\tt dog}$).
1 code implementation • 4 Oct 2022 • Tarasha Khurana, Peiyun Hu, Achal Dave, Jason Ziglar, David Held, Deva Ramanan
Self-supervised representations proposed for large-scale planning, such as ego-centric freespace, confound these two motions, making the representation difficult to use for downstream motion planners.
1 code implementation • 25 Sep 2022 • Ali Athar, Jonathon Luiten, Paul Voigtlaender, Tarasha Khurana, Achal Dave, Bastian Leibe, Deva Ramanan
Multiple existing benchmarks involve tracking and segmenting objects in video e. g., Video Object Segmentation (VOS) and Multi-Object Tracking and Segmentation (MOTS), but there is little interaction between them due to the use of disparate benchmark datasets and metrics (e. g. J&F, mAP, sMOTSA).
Multi-Object Tracking
Multi-Object Tracking and Segmentation
+4
1 code implementation • 11 Aug 2022 • Jason Y. Zhang, Deva Ramanan, Shubham Tulsiani
We describe a data-driven method for inferring the camera viewpoints given multiple images of an arbitrary object.
1 code implementation • 1 Jun 2022 • Ali Athar, Jonathon Luiten, Alexander Hermans, Deva Ramanan, Bastian Leibe
Recently, "Masked Attention" was proposed in which a given object representation only attends to those image pixel features for which the segmentation mask of that object is active.
1 code implementation • CVPR 2022 • Neehar Peri, Jonathon Luiten, Mengtian Li, Aljoša Ošep, Laura Leal-Taixé, Deva Ramanan
Object detection and forecasting are fundamental components of embodied perception.
1 code implementation • CVPR 2022 • Shaden Alshammari, Yu-Xiong Wang, Deva Ramanan, Shu Kong
In contrast, weight decay penalizes larger weights more heavily and so learns small balanced weights; the MaxNorm constraint encourages growing small weights within a norm ball but caps all the weights by the radius.
Ranked #6 on
Long-tail Learning
on CIFAR-100-LT (ρ=10)
1 code implementation • 17 Jan 2022 • Zhiqiu Lin, Jia Shi, Deepak Pathak, Deva Ramanan
The major strength of CLEAR over prior CL benchmarks is the smooth temporal evolution of visual concepts with real-world imagery, including both high-quality labeled data along with abundant unlabeled samples per time period for continual semi-supervised learning.
no code implementations • CVPR 2022 • Yang Liu, Idil Esen Zulfikar, Jonathon Luiten, Achal Dave, Deva Ramanan, Bastian Leibe, Aljoša Ošep, Laura Leal-Taixé
A benchmark that would allow us to perform an apple-to-apple comparison of existing efforts is a crucial first step towards advancing this important research field.
1 code implementation • CVPR 2022 • Gengshan Yang, Minh Vo, Natalia Neverova, Deva Ramanan, Andrea Vedaldi, Hanbyul Joo
Our key insight is to merge three schools of thought; (1) classic deformable shape models that make use of articulated bones and blend skinning, (2) volumetric neural radiance fields (NeRFs) that are amenable to gradient-based optimization, and (3) canonical embeddings that generate correspondences between pixels and an articulated model.
1 code implementation • CVPR 2022 • Haithem Turki, Deva Ramanan, Mahadev Satyanarayanan
We use neural radiance fields (NeRFs) to build interactive 3D environments from large-scale visual captures spanning buildings or even multiple city blocks collected primarily from drones.
1 code implementation • CVPR 2022 • Ali Athar, Jonathon Luiten, Alexander Hermans, Deva Ramanan, Bastian Leibe
Existing state-of-the-art methods for Video Object Segmentation (VOS) learn low-level pixel-to-pixel correspondences between frames to propagate object masks across video.
1 code implementation • NeurIPS 2021 • Gengshan Yang, Deqing Sun, Varun Jampani, Daniel Vlasic, Forrester Cole, Ce Liu, Deva Ramanan
The surface embeddings are implemented as coordinate-based MLPs that are fit to each video via consistency and contrastive reconstruction losses. Experimental results show that ViSER compares favorably against prior work on challenging videos of humans with loose clothing and unusual poses as well as animals videos from DAVIS and YTVOS.
3D Shape Reconstruction
3D Shape Reconstruction from Videos
+1
1 code implementation • NeurIPS 2021 • Jason Y. Zhang, Gengshan Yang, Shubham Tulsiani, Deva Ramanan
NeRS learns a neural shape representation of a closed surface that is diffeomorphic to a sphere, guaranteeing water-tight reconstructions.
no code implementations • ICCV 2021 • Fait Poms, Vishnu Sarukkai, Ravi Teja Mullapudi, Nimit S. Sohoni, William R. Mark, Deva Ramanan, Kayvon Fatahalian
For machine learning models trained with limited labeled training data, validation stands to become the main bottleneck to reducing overall annotation costs.
1 code implementation • ICCV 2021 • Chittesh Thavamani, Mengtian Li, Nicolas Cebron, Deva Ramanan
Efficient processing of high-res video streams is safety-critical for many robotics applications such as autonomous driving.
1 code implementation • CVPR 2022 • Kangle Deng, Andrew Liu, Jun-Yan Zhu, Deva Ramanan
Crucially, SFM also produces sparse 3D points that can be used as "free" depth supervision during training: we add a loss to encourage the distribution of a ray's terminating depth matches a given 3D keypoint, incorporating depth uncertainty.
1 code implementation • CVPR 2021 • Peiyun Hu, Aaron Huang, John Dolan, David Held, Deva Ramanan
Finally, we propose future freespace as an additional source of annotation-free supervision.
1 code implementation • CVPR 2021 • Gengshan Yang, Deqing Sun, Varun Jampani, Daniel Vlasic, Forrester Cole, Huiwen Chang, Deva Ramanan, William T. Freeman, Ce Liu
Remarkable progress has been made in 3D reconstruction of rigid structures from a video or a collection of images.
no code implementations • 22 Apr 2021 • Yang Liu, Idil Esen Zulfikar, Jonathon Luiten, Achal Dave, Deva Ramanan, Bastian Leibe, Aljoša Ošep, Laura Leal-Taixé
We hope to open a new front in multi-object tracking research that will hopefully bring us a step closer to intelligent systems that can operate safely in the real world.
2 code implementations • 7 Apr 2021 • Yi-Ting Chen, Jinghao Shi, Zelin Ye, Christoph Mertz, Deva Ramanan, Shu Kong
Object detection with multimodal inputs can improve many safety-critical systems such as autonomous vehicles (AVs).
no code implementations • 7 Apr 2021 • Zhiqiu Lin, Deva Ramanan, Aayush Bansal
We present streaming self-training (SST) that aims to democratize the process of learning visual recognition models such that a non-expert user can define a new task depending on their needs via a few labeled examples and minimal domain knowledge.
1 code implementation • ICCV 2021 • Shu Kong, Deva Ramanan
However, the former generalizes poorly to diverse open test data due to overfitting to the training outliers, which are unlikely to exhaustively span the open-world.
no code implementations • 31 Mar 2021 • Kevin Wang, Deva Ramanan, Aayush Bansal
Associating latent codes of a video and manifold projection enables users to make desired edits.
2 code implementations • 1 Feb 2021 • Achal Dave, Piotr Dollár, Deva Ramanan, Alexander Kirillov, Ross Girshick
On one hand, this is desirable as it treats all classes equally.
1 code implementation • CVPR 2021 • Gengshan Yang, Deva Ramanan
Geometric motion segmentation algorithms, however, generalize to novel scenes, but have yet to achieve comparable performance to appearance-based ones, due to noisy motion estimations and degenerate motion configurations.
no code implementations • ICCV 2021 • Ravi Teja Mullapudi, Fait Poms, William R. Mark, Deva Ramanan, Kayvon Fatahalian
In this paper, we consider the scenario where we start with as-little-as five labeled positives of a rare category and a large amount of unlabeled data of which 99. 9% of it is negatives.
no code implementations • 1 Jan 2021 • Shu Kong, Deva Ramanan
Machine-learned safety-critical systems need to be self-aware and reliably know their unknowns in the open-world.
1 code implementation • ICCV 2021 • Tarasha Khurana, Achal Dave, Deva Ramanan
We demonstrate that current detection and tracking systems perform dramatically worse on this task.
1 code implementation • CVPR 2021 • Ravi Teja Mullapudi, Fait Poms, William R. Mark, Deva Ramanan, Kayvon Fatahalian
We focus on the real-world problem of training accurate deep models for image classification of a small number of rare categories.
1 code implementation • 24 Aug 2020 • Siddhesh Khandelwal, William Qi, Jagjeet Singh, Andrew Hartnett, Deva Ramanan
Forecasting the long-term future motion of road actors is a core challenge to the deployment of safe autonomous vehicles (AVs).
1 code implementation • ECCV 2020 • Jason Y. Zhang, Sam Pepose, Hanbyul Joo, Deva Ramanan, Jitendra Malik, Angjoo Kanazawa
We present a method that infers spatial arrangements and shapes of humans and objects in a globally consistent 3D scene, all from a single image in-the-wild captured in an uncontrolled environment.
3D Human Pose Estimation
3D Shape Reconstruction From A Single 2D Image
+2
no code implementations • CVPR 2020 • Aayush Bansal, Minh Vo, Yaser Sheikh, Deva Ramanan, Srinivasa Narasimhan
We present a data-driven approach for 4D space-time visualization of dynamic events from videos captured by hand-held multiple cameras.
1 code implementation • ECCV 2020 • Mengtian Li, Yu-Xiong Wang, Deva Ramanan
While past work has studied the algorithmic trade-off between latency and accuracy, there has not been a clear metric to compare different methods along the Pareto optimal latency-accuracy curve.
Ranked #2 on
Real-Time Object Detection
on Argoverse-HD (Detection-Only, Val)
(using extra training data)
no code implementations • ECCV 2020 • Achal Dave, Tarasha Khurana, Pavel Tokmakov, Cordelia Schmid, Deva Ramanan
To this end, we ask annotators to label objects that move at any point in the video, and give names to them post factum.
no code implementations • ICLR 2020 • Rohit Girdhar, Deva Ramanan
In this work, we build a video dataset with fully observable and controllable object and scene bias, and which truly requires spatiotemporal understanding in order to be solved.
1 code implementation • 31 Mar 2020 • Ligong Han, Robert F. Murphy, Deva Ramanan
A key step in understanding the spatial organization of cells and tissues is the ability to construct generative models that accurately reflect that organization.
1 code implementation • ICLR 2021 • Kangle Deng, Aayush Bansal, Deva Ramanan
We present an unsupervised approach that converts the input speech of any individual into audiovisual streams of potentially-infinitely many output speakers.
1 code implementation • ICLR 2020 • William Qi, Ravi Teja Mullapudi, Saurabh Gupta, Deva Ramanan
In this paper, we combine the best of both worlds with a modular approach that learns a spatial representation of a scene that is trained to be effective when coupled with traditional geometric planners.
2 code implementations • CVPR 2019 • Gengshan Yang, Joshua Manela, Michael Happold, Deva Ramanan
We explore the problem of real-time stereo matching on high-res imagery.
1 code implementation • 12 Dec 2019 • Gengshan Yang, Peiyun Hu, Deva Ramanan
Such approaches cannot diagnose when failures might occur.
1 code implementation • CVPR 2020 • Peiyun Hu, Jason Ziglar, David Held, Deva Ramanan
On the NuScenes 3D detection benchmark, we show that, by adding an additional stream for visibility input, we can significantly improve the overall detection accuracy of a state-of-the-art 3D detector.
no code implementations • 10 Dec 2019 • Peiyun Hu, David Held, Deva Ramanan
We prove that if we score a segmentation by the worst objectness among its individual segments, there is an efficient algorithm that finds the optimal worst-case segmentation among an exponentially large number of candidate segmentations.
2 code implementations • NeurIPS 2019 • Gengshan Yang, Deva Ramanan
As a result, SOTA networks also employ various heuristics designed to limit volumetric processing, leading to limited accuracy and overfitting.
Ranked #11 on
Optical Flow Estimation
on Sintel-final
no code implementations • 8 Nov 2019 • Bhavan Jasani, Rohit Girdhar, Deva Ramanan
Joint vision and language tasks like visual question answering are fascinating because they explore high-level understanding, but at the same time, can be more prone to language biases.
2 code implementations • CVPR 2019 • Ming-Fang Chang, John Lambert, Patsorn Sangkloy, Jagjeet Singh, Slawomir Bak, Andrew Hartnett, De Wang, Peter Carr, Simon Lucey, Deva Ramanan, James Hays
In our baseline experiments, we illustrate how detailed map information such as lane direction, driveable area, and ground height improves the accuracy of 3D object tracking and motion forecasting.
no code implementations • 25 Oct 2019 • Achal Dave, Pavel Tokmakov, Cordelia Schmid, Deva Ramanan
Moreover, at test time the same network can be applied to detection and tracking, resulting in a unified approach for the two tasks.
no code implementations • ICLR 2020 • Jessica Lee, Deva Ramanan, Rohit Girdhar
We address the task of unsupervised retargeting of human actions from one video to another.
1 code implementation • 10 Oct 2019 • Rohit Girdhar, Deva Ramanan
In this work, we build a video dataset with fully observable and controllable object and scene bias, and which truly requires spatiotemporal understanding in order to be solved.
no code implementations • ICCV 2019 • Phuc Xuan Nguyen, Deva Ramanan, Charless C. Fowlkes
Our approach makes use of two innovations to attention-modeling in weakly-supervised learning.
Action Localization
Weakly Supervised Action Localization
+1
no code implementations • CVPR 2017 • Yu-Xiong Wang, Deva Ramanan, Martial Hebert
One of their remarkable properties is the ability to transfer knowledge from a large source dataset to a (typically smaller) target dataset.
no code implementations • CVPR 2019 • Aayush Bansal, Yaser Sheikh, Deva Ramanan
We introduce a data-driven approach for interactively synthesizing in-the-wild images from semantic label maps.
1 code implementation • ICCV 2021 • Vaishaal Shankar, Achal Dave, Rebecca Roelofs, Deva Ramanan, Benjamin Recht, Ludwig Schmidt
Additionally, we evaluate three detection models and show that natural perturbations induce both classification as well as localization errors, leading to a median drop in detection mAP of 14 points.
no code implementations • ICML Workshop Deep_Phenomen 2019 • Vaishaal Shankar, Achal Dave, Rebecca Roelofs, Deva Ramanan, Benjamin Recht, Ludwig Schmidt
We introduce a systematic framework for quantifying the robustness of classifiers to naturally occurring perturbations of images found in videos.
1 code implementation • ICLR 2020 • Mengtian Li, Ersin Yumer, Deva Ramanan
We also revisit existing approaches for fast convergence and show that budget-aware learning schedules readily outperform such approaches under (the practical but under-explored) budgeted training setting.
no code implementations • 11 Feb 2019 • Achal Dave, Pavel Tokmakov, Deva Ramanan
To address this concern, we propose two new benchmarks for generic, moving object detection, and show that our model matches top-down methods on common categories, while significantly out-performing both top-down and bottom-up methods on never-before-seen categories.
no code implementations • ICCV 2019 • Rohit Girdhar, Du Tran, Lorenzo Torresani, Deva Ramanan
In this work, we propose an alternative approach to learning video representations that require no semantically labeled videos and instead leverages the years of effort in collecting and labeling large and clean still-image datasets.
Ranked #69 on
Action Recognition
on HMDB-51
(using extra training data)
3 code implementations • 2 Jan 2019 • Mengtian Li, Zhe Lin, Radomir Mech, Ersin Yumer, Deva Ramanan
Edges, boundaries and contours are important subjects of study in both computer graphics and computer vision.
1 code implementation • ICCV 2019 • Ravi Teja Mullapudi, Steven Chen, Keyi Zhang, Deva Ramanan, Kayvon Fatahalian
Rather than learn a specialized student model on offline data from the video stream, we train the student in an online fashion on the live video, intermittently running the teacher to provide a target for learning.
no code implementations • ECCV 2018 • Liang-Yan Gui, Yu-Xiong Wang, Deva Ramanan, Jose M. F. Moura
This paper addresses the problem of few-shot human motion prediction, in the spirit of the recent progress on few-shot learning and meta-learning.
1 code implementation • ECCV 2018 • Aayush Bansal, Shugao Ma, Deva Ramanan, Yaser Sheikh
We introduce a data-driven approach for unsupervised video retargeting that translates content from one domain to another while preserving the style native to a domain, i. e., if contents of John Oliver's speech were to be transferred to Stephen Colbert, then the generated content/speech should be in Stephen Colbert's style.
no code implementations • ICML 2018 • Phuc Nguyen, Deva Ramanan, Charless Fowlkes
Much recent work on visual recognition aims to scale up learning to massive, noisily-annotated datasets.
1 code implementation • 6 Apr 2018 • Bailey Kong, James Supancic, Deva Ramanan, Charless C. Fowlkes
We investigate the problem of automatically determining what type of shoe left an impression found at a crime scene.
1 code implementation • ICLR 2019 • Peiyun Hu, Zachary C. Lipton, Anima Anandkumar, Deva Ramanan
While many active learning papers assume that the learner can simply ask for a label and receive it, real annotation often presents a mismatch between the form of a label (say, one among many classes), and the form of an annotation (typically yes/no binary feedback).
no code implementations • 6 Feb 2018 • Mengtian Li, Laszlo Jeni, Deva Ramanan
While most prior work treats this as a regression problem, we instead formulate it as a discrete $K$-way classification task, where a classifier is trained to return one of $K$ discrete alignments.
no code implementations • NeurIPS 2017 • Yu-Xiong Wang, Deva Ramanan, Martial Hebert
We cast this problem as transfer learning, where knowledge from the data-rich classes in the head of the distribution is transferred to the data-poor classes in the tail.
no code implementations • 29 Nov 2017 • Victor Fragoso, Chunhui Liu, Aayush Bansal, Deva Ramanan
We present compositional nearest neighbors (CompNN), a simple approach to visually interpreting distributed representations learned by a convolutional neural network (CNN) for pixel-level tasks (e. g., image synthesis and segmentation).
1 code implementation • NeurIPS 2017 • Rohit Girdhar, Deva Ramanan
We introduce a simple yet surprisingly powerful model to incorporate attention in action recognition and human object interaction tasks.
Ranked #7 on
Human-Object Interaction Detection
on HICO
1 code implementation • ICLR 2018 • Aayush Bansal, Yaser Sheikh, Deva Ramanan
We present a simple nearest-neighbor (NN) approach that synthesizes high-frequency photorealistic images from an "incomplete" signal such as a low-resolution image, a surface normal map, or edges.
no code implementations • ICCV 2017 • Chen Huang, Simon Lucey, Deva Ramanan
Our fundamental insight is to take an adaptive approach, where easy frames are processed with cheap features (such as pixel values), while challenging frames are processed with invariant but expensive deep features.
no code implementations • 8 Aug 2017 • Manuel Günther, Peiyun Hu, Christian Herrmann, Chi Ho Chan, Min Jiang, Shufan Yang, Akshay Raj Dhamija, Deva Ramanan, Jürgen Beyerer, Josef Kittler, Mohamad Al Jazaery, Mohammad Iqbal Nouyed, Guodong Guo, Cezary Stankiewicz, Terrance E. Boult
Face detection and recognition benchmarks have shifted toward more difficult environments.
no code implementations • 22 Jul 2017 • Zachary Pezzementi, Trenton Tabor, Peiyun Hu, Jonathan K. Chang, Deva Ramanan, Carl Wellington, Benzun P. Wisely Babu, Herman Herman
Person detection from vehicles has made rapid progress recently with the advent of multiple highquality datasets of urban and highway driving, yet no large-scale benchmark is available for the same problem in off-road or agricultural environments.
no code implementations • ICCV 2017 • James Steven Supancic III, Deva Ramanan
We formulate tracking as an online decision-making process, where a tracking agent must follow an object despite ambiguous image frames and a limited computational budget.
no code implementations • CVPR 2017 • Achal Dave, Olga Russakovsky, Deva Ramanan
While deep feature learning has revolutionized techniques for static-image understanding, the same does not quite hold for video processing.
no code implementations • CVPR 2017 • Rohit Girdhar, Deva Ramanan, Abhinav Gupta, Josef Sivic, Bryan Russell
In this work, we introduce a new video representation for action classification that aggregates local convolutional features across the entire spatio-temporal extent of the video.
1 code implementation • CVPR 2017 • Shiyu Huang, Deva Ramanan
Such "in-the-tail" data is notoriously hard to observe, making both training and testing difficult.
1 code implementation • ICCV 2017 • Hamed Kiani Galoogahi, Ashton Fagg, Chen Huang, Deva Ramanan, Simon Lucey
In this paper, we propose the first higher frame rate video dataset (called Need for Speed - NfS) and benchmark for visual object tracking.
1 code implementation • 21 Feb 2017 • Aayush Bansal, Xinlei Chen, Bryan Russell, Abhinav Gupta, Deva Ramanan
We explore design principles for general pixel-level prediction problems, from low-level edge detection to mid-level surface normal estimation to high-level semantic segmentation.
no code implementations • CVPR 2017 • Ching-Hang Chen, Deva Ramanan
While many approaches try to directly predict 3D pose from image measurements, we explore a simple architecture that reasons through intermediate 2D pose predictions.
Ranked #248 on
3D Human Pose Estimation
on Human3.6M
no code implementations • 15 Dec 2016 • Vivek Krishnan, Deva Ramanan
We consider the task of visual net surgery, in which a CNN can be reconfigured without extra data to recognize novel concepts that may be omitted from the training set.
20 code implementations • CVPR 2017 • Peiyun Hu, Deva Ramanan
We explore three aspects of the problem in the context of finding small faces: the role of scale invariance, image resolution, and contextual reasoning.
Ranked #14 on
Face Detection
on WIDER Face (Medium)
no code implementations • 21 Sep 2016 • Aayush Bansal, Xinlei Chen, Bryan Russell, Abhinav Gupta, Deva Ramanan
We explore architectures for general pixel-level prediction problems, from low-level edge detection to mid-level surface normal estimation to high-level semantic segmentation.
no code implementations • 31 Mar 2016 • Phuc Xuan Nguyen, Gregory Rogez, Charless Fowlkes, Deva Ramanan
Micro-videos are six-second videos popular on social media networks with several unique properties.
no code implementations • ICCV 2015 • James S. Supancic III, Gregory Rogez, Yi Yang, Jamie Shotton, Deva Ramanan
To spur further progress we introduce a challenging new dataset with diverse, cluttered scenes.
no code implementations • ICCV 2015 • Chunshui Cao, Xian-Ming Liu, Yi Yang, Yinan Yu, Jiang Wang, Zilei Wang, Yongzhen Huang, Liang Wang, Chang Huang, Wei Xu, Deva Ramanan, Thomas S. Huang
While feedforward deep convolutional neural networks (CNNs) have been a great success in computer vision, it is important to remember that the human visual contex contains generally more feedback connections than foward connections.
no code implementations • ICCV 2015 • Gregory Rogez, James S. Supancic III, Deva Ramanan
We analyze functional manipulations of handheld objects, formalizing the problem as one of fine-grained grasp classification.
1 code implementation • CVPR 2016 • Peiyun Hu, Deva Ramanan
We show that RGs can be optimized with a quadratic program (QP), that can in turn be optimized with a recurrent neural network (with rectified linear units).
Ranked #40 on
Pose Estimation
on MPII Human Pose
no code implementations • CVPR 2015 • Gregory Rogez, James S. Supancic III, Deva Ramanan
In egocentric views, hands and arms are observable within a well defined volume in front of the camera.
no code implementations • ICCV 2015 • Songfan Yang, Deva Ramanan
We explore multi-scale convolutional neural nets (CNNs) for image classification.
no code implementations • 24 Apr 2015 • James Steven Supancic III, Gregory Rogez, Yi Yang, Jamie Shotton, Deva Ramanan
To spur further progress we introduce a challenging new dataset with diverse, cluttered scenes.
no code implementations • 5 Mar 2015 • Xiangxin Zhu, Carl Vondrick, Charless Fowlkes, Deva Ramanan
Datasets for training object recognition systems are steadily increasing in size.
no code implementations • 29 Nov 2014 • Gregory Rogez, James S. Supancic III, Deva Ramanan
We tackle the problem of estimating the 3D pose of an individual's upper limbs (arms+hands) from a chest mounted depth-camera.
no code implementations • 29 Nov 2014 • Gregory Rogez, James S. Supancic III, Maryam Khademi, Jose Maria Martinez Montiel, Deva Ramanan
We focus on the task of everyday hand pose estimation from egocentric viewpoints.
no code implementations • CVPR 2014 • Xiangxin Zhu, Dragomir Anguelov, Deva Ramanan
We argue that object subcategories follow a long-tail distribution: a few subcategories are common, while many are rare.
no code implementations • CVPR 2014 • Hamed Pirsiavash, Deva Ramanan
Real-world videos of human activities exhibit temporal structure at various scales; long videos are typically composed out of multiple action instances, where each instance is itself composed of sub-actions with variable durations and orderings.
no code implementations • CVPR 2014 • Mohsen Hejrati, Deva Ramanan
We introduce an efficient "brute-force" approach to inference that searches through a large number of candidate reconstructions, returning the optimal one.
no code implementations • CVPR 2014 • Golnaz Ghiasi, Yi Yang, Deva Ramanan, Charless C. Fowlkes
Occlusion poses a significant difficulty for object recognition due to the combinatorial diversity of possible occlusion patterns.
29 code implementations • 1 May 2014 • Tsung-Yi Lin, Michael Maire, Serge Belongie, Lubomir Bourdev, Ross Girshick, James Hays, Pietro Perona, Deva Ramanan, C. Lawrence Zitnick, Piotr Dollár
We present a new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding.
no code implementations • 6 Dec 2013 • Deva Ramanan
This manuscript describes a method for training linear SVMs (including binary SVMs, SVM regression, and structural SVMs) from large, out-of-core training datasets.
no code implementations • CVPR 2013 • Dennis Park, C. L. Zitnick, Deva Ramanan, Piotr Dollar
We describe novel but simple motion features for the problem of detecting objects in video sequences.
no code implementations • CVPR 2013 • Xiaofeng Ren, Deva Ramanan
Object detection has seen huge progress in recent years, much thanks to the heavily-engineered Histograms of Oriented Gradients (HOG) features.
no code implementations • CVPR 2013 • James S. Supancic III, Deva Ramanan
We address the problem of long-term object tracking, where the object may become occluded or leave-the-view.
no code implementations • NeurIPS 2012 • Mohsen Hejrati, Deva Ramanan
We use a morphable model to capture 3D within-class variation, and use a weak-perspective camera model to capture viewpoint.
no code implementations • NeurIPS 2011 • Carl Vondrick, Deva Ramanan
We introduce a novel active learning framework for video annotation.
no code implementations • NeurIPS 2011 • Levi Boyles, Anoop Korattikara, Deva Ramanan, Max Welling
Learning problems such as logistic regression are typically formulated as pure optimization problems defined on some loss function.
no code implementations • NeurIPS 2009 • Hamed Pirsiavash, Deva Ramanan, Charless C. Fowlkes
Bilinear classifiers are a discriminative variant of bilinear models, which capture the dependence of data on multiple factors.