Search Results for author: Rahul Sukthankar

Found 45 papers, 14 papers with code

Semi-supervised Learning with Weakly-Related Unlabeled Data : Towards Better Text Categorization

no code implementations • NeurIPS 2008 • Liu Yang, Rong Jin, Rahul Sukthankar

For empirical evaluation, we present a direct comparison with a number of state-of-the-art methods for inductive semi-supervised learning and text categorization; and we show that SSLW results in a significant improvement in categorization accuracy, equipped with a small training set and an unlabeled resource that is weakly related to the test beds."

General Classification Text Categorization

Paper
Add Code

An Integer Projected Fixed Point Method for Graph Matching and MAP Inference

no code implementations • NeurIPS 2009 • Marius Leordeanu, Martial Hebert, Rahul Sukthankar

When applied to MAP inference, the algorithm is a parallel extension of Iterated Conditional Modes (ICM) with climbing and convergence properties that make it a compelling alternative to the sequential ICM.

Graph Matching

Paper
Add Code

Discriminative Segment Annotation in Weakly Labeled Video

no code implementations • CVPR 2013 • Kevin Tang, Rahul Sukthankar, Jay Yagnik, Li Fei-Fei

Second, we ensure that CRANE is robust to label noise, both in terms of tagged videos that fail to contain the concept as well as occasional negative videos that do.

Paper
Add Code

Spatiotemporal Deformable Part Models for Action Detection

no code implementations • CVPR 2013 • Yicong Tian, Rahul Sukthankar, Mubarak Shah

Deformable part models have achieved impressive performance for object detection, even on difficult image datasets.

Action Detection object-detection +1

Paper
Add Code

Thoughts on a Recursive Classifier Graph: a Multiclass Network for Deep Object Recognition

no code implementations • 2 Apr 2014 • Marius Leordeanu, Rahul Sukthankar

In this manner we can learn and grow both a deep, complex graph of classifiers and a rich pool of features at different levels of abstraction and interpretation.

Object Recognition

Paper
Add Code

Recognition of Complex Events: Exploiting Temporal Dynamics between Underlying Concepts

no code implementations • CVPR 2014 • Subhabrata Bhattacharya, Mahdi M. Kalayeh, Rahul Sukthankar, Mubarak Shah

While approaches based on bags of features excel at low-level action classification, they are ill-suited for recognizing complex events in video, where concept-based temporal representations currently dominate.

Action Classification Event Detection +3

Paper
Add Code

Large-Scale Video Classification with Convolutional Neural Networks

1 code implementation • 2014 IEEE Conference on Computer Vision and Pattern Recognition 2014 • Andrej Karpathy, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Sukthankar, Li Fei-Fei

We further study the generalization performance of our best model by retraining the top layers on the UCF-101 Action Recognition dataset and observe significant performance improvements compared to the UCF-101 baseline model (63. 3% up from 43. 9%).

Ranked #9 on Action Recognition on Sports-1M

Action Recognition Classification +3

Paper
Code

Features in Concert: Discriminative Feature Selection meets Unsupervised Clustering

no code implementations • 27 Nov 2014 • Marius Leordeanu, Alexandra Radu, Rahul Sukthankar

Feature selection is an essential problem in computer vision, important for category learning and recognition.

Clustering feature selection

Paper
Add Code

Articulated motion discovery using pairs of trajectories

no code implementations • CVPR 2015 • Luca Del Pero, Susanna Ricco, Rahul Sukthankar, Vittorio Ferrari

We propose an unsupervised approach for discovering characteristic motion patterns in videos of highly articulated objects performing natural, unscripted behaviors, such as tigers in the wild.

Paper
Add Code

Recovering Spatiotemporal Correspondence between Deformable Objects by Exploiting Consistent Foreground Motion in Video

no code implementations • 1 Dec 2014 • Luca Del Pero, Susanna Ricco, Rahul Sukthankar, Vittorio Ferrari

Given unstructured videos of deformable objects, we automatically recover spatiotemporal correspondences to map one object to another (such as animals in the wild).

Object

Paper
Add Code

Temporal Localization of Fine-Grained Actions in Videos by Domain Transfer from Web Images

1 code implementation • 4 Apr 2015 • Chen Sun, Sanketh Shetty, Rahul Sukthankar, Ram Nevatia

To solve this problem, we propose a simple yet effective method that takes weak video labels and noisy image labels as input, and generates localized action frames as output.

Action Recognition Temporal Action Localization +1

Paper
Code

Robust Video Segment Proposals With Painless Occlusion Handling

no code implementations • CVPR 2015 • Zhengyang Wu, Fuxin Li, Rahul Sukthankar, James M. Rehg

We propose a robust algorithm to generate video segment proposals.

Occlusion Handling

Paper
Add Code

MatchNet: Unifying Feature and Metric Learning for Patch-Based Matching

2 code implementations • CVPR 2015 • Xufeng Han, Thomas Leung, Yangqing Jia, Rahul Sukthankar, Alexander C. Berg

We perform a comprehensive set of experiments on standard datasets to carefully study the contributions of each aspect of MatchNet, with direct comparisons to established methods.

Computational Efficiency Metric Learning +1

188

Paper
Code

Coreset-Based Adaptive Tracking

no code implementations • 19 Nov 2015 • Abhimanyu Dubey, Nikhil Naik, Dan Raviv, Rahul Sukthankar, Ramesh Raskar

We propose a method for learning from streaming visual data using a compact, constant size representation of all the data that was seen until a given moment.

Object Object Tracking

Paper
Add Code

Variable Rate Image Compression with Recurrent Neural Networks

1 code implementation • 19 Nov 2015 • George Toderici, Sean M. O'Malley, Sung Jin Hwang, Damien Vincent, David Minnen, Shumeet Baluja, Michele Covell, Rahul Sukthankar

A large fraction of Internet traffic is now driven by requests from mobile devices with relatively small screens and often stringent bandwidth requirements.

Image Compression Image Reconstruction

165

Paper
Code

Behavior Discovery and Alignment of Articulated Object Classes from Unstructured Video

no code implementations • 30 Nov 2015 • Luca Del Pero, Susanna Ricco, Rahul Sukthankar, Vittorio Ferrari

On behavior discovery, we outperform the state-of-the-art Improved DTF descriptor.

Retrieval

Paper
Add Code

Labeling the Features Not the Samples: Efficient Video Classification with Minimal Supervision

no code implementations • 1 Dec 2015 • Marius Leordeanu, Alexandra Radu, Shumeet Baluja, Rahul Sukthankar

Our method works both as a feature selection mechanism and as a fully competitive classifier.

Clustering feature selection +2

Paper
Add Code

The THUMOS Challenge on Action Recognition for Videos "in the Wild"

no code implementations • 21 Apr 2016 • Haroon Idrees, Amir R. Zamir, Yu-Gang Jiang, Alex Gorban, Ivan Laptev, Rahul Sukthankar, Mubarak Shah

Additionally, we include a comprehensive empirical study evaluating the differences in action recognition between trimmed and untrimmed videos, and how well methods trained on trimmed videos generalize to untrimmed videos.

Action Classification Action Recognition +3

Paper
Add Code

Discovering the Physical Parts of an Articulated Object Class From Multiple Videos

no code implementations • CVPR 2016 • Luca Del Pero, Susanna Ricco, Rahul Sukthankar, Vittorio Ferrari

We propose a motion-based method to discover the physical parts of an articulated object class (e. g. head/torso/leg of a horse) from multiple videos.

Motion Segmentation Object +1

Paper
Add Code

Beyond Skip Connections: Top-Down Modulation for Object Detection

1 code implementation • 20 Dec 2016 • Abhinav Shrivastava, Rahul Sukthankar, Jitendra Malik, Abhinav Gupta

But most of these fine details are lost in the early convolutional layers.

Ranked #203 on Object Detection on COCO test-dev

Object object-detection +1

111

Paper
Code

Traffic Lights with Auction-Based Controllers: Algorithms and Real-World Data

no code implementations • 3 Feb 2017 • Shumeet Baluja, Michele Covell, Rahul Sukthankar

Real-time optimization of traffic flow addresses important practical problems: reducing a driver's wasted time, improving city-wide efficiency, reducing gas emissions and improving air quality.

Paper
Add Code

Cognitive Mapping and Planning for Visual Navigation

6 code implementations • CVPR 2017 • Saurabh Gupta, Varun Tolani, James Davidson, Sergey Levine, Rahul Sukthankar, Jitendra Malik

The accumulated belief of the world enables the agent to track visited regions of the environment.

Visual Navigation

76,579

Paper
Code

Robust Adversarial Reinforcement Learning

6 code implementations • ICML 2017 • Lerrel Pinto, James Davidson, Rahul Sukthankar, Abhinav Gupta

Deep neural networks coupled with fast simulation and improved computation have led to recent successes in the field of reinforcement learning (RL).

Friction reinforcement-learning +1

Paper
Code

SfM-Net: Learning of Structure and Motion from Video

no code implementations • 25 Apr 2017 • Sudheendra Vijayanarasimhan, Susanna Ricco, Cordelia Schmid, Rahul Sukthankar, Katerina Fragkiadaki

We propose SfM-Net, a geometry-aware neural network for motion estimation in videos that decomposes frame-to-frame pixel motion in terms of scene and object depth, camera motion and 3D object rotations and translations.

Motion Estimation Object +1

Paper
Add Code

Motion Prediction Under Multimodality with Conditional Stochastic Networks

no code implementations • 5 May 2017 • Katerina Fragkiadaki, Jonathan Huang, Alex Alemi, Sudheendra Vijayanarasimhan, Susanna Ricco, Rahul Sukthankar

In this work, we present stochastic neural network architectures that handle such multimodality through stochasticity: future trajectories of objects, body joints or frames are represented as deep, non-linear transformations of random (as opposed to deterministic) variables.

motion prediction Optical Flow Estimation +2

Paper
Add Code

WebVision Challenge: Visual Learning and Understanding With Web Data

no code implementations • 16 May 2017 • Wen Li, Li-Min Wang, Wei Li, Eirikur Agustsson, Jesse Berent, Abhinav Gupta, Rahul Sukthankar, Luc van Gool

The 2017 WebVision challenge consists of two tracks, the image classification task on WebVision test set, and the transfer learning task on PASCAL VOC 2012 dataset.

Benchmarking Image Classification +1

Paper
Add Code

AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions

8 code implementations • CVPR 2018 • Chunhui Gu, Chen Sun, David A. Ross, Carl Vondrick, Caroline Pantofaru, Yeqing Li, Sudheendra Vijayanarasimhan, George Toderici, Susanna Ricco, Rahul Sukthankar, Cordelia Schmid, Jitendra Malik

The AVA dataset densely annotates 80 atomic visual actions in 430 15-minute video clips, where actions are localized in space and time, resulting in 1. 58M action labels with multiple labels per person occurring frequently.

Ranked #6 on Action Detection on UCF101-24

Actin Detection Action Detection +3

76,579

Paper
Code

Object category learning and retrieval with weak supervision

1 code implementation • 26 Jan 2018 • Steven Hickson, Anelia Angelova, Irfan Essa, Rahul Sukthankar

We consider the problem of retrieving objects from image data and learning to classify them into meaningful semantic categories with minimal supervision.

Clustering Deep Clustering +2

Paper
Code

Rethinking the Faster R-CNN Architecture for Temporal Action Localization

no code implementations • CVPR 2018 • Yu-Wei Chao, Sudheendra Vijayanarasimhan, Bryan Seybold, David A. Ross, Jia Deng, Rahul Sukthankar

We propose TAL-Net, an improved approach to temporal action localization in video that is inspired by the Faster R-CNN object detection framework.

Ranked #27 on Temporal Action Localization on THUMOS’14

Action Classification General Classification +3

Paper
Add Code

Actor-Centric Relation Network

1 code implementation • ECCV 2018 • Chen Sun, Abhinav Shrivastava, Carl Vondrick, Kevin Murphy, Rahul Sukthankar, Cordelia Schmid

A visualization of the learned relation features confirms that our approach is able to attend to the relevant relations for each action.

Ranked #15 on Action Recognition on AVA v2.1

Action Classification Action Detection +5

3,876

Paper
Code

Modulated Policy Hierarchies

no code implementations • 30 Nov 2018 • Alexander Pashevich, Danijar Hafner, James Davidson, Rahul Sukthankar, Cordelia Schmid

To achieve this, we study different modulation signals and exploration for hierarchical controllers.

Reinforcement Learning (RL)

Paper
Add Code

D3D: Distilled 3D Networks for Video Action Recognition

1 code implementation • 19 Dec 2018 • Jonathan C. Stroud, David A. Ross, Chen Sun, Jia Deng, Rahul Sukthankar

State-of-the-art methods for video action recognition commonly use an ensemble of two networks: the spatial stream, which takes RGB frames as input, and the temporal stream, which takes optical flow as input.

Ranked #11 on Action Recognition on AVA v2.1

Action Classification Action Recognition +2

Paper
Code

Relational Action Forecasting

no code implementations • CVPR 2019 • Chen Sun, Abhinav Shrivastava, Carl Vondrick, Rahul Sukthankar, Kevin Murphy, Cordelia Schmid

This paper focuses on multi-person action forecasting in videos.

Action Classification Action Recognition +1

Paper
Add Code

An Efficient 3D CNN for Action/Object Segmentation in Video

no code implementations • 21 Jul 2019 • Rui Hou, Chen Chen, Rahul Sukthankar, Mubarak Shah

Convolutional Neural Network (CNN) based image segmentation has made great progress in recent years.

Ranked #64 on Semi-Supervised Video Object Segmentation on DAVIS 2016

Action Segmentation Image Segmentation +6

Paper
Add Code

Selfie Drone Stick: A Natural Interface for Quadcopter Photography

no code implementations • 14 Sep 2019 • Saif Alabachi, Gita Sukthankar, Rahul Sukthankar

This paper describes two key innovations required to deploy deep reinforcement learning models on a real robot: 1) an abstract state representation for transferring learning from simulation to the hardware platform, and 2) reward shaping and staging paradigms for training the controller.

Paper
Add Code

Weakly Supervised 3D Human Pose and Shape Reconstruction with Normalizing Flows

no code implementations • ECCV 2020 • Andrei Zanfir, Eduard Gabriel Bazavan, Hongyi Xu, Bill Freeman, Rahul Sukthankar, Cristian Sminchisescu

Monocular 3D human pose and shape estimation is challenging due to the many degrees of freedom of the human body and thedifficulty to acquire training data for large-scale supervised learning in complex visual scenes.

Ranked #54 on 3D Human Pose Estimation on 3DPW (PA-MPJPE metric)

3D human pose and shape estimation Self-Supervised Learning

Paper
Add Code

Speech2Action: Cross-modal Supervision for Action Recognition

no code implementations • CVPR 2020 • Arsha Nagrani, Chen Sun, David Ross, Rahul Sukthankar, Cordelia Schmid, Andrew Zisserman

We train a BERT-based Speech2Action classifier on over a thousand movie screenplays, to predict action labels from transcribed speech segments.

Action Recognition

Paper
Add Code

Learning Video Representations from Textual Web Supervision

no code implementations • 29 Jul 2020 • Jonathan C. Stroud, Zhichao Lu, Chen Sun, Jia Deng, Rahul Sukthankar, Cordelia Schmid, David A. Ross

Based on this observation, we propose to use text as a method for learning video representations.

Action Recognition Representation Learning

Paper
Add Code

The End-of-End-to-End: A Video Understanding Pentathlon Challenge (2020)

1 code implementation • 3 Aug 2020 • Samuel Albanie, Yang Liu, Arsha Nagrani, Antoine Miech, Ernesto Coto, Ivan Laptev, Rahul Sukthankar, Bernard Ghanem, Andrew Zisserman, Valentin Gabeur, Chen Sun, Karteek Alahari, Cordelia Schmid, Shi-Zhe Chen, Yida Zhao, Qin Jin, Kaixu Cui, Hui Liu, Chen Wang, Yudong Jiang, Xiaoshuai Hao

This report summarizes the results of the first edition of the challenge together with the findings of the participants.

Natural Language Queries Retrieval +3

327

Paper
Code

Neural Descent for Visual 3D Human Pose and Shape

no code implementations • CVPR 2021 • Andrei Zanfir, Eduard Gabriel Bazavan, Mihai Zanfir, William T. Freeman, Rahul Sukthankar, Cristian Sminchisescu

We present deep neural network methodology to reconstruct the 3d pose and shape of people, given an input RGB image.

Ranked #62 on 3D Human Pose Estimation on 3DPW (MPJPE metric)

3D Human Pose Estimation

Paper
Add Code

Semi-Supervised Learning for Multi-Task Scene Understanding by Neural Graph Consensus

3 code implementations • 2 Oct 2020 • Marius Leordeanu, Mihai Pirvu, Dragos Costea, Alina Marcu, Emil Slusanschi, Rahul Sukthankar

The unsupervised learning process is repeated over several generations, in which each edge becomes a "student" and also part of different ensemble "teachers" for training other students.

Scene Understanding Semantic Segmentation

Paper
Code

THUNDR: Transformer-based 3D HUmaN Reconstruction with Markers

no code implementations • ICCV 2021 • Mihai Zanfir, Andrei Zanfir, Eduard Gabriel Bazavan, William T. Freeman, Rahul Sukthankar, Cristian Sminchisescu

We present THUNDR, a transformer-based deep neural network methodology to reconstruct the 3d pose and shape of people, given monocular RGB images.

Ranked #41 on 3D Human Pose Estimation on 3DPW (MPJPE metric)

3D Human Pose Estimation 3D Human Reconstruction +2

Paper
Add Code

Discrete Representations Strengthen Vision Transformer Robustness

1 code implementation • ICLR 2022 • Chengzhi Mao, Lu Jiang, Mostafa Dehghani, Carl Vondrick, Rahul Sukthankar, Irfan Essa

Vision Transformer (ViT) is emerging as the state-of-the-art architecture for image recognition.

Ranked #3 on Domain Generalization on Stylized-ImageNet

Domain Generalization Image Classification

305

Paper
Code

HSPACE: Synthetic Parametric Humans Animated in Complex Environments

no code implementations • 23 Dec 2021 • Eduard Gabriel Bazavan, Andrei Zanfir, Mihai Zanfir, William T. Freeman, Rahul Sukthankar, Cristian Sminchisescu

We combine a hundred diverse individuals of varying ages, gender, proportions, and ethnicity, with hundreds of motions and scenes, as well as parametric variations in body shape (for a total of 1, 600 different humans), in order to generate an initial dataset of over 1 million frames.

Ranked #1 on 3D Human Pose Estimation on HSPACE

3D Human Pose Estimation Scene Understanding

Paper
Add Code

Self-supervised Hypergraphs for Learning Multiple World Interpretations

no code implementations • 15 Aug 2023 • Alina Marcu, Mihai Pirvu, Dragos Costea, Emanuela Haller, Emil Slusanschi, Ahmed Nabil Belbachir, Rahul Sukthankar, Marius Leordeanu

Thus, each node could be an input node in some hyperedges and an output node in others.

Multi-Task Learning Self-Supervised Learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.