no code implementations • NeurIPS 2012 • Kevin Tang, Vignesh Ramanathan, Li Fei-Fei, Daphne Koller
In this paper, we tackle the problem of adapting object detectors learned from images to work well on videos.
no code implementations • CVPR 2013 • Vignesh Ramanathan, Bangpeng Yao, Li Fei-Fei
We deal with the problem of recognizing social roles played by people in an event.
no code implementations • CVPR 2014 • Alexandre Alahi, Vignesh Ramanathan, Li Fei-Fei
In crowded spaces such as city centers or train stations, human mobility looks complex, but is often influenced only by a few causes.
no code implementations • ICCV 2015 • Vignesh Ramanathan, Kevin Tang, Greg Mori, Li Fei-Fei
In this paper, we propose to learn temporal embeddings of video frames for complex video analysis.
no code implementations • CVPR 2015 • Vignesh Ramanathan, Cong-Cong Li, Jia Deng, Wei Han, Zhen Li, Kunlong Gu, Yang song, Samy Bengio, Charles Rosenberg, Li Fei-Fei
Human actions capture a wide variety of interactions between people and objects.
no code implementations • CVPR 2016 • Vignesh Ramanathan, Jonathan Huang, Sami Abu-El-Haija, Alexander Gorban, Kevin Murphy, Li Fei-Fei
In this paper, we propose a model which learns to detect events in such videos while automatically "attending" to the people responsible for the event.
no code implementations • CVPR 2016 • Alexandre Alahi, Kratarth Goel, Vignesh Ramanathan, Alexandre Robicquet, Li Fei-Fei, Silvio Savarese
Different from the conventional LSTM, we share the information between multiple LSTMs through a new pooling layer.
Ranked #1 on Trajectory Prediction on Stanford Drone (FDE(8/12) @K=5 metric)
no code implementations • CVPR 2017 • Serena Yeung, Vignesh Ramanathan, Olga Russakovsky, Liyue Shen, Greg Mori, Li Fei-Fei
Our method uses Q-learning to learn a data labeling policy on a small labeled training dataset, and then uses this to automatically label noisy web data for new visual concepts.
4 code implementations • ECCV 2018 • Dhruv Mahajan, Ross Girshick, Vignesh Ramanathan, Kaiming He, Manohar Paluri, Yixuan Li, Ashwin Bharambe, Laurens van der Maaten
ImageNet classification is the de facto pretraining task for these models.
Ranked #221 on Image Classification on ImageNet (using extra training data)
no code implementations • CVPR 2018 • De-An Huang, Vignesh Ramanathan, Dhruv Mahajan, Lorenzo Torresani, Manohar Paluri, Li Fei-Fei, Juan Carlos Niebles
The ability to capture temporal information has been critical to the development of video understanding models.
no code implementations • CVPR 2019 • Zhenheng Yang, Dhruv Mahajan, Deepti Ghadiyaram, Ram Nevatia, Vignesh Ramanathan
Weakly supervised object detection aims at reducing the amount of supervision required to train detection models.
Ranked #1 on Weakly Supervised Object Detection on Charades
no code implementations • CVPR 2020 • Vignesh Ramanathan, Rui Wang, Dhruv Mahajan
This requires a detection framework that can be jointly trained with limited number of bounding box annotated images and large number of weakly labelled images.
no code implementations • 13 Aug 2020 • Rui Wang, Dhruv Mahajan, Vignesh Ramanathan
It is lucrative to train a good proposal model, that generalizes to unseen classes.
no code implementations • ICCV 2021 • Vignesh Ramanathan, Rui Wang, Dhruv Mahajan
State-of-the-art object detection approaches typically rely on pre-trained classification models to achieve better performance and faster convergence.
no code implementations • CVPR 2021 • Qing Liu, Vignesh Ramanathan, Dhruv Mahajan, Alan Yuille, Zhenheng Yang
However, existing approaches which rely only on image-level class labels predominantly suffer from errors due to (a) partial segmentation of objects and (b) missing object predictions.
no code implementations • CVPR 2021 • Abhimanyu Dubey, Vignesh Ramanathan, Alex Pentland, Dhruv Mahajan
We show that the existing approaches either do not scale to this dataset or underperform compared to the simple baseline of training a model on the union of data from all training domains.
no code implementations • 10 May 2021 • Lakshmi Annamalai, Vignesh Ramanathan, Chetan Singh Thakur
Compared to competing supervised approaches, ours is a task-agnostic approach ideally suited for the event domain, where task specific labeled data is scarce.
1 code implementation • CVPR 2023 • Jang Hyun Cho, Philipp Krähenbühl, Vignesh Ramanathan
PartDistillation transfers the part information of an instance segmentation model into a part segmentation model through self-supervised self-training on a large dataset.
1 code implementation • CVPR 2023 • Vignesh Ramanathan, Anmol Kalia, Vladan Petrovic, Yi Wen, Baixue Zheng, Baishan Guo, Rui Wang, Aaron Marquez, Rama Kovvuri, Abhishek Kadian, Amir Mousavi, Yiwen Song, Abhimanyu Dubey, Dhruv Mahajan
This motivates the need for large datasets which go beyond traditional object masks and provide richer annotations such as part masks and attributes.
1 code implementation • CVPR 2023 • Filip Radenovic, Abhimanyu Dubey, Abhishek Kadian, Todor Mihaylov, Simon Vandenhende, Yash Patel, Yi Wen, Vignesh Ramanathan, Dhruv Mahajan
Vision-language models trained with contrastive learning on large-scale noisy data are becoming increasingly popular for zero-shot recognition problems.
no code implementations • 27 Sep 2023 • Xiaoliang Dai, Ji Hou, Chih-Yao Ma, Sam Tsai, Jialiang Wang, Rui Wang, Peizhao Zhang, Simon Vandenhende, Xiaofang Wang, Abhimanyu Dubey, Matthew Yu, Abhishek Kadian, Filip Radenovic, Dhruv Mahajan, Kunpeng Li, Yue Zhao, Vladan Petrovic, Mitesh Kumar Singh, Simran Motwani, Yi Wen, Yiwen Song, Roshan Sumbaly, Vignesh Ramanathan, Zijian He, Peter Vajda, Devi Parikh
Training text-to-image models with web scale image-text pairs enables the generation of a wide range of visual concepts from text.
no code implementations • 6 Dec 2023 • Ivona Najdenkoska, Animesh Sinha, Abhimanyu Dubey, Dhruv Mahajan, Vignesh Ramanathan, Filip Radenovic
We propose Context Diffusion, a diffusion-based framework that enables image generation models to learn from visual examples presented in context.