Search Results for author: Abhinav Shrivastava

Found 100 papers, 46 papers with code

MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding

1 code implementation8 Apr 2024 Bo He, Hengduo Li, Young Kyun Jang, Menglin Jia, Xuefei Cao, Ashish Shah, Abhinav Shrivastava, Ser-Nam Lim

However, existing LLM-based large multimodal models (e. g., Video-LLaMA, VideoChat) can only take in a limited number of frames for short video understanding.

Question Answering Video Captioning +4

Measuring Style Similarity in Diffusion Models

1 code implementation1 Apr 2024 Gowthami Somepalli, Anubhav Gupta, Kamal Gupta, Shramay Palta, Micah Goldblum, Jonas Geiping, Abhinav Shrivastava, Tom Goldstein

We also propose a method to extract style descriptors that can be used to attribute style of a generated image to the images used in the training dataset of a text-to-image model.

Attribute

LiFT: A Surprisingly Simple Lightweight Feature Transform for Dense ViT Descriptors

no code implementations21 Mar 2024 Saksham Suri, Matthew Walmer, Kamal Gupta, Abhinav Shrivastava

We present a simple self-supervised method to enhance the performance of ViT features for dense downstream tasks.

Object Discovery

Customize-A-Video: One-Shot Motion Customization of Text-to-Video Diffusion Models

no code implementations22 Feb 2024 Yixuan Ren, Yang Zhou, Jimei Yang, Jing Shi, Difan Liu, Feng Liu, Mingi Kwon, Abhinav Shrivastava

With the emergence of text-to-video (T2V) diffusion models, its temporal counterpart, motion customization, has not yet been well investigated.

Video Generation

Explaining the Implicit Neural Canvas: Connecting Pixels to Neurons by Tracing their Contributions

no code implementations18 Jan 2024 Namitha Padmanabhan, Matthew Gwilliam, Pulkit Kumar, Shishira R Maiya, Max Ehrlich, Abhinav Shrivastava

We call the aggregate of these contribution maps the Implicit Neural Canvas and we use this concept to demonstrate that the INRs which we study learn to ''see'' the frames they represent in surprising ways.

Novel View Synthesis Video Compression

Video Dynamics Prior: An Internal Learning Approach for Robust Video Enhancements

no code implementations NeurIPS 2023 Gaurav Shrivastava, Ser-Nam Lim, Abhinav Shrivastava

In this paper, we present a novel robust framework for low-level vision tasks, including denoising, object removal, frame interpolation, and super-resolution, that does not require any external training data corpus.

Denoising Super-Resolution

EAGLES: Efficient Accelerated 3D Gaussians with Lightweight EncodingS

no code implementations7 Dec 2023 Sharath Girish, Kamal Gupta, Abhinav Shrivastava

We validate the effectiveness of our approach on a variety of datasets and scenes preserving the visual quality while consuming 10-20x less memory and faster training/inference speed.

Gen2Det: Generate to Detect

no code implementations7 Dec 2023 Saksham Suri, Fanyi Xiao, Animesh Sinha, Sean Chang Culatana, Raghuraman Krishnamoorthi, Chenchen Zhu, Abhinav Shrivastava

In the long-tailed detection setting on LVIS, Gen2Det improves the performance on rare categories by a large margin while also significantly improving the performance on other categories, e. g. we see an improvement of 2. 13 Box AP and 1. 84 Mask AP over just training on real data on LVIS with Mask R-CNN.

Image Generation Object +2

Multimodality-guided Image Style Transfer using Cross-modal GAN Inversion

no code implementations4 Dec 2023 Hanyu Wang, Pengxiang Wu, Kevin Dela Rosa, Chen Wang, Abhinav Shrivastava

Compared to IIST, such approaches provide more flexibility with text-specified styles, which are useful in scenarios where the style is hard to define with reference images.

Style Transfer

A Video is Worth 10,000 Words: Training and Benchmarking with Diverse Captions for Better Long Video Retrieval

no code implementations30 Nov 2023 Matthew Gwilliam, Michael Cogswell, Meng Ye, Karan Sikka, Abhinav Shrivastava, Ajay Divakaran

To provide a more thorough evaluation of the capabilities of long video retrieval systems, we propose a pipeline that leverages state-of-the-art large language models to carefully generate a diverse set of synthetic captions for long videos.

Benchmarking Retrieval +2

Multi-entity Video Transformers for Fine-Grained Video Representation Learning

1 code implementation17 Nov 2023 Matthew Walmer, Rose Kanjirathinkal, Kai Sheng Tai, Keyur Muzumdar, Taipeng Tian, Abhinav Shrivastava

In this work, we advance the state-of-the-art for this area by re-examining the design of transformer architectures for video representation learning.

Representation Learning

SHACIRA: Scalable HAsh-grid Compression for Implicit Neural Representations

no code implementations ICCV 2023 Sharath Girish, Abhinav Shrivastava, Kamal Gupta

Implicit Neural Representations (INR) or neural fields have emerged as a popular framework to encode multimedia signals such as images and radiance fields while retaining high-quality.

Quantization

Diff2Lip: Audio Conditioned Diffusion Models for Lip-Synchronization

1 code implementation18 Aug 2023 Soumik Mukhopadhyay, Saksham Suri, Ravi Teja Gadde, Abhinav Shrivastava

We show results on both reconstruction (same audio-video inputs) as well as cross (different audio-video inputs) settings on Voxceleb2 and LRW datasets.

Diffusion Models Beat GANs on Image Classification

1 code implementation17 Jul 2023 Soumik Mukhopadhyay, Matthew Gwilliam, Vatsal Agarwal, Namitha Padmanabhan, Archana Swaminathan, Srinidhi Hegde, Tianyi Zhou, Abhinav Shrivastava

We explore optimal methods for extracting and using these embeddings for classification tasks, demonstrating promising results on the ImageNet classification task.

Classification Denoising +5

MOST: Multiple Object localization with Self-supervised Transformers for object discovery

no code implementations ICCV 2023 Sai Saketh Rambhatla, Ishan Misra, Rama Chellappa, Abhinav Shrivastava

In this work, we present Multiple Object localization with Self-supervised Transformers (MOST) that uses features of transformers trained using self-supervised learning to localize multiple objects in real world images.

Object object-detection +6

ASIC: Aligning Sparse in-the-wild Image Collections

no code implementations ICCV 2023 Kamal Gupta, Varun Jampani, Carlos Esteves, Abhinav Shrivastava, Ameesh Makadia, Noah Snavely, Abhishek Kar

We present a self-supervised technique that directly optimizes on a sparse collection of images of a particular object/object category to obtain consistent dense correspondences across the collection.

Object

Towards Scalable Neural Representation for Diverse Videos

no code implementations CVPR 2023 Bo He, Xitong Yang, Hanyu Wang, Zuxuan Wu, Hao Chen, Shuaiyi Huang, Yixuan Ren, Ser-Nam Lim, Abhinav Shrivastava

Implicit neural representations (INR) have gained increasing attention in representing 3D scenes and images, and have been recently applied to encode videos (e. g., NeRV, E-NeRV).

Action Recognition Video Compression

COVID-VTS: Fact Extraction and Verification on Short Video Platforms

1 code implementation15 Feb 2023 Fuxiao Liu, Yaser Yacoob, Abhinav Shrivastava

We introduce a new benchmark, COVID-VTS, for fact-checking multi-modal information involving short-duration videos with COVID19- focused information from both the real world and machine generation.

Fact Checking Fact Selection +1

BT^2: Backward-compatible Training with Basis Transformation

1 code implementation ICCV 2023 Yifei Zhou, Zilu Li, Abhinav Shrivastava, Hengshuang Zhao, Antonio Torralba, Taipeng Tian, Ser-Nam Lim

In this way, the new representation can be directly compared with the old representation, in principle avoiding the need for any backfilling.

Teaching Matters: Investigating the Role of Supervision in Vision Transformers

1 code implementation CVPR 2023 Matthew Walmer, Saksham Suri, Kamal Gupta, Abhinav Shrivastava

We compare ViTs trained through different methods of supervision, and show that they learn a diverse range of behaviors in terms of their attention, representations, and downstream performance.

CNeRV: Content-adaptive Neural Representation for Visual Data

no code implementations18 Nov 2022 Hao Chen, Matt Gwilliam, Bo He, Ser-Nam Lim, Abhinav Shrivastava

We match the performance of NeRV, a state-of-the-art implicit neural representation, on the reconstruction task for frames seen during training while far surpassing for frames that are skipped during training (unseen images).

Data Compression

$BT^2$: Backward-compatible Training with Basis Transformation

1 code implementation8 Nov 2022 Yifei Zhou, Zilu Li, Abhinav Shrivastava, Hengshuang Zhao, Antonio Torralba, Taipeng Tian, Ser-Nam Lim

In this way, the new representation can be directly compared with the old representation, in principle avoiding the need for any backfilling.

Retrieval

Learning Semantic Correspondence with Sparse Annotations

1 code implementation15 Aug 2022 Shuaiyi Huang, Luyu Yang, Bo He, Songyang Zhang, Xuming He, Abhinav Shrivastava

In this paper, we aim to address the challenge of label sparsity in semantic correspondence by enriching supervision signals from sparse keypoint annotations.

Denoising Semantic correspondence

Beyond Supervised vs. Unsupervised: Representative Benchmarking and Analysis of Image Representation Learning

1 code implementation CVPR 2022 Matthew Gwilliam, Abhinav Shrivastava

In this paper, we compare methods using performance-based benchmarks such as linear evaluation, nearest neighbor classification, and clustering for several different datasets, demonstrating the lack of a clear front-runner within the current state-of-the-art.

Benchmarking Clustering +3

Disentangling Visual Embeddings for Attributes and Objects

1 code implementation CVPR 2022 Nirat Saini, Khoi Pham, Abhinav Shrivastava

We use visual decomposed features to hallucinate embeddings that are representative for the seen and novel compositions to better regularize the learning of our model.

Attribute Compositional Zero-Shot Learning +2

Neural Space-filling Curves

no code implementations18 Apr 2022 Hanyu Wang, Kamal Gupta, Larry Davis, Abhinav Shrivastava

We present Neural Space-filling Curves (SFCs), a data-driven approach to infer a context-based scan order for a set of images.

Image Compression

LilNetX: Lightweight Networks with EXtreme Model Compression and Structured Sparsification

1 code implementation6 Apr 2022 Sharath Girish, Kamal Gupta, Saurabh Singh, Abhinav Shrivastava

We introduce LilNetX, an end-to-end trainable technique for neural networks that enables learning models with specified accuracy-rate-computation trade-off.

Model Compression

ASM-Loc: Action-aware Segment Modeling for Weakly-Supervised Temporal Action Localization

1 code implementation CVPR 2022 Bo He, Xitong Yang, Le Kang, Zhiyu Cheng, Xin Zhou, Abhinav Shrivastava

Without the boundary information of action segments, existing methods mostly rely on multiple instance learning (MIL), where the predictions of unlabeled instances (i. e., video snippets) are supervised by classifying labeled bags (i. e., untrimmed videos).

Weakly Supervised Temporal Action Localization

ObjectFormer for Image Manipulation Detection and Localization

no code implementations CVPR 2022 Junke Wang, Zuxuan Wu, Jingjing Chen, Xintong Han, Abhinav Shrivastava, Ser-Nam Lim, Yu-Gang Jiang

Recent advances in image editing techniques have posed serious challenges to the trustworthiness of multimedia data, which drives the research of image tampering detection.

Image Manipulation Image Manipulation Detection

Dual-Key Multimodal Backdoors for Visual Question Answering

1 code implementation CVPR 2022 Matthew Walmer, Karan Sikka, Indranil Sur, Abhinav Shrivastava, Susmit Jha

This is challenging for the attacker as the detector can distort or ignore the visual trigger entirely, which leads to models where backdoors are over-reliant on the language trigger.

Question Answering Visual Question Answering

Burn After Reading: Online Adaptation for Cross-domain Streaming Data

no code implementations8 Dec 2021 Luyu Yang, Mingfei Gao, Zeyuan Chen, ran Xu, Abhinav Shrivastava, Chetan Ramaiah

In the context of online privacy, many methods propose complex privacy and security preserving measures to protect sensitive data.

Unsupervised Domain Adaptation

PatchGame: Learning to Signal Mid-level Patches in Referential Games

1 code implementation NeurIPS 2021 Kamal Gupta, Gowthami Somepalli, Anubhav Gupta, Vinoj Jayasundara, Matthias Zwicker, Abhinav Shrivastava

We study a referential game (a type of signaling game) where two agents communicate with each other via a discrete bottleneck to achieve a common goal.

NeRV: Neural Representations for Videos

3 code implementations NeurIPS 2021 Hao Chen, Bo He, Hanyu Wang, Yixuan Ren, Ser-Nam Lim, Abhinav Shrivastava

In contrast, with NeRV, we can use any neural network compression method as a proxy for video compression, and achieve comparable performance to traditional frame-based video compression approaches (H. 264, HEVC \etc).

Denoising Neural Network Compression +3

A Frequency Perspective of Adversarial Robustness

no code implementations26 Oct 2021 Shishira R Maiya, Max Ehrlich, Vatsal Agarwal, Ser-Nam Lim, Tom Goldstein, Abhinav Shrivastava

Our analysis shows that adversarial examples are neither in high-frequency nor in low-frequency components, but are simply dataset dependent.

Adversarial Robustness

HR-RCNN: Hierarchical Relational Reasoning for Object Detection

no code implementations26 Oct 2021 Hao Chen, Abhinav Shrivastava

Incorporating relational reasoning in neural networks for object recognition remains an open problem.

Graph Attention Instance Segmentation +6

Diverse Video Generation using a Gaussian Process Trigger

1 code implementation ICLR 2021 Gaurav Shrivastava, Abhinav Shrivastava

Our approach, Diverse Video Generator, uses a Gaussian Process (GP) to learn priors on future states given the past and maintains a probability distribution over possible futures given a particular sample.

 Ranked #1 on Video Prediction on KTH (Diversity metric)

Video Generation Video Prediction

Hierarchical Video Prediction Using Relational Layouts for Human-Object Interactions

no code implementations CVPR 2021 Navaneeth Bodla, Gaurav Shrivastava, Rama Chellappa, Abhinav Shrivastava

Our work builds on hierarchical video prediction models, which disentangle the video generation process into two stages: predicting a high-level representation, such as pose sequence, and then learning a pose-to-pixels translation model for pixel generation.

Human-Object Interaction Detection Object +4

Learning Graphs for Knowledge Transfer With Limited Labels

no code implementations CVPR 2021 Pallabi Ghosh, Nirat Saini, Larry S. Davis, Abhinav Shrivastava

The standard paradigm is to utilize relationships in the input graph to transfer information using GCNs from training to testing nodes in the graph; for example, the semi-supervised, zero-shot, and few-shot learning setups.

Benchmarking Few-Shot action recognition +3

Learning to Predict Visual Attributes in the Wild

no code implementations CVPR 2021 Khoi Pham, Kushal Kafle, Zhe Lin, Zhihong Ding, Scott Cohen, Quan Tran, Abhinav Shrivastava

In this paper, we introduce a large-scale in-the-wild visual attribute prediction dataset consisting of over 927K attribute annotations for over 260K object instances.

Attribute Contrastive Learning +2

Rethinking Pseudo Labels for Semi-Supervised Object Detection

no code implementations1 Jun 2021 Hengduo Li, Zuxuan Wu, Abhinav Shrivastava, Larry S. Davis

In this paper, we introduce certainty-aware pseudo labels tailored for object detection, which can effectively estimate the classification and localization quality of derived pseudo labels.

Classification Image Classification +4

Towards Discovery and Attribution of Open-world GAN Generated Images

1 code implementation ICCV 2021 Sharath Girish, Saksham Suri, Saketh Rambhatla, Abhinav Shrivastava

Through extensive experiments, we show that our algorithm discovers unseen GANs with high accuracy and also generalizes to GANs trained on unseen real datasets.

Attribute Clustering +1

The Pursuit of Knowledge: Discovering and Localizing Novel Categories using Dual Memory

no code implementations ICCV 2021 Sai Saketh Rambhatla, Rama Chellappa, Abhinav Shrivastava

We tackle object category discovery, which is the problem of discovering and localizing novel objects in a large unlabeled dataset.

Object

Learned Spatial Representations for Few-shot Talking-Head Synthesis

no code implementations ICCV 2021 Moustafa Meshry, Saksham Suri, Larry S. Davis, Abhinav Shrivastava

In contrast, we propose to factorize the representation of a subject into its spatial and style components.

StEP: Style-based Encoder Pre-training for Multi-modal Image Synthesis

no code implementations CVPR 2021 Moustafa Meshry, Yixuan Ren, Larry S Davis, Abhinav Shrivastava

Specifically, we pre-train a generic style encoder using a novel proxy task to learn an embedding of images, from arbitrary domains, into a low-dimensional style latent space.

Image Generation Translation

Knowledge Evolution in Neural Networks

1 code implementation CVPR 2021 Ahmed Taha, Abhinav Shrivastava, Larry Davis

We evaluate KE using relatively small datasets (e. g., CUB-200) and randomly initialized deep networks.

Metric Learning

SVMax: A Feature Embedding Regularizer

1 code implementation4 Mar 2021 Ahmed Taha, Alex Hanson, Abhinav Shrivastava, Larry Davis

The SVMax regularizer supports both supervised and unsupervised learning.

Retrieval

Deep Video Inpainting Detection

no code implementations26 Jan 2021 Peng Zhou, Ning Yu, Zuxuan Wu, Larry S. Davis, Abhinav Shrivastava, Ser-Nam Lim

This paper studies video inpainting detection, which localizes an inpainted region in a video both spatially and temporally.

Video Inpainting

Multimodal Attention for Layout Synthesis in Diverse Domains

no code implementations1 Jan 2021 Kamal Gupta, Vijay Mahadevan, Alessandro Achille, Justin Lazarow, Larry S. Davis, Abhinav Shrivastava

We address the problem of scene layout generation for diverse domains such as images, mobile applications, documents and 3D objects.

Learning What Not to Model: Gaussian Process Regression with Negative Constraints

no code implementations1 Jan 2021 Gaurav Shrivastava, Harsh Shrivastava, Abhinav Shrivastava

But, what if for an input point '$\bar{\mathbf{x}}$', we want to constrain the GP to avoid a target regression value '$\bar{y}(\bar{\mathbf{x}})$' (a negative datapair)?

Navigate regression

GTA: Global Temporal Attention for Video Action Understanding

no code implementations15 Dec 2020 Bo He, Xitong Yang, Zuxuan Wu, Hao Chen, Ser-Nam Lim, Abhinav Shrivastava

To this end, we introduce Global Temporal Attention (GTA), which performs global temporal attention on top of spatial attention in a decoupled manner.

Action Recognition Action Understanding +1

The Lottery Ticket Hypothesis for Object Recognition

1 code implementation CVPR 2021 Sharath Girish, Shishira R. Maiya, Kamal Gupta, Hao Chen, Larry Davis, Abhinav Shrivastava

The recently proposed Lottery Ticket Hypothesis (LTH) states that deep neural networks trained on large datasets contain smaller subnetworks that achieve on par performance as the dense networks.

Instance Segmentation Keypoint Estimation +5

Analyzing and Mitigating JPEG Compression Defects in Deep Learning

no code implementations17 Nov 2020 Max Ehrlich, Larry Davis, Ser-Nam Lim, Abhinav Shrivastava

We show that there is a significant penalty on common performance metrics for high compression.

Learning Visual Representations for Transfer Learning by Suppressing Texture

1 code implementation3 Nov 2020 Shlok Mishra, Anshul Shah, Ankan Bansal, Janit Anjaria, Jonghyun Choi, Abhinav Shrivastava, Abhishek Sharma, David Jacobs

Recent literature has shown that features obtained from supervised training of CNNs may over-emphasize texture rather than encoding high-level information.

Image Classification object-detection +3

Pose And Joint-Aware Action Recognition

1 code implementation16 Oct 2020 Anshul Shah, Shlok Mishra, Ankan Bansal, Jun-Cheng Chen, Rama Chellappa, Abhinav Shrivastava

Unlike other modalities, constellation of joints and their motion generate models with succinct human motion information for activity recognition.

Action Classification Action Recognition In Videos +5

Improved Modeling of 3D Shapes with Multi-view Depth Maps

1 code implementation7 Sep 2020 Kamal Gupta, Susmija Jabbireddy, Ketul Shah, Abhinav Shrivastava, Matthias Zwicker

Our simple encoder-decoder framework, comprised of a novel identity encoder and class-conditional viewpoint generator, generates 3D consistent depth maps.

Image Generation

Deep Co-Training with Task Decomposition for Semi-Supervised Domain Adaptation

1 code implementation ICCV 2021 Luyu Yang, Yan Wang, Mingfei Gao, Abhinav Shrivastava, Kilian Q. Weinberger, Wei-Lun Chao, Ser-Nam Lim

To integrate the strengths of the two classifiers, we apply the well-established co-training framework, in which the two classifiers exchange their high confident predictions to iteratively "teach each other" so that both classifiers can excel in the target domain.

Semi-supervised Domain Adaptation Unsupervised Domain Adaptation

End-to-end Learning of Compressible Features

1 code implementation23 Jul 2020 Saurabh Singh, Sami Abu-El-Haija, Nick Johnston, Johannes Ballé, Abhinav Shrivastava, George Toderici

We propose a learned method that jointly optimizes for compressibility along with the task objective for learning the features.

Quantization

Curriculum Manager for Source Selection in Multi-Source Domain Adaptation

no code implementations ECCV 2020 Luyu Yang, Yogesh Balaji, Ser-Nam Lim, Abhinav Shrivastava

In this paper, we proposed an adversarial agent that learns a dynamic curriculum for source samples, called Curriculum Manager for Source Selection (CMSS).

Multi-Source Unsupervised Domain Adaptation Unsupervised Domain Adaptation

Group Ensemble: Learning an Ensemble of ConvNets in a single ConvNet

1 code implementation1 Jul 2020 Hao Chen, Abhinav Shrivastava

Owing to group convolution and the shared-base, GENet can fully leverage the advantage of explicit ensemble learning while retaining the same computation as a single ConvNet.

Action Recognition Ensemble Learning +2

LayoutTransformer: Layout Generation and Completion with Self-attention

2 code implementations ICCV 2021 Kamal Gupta, Justin Lazarow, Alessandro Achille, Larry Davis, Vijay Mahadevan, Abhinav Shrivastava

Generating a new layout or extending an existing layout requires understanding the relationships between these primitives.

Quantization Guided JPEG Artifact Correction

1 code implementation ECCV 2020 Max Ehrlich, Larry Davis, Ser-Nam Lim, Abhinav Shrivastava

The JPEG image compression algorithm is the most popular method of image compression because of its ability for large compression ratios.

JPEG Artifact Correction Quantization

Spatial Priming for Detecting Human-Object Interactions

no code implementations9 Apr 2020 Ankan Bansal, Sai Saketh Rambhatla, Abhinav Shrivastava, Rama Chellappa

The proposed method consists of a layout module which primes a visual module to predict the type of interaction between a human and an object.

Human-Object Interaction Detection Object

PatchVAE: Learning Local Latent Codes for Recognition

1 code implementation CVPR 2020 Kamal Gupta, Saurabh Singh, Abhinav Shrivastava

Unsupervised representation learning holds the promise of exploiting large amounts of unlabeled data to learn general representations.

Representation Learning

Hand-Priming in Object Localization for Assistive Egocentric Vision

no code implementations28 Feb 2020 Kyungjun Lee, Abhinav Shrivastava, Hernisa Kacorri

Egocentric vision holds great promises for increasing access to visual information and improving the quality of life for people with visual impairments, with object recognition being one of the daily challenges for this population.

Hand Segmentation Multi-Task Learning +3

Depth Completion Using a View-constrained Deep Prior

no code implementations21 Jan 2020 Pallabi Ghosh, Vibhav Vineet, Larry S. Davis, Abhinav Shrivastava, Sudipta Sinha, Neel Joshi

Given color images and noisy and incomplete target depth maps, we optimize a randomly-initialized CNN model to reconstruct a depth map restored by virtue of using the CNN network structure as a prior combined with a view-constrained photo-consistency loss.

Depth Completion Image Denoising

Scalable Model Compression by Entropy Penalized Reparameterization

no code implementations ICLR 2020 Deniz Oktay, Johannes Ballé, Saurabh Singh, Abhinav Shrivastava

We describe a simple and general neural network weight compression approach, in which the network parameters (weights and biases) are represented in a "latent" space, amounting to a reparameterization.

General Classification Model Compression

Render4Completion: Synthesizing Multi-View Depth Maps for 3D Shape Completion

no code implementations17 Apr 2019 Tao Hu, Zhizhong Han, Abhinav Shrivastava, Matthias Zwicker

Different from image-to-image translation network that completes each view separately, our novel network, multi-view completion net (MVCN), leverages information from all views of a 3D shape to help the completion of each single view.

Image-to-Image Translation Translation

Detecting Human-Object Interactions via Functional Generalization

no code implementations5 Apr 2019 Ankan Bansal, Sai Saketh Rambhatla, Abhinav Shrivastava, Rama Chellappa

We present an approach for detecting human-object interactions (HOIs) in images, based on the idea that humans interact with functionally similar objects in a similar manner.

Human-Object Interaction Detection Object

Generate, Segment and Refine: Towards Generic Manipulation Segmentation

1 code implementation24 Nov 2018 Peng Zhou, Bor-Chun Chen, Xintong Han, Mahyar Najibi, Abhinav Shrivastava, Ser Nam Lim, Larry S. Davis

The advent of image sharing platforms and the easy availability of advanced photo editing software have resulted in a large quantities of manipulated images being shared on the internet.

Detecting Image Manipulation Image Generation +3

Actor-Centric Relation Network

1 code implementation ECCV 2018 Chen Sun, Abhinav Shrivastava, Carl Vondrick, Kevin Murphy, Rahul Sukthankar, Cordelia Schmid

A visualization of the learned relation features confirms that our approach is able to attend to the relevant relations for each action.

Action Classification Action Detection +5

Training Region-based Object Detectors with Online Hard Example Mining

5 code implementations CVPR 2016 Abhinav Shrivastava, Abhinav Gupta, Ross Girshick

Our motivation is the same as it has always been -- detection datasets contain an overwhelming number of easy examples and a small number of hard examples.

object-detection Object Detection

Watch and Learn: Semi-Supervised Learning of Object Detectors from Videos

no code implementations21 May 2015 Ishan Misra, Abhinav Shrivastava, Martial Hebert

We present a semi-supervised approach that localizes multiple unknown object instances in long videos.

Object object-detection +1

Mid-level Elements for Object Detection

no code implementations27 Apr 2015 Aayush Bansal, Abhinav Shrivastava, Carl Doersch, Abhinav Gupta

Building on the success of recent discriminative mid-level elements, we propose a surprisingly simple approach for object detection which performs comparable to the current state-of-the-art approaches on PASCAL VOC comp-3 detection challenge (no external data).

Object object-detection +1

Enriching Visual Knowledge Bases via Object Discovery and Segmentation

no code implementations CVPR 2014 Xinlei Chen, Abhinav Shrivastava, Abhinav Gupta

In this paper, we propose to enrich these knowledge bases by automatically discovering objects and their segmentations from noisy Internet images.

Object Discovery Segmentation

Cannot find the paper you are looking for? You can Submit a new open access paper.