Search Results for author: Larry S. Davis

Found 114 papers, 31 papers with code

Building an Open-Vocabulary Video CLIP Model with Better Architectures, Optimization and Data

1 code implementation8 Oct 2023 Zuxuan Wu, Zejia Weng, Wujian Peng, Xitong Yang, Ang Li, Larry S. Davis, Yu-Gang Jiang

Despite significant results achieved by Contrastive Language-Image Pretraining (CLIP) in zero-shot image recognition, limited effort has been made exploring its potential for zero-shot video recognition.

Action Recognition Continual Learning +5

TAG: Boosting Text-VQA via Text-aware Visual Question-answer Generation

1 code implementation3 Aug 2022 Jun Wang, Mingfei Gao, Yuqian Hu, Ramprasaath R. Selvaraju, Chetan Ramaiah, ran Xu, Joseph F. JaJa, Larry S. Davis

To address this deficiency, we develop a new method to generate high-quality and diverse QA pairs by explicitly utilizing the existing rich text available in the scene context of each image.

Answer Generation Question-Answer-Generation +3

InvGAN: Invertible GANs

no code implementations8 Dec 2021 Partha Ghosh, Dominik Zietlow, Michael J. Black, Larry S. Davis, Xiaochen Hu

Our \textbf{InvGAN}, short for Invertible GAN, successfully embeds real images to the latent space of a high quality generative model.

Data Augmentation Image Inpainting +1

Learning Graphs for Knowledge Transfer With Limited Labels

no code implementations CVPR 2021 Pallabi Ghosh, Nirat Saini, Larry S. Davis, Abhinav Shrivastava

The standard paradigm is to utilize relationships in the input graph to transfer information using GCNs from training to testing nodes in the graph; for example, the semi-supervised, zero-shot, and few-shot learning setups.

Benchmarking Few-Shot action recognition +3

Rethinking Pseudo Labels for Semi-Supervised Object Detection

no code implementations1 Jun 2021 Hengduo Li, Zuxuan Wu, Abhinav Shrivastava, Larry S. Davis

In this paper, we introduce certainty-aware pseudo labels tailored for object detection, which can effectively estimate the classification and localization quality of derived pseudo labels.

Classification Image Classification +4

Learned Spatial Representations for Few-shot Talking-Head Synthesis

no code implementations ICCV 2021 Moustafa Meshry, Saksham Suri, Larry S. Davis, Abhinav Shrivastava

In contrast, we propose to factorize the representation of a subject into its spatial and style components.

THAT: Two Head Adversarial Training for Improving Robustness at Scale

no code implementations25 Mar 2021 Zuxuan Wu, Tom Goldstein, Larry S. Davis, Ser-Nam Lim

Many variants of adversarial training have been proposed, with most research focusing on problems with relatively few classes.

Vocal Bursts Valence Prediction

Scale Normalized Image Pyramids with AutoFocus for Object Detection

1 code implementation10 Feb 2021 Bharat Singh, Mahyar Najibi, Abhishek Sharma, Larry S. Davis

The resulting algorithm is referred to as AutoFocus and results in a 2. 5-5 times speed-up during inference when used with SNIP.

Object object-detection +1

Deep Video Inpainting Detection

no code implementations26 Jan 2021 Peng Zhou, Ning Yu, Zuxuan Wu, Larry S. Davis, Abhinav Shrivastava, Ser-Nam Lim

This paper studies video inpainting detection, which localizes an inpainted region in a video both spatially and temporally.

Video Inpainting

Multimodal Attention for Layout Synthesis in Diverse Domains

no code implementations1 Jan 2021 Kamal Gupta, Vijay Mahadevan, Alessandro Achille, Justin Lazarow, Larry S. Davis, Abhinav Shrivastava

We address the problem of scene layout generation for diverse domains such as images, mobile applications, documents and 3D objects.

InfoFocus: 3D Object Detection for Autonomous Driving with Dynamic Information Modeling

no code implementations ECCV 2020 Jun Wang, Shiyi Lan, Mingfei Gao, Larry S. Davis

Results show that our framework achieves the state-of-the-art performance with 31 FPS and improves our baseline significantly by 9. 0% mAP on the nuScenes test set.

3D Object Detection Autonomous Driving +2

DeepStrip: High Resolution Boundary Refinement

no code implementations25 Mar 2020 Peng Zhou, Brian Price, Scott Cohen, Gregg Wilensky, Larry S. Davis

In this paper, we target refining the boundaries in high resolution images given low resolution masks.

Vocal Bursts Intensity Prediction

Depth Completion Using a View-constrained Deep Prior

no code implementations21 Jan 2020 Pallabi Ghosh, Vibhav Vineet, Larry S. Davis, Abhinav Shrivastava, Sudipta Sinha, Neel Joshi

Given color images and noisy and incomplete target depth maps, we optimize a randomly-initialized CNN model to reconstruct a depth map restored by virtue of using the CNN network structure as a prior combined with a view-constrained photo-consistency loss.

Depth Completion Image Denoising

Recognizing Instagram Filtered Images with Feature De-stylization

2 code implementations30 Dec 2019 Zhe Wu, Zuxuan Wu, Bharat Singh, Larry S. Davis

Deep neural networks have been shown to suffer from poor generalization when small perturbations are added (like Gaussian noise), yet little work has been done to evaluate their robustness to more natural image transformations like photo filters.

Style Transfer

Fashion Outfit Complementary Item Retrieval

1 code implementation CVPR 2020 Yen-Liang Lin, Son Tran, Larry S. Davis

We evaluate our method on the outfit compatibility, FITB and new retrieval tasks.

Retrieval

Learning from Noisy Anchors for One-stage Object Detection

1 code implementation CVPR 2020 Hengduo Li, Zuxuan Wu, Chen Zhu, Caiming Xiong, Richard Socher, Larry S. Davis

State-of-the-art object detectors rely on regressing and classifying an extensive list of possible anchors, which are divided into positive and negative samples based on their intersection-over-union (IoU) with corresponding groundtruth objects.

Classification General Classification +3

LiteEval: A Coarse-to-Fine Framework for Resource Efficient Video Recognition

no code implementations NeurIPS 2019 Zuxuan Wu, Caiming Xiong, Yu-Gang Jiang, Larry S. Davis

This paper presents LiteEval, a simple yet effective coarse-to-fine framework for resource efficient video recognition, suitable for both online and offline scenarios.

Video Recognition

Consistency-based Semi-supervised Active Learning: Towards Minimizing Labeling Cost

no code implementations ECCV 2020 Mingfei Gao, Zizhao Zhang, Guo Yu, Sercan O. Arik, Larry S. Davis, Tomas Pfister

Active learning (AL) combines data labeling and model training to minimize the labeling cost by prioritizing the selection of high value data that can best improve model performance.

Active Learning Image Classification +1

Consistency-Based Semi-Supervised Active Learning: Towards Minimizing Labeling Budget

no code implementations25 Sep 2019 Mingfei Gao, Zizhao Zhang, Guo Yu, Sercan O. Arik, Larry S. Davis, Tomas Pfister

Active learning (AL) aims to integrate data labeling and model training in a unified way, and to minimize the labeling budget by prioritizing the selection of high value data that can best improve model performance.

Active Learning Representation Learning

Cross-X Learning for Fine-Grained Visual Categorization

no code implementations ICCV 2019 Wei Luo, Xitong Yang, Xianjie Mo, Yuheng Lu, Larry S. Davis, Jun Li, Jian Yang, Ser-Nam Lim

Recognizing objects from subcategories with very subtle differences remains a challenging task due to the large intra-class and small inter-class variation.

Ranked #18 on Fine-Grained Image Classification on NABirds (using extra training data)

Fine-Grained Image Classification Fine-Grained Visual Categorization

HiCoRe: Visual Hierarchical Context-Reasoning

no code implementations2 Sep 2019 Pedro H. Bugatti, Priscila T. M. Saito, Larry S. Davis

To do so, we build and apply graphs to graph convolution networks with convolutional neural networks.

Graph Attention

WSLLN: Weakly Supervised Natural Language Localization Networks

no code implementations31 Aug 2019 Mingfei Gao, Larry S. Davis, Richard Socher, Caiming Xiong

We propose weakly supervised language localization networks (WSLLN) to detect events in long, untrimmed videos given language queries.

Sentence

Truncated Cauchy Non-negative Matrix Factorization

no code implementations2 Jun 2019 Naiyang Guan, Tongliang Liu, Yangmuzi Zhang, DaCheng Tao, Larry S. Davis

Non-negative matrix factorization (NMF) minimizes the Euclidean distance between the data matrix and its low rank approximation, and it fails when applied to corrupted data because the loss function is sensitive to outliers.

Clustering Image Clustering

ACE: Adapting to Changing Environments for Semantic Segmentation

no code implementations ICCV 2019 Zuxuan Wu, Xin Wang, Joseph E. Gonzalez, Tom Goldstein, Larry S. Davis

However, neural classifiers are often extremely brittle when confronted with domain shift---changes in the input distribution that occur over time.

Meta-Learning Semantic Segmentation

An Analysis of Pre-Training on Object Detection

no code implementations11 Apr 2019 Hengduo Li, Bharat Singh, Mahyar Najibi, Zuxuan Wu, Larry S. Davis

We analyze how well their features generalize to tasks like image classification, semantic segmentation and object detection on small datasets like PASCAL-VOC, Caltech-256, SUN-397, Flowers-102 etc.

Avg Classification +6

M2KD: Multi-model and Multi-level Knowledge Distillation for Incremental Learning

no code implementations3 Apr 2019 Peng Zhou, Long Mai, Jianming Zhang, Ning Xu, Zuxuan Wu, Larry S. Davis

Instead of sequentially distilling knowledge only from the last model, we directly leverage all previous model snapshots.

Incremental Learning Knowledge Distillation

StartNet: Online Detection of Action Start in Untrimmed Videos

no code implementations ICCV 2019 Mingfei Gao, Mingze Xu, Larry S. Davis, Richard Socher, Caiming Xiong

We propose StartNet to address Online Detection of Action Start (ODAS) where action starts and their associated categories are detected in untrimmed, streaming videos.

Action Classification Policy Gradient Methods

Compatible and Diverse Fashion Image Inpainting

no code implementations4 Feb 2019 Xintong Han, Zuxuan Wu, Weilin Huang, Matthew R. Scott, Larry S. Davis

The latent representations are jointly optimized with the corresponding generation network to condition the synthesis process, encouraging a diverse set of generated results that are visually compatible with existing fashion garments.

Fashion Synthesis Image Inpainting

TAN: Temporal Aggregation Network for Dense Multi-label Action Recognition

no code implementations14 Dec 2018 Xiyang Dai, Bharat Singh, Joe Yue-Hei Ng, Larry S. Davis

We present Temporal Aggregation Network (TAN) which decomposes 3D convolutions into spatial and temporal aggregation blocks.

Action Recognition Temporal Action Localization

AutoFocus: Efficient Multi-Scale Inference

1 code implementation ICCV 2019 Mahyar Najibi, Bharat Singh, Larry S. Davis

Instead of processing an entire image pyramid, AutoFocus adopts a coarse to fine approach and only processes regions which are likely to contain small objects at finer scales.

MAN: Moment Alignment Network for Natural Language Moment Retrieval via Iterative Graph Adjustment

no code implementations CVPR 2019 Da Zhang, Xiyang Dai, Xin Wang, Yuan-Fang Wang, Larry S. Davis

In this paper, we present Moment Alignment Network (MAN), a novel framework that unifies the candidate moment encoding and temporal structural reasoning in a single-shot feed-forward network.

Moment Retrieval Natural Language Moment Retrieval +1

Universal Adversarial Training

no code implementations27 Nov 2018 Ali Shafahi, Mahyar Najibi, Zheng Xu, John Dickerson, Larry S. Davis, Tom Goldstein

Standard adversarial attacks change the predicted class label of a selected image by adding specially tailored small perturbations to its pixels.

Stacked Spatio-Temporal Graph Convolutional Networks for Action Segmentation

no code implementations26 Nov 2018 Pallabi Ghosh, Yi Yao, Larry S. Davis, Ajay Divakaran

We show results on CAD120 (which provides pre-computed node features and edge weights for fair performance comparison across algorithms) as well as a more complex real-world activity dataset, Charades.

Action Recognition Action Segmentation +2

Generate, Segment and Refine: Towards Generic Manipulation Segmentation

1 code implementation24 Nov 2018 Peng Zhou, Bor-Chun Chen, Xintong Han, Mahyar Najibi, Abhinav Shrivastava, Ser Nam Lim, Larry S. Davis

The advent of image sharing platforms and the easy availability of advanced photo editing software have resulted in a large quantities of manipulated images being shared on the internet.

Detecting Image Manipulation Image Generation +3

Explicit Bias Discovery in Visual Question Answering Models

no code implementations CVPR 2019 Varun Manjunatha, Nirat Saini, Larry S. Davis

It is of interest to the community to explicitly discover such biases, both for understanding the behavior of such models, and towards debugging them.

Question Answering Visual Question Answering

Modeling Local Geometric Structure of 3D Point Clouds using Geo-CNN

2 code implementations CVPR 2019 Shiyi Lan, Ruichi Yu, Gang Yu, Larry S. Davis

This encourages the network to preserve the geometric structure in Euclidean space throughout the feature extraction hierarchy.

Modeling Local Geometric Structure

Temporal Recurrent Networks for Online Action Detection

2 code implementations ICCV 2019 Mingze Xu, Mingfei Gao, Yi-Ting Chen, Larry S. Davis, David J. Crandall

Most work on temporal action detection is formulated as an offline problem, in which the start and end times of actions are determined after the entire video is fully observed.

Online Action Detection

Improving Annotation for 3D Pose Dataset of Fine-Grained Object Categories

2 code implementations19 Oct 2018 Yaming Wang, Xiao Tan, Yi Yang, Ziyu Li, Xiao Liu, Feng Zhou, Larry S. Davis

Existing 3D pose datasets of object categories are limited to generic object types and lack of fine-grained information.

3D Pose Estimation Object +1

Weakly-supervised Video Summarization using Variational Encoder-Decoder and Web Prior

no code implementations ECCV 2018 Sijia Cai, WangMeng Zuo, Larry S. Davis, Lei Zhang

Video summarization is a challenging under-constrained problem because the underlying summary of a single video strongly depends on users' subjective understandings.

Saliency Prediction Supervised Video Summarization

Soft Sampling for Robust Object Detection

1 code implementation18 Jun 2018 Zhe Wu, Navaneeth Bodla, Bharat Singh, Mahyar Najibi, Rama Chellappa, Larry S. Davis

Interestingly, we observe that after dropping 30% of the annotations (and labeling them as background), the performance of CNN-based object detectors like Faster-RCNN only drops by 5% on the PASCAL VOC dataset.

Object object-detection +1

An Analysis of Scale Invariance in Object Detection ­ SNIP

no code implementations CVPR 2018 Bharat Singh, Larry S. Davis

On the COCO dataset, our single model performance is 45. 7% and an ensemble of 3 networks obtains an mAP of 48. 3%.

object-detection Object Detection

SNIPER: Efficient Multi-Scale Training

4 code implementations NeurIPS 2018 Bharat Singh, Mahyar Najibi, Larry S. Davis

Our implementation based on Faster-RCNN with a ResNet-101 backbone obtains an mAP of 47. 6% on the COCO dataset for bounding box detection and can process 5 images per second during inference with a single GPU.

object-detection Object Detection +1

Learning Rich Features for Image Manipulation Detection

2 code implementations CVPR 2018 Peng Zhou, Xintong Han, Vlad I. Morariu, Larry S. Davis

Image manipulation detection is different from traditional semantic object detection because it pays more attention to tampering artifacts than to image content, which suggests that richer features need to be learned.

Image Manipulation Image Manipulation Detection +3

DCAN: Dual Channel-wise Alignment Networks for Unsupervised Scene Adaptation

no code implementations ECCV 2018 Zuxuan Wu, Xintong Han, Yen-Liang Lin, Mustafa Gkhan Uzunbas, Tom Goldstein, Ser Nam Lim, Larry S. Davis

In particular, given an image from the source domain and unlabeled samples from the target domain, the generator synthesizes new images on-the-fly to resemble samples from the target domain in appearance and the segmentation network further refines high-level features before predicting semantic maps, both of which leverage feature statistics of sampled images from the target domain.

Segmentation Semantic Segmentation

Layout-induced Video Representation for Recognizing Agent-in-Place Actions

no code implementations ICCV 2019 Ruichi Yu, Hongcheng Wang, Ang Li, Jingxiao Zheng, Vlad I. Morariu, Larry S. Davis

We address the recognition of agent-in-place actions, which are associated with agents who perform them and places where they occur, in the context of outdoor home surveillance.

ReMotENet: Efficient Relevant Motion Event Detection for Large-scale Home Surveillance Videos

no code implementations6 Jan 2018 Ruichi Yu, Hongcheng Wang, Larry S. Davis

To dramatically speedup relevant motion event detection and improve its performance, we propose a novel network for relevant motion event detection, ReMotENet, which is a unified, end-to-end data-driven method using spatial-temporal attention-based 3D ConvNets to jointly model the appearance and motion of objects-of-interest in a video.

Event Detection object-detection +1

Deception Detection in Videos

no code implementations12 Dec 2017 Zhe Wu, Bharat Singh, Larry S. Davis, V. S. Subrahmanian

We present a system for covert automated deception detection in real-life courtroom trial videos.

Action Recognition Deception Detection In Videos +1

R-FCN-3000 at 30fps: Decoupling Detection and Classification

2 code implementations CVPR 2018 Bharat Singh, Hengduo Li, Abhishek Sharma, Larry S. Davis

Our approach is a modification of the R-FCN architecture in which position-sensitive filters are shared across different object classes for performing localization.

Classification General Classification +2

An Analysis of Scale Invariance in Object Detection - SNIP

no code implementations22 Nov 2017 Bharat Singh, Larry S. Davis

On the COCO dataset, our single model performance is 45. 7% and an ensemble of 3 networks obtains an mAP of 48. 3%.

Object object-detection +1

BlockDrop: Dynamic Inference Paths in Residual Networks

1 code implementation CVPR 2018 Zuxuan Wu, Tushar Nagarajan, Abhishek Kumar, Steven Rennie, Larry S. Davis, Kristen Grauman, Rogerio Feris

Very deep convolutional neural networks offer excellent recognition results, yet their computational expense limits their impact for many real-world applications.

VITON: An Image-based Virtual Try-on Network

6 code implementations CVPR 2018 Xintong Han, Zuxuan Wu, Zhe Wu, Ruichi Yu, Larry S. Davis

We present an image-based VIirtual Try-On Network (VITON) without using 3D information in any form, which seamlessly transfers a desired clothing item onto the corresponding region of a person using a coarse-to-fine strategy.

Descriptive Virtual Try-on

NISP: Pruning Networks using Neuron Importance Score Propagation

no code implementations CVPR 2018 Ruichi Yu, Ang Li, Chun-Fu Chen, Jui-Hsin Lai, Vlad I. Morariu, Xintong Han, Mingfei Gao, Ching-Yung Lin, Larry S. Davis

In contrast, we argue that it is essential to prune neurons in the entire neuron network jointly based on a unified goal: minimizing the reconstruction error of important responses in the "final response layer" (FRL), which is the second-to-last layer before classification, for a pruned network to retrain its predictive power.

Network Pruning

Dynamic Zoom-in Network for Fast Object Detection in Large Images

no code implementations CVPR 2018 Mingfei Gao, Ruichi Yu, Ang Li, Vlad I. Morariu, Larry S. Davis

We introduce a generic framework that reduces the computational cost of object detection while retaining accuracy for scenarios where objects with varied sizes appear in high resolution images.

object-detection Real-Time Object Detection

C-WSL: Count-guided Weakly Supervised Localization

no code implementations ECCV 2018 Mingfei Gao, Ang Li, Ruichi Yu, Vlad I. Morariu, Larry S. Davis

We introduce count-guided weakly supervised localization (C-WSL), an approach that uses per-class object count as a new form of supervision to improve weakly supervised localization (WSL).

Object

On Encoding Temporal Evolution for Real-time Action Prediction

no code implementations22 Sep 2017 Fahimeh Rezazadegan, Sareh Shirazi, Mahsa Baktashmotlagh, Larry S. Davis

Anticipating future actions is a key component of intelligence, specifically when it applies to real-time systems, such as robots or autonomous cars.

Temporal Context Network for Activity Localization in Videos

no code implementations ICCV 2017 Xiyang Dai, Bharat Singh, Guyue Zhang, Larry S. Davis, Yan Qiu Chen

For each temporal segment inside a proposal, features are uniformly sampled at a pair of scales and are input to a temporal convolutional neural network for classification.

General Classification Temporal Localization

Learning Fashion Compatibility with Bidirectional LSTMs

2 code implementations18 Jul 2017 Xintong Han, Zuxuan Wu, Yu-Gang Jiang, Larry S. Davis

To this end, we propose to jointly learn a visual-semantic embedding and the compatibility relationships among fashion items in an end-to-end fashion.

Attribute

FASON: First and Second Order Information Fusion Network for Texture Recognition

no code implementations CVPR 2017 Xiyang Dai, Joe Yue-Hei Ng, Larry S. Davis

We then build a multi-level deep architecture to exploit the first and second order information within different convolutional layers.

Soft-NMS -- Improving Object Detection With One Line of Code

8 code implementations ICCV 2017 Navaneeth Bodla, Bharat Singh, Rama Chellappa, Larry S. Davis

To this end, we propose Soft-NMS, an algorithm which decays the detection scores of all other objects as a continuous function of their overlap with M. Hence, no object is eliminated in this process.

Object object-detection +1

Weakly-Supervised Spatial Context Networks

no code implementations10 Apr 2017 Zuxuan Wu, Larry S. Davis, Leonid Sigal

In particular, we propose spatial context networks that learn to predict a representation of one image patch from another image patch, within the same image, conditioned on their real-valued relative spatial offset.

Object Object Categorization

Fast-AT: Fast Automatic Thumbnail Generation using Deep Neural Networks

no code implementations CVPR 2017 Seyed A. Esmaeili, Bharat Singh, Larry S. Davis

It is a fully-convolutional deep neural network, which learns specific filters for thumbnails of different sizes and aspect ratios.

Generalized Deep Image to Image Regression

1 code implementation CVPR 2017 Venkataraman Santhanam, Vlad I. Morariu, Larry S. Davis

We present a Deep Convolutional Neural Network architecture which serves as a generic image-to-image regressor that can be trained end-to-end without any further machinery.

Colorization Denoising +1

ActionFlowNet: Learning Motion Representation for Action Recognition

no code implementations9 Dec 2016 Joe Yue-Hei Ng, Jonghyun Choi, Jan Neumann, Larry S. Davis

Even with the recent advances in convolutional neural networks (CNN) in various visual recognition tasks, the state-of-the-art action recognition system still relies on hand crafted motion feature such as optical flow to achieve the best performance.

Action Recognition Optical Flow Estimation +1

Learning a Discriminative Filter Bank within a CNN for Fine-grained Recognition

1 code implementation CVPR 2018 Yaming Wang, Vlad I. Morariu, Larry S. Davis

Compared to earlier multistage frameworks using CNN features, recent end-to-end deep approaches for fine-grained recognition essentially enhance the mid-level learning capability of CNNs.

Representation Learning

Generating Holistic 3D Scene Abstractions for Text-based Image Retrieval

no code implementations CVPR 2017 Ang Li, Jin Sun, Joe Yue-Hei Ng, Ruichi Yu, Vlad I. Morariu, Larry S. Davis

Since interactions between objects can be reduced to a limited set of atomic spatial relations in 3D, we study the possibility of inferring 3D structure from a text description rather than an image, applying physical relation models to synthesize holistic 3D abstract object layouts satisfying the spatial constraints present in a textual description.

Image Retrieval Object +3

ModelHub: Towards Unified Data and Lifecycle Management for Deep Learning

no code implementations18 Nov 2016 Hui Miao, Ang Li, Larry S. Davis, Amol Deshpande

Deep learning modeling lifecycle generates a rich set of data artifacts, such as learned parameters and training logs, and comprises of several frequently conducted tasks, e. g., to understand the model behaviors and to try out new models.

Management

Fused DNN: A deep neural network fusion approach to fast and robust pedestrian detection

no code implementations11 Oct 2016 Xianzhi Du, Mostafa El-Khamy, Jungwon Lee, Larry S. Davis

A single shot deep convolutional network is trained as a object detector to generate all possible pedestrian candidates of different sizes and occlusions.

Pedestrian Detection Semantic Segmentation

The Role of Context Selection in Object Detection

no code implementations9 Sep 2016 Ruichi Yu, Xi Chen, Vlad I. Morariu, Larry S. Davis

We investigate the reasons why context in object detection has limited utility by isolating and evaluating the predictive power of different context cues under ideal conditions in which context provided by an oracle.

Object object-detection +1

Modeling Context Between Objects for Referring Expression Understanding

1 code implementation1 Aug 2016 Varun K. Nagaraja, Vlad I. Morariu, Larry S. Davis

Our approach uses an LSTM to learn the probability of a referring expression, with input features from a region and a context region.

Multiple Instance Learning Object +1

Mining Discriminative Triplets of Patches for Fine-Grained Classification

no code implementations CVPR 2016 Yaming Wang, Jonghyun Choi, Vlad I. Morariu, Larry S. Davis

Fine-grained classification involves distinguishing between similar sub-categories based on subtle differences in highly localized regions; therefore, accurate localization of discriminative regions remains a major challenge.

Classification General Classification

Scalable Gaussian Processes for Supervised Hashing

no code implementations25 Apr 2016 Bahadir Ozdemir, Larry S. Davis

We propose a flexible procedure for large-scale image search by hash functions with kernels.

Binary Classification Gaussian Processes +4

Supervised Incremental Hashing

no code implementations25 Apr 2016 Bahadir Ozdemir, Mahyar Najibi, Larry S. Davis

In the first stage of classification, binary codes are considered as class labels by a set of binary SVMs; each corresponds to one bit.

General Classification Image Retrieval

Learning Temporal Regularity in Video Sequences

2 code implementations CVPR 2016 Mahmudul Hasan, Jonghyun Choi, Jan Neumann, Amit K. Roy-Chowdhury, Larry S. Davis

Perceiving meaningful activities in a long video sequence is a challenging problem due to ambiguous definition of 'meaningfulness' as well as clutters in the scene.

Semi-supervised Anomaly Detection Video Anomaly Detection

Generating Discriminative Object Proposals via Submodular Ranking

no code implementations11 Feb 2016 Yangmuzi Zhang, Zhuolin Jiang, Xi Chen, Larry S. Davis

Based on the multi-scale nature of objects in images, our approach is built on top of a hierarchical segmentation.

Image Segmentation Object +3

Action Recognition with Image Based CNN Features

no code implementations13 Dec 2015 Mahdyar Ravanbakhsh, Hossein Mousavi, Mohammad Rastegari, Vittorio Murino, Larry S. Davis

Action recognition tasks usually relies on complex handcrafted structures as features to represent the human action model.

Action Recognition Temporal Action Localization

VRFP: On-the-fly Video Retrieval using Web Images and Fast Fisher Vector Products

no code implementations10 Dec 2015 Xintong Han, Bharat Singh, Vlad I. Morariu, Larry S. Davis

VRFP is a real-time video retrieval framework based on short text input queries, which obtains weakly labeled training images from the web after the query is known.

Re-Ranking Retrieval +2

Multi-Task Learning With Low Rank Attribute Embedding for Person Re-Identification

no code implementations ICCV 2015 Chi Su, Fan Yang, Shiliang Zhang, Qi Tian, Larry S. Davis, Wen Gao

Since attributes are generally correlated, we introduce a low rank attribute embedding into the MTL formulation to embed original binary attributes to a continuous attribute space, where incorrect and incomplete attributes are rectified and recovered to better describe people.

Attribute Multi-Task Learning +1

Searching for Objects using Structure in Indoor Scenes

no code implementations24 Nov 2015 Varun K. Nagaraja, Vlad I. Morariu, Larry S. Davis

However, we can use structure in the scene to search for objects without processing the entire image.

Imitation Learning Object

Selecting Relevant Web Trained Concepts for Automated Event Retrieval

no code implementations ICCV 2015 Bharat Singh, Xintong Han, Zhe Wu, Vlad I. Morariu, Larry S. Davis

Given a text description of an event, event retrieval is performed by selecting concepts linguistically related to the event description and fusing the concept responses on unseen videos.

Domain Adaptation Retrieval

On Large-Scale Retrieval: Binary or n-ary Coding?

no code implementations20 Sep 2015 Mahyar Najibi, Mohammad Rastegari, Larry S. Davis

To make large-scale search feasible, Distance Estimation and Subset Indexing are the main approaches.

Image Retrieval Quantization +1

Class Consistent Multi-Modal Fusion With Binary Features

no code implementations CVPR 2015 Ashish Shrivastava, Mohammad Rastegari, Sumit Shekhar, Rama Chellappa, Larry S. Davis

Many existing recognition algorithms combine different modalities based on training accuracy but do not consider the possibility of noise at test time.

SHOE: Supervised Hashing with Output Embeddings

no code implementations30 Jan 2015 Sravanthi Bondugula, Varun Manjunatha, Larry S. Davis, David Doermann

We present a supervised binary encoding scheme for image retrieval that learns projections by taking into account similarity between classes obtained from output embeddings.

Attribute Image Retrieval +2

Comparing apples to apples in the evaluation of binary coding methods

no code implementations5 May 2014 Mohammad Rastegari, Shobeir Fakhraei, Jonghyun Choi, David Jacobs, Larry S. Davis

We discuss methodological issues related to the evaluation of unsupervised binary code construction methods for nearest neighbor search.

Multi-Directional Multi-Level Dual-Cross Patterns for Robust Face Recognition

1 code implementation21 Jan 2014 Changxing Ding, Jonghyun Choi, DaCheng Tao, Larry S. Davis

To perform unconstrained face recognition robust to variations in illumination, pose and expression, this paper presents a new scheme to extract "Multi-Directional Multi-Level Dual-Cross Patterns" (MDML-DCPs) from face images.

Face Identification Face Recognition +2

Adding Unlabeled Samples to Categories by Learned Attributes

no code implementations CVPR 2013 Jonghyun Choi, Mohammad Rastegari, Ali Farhadi, Larry S. Davis

We propose a method to expand the visual coverage of training sets that consist of a small number of labeled examples using learned attributes.

Submodular Salient Region Detection

no code implementations CVPR 2013 Zhuolin Jiang, Larry S. Davis

The problem of salient region detection is formulated as the well-studied facility location problem from operations research.

Saliency Detection

A ``Shape Aware'' Model for semi-supervised Learning of Objects and its Context

no code implementations NeurIPS 2008 Abhinav Gupta, Jianbo Shi, Larry S. Davis

Using an analogous reasoning, we present an approach that combines bag-of-words and spatial models to perform semantic and syntactic analysis for recognition of an object based on its internal appearance and its context.

Object Object Recognition

Automatic online tuning for fast Gaussian summation

no code implementations NeurIPS 2008 Vlad I. Morariu, Balaji V. Srinivasan, Vikas C. Raykar, Ramani Duraiswami, Larry S. Davis

To solve the second problem, we present an online tuning approach that results in a black box method that automatically chooses the evaluation method and its parameters to yield the best performance for the input data, desired accuracy, and bandwidth.

BIG-bench Machine Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.