no code implementations • 25 Mar 2023 • Vinoj Jayasundara, Amit Agrawal, Nicolas Heron, Abhinav Shrivastava, Larry S. Davis
We present FlexNeRF, a method for photorealistic freeviewpoint rendering of humans in motion from monocular videos.
no code implementations • 12 Dec 2022 • Junke Wang, Zhenxin Li, Chao Zhang, Jingjing Chen, Zuxuan Wu, Larry S. Davis, Yu-Gang Jiang
Online media data, in the forms of images and videos, are becoming mainstream communication channels.
1 code implementation • 3 Aug 2022 • Jun Wang, Mingfei Gao, Yuqian Hu, Ramprasaath R. Selvaraju, Chetan Ramaiah, ran Xu, Joseph F. JaJa, Larry S. Davis
To address this deficiency, we develop a new method to generate high-quality and diverse QA pairs by explicitly utilizing the existing rich text available in the scene context of each image.
no code implementations • 8 Dec 2021 • Partha Ghosh, Dominik Zietlow, Michael J. Black, Larry S. Davis, Xiaochen Hu
Our \textbf{InvGAN}, short for Invertible GAN, successfully embeds real images to the latent space of a high quality generative model.
no code implementations • CVPR 2021 • Pallabi Ghosh, Nirat Saini, Larry S. Davis, Abhinav Shrivastava
The standard paradigm is to utilize relationships in the input graph to transfer information using GCNs from training to testing nodes in the graph; for example, the semi-supervised, zero-shot, and few-shot learning setups.
no code implementations • 1 Jun 2021 • Hengduo Li, Zuxuan Wu, Abhinav Shrivastava, Larry S. Davis
In this paper, we introduce certainty-aware pseudo labels tailored for object detection, which can effectively estimate the classification and localization quality of derived pseudo labels.
Ranked #5 on
Semi-Supervised Object Detection
on COCO 100% labeled data
(using extra training data)
4 code implementations • ICCV 2021 • Shiyi Lan, Zhiding Yu, Christopher Choy, Subhashree Radhakrishnan, Guilin Liu, Yuke Zhu, Larry S. Davis, Anima Anandkumar
We introduce DiscoBox, a novel framework that jointly learns instance segmentation and semantic correspondence using bounding box supervision.
Box-supervised Instance Segmentation
Semantic correspondence
+1
no code implementations • ICCV 2021 • Moustafa Meshry, Saksham Suri, Larry S. Davis, Abhinav Shrivastava
In contrast, we propose to factorize the representation of a subject into its spatial and style components.
no code implementations • 25 Mar 2021 • Zuxuan Wu, Tom Goldstein, Larry S. Davis, Ser-Nam Lim
Many variants of adversarial training have been proposed, with most research focusing on problems with relatively few classes.
1 code implementation • 10 Feb 2021 • Bharat Singh, Mahyar Najibi, Abhishek Sharma, Larry S. Davis
The resulting algorithm is referred to as AutoFocus and results in a 2. 5-5 times speed-up during inference when used with SNIP.
no code implementations • 26 Jan 2021 • Peng Zhou, Ning Yu, Zuxuan Wu, Larry S. Davis, Abhinav Shrivastava, Ser-Nam Lim
This paper studies video inpainting detection, which localizes an inpainted region in a video both spatially and temporally.
no code implementations • 1 Jan 2021 • Kamal Gupta, Vijay Mahadevan, Alessandro Achille, Justin Lazarow, Larry S. Davis, Abhinav Shrivastava
We address the problem of scene layout generation for diverse domains such as images, mobile applications, documents and 3D objects.
no code implementations • CVPR 2021 • Hengduo Li, Zuxuan Wu, Abhinav Shrivastava, Larry S. Davis
Then, only frames and convolutions that are selected by the selection network are used in the 3D model to generate predictions.
Ranked #10 on
Action Recognition
on ActivityNet
no code implementations • CVPR 2021 • Jiali Duan, Yen-Liang Lin, Son Tran, Larry S. Davis, C. -C. Jay Kuo
We first train a teacher model on the labeled data and use it to generate pseudo labels for the unlabeled data.
no code implementations • 28 Aug 2020 • Pallabi Ghosh, Nirat Saini, Larry S. Davis, Abhinav Shrivastava
Current action recognition systems require large amounts of training data for recognizing an action.
Ranked #12 on
Zero-Shot Action Recognition
on Kinetics
no code implementations • ECCV 2020 • Jun Wang, Shiyi Lan, Mingfei Gao, Larry S. Davis
Results show that our framework achieves the state-of-the-art performance with 31 FPS and improves our baseline significantly by 9. 0% mAP on the nuScenes test set.
Ranked #319 on
3D Object Detection
on nuScenes
no code implementations • CVPR 2020 • Mahyar Najibi, Guangda Lai, Abhijit Kundu, Zhichao Lu, Vivek Rathod, Thomas Funkhouser, Caroline Pantofaru, David Ross, Larry S. Davis, Alireza Fathi
In contrast, we propose a general-purpose method that works on both indoor and outdoor scenes.
1 code implementation • CVPR 2020 • Shiyi Lan, Zhou Ren, Yi Wu, Larry S. Davis, Gang Hua
Object detection is an essential step towards holistic scene understanding.
Ranked #206 on
Object Detection
on COCO test-dev
no code implementations • 25 Mar 2020 • Peng Zhou, Brian Price, Scott Cohen, Gregg Wilensky, Larry S. Davis
In this paper, we target refining the boundaries in high resolution images given low resolution masks.
no code implementations • 21 Jan 2020 • Pallabi Ghosh, Vibhav Vineet, Larry S. Davis, Abhinav Shrivastava, Sudipta Sinha, Neel Joshi
Given color images and noisy and incomplete target depth maps, we optimize a randomly-initialized CNN model to reconstruct a depth map restored by virtue of using the CNN network structure as a prior combined with a view-constrained photo-consistency loss.
1 code implementation • 30 Dec 2019 • Zhe Wu, Zuxuan Wu, Bharat Singh, Larry S. Davis
Deep neural networks have been shown to suffer from poor generalization when small perturbations are added (like Gaussian noise), yet little work has been done to evaluate their robustness to more natural image transformations like photo filters.
no code implementations • CVPR 2020 • Yen-Liang Lin, Son Tran, Larry S. Davis
We evaluate our method on the outfit compatibility, FITB and new retrieval tasks.
1 code implementation • CVPR 2020 • Hengduo Li, Zuxuan Wu, Chen Zhu, Caiming Xiong, Richard Socher, Larry S. Davis
State-of-the-art object detectors rely on regressing and classifying an extensive list of possible anchors, which are divided into positive and negative samples based on their intersection-over-union (IoU) with corresponding groundtruth objects.
no code implementations • NeurIPS 2019 • Zuxuan Wu, Caiming Xiong, Yu-Gang Jiang, Larry S. Davis
This paper presents LiteEval, a simple yet effective coarse-to-fine framework for resource efficient video recognition, suitable for both online and offline scenarios.
no code implementations • ECCV 2020 • Mingfei Gao, Zizhao Zhang, Guo Yu, Sercan O. Arik, Larry S. Davis, Tomas Pfister
Active learning (AL) combines data labeling and model training to minimize the labeling cost by prioritizing the selection of high value data that can best improve model performance.
no code implementations • 25 Sep 2019 • Mingfei Gao, Zizhao Zhang, Guo Yu, Sercan O. Arik, Larry S. Davis, Tomas Pfister
Active learning (AL) aims to integrate data labeling and model training in a unified way, and to minimize the labeling budget by prioritizing the selection of high value data that can best improve model performance.
no code implementations • ICCV 2019 • Wei Luo, Xitong Yang, Xianjie Mo, Yuheng Lu, Larry S. Davis, Jun Li, Jian Yang, Ser-Nam Lim
Recognizing objects from subcategories with very subtle differences remains a challenging task due to the large intra-class and small inter-class variation.
Ranked #15 on
Fine-Grained Image Classification
on NABirds
(using extra training data)
Fine-Grained Image Classification
Fine-Grained Visual Categorization
no code implementations • 2 Sep 2019 • Pedro H. Bugatti, Priscila T. M. Saito, Larry S. Davis
To do so, we build and apply graphs to graph convolution networks with convolutional neural networks.
no code implementations • 31 Aug 2019 • Mingfei Gao, Larry S. Davis, Richard Socher, Caiming Xiong
We propose weakly supervised language localization networks (WSLLN) to detect events in long, untrimmed videos given language queries.
no code implementations • 2 Jun 2019 • Naiyang Guan, Tongliang Liu, Yangmuzi Zhang, DaCheng Tao, Larry S. Davis
Non-negative matrix factorization (NMF) minimizes the Euclidean distance between the data matrix and its low rank approximation, and it fails when applied to corrupted data because the loss function is sensitive to outliers.
no code implementations • CVPR 2021 • Bor-Chun Chen, Zuxuan Wu, Larry S. Davis, Ser-Nam Lim
Detecting spliced images is one of the emerging challenges in computer vision.
6 code implementations • NeurIPS 2019 • Ali Shafahi, Mahyar Najibi, Amin Ghiasi, Zheng Xu, John Dickerson, Christoph Studer, Larry S. Davis, Gavin Taylor, Tom Goldstein
Adversarial training, in which a network is trained on adversarial examples, is one of the few defenses against adversarial attacks that withstands strong attacks.
no code implementations • ICCV 2019 • Zuxuan Wu, Xin Wang, Joseph E. Gonzalez, Tom Goldstein, Larry S. Davis
However, neural classifiers are often extremely brittle when confronted with domain shift---changes in the input distribution that occur over time.
no code implementations • 11 Apr 2019 • Hengduo Li, Bharat Singh, Mahyar Najibi, Zuxuan Wu, Larry S. Davis
We analyze how well their features generalize to tasks like image classification, semantic segmentation and object detection on small datasets like PASCAL-VOC, Caltech-256, SUN-397, Flowers-102 etc.
no code implementations • WS 2019 • Peratham Wiriyathammabhum, Abhinav Shrivastava, Vlad I. Morariu, Larry S. Davis
This paper presents a new task, the grounding of spatio-temporal identifying descriptions in videos.
no code implementations • 3 Apr 2019 • Peng Zhou, Long Mai, Jianming Zhang, Ning Xu, Zuxuan Wu, Larry S. Davis
Instead of sequentially distilling knowledge only from the last model, we directly leverage all previous model snapshots.
no code implementations • ICCV 2019 • Mingfei Gao, Mingze Xu, Larry S. Davis, Richard Socher, Caiming Xiong
We propose StartNet to address Online Detection of Action Start (ODAS) where action starts and their associated categories are detected in untrimmed, streaming videos.
no code implementations • 4 Feb 2019 • Xintong Han, Zuxuan Wu, Weilin Huang, Matthew R. Scott, Larry S. Davis
The latent representations are jointly optimized with the corresponding generation network to condition the synthesis process, encouraging a diverse set of generated results that are visually compatible with existing fashion garments.
no code implementations • 14 Dec 2018 • Xiyang Dai, Bharat Singh, Joe Yue-Hei Ng, Larry S. Davis
We present Temporal Aggregation Network (TAN) which decomposes 3D convolutions into spatial and temporal aggregation blocks.
no code implementations • CVPR 2019 • Mahyar Najibi, Bharat Singh, Larry S. Davis
We propose a novel approach for generating region proposals for performing face-detection.
1 code implementation • ICCV 2019 • Mahyar Najibi, Bharat Singh, Larry S. Davis
Instead of processing an entire image pyramid, AutoFocus adopts a coarse to fine approach and only processes regions which are likely to contain small objects at finer scales.
no code implementations • CVPR 2019 • Da Zhang, Xiyang Dai, Xin Wang, Yuan-Fang Wang, Larry S. Davis
In this paper, we present Moment Alignment Network (MAN), a novel framework that unifies the candidate moment encoding and temporal structural reasoning in a single-shot feed-forward network.
no code implementations • CVPR 2019 • Zuxuan Wu, Caiming Xiong, Chih-Yao Ma, Richard Socher, Larry S. Davis
We present AdaFrame, a framework that adaptively selects relevant frames on a per-input basis for fast video recognition.
no code implementations • 27 Nov 2018 • Ali Shafahi, Mahyar Najibi, Zheng Xu, John Dickerson, Larry S. Davis, Tom Goldstein
Standard adversarial attacks change the predicted class label of a selected image by adding specially tailored small perturbations to its pixels.
no code implementations • 26 Nov 2018 • Pallabi Ghosh, Yi Yao, Larry S. Davis, Ajay Divakaran
We show results on CAD120 (which provides pre-computed node features and edge weights for fair performance comparison across algorithms) as well as a more complex real-world activity dataset, Charades.
1 code implementation • 24 Nov 2018 • Peng Zhou, Bor-Chun Chen, Xintong Han, Mahyar Najibi, Abhinav Shrivastava, Ser Nam Lim, Larry S. Davis
The advent of image sharing platforms and the easy availability of advanced photo editing software have resulted in a large quantities of manipulated images being shared on the internet.
no code implementations • CVPR 2019 • Varun Manjunatha, Nirat Saini, Larry S. Davis
It is of interest to the community to explicitly discover such biases, both for understanding the behavior of such models, and towards debugging them.
2 code implementations • CVPR 2019 • Shiyi Lan, Ruichi Yu, Gang Yu, Larry S. Davis
This encourages the network to preserve the geometric structure in Euclidean space throughout the feature extraction hierarchy.
2 code implementations • ICCV 2019 • Mingze Xu, Mingfei Gao, Yi-Ting Chen, Larry S. Davis, David J. Crandall
Most work on temporal action detection is formulated as an offline problem, in which the start and end times of actions are determined after the entire video is fully observed.
Ranked #6 on
Online Action Detection
on TVSeries
2 code implementations • 19 Oct 2018 • Yaming Wang, Xiao Tan, Yi Yang, Ziyu Li, Xiao Liu, Feng Zhou, Larry S. Davis
Existing 3D pose datasets of object categories are limited to generic object types and lack of fine-grained information.
1 code implementation • ECCV 2018 • Sijia Cai, WangMeng Zuo, Larry S. Davis, Lei Zhang
Video summarization is a challenging under-constrained problem because the underlying summary of a single video strongly depends on users' subjective understandings.
1 code implementation • 18 Jun 2018 • Zhe Wu, Navaneeth Bodla, Bharat Singh, Mahyar Najibi, Rama Chellappa, Larry S. Davis
Interestingly, we observe that after dropping 30% of the annotations (and labeling them as background), the performance of CNN-based object detectors like Faster-RCNN only drops by 5% on the PASCAL VOC dataset.
2 code implementations • 12 Jun 2018 • Yaming Wang, Xiao Tan, Yi Yang, Xiao Liu, Errui Ding, Feng Zhou, Larry S. Davis
The new dataset is available at www. umiacs. umd. edu/~wym/3dpose. html
no code implementations • CVPR 2018 • Bharat Singh, Larry S. Davis
On the COCO dataset, our single model performance is 45. 7% and an ensemble of 3 networks obtains an mAP of 48. 3%.
4 code implementations • NeurIPS 2018 • Bharat Singh, Mahyar Najibi, Larry S. Davis
Our implementation based on Faster-RCNN with a ResNet-101 backbone obtains an mAP of 47. 6% on the COCO dataset for bounding box detection and can process 5 images per second during inference with a single GPU.
Ranked #117 on
Object Detection
on COCO test-dev
2 code implementations • CVPR 2018 • Peng Zhou, Xintong Han, Vlad I. Morariu, Larry S. Davis
Image manipulation detection is different from traditional semantic object detection because it pays more attention to tampering artifacts than to image content, which suggests that richer features need to be learned.
1 code implementation • 27 Apr 2018 • Sohil Shah, Pallabi Ghosh, Larry S. Davis, Tom Goldstein
Many imaging tasks require global information about all pixels in an image.
no code implementations • ECCV 2018 • Zuxuan Wu, Xintong Han, Yen-Liang Lin, Mustafa Gkhan Uzunbas, Tom Goldstein, Ser Nam Lim, Larry S. Davis
In particular, given an image from the source domain and unlabeled samples from the target domain, the generator synthesizes new images on-the-fly to resemble samples from the target domain in appearance and the segmentation network further refines high-level features before predicting semantic maps, both of which leverage feature statistics of sampled images from the target domain.
no code implementations • ICCV 2019 • Ruichi Yu, Hongcheng Wang, Ang Li, Jingxiao Zheng, Vlad I. Morariu, Larry S. Davis
We address the recognition of agent-in-place actions, which are associated with agents who perform them and places where they occur, in the context of outdoor home surveillance.
no code implementations • 29 Mar 2018 • Peng Zhou, Xintong Han, Vlad I. Morariu, Larry S. Davis
We propose a two-stream network for face tampering detection.
no code implementations • 6 Jan 2018 • Ruichi Yu, Hongcheng Wang, Larry S. Davis
To dramatically speedup relevant motion event detection and improve its performance, we propose a novel network for relevant motion event detection, ReMotENet, which is a unified, end-to-end data-driven method using spatial-temporal attention-based 3D ConvNets to jointly model the appearance and motion of objects-of-interest in a video.
no code implementations • 12 Dec 2017 • Zhe Wu, Bharat Singh, Larry S. Davis, V. S. Subrahmanian
We present a system for covert automated deception detection in real-life courtroom trial videos.
2 code implementations • CVPR 2018 • Bharat Singh, Hengduo Li, Abhishek Sharma, Larry S. Davis
Our approach is a modification of the R-FCN architecture in which position-sensitive filters are shared across different object classes for performing localization.
5 code implementations • CVPR 2018 • Xintong Han, Zuxuan Wu, Zhe Wu, Ruichi Yu, Larry S. Davis
We present an image-based VIirtual Try-On Network (VITON) without using 3D information in any form, which seamlessly transfers a desired clothing item onto the corresponding region of a person using a coarse-to-fine strategy.
no code implementations • 22 Nov 2017 • Bharat Singh, Larry S. Davis
On the COCO dataset, our single model performance is 45. 7% and an ensemble of 3 networks obtains an mAP of 48. 3%.
Ranked #132 on
Object Detection
on COCO test-dev
1 code implementation • CVPR 2018 • Zuxuan Wu, Tushar Nagarajan, Abhishek Kumar, Steven Rennie, Larry S. Davis, Kristen Grauman, Rogerio Feris
Very deep convolutional neural networks offer excellent recognition results, yet their computational expense limits their impact for many real-world applications.
no code implementations • CVPR 2018 • Ruichi Yu, Ang Li, Chun-Fu Chen, Jui-Hsin Lai, Vlad I. Morariu, Xintong Han, Mingfei Gao, Ching-Yung Lin, Larry S. Davis
In contrast, we argue that it is essential to prune neurons in the entire neuron network jointly based on a unified goal: minimizing the reconstruction error of important responses in the "final response layer" (FRL), which is the second-to-last layer before classification, for a pruned network to retrain its predictive power.
no code implementations • ECCV 2018 • Mingfei Gao, Ang Li, Ruichi Yu, Vlad I. Morariu, Larry S. Davis
We introduce count-guided weakly supervised localization (C-WSL), an approach that uses per-class object count as a new form of supervision to improve weakly supervised localization (WSL).
no code implementations • CVPR 2018 • Mingfei Gao, Ruichi Yu, Ang Li, Vlad I. Morariu, Larry S. Davis
We introduce a generic framework that reduces the computational cost of object detection while retaining accuracy for scenarios where objects with varied sizes appear in high resolution images.
no code implementations • 22 Sep 2017 • Fahimeh Rezazadegan, Sareh Shirazi, Mahsa Baktashmotlagh, Larry S. Davis
Anticipating future actions is a key component of intelligence, specifically when it applies to real-time systems, such as robots or autonomous cars.
no code implementations • ICCV 2017 • Xiyang Dai, Bharat Singh, Guyue Zhang, Larry S. Davis, Yan Qiu Chen
For each temporal segment inside a proposal, features are uniformly sampled at a pair of scales and are input to a temporal convolutional neural network for classification.
Ranked #7 on
Action Recognition
on THUMOS’14
1 code implementation • ICCV 2017 • Xintong Han, Zuxuan Wu, Phoenix X. Huang, Xiao Zhang, Menglong Zhu, Yuan Li, Yang Zhao, Larry S. Davis
This paper proposes an automatic spatially-aware concept discovery approach using weakly labeled image-text data from shopping websites.
no code implementations • ICCV 2017 • Ruichi Yu, Ang Li, Vlad I. Morariu, Larry S. Davis
Understanding visual relationships involves identifying the subject, the object, and a predicate relating them.
1 code implementation • 18 Jul 2017 • Xintong Han, Zuxuan Wu, Yu-Gang Jiang, Larry S. Davis
To this end, we propose to jointly learn a visual-semantic embedding and the compatibility relationships among fashion items in an end-to-end fashion.
no code implementations • CVPR 2017 • Xiyang Dai, Joe Yue-Hei Ng, Larry S. Davis
We then build a multi-level deep architecture to exploit the first and second order information within different convolutional layers.
8 code implementations • ICCV 2017 • Navaneeth Bodla, Bharat Singh, Rama Chellappa, Larry S. Davis
To this end, we propose Soft-NMS, an algorithm which decays the detection scores of all other objects as a continuous function of their overlap with M. Hence, no object is eliminated in this process.
no code implementations • 10 Apr 2017 • Zuxuan Wu, Larry S. Davis, Leonid Sigal
In particular, we propose spatial context networks that learn to predict a representation of one image patch from another image patch, within the same image, conditioned on their real-valued relative spatial offset.
no code implementations • CVPR 2017 • Seyed A. Esmaeili, Bharat Singh, Larry S. Davis
It is a fully-convolutional deep neural network, which learns specific filters for thumbnails of different sizes and aspect ratios.
1 code implementation • CVPR 2017 • Venkataraman Santhanam, Vlad I. Morariu, Larry S. Davis
We present a Deep Convolutional Neural Network architecture which serves as a generic image-to-image regressor that can be trained end-to-end without any further machinery.
no code implementations • 9 Dec 2016 • Joe Yue-Hei Ng, Jonghyun Choi, Jan Neumann, Larry S. Davis
Even with the recent advances in convolutional neural networks (CNN) in various visual recognition tasks, the state-of-the-art action recognition system still relies on hand crafted motion feature such as optical flow to achieve the best performance.
Ranked #66 on
Action Recognition
on HMDB-51
1 code implementation • CVPR 2018 • Yaming Wang, Vlad I. Morariu, Larry S. Davis
Compared to earlier multistage frameworks using CNN features, recent end-to-end deep approaches for fine-grained recognition essentially enhance the mid-level learning capability of CNNs.
Ranked #20 on
Fine-Grained Image Classification
on CUB-200-2011
no code implementations • CVPR 2017 • Ang Li, Jin Sun, Joe Yue-Hei Ng, Ruichi Yu, Vlad I. Morariu, Larry S. Davis
Since interactions between objects can be reduced to a limited set of atomic spatial relations in 3D, we study the possibility of inferring 3D structure from a text description rather than an image, applying physical relation models to synthesize holistic 3D abstract object layouts satisfying the spatial constraints present in a textual description.
no code implementations • 18 Nov 2016 • Hui Miao, Ang Li, Larry S. Davis, Amol Deshpande
Deep learning modeling lifecycle generates a rich set of data artifacts, such as learned parameters and training logs, and comprises of several frequently conducted tasks, e. g., to understand the model behaviors and to try out new models.
no code implementations • 11 Oct 2016 • Xianzhi Du, Mostafa El-Khamy, Jungwon Lee, Larry S. Davis
A single shot deep convolutional network is trained as a object detector to generate all possible pedestrian candidates of different sizes and occlusions.
Ranked #19 on
Pedestrian Detection
on Caltech
no code implementations • 9 Sep 2016 • Ruichi Yu, Xi Chen, Vlad I. Morariu, Larry S. Davis
We investigate the reasons why context in object detection has limited utility by isolating and evaluating the predictive power of different context cues under ideal conditions in which context provided by an oracle.
1 code implementation • 1 Aug 2016 • Varun K. Nagaraja, Vlad I. Morariu, Larry S. Davis
Our approach uses an LSTM to learn the probability of a referring expression, with input features from a region and a context region.
no code implementations • CVPR 2016 • Yaming Wang, Jonghyun Choi, Vlad I. Morariu, Larry S. Davis
Fine-grained classification involves distinguishing between similar sub-categories based on subtle differences in highly localized regions; therefore, accurate localization of discriminative regions remains a major challenge.
no code implementations • 25 Apr 2016 • Bahadir Ozdemir, Larry S. Davis
We propose a flexible procedure for large-scale image search by hash functions with kernels.
no code implementations • 25 Apr 2016 • Bahadir Ozdemir, Mahyar Najibi, Larry S. Davis
In the first stage of classification, binary codes are considered as class labels by a set of binary SVMs; each corresponds to one bit.
2 code implementations • CVPR 2016 • Mahmudul Hasan, Jonghyun Choi, Jan Neumann, Amit K. Roy-Chowdhury, Larry S. Davis
Perceiving meaningful activities in a long video sequence is a challenging problem due to ambiguous definition of 'meaningfulness' as well as clutters in the scene.
Ranked #2 on
Traffic Accident Detection
on A3D
no code implementations • 11 Feb 2016 • Yangmuzi Zhang, Zhuolin Jiang, Xi Chen, Larry S. Davis
Based on the multi-scale nature of objects in images, our approach is built on top of a hierarchical segmentation.
no code implementations • 9 Feb 2016 • Xiyang Dai, Sameh Khamis, Yangmuzi Zhang, Larry S. Davis
Sparse representations have been successfully applied to signal processing, computer vision and machine learning.
no code implementations • CVPR 2016 • Mahyar Najibi, Mohammad Rastegari, Larry S. Davis
G-CNN starts with a multi-scale grid of fixed bounding boxes.
no code implementations • 13 Dec 2015 • Mahdyar Ravanbakhsh, Hossein Mousavi, Mohammad Rastegari, Vittorio Murino, Larry S. Davis
Action recognition tasks usually relies on complex handcrafted structures as features to represent the human action model.
no code implementations • 10 Dec 2015 • Xintong Han, Bharat Singh, Vlad I. Morariu, Larry S. Davis
VRFP is a real-time video retrieval framework based on short text input queries, which obtains weakly labeled training images from the web after the query is known.
no code implementations • ICCV 2015 • Chi Su, Fan Yang, Shiliang Zhang, Qi Tian, Larry S. Davis, Wen Gao
Since attributes are generally correlated, we introduce a low rank attribute embedding into the MTL formulation to embed original binary attributes to a continuous attribute space, where incorrect and incomplete attributes are rectified and recovered to better describe people.
1 code implementation • ICCV 2015 • Ang Li, Vlad Morariu, Larry S. Davis
Most existing face verification systems rely on precise face detection and registration.
no code implementations • 24 Nov 2015 • Varun K. Nagaraja, Vlad I. Morariu, Larry S. Davis
However, we can use structure in the scene to search for objects without processing the entire image.
no code implementations • ICCV 2015 • Bharat Singh, Xintong Han, Zhe Wu, Vlad I. Morariu, Larry S. Davis
Given a text description of an event, event retrieval is performed by selecting concepts linguistically related to the event description and fusing the concept responses on unseen videos.
no code implementations • 20 Sep 2015 • Mahyar Najibi, Mohammad Rastegari, Larry S. Davis
To make large-scale search feasible, Distance Estimation and Subset Indexing are the main approaches.
no code implementations • CVPR 2015 • Ashish Shrivastava, Mohammad Rastegari, Sumit Shekhar, Rama Chellappa, Larry S. Davis
Many existing recognition algorithms combine different modalities based on training accuracy but do not consider the possibility of noise at test time.
no code implementations • 20 Apr 2015 • Joe Yue-Hei Ng, Fan Yang, Larry S. Davis
Deep convolutional neural networks have been successfully applied to image classification tasks.
no code implementations • 30 Jan 2015 • Sravanthi Bondugula, Varun Manjunatha, Larry S. Davis, David Doermann
We present a supervised binary encoding scheme for image retrieval that learns projections by taking into account similarity between classes obtained from output embeddings.
no code implementations • NeurIPS 2014 • Bahadir Ozdemir, Larry S. Davis
We propose a multimodal retrieval procedure based on latent feature models.
no code implementations • 5 May 2014 • Mohammad Rastegari, Shobeir Fakhraei, Jonghyun Choi, David Jacobs, Larry S. Davis
We discuss methodological issues related to the evaluation of unsupervised binary code construction methods for nearest neighbor search.
1 code implementation • 21 Jan 2014 • Changxing Ding, Jonghyun Choi, DaCheng Tao, Larry S. Davis
To perform unconstrained face recognition robust to variations in illumination, pose and expression, this paper presents a new scheme to extract "Multi-Directional Multi-Level Dual-Cross Patterns" (MDML-DCPs) from face images.
no code implementations • ECCV 2014 • Ang Li, Vlad I. Morariu, Larry S. Davis
Image based geolocation aims to answer the question: where was this ground photograph taken?
no code implementations • CVPR 2013 • Arpit Jain, Abhinav Gupta, Mikel Rodriguez, Larry S. Davis
representation for videos based on mid-level discriminative spatio-temporal patches.
no code implementations • CVPR 2013 • Jonghyun Choi, Mohammad Rastegari, Ali Farhadi, Larry S. Davis
We propose a method to expand the visual coverage of training sets that consist of a small number of labeled examples using learned attributes.
no code implementations • CVPR 2013 • Yangmuzi Zhang, Zhuolin Jiang, Larry S. Davis
An approach to learn a structured low-rank representation for image classification is presented.
no code implementations • CVPR 2013 • Zhuolin Jiang, Larry S. Davis
The problem of salient region detection is formulated as the well-studied facility location problem from operations research.
no code implementations • NeurIPS 2008 • Vlad I. Morariu, Balaji V. Srinivasan, Vikas C. Raykar, Ramani Duraiswami, Larry S. Davis
To solve the second problem, we present an online tuning approach that results in a black box method that automatically chooses the evaluation method and its parameters to yield the best performance for the input data, desired accuracy, and bandwidth.
no code implementations • NeurIPS 2008 • Abhinav Gupta, Jianbo Shi, Larry S. Davis
Using an analogous reasoning, we present an approach that combines bag-of-words and spatial models to perform semantic and syntactic analysis for recognition of an object based on its internal appearance and its context.