no code implementations • CVPR 2023 • Bang-Dang Pham, Phong Tran, Anh Tran, Cuong Pham, Rang Nguyen, Minh Hoai
We consider the challenging task of training models for image-to-video deblurring, which aims to recover a sequence of sharp images corresponding to a given blurry image input.
1 code implementation • CVPR 2023 • Sounak Mondal, Zhibo Yang, Seoyoung Ahn, Dimitris Samaras, Gregory Zelinsky, Minh Hoai
In response, we pose a new task called ZeroGaze, a new variant of zero-shot learning where gaze is predicted for never-before-searched objects, and we develop a novel model, Gazeformer, to solve the ZeroGaze problem.
no code implementations • 16 Mar 2023 • Zhibo Yang, Sounak Mondal, Seoyoung Ahn, Gregory Zelinsky, Minh Hoai, Dimitris Samaras
Most models of visual attention are aimed at predicting either top-down or bottom-up control, as studied using different visual search and free-viewing tasks.
1 code implementation • CVPR 2023 • Zekun Zhang, Minh Hoai
This paper proposes a novel method to improve the performance of a trained object detector on scenes with fixed camera perspectives based on self-supervised adaptation.
no code implementations • 20 Nov 2022 • Qiaomu Miao, Minh Hoai, Dimitris Samaras
Gaze following aims to predict where a person is looking in a scene, by predicting the target location, or indicating that the target is located outside the image.
1 code implementation • 28 Oct 2022 • Bach Tran, Binh-Son Hua, Anh Tuan Tran, Minh Hoai
Inspired by the success of deep learning in the image domain, we devise a novel pre-training technique for better model initialization by utilizing the multi-view rendering of the 3D data.
no code implementations • 12 Oct 2022 • Sayontan Ghosh, Tanvi Aggarwal, Minh Hoai, Niranjan Balasubramanian
Anticipating future actions in a video is useful for many autonomous and assistive technologies.
1 code implementation • 22 Jul 2022 • Thanh Nguyen, Chau Pham, Khoi Nguyen, Minh Hoai
We tackle a new task of few-shot object counting and detection.
Ranked #7 on
Object Counting
on FSC147
1 code implementation • 4 Jul 2022 • Zhibo Yang, Sounak Mondal, Seoyoung Ahn, Gregory Zelinsky, Minh Hoai, Dimitris Samaras
In this paper, we propose the first data-driven computational model that addresses the search-termination problem and predicts the scanpath of search fixations made by people searching for targets that do not appear in images.
no code implementations • 27 May 2022 • Viresh Ranjan, Minh Hoai
We tackle the task of Class Agnostic Counting, which aims to count objects in a novel object category at test time without any access to labeled training data for that category.
1 code implementation • CVPR 2022 • Mingzhen Huang, Supreeth Narasimhaswamy, Saif Vazir, Haibin Ling, Minh Hoai
The first stage is Forward Propagation, where the features from frame t-1 are propagated to frame t based on previously detected hands and their estimated motion.
Ranked #1 on
Multiple Object Tracking
on YouTube-Hands
(using extra training data)
1 code implementation • CVPR 2022 • Supreeth Narasimhaswamy, Thanh Nguyen, Mingzhen Huang, Minh Hoai
We also introduce a new challenging dataset called BodyHands containing unconstrained images with hand and their corresponding body locations annotations.
no code implementations • 29 Sep 2021 • Viresh Ranjan, Minh Hoai
Given an image containing multiple objects of a novel visual category and few exemplar bounding boxes depicting the visual category of interest, we want to count all of the instances of the desired visual category in the image.
1 code implementation • ICCV 2021 • Long-Nhat Ho, Anh Tuan Tran, Quynh Phung, Minh Hoai
In this paper, we eliminate the symmetry requirement with a novel unsupervised algorithm that can learn a 3D reconstruction network from a multi-image dataset.
1 code implementation • CVPR 2021 • Phong Tran, Anh Tuan Tran, Quynh Phung, Minh Hoai
This paper introduces a method to encode the blur operators of an arbitrary dataset of sharp-blur image pairs into a blur kernel space.
1 code implementation • CVPR 2021 • Nguyen Nguyen, Thu Nguyen, Vinh Tran, Minh-Triet Tran, Thanh Duc Ngo, Thien Huu Nguyen, Minh Hoai
Language prior plays an important role in the way humans perceive and recognize text in the wild.
1 code implementation • CVPR 2021 • Viresh Ranjan, Udbhav Sharma, Thu Nguyen, Minh Hoai
We also present a novel adaptation strategy to adapt our network to any novel visual category at test time, using only a few exemplar objects from the novel category.
Ranked #9 on
Object Counting
on FSC147
1 code implementation • CVPR 2021 • Chuong Huynh, Anh Tran, Khoa Luu, Minh Hoai
In this work, we present MagNet, a multi-scale framework that resolves local ambiguity by looking at the image at multiple magnification levels.
Ranked #4 on
Land Cover Classification
on DeepGlobe
1 code implementation • CVPR 2021 • Thao Nguyen, Anh Tran, Minh Hoai
However, existing works overlooked the latter components and confined makeup transfer to color manipulation, focusing only on light makeup styles.
Ranked #1 on
Facial Makeup Transfer
on CPM-Synt-2
1 code implementation • 1 Apr 2021 • Phong Tran, Anh Tran, Quynh Phung, Minh Hoai
This paper introduces a method to encode the blur operators of an arbitrary dataset of sharp-blur image pairs into a blur kernel space.
Blind Image Deblurring
Facial Expression Recognition (FER)
+1
no code implementations • 1 Mar 2021 • Phong Tran, Anh Tran, Thao Nguyen, Minh Hoai
The objective of this work is to deblur face videos.
1 code implementation • 23 Dec 2020 • Shahira Abousamra, Minh Hoai, Dimitris Samaras, Chao Chen
Due to various challenges, a localization method is prone to spatial semantic errors, i. e., predicting multiple dots within a same person or collapsing multiple dots in a cluttered region.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Thu Nguyen, Duy Phung, Minh Hoai, Thien Huu Nguyen
Personality image captioning (PIC) aims to describe an image with a natural language caption given a personality trait.
no code implementations • NeurIPS 2020 • Supreeth Narasimhaswamy, Trung Nguyen, Minh Hoai
The first attention mechanism is based on the hand and a region's affinity, enclosing the hand and the object, and densely pools features from this region to the hand region.
1 code implementation • 30 Sep 2020 • Viresh Ranjan, Boyu Wang, Mubarak Shah, Minh Hoai
We present sample selection strategies which make use of the density and uncertainty of predictions from the networks trained on one domain to select the informative images from a target domain of interest to acquire human annotation.
1 code implementation • NeurIPS 2020 • Boyu Wang, Huidong Liu, Dimitris Samaras, Minh Hoai
Existing crowd counting methods need to use a Gaussian to smooth each annotated dot or to estimate the likelihood of every pixel given the annotated point.
Ranked #2 on
Crowd Counting
on UCF-QNRF
no code implementations • 14 Sep 2020 • Raji Annadi, Yupei Chen, Viresh Ranjan, Dimitris Samaras, Gregory Zelinsky, Minh Hoai
Analyzing the collected gaze behavior of ten human participants on thirty crowd images, we observe some common approaches for visual counting.
no code implementations • 3 Sep 2020 • Xinyi Huang, Suphanut Jamonnak, Ye Zhao, Boyu Wang, Minh Hoai, Kevin Yager, Wei Xu
Existing interactive visualization tools for deep learning are mostly applied to the training, debugging, and refinement of neural network models working on natural images.
2 code implementations • CVPR 2020 • Zhibo Yang, Lihan Huang, Yupei Chen, Zijun Wei, Seoyoung Ahn, Gregory Zelinsky, Dimitris Samaras, Minh Hoai
These maps were learned by IRL and then used to predict behavioral scanpaths for multiple target categories.
no code implementations • 31 Jan 2020 • Gregory J. Zelinsky, Yupei Chen, Seoyoung Ahn, Hossein Adeli, Zhibo Yang, Lihan Huang, Dimitrios Samaras, Minh Hoai
Using machine learning and the psychologically-meaningful principle of reward, it is possible to learn the visual features used in goal-directed attention control.
no code implementations • 10 Oct 2019 • Xinyi Huang, Suphanut Jamonnak, Ye Zhao, Boyu Wang, Minh Hoai, Kevin Yager, Wei Xu
This extended abstract presents a visualization system, which is designed for domain scientists to visually understand their deep learning model of extracting multiple attributes in x-ray scattering images.
no code implementations • 10 Apr 2019 • Yang Wang, Vinh Tran, Gedas Bertasius, Lorenzo Torresani, Minh Hoai
This is a challenging task due to the subtlety of human actions in video and the co-occurrence of contextual elements.
no code implementations • 9 Apr 2019 • Vinh Tran, Yang Wang, Minh Hoai
In this paper, we propose a novel knowledge distillation framework that uses an action recognition network to supervise the training of an action anticipation network, guiding the latter to attend to the relevant information needed for correctly anticipating the future actions.
1 code implementation • ICCV 2019 • Supreeth Narasimhaswamy, Zhengwei Wei, Yang Wang, Justin Zhang, Minh Hoai
We also conduct ablation studies on hand detection to show the effectiveness of the proposed contextual attention module.
no code implementations • 19 Feb 2019 • Roy Shilkrot, Zhi Chai, Minh Hoai
Visual segmentation has seen tremendous advancement recently with ready solutions for a wide variety of scene types, including human hands and other body parts.
no code implementations • CVPR 2019 • Yang Wang, Haibin Huang, Chuan Wang, Tong He, Jue Wang, Minh Hoai
In this paper, we propose GIF2Video, the first learning-based method for enhancing the visual quality of GIFs in the wild.
no code implementations • ICLR 2019 • Viresh Ranjan, Heeyoung Kwon, Niranjan Balasubramanian, Minh Hoai
We automatically generate fake sentences by corrupting original sentences from a source collection and train the encoders to produce representations that are effective at detecting fake sentences.
no code implementations • ECCV 2018 • Viresh Ranjan, Hieu Le, Minh Hoai
In this work, we tackle the problem of crowd counting in images.
Ranked #10 on
Crowd Counting
on UCF CC 50
no code implementations • CVPR 2018 • Zijun Wei, Jianming Zhang, Xiaohui Shen, Zhe Lin, RadomÃr Mech, Minh Hoai, Dimitris Samaras
Finding views with good photo composition is a challenging task for machine learning methods.
no code implementations • CVPR 2018 • Yang Wang, Minh Hoai
The ability to recognize human actions in video has many potential applications.
1 code implementation • ECCV 2018 • Hieu Le, Tomas F. Yago Vicente, Vu Nguyen, Minh Hoai, Dimitris Samaras
The A-Net modifies the original training images constrained by a simplified physical shadow model and is focused on fooling the D-Net's shadow predictions.
Ranked #4 on
Shadow Detection
on SBU
no code implementations • ICCV 2017 • Vu Nguyen, Tomas F. Yago Vicente, Maozheng Zhao, Minh Hoai, Dimitris Samaras
We introduce scGAN, a novel extension of conditional Generative Adversarial Networks (GAN) tailored for the challenging problem of shadow detection in images.
Ranked #6 on
RGB Salient Object Detection
on ISTD
no code implementations • 17 Aug 2017 • Yang Wang, Vinh Tran, Minh Hoai
We introduce Eigen Evolution Pooling, an efficient method to aggregate a sequence of feature vectors.
no code implementations • 14 Feb 2017 • Yang Wang, Vinh Tran, Minh Hoai
Recently Trajectory-pooled Deep-learning Descriptors were shown to achieve state-of-the-art human action recognition results on a number of datasets.
no code implementations • 10 Nov 2016 • Boyu Wang, Kevin Yager, Dantong Yu, Minh Hoai
In this paper, we explore the use of deep learning to develop methods for automatically analyzing x-ray scattering images.
no code implementations • CVPR 2016 • Zijun Wei, Minh Hoai
RRSVM exploits the correlation of local regions in an image, and it jointly learns a region evaluation function and a scheme for integrating multiple regions.
no code implementations • CVPR 2016 • Tomas F. Yago Vicente, Minh Hoai, Dimitris Samaras
However, shadow detection on broader image domains is still challenging due to the lack of annotated training data.
no code implementations • 31 May 2016 • Yang Liu, Minh Hoai, Mang Shao, Tae-Kyun Kim
LBSVM is based on Structured-Output SVM, but extends it to handle noisy video data and ensure consistency of the output decision throughout time.
no code implementations • CVPR 2016 • Yang Wang, Minh Hoai
In this paper we consider the task of recognizing human actions in realistic video where human actions are dominated by irrelevant factors.
no code implementations • ICCV 2015 • Tomas F. Yago Vicente, Minh Hoai, Dimitris Samaras
Optimizing the leave-one-out cross validation error is typically difficult, but it can be done efficiently in our framework.
no code implementations • CVPR 2014 • Minh Hoai, Andrew Zisserman
The objective of this work is to accurately and efficiently detect configurations of one or more people in edited TV material.
no code implementations • CVPR 2013 • Minh Hoai, Andrew Zisserman
The objective of this work is to learn sub-categories.