Search Results for author: Minh Hoai

Found 57 papers, 25 papers with code

Driver Attention Tracking and Analysis

no code implementations10 Apr 2024 Dat Viet Thanh Nguyen, Anh Tran, Hoai Nam Vu, Cuong Pham, Minh Hoai

This network has a camera calibration module that can compute an embedding vector that represents the spatial configuration between the driver and the camera system.

Camera Calibration

Blur2Blur: Blur Conversion for Unsupervised Image Deblurring on Unknown Domains

no code implementations24 Mar 2024 Bang-Dang Pham, Phong Tran, Anh Tran, Cuong Pham, Rang Nguyen, Minh Hoai

This algorithm works by transforming a blurry input image, which is challenging to deblur, into another blurry image that is more amenable to deblurring.

Deblurring Image Deblurring

HanDiffuser: Text-to-Image Generation With Realistic Hand Appearances

no code implementations4 Mar 2024 Supreeth Narasimhaswamy, Uttaran Bhattacharya, Xiang Chen, Ishita Dasgupta, Saayan Mitra, Minh Hoai

To generate images with realistic hands, we propose a novel diffusion-based architecture called HanDiffuser that achieves realism by injecting hand embeddings in the generative process.

Text-to-Image Generation

Count What You Want: Exemplar Identification and Few-shot Counting of Human Actions in the Wild

1 code implementation28 Dec 2023 Yifeng Huang, Duc DUy Nguyen, Lam Nguyen, Cuong Pham, Minh Hoai

To develop and evaluate our approach, we introduce a diverse and realistic dataset consisting of real-world data from 37 subjects and 50 action categories, encompassing both sensor and audio data.

Density Estimation

Interactive Class-Agnostic Object Counting

no code implementations ICCV 2023 Yifeng Huang, Viresh Ranjan, Minh Hoai

The user can provide feedback by selecting a region with obvious counting errors and specifying the range for the estimated number of objects within it.

Object Object Counting

HyperCUT: Video Sequence from a Single Blurry Image using Unsupervised Ordering

1 code implementation CVPR 2023 Bang-Dang Pham, Phong Tran, Anh Tran, Cuong Pham, Rang Nguyen, Minh Hoai

We consider the challenging task of training models for image-to-video deblurring, which aims to recover a sequence of sharp images corresponding to a given blurry image input.

Deblurring

Gazeformer: Scalable, Effective and Fast Prediction of Goal-Directed Human Attention

1 code implementation CVPR 2023 Sounak Mondal, Zhibo Yang, Seoyoung Ahn, Dimitris Samaras, Gregory Zelinsky, Minh Hoai

In response, we pose a new task called ZeroGaze, a new variant of zero-shot learning where gaze is predicted for never-before-searched objects, and we develop a novel model, Gazeformer, to solve the ZeroGaze problem.

Gaze Prediction Language Modelling +2

Unifying Top-down and Bottom-up Scanpath Prediction Using Transformers

1 code implementation16 Mar 2023 Zhibo Yang, Sounak Mondal, Seoyoung Ahn, Ruoyu Xue, Gregory Zelinsky, Minh Hoai, Dimitris Samaras

Most models of visual attention aim at predicting either top-down or bottom-up control, as studied using different visual search and free-viewing tasks.

Scanpath prediction

Object Detection With Self-Supervised Scene Adaptation

1 code implementation CVPR 2023 Zekun Zhang, Minh Hoai

This paper proposes a novel method to improve the performance of a trained object detector on scenes with fixed camera perspectives based on self-supervised adaptation.

Data Augmentation Object +2

Patch-level Gaze Distribution Prediction for Gaze Following

1 code implementation20 Nov 2022 Qiaomu Miao, Minh Hoai, Dimitris Samaras

Gaze following aims to predict where a person is looking in a scene, by predicting the target location, or indicating that the target is located outside the image.

Binary Classification

Self-Supervised Learning with Multi-View Rendering for 3D Point Cloud Analysis

1 code implementation28 Oct 2022 Bach Tran, Binh-Son Hua, Anh Tuan Tran, Minh Hoai

Inspired by the success of deep learning in the image domain, we devise a novel pre-training technique for better model initialization by utilizing the multi-view rendering of the 3D data.

Knowledge Distillation Self-Supervised Learning

Target-absent Human Attention

1 code implementation4 Jul 2022 Zhibo Yang, Sounak Mondal, Seoyoung Ahn, Gregory Zelinsky, Minh Hoai, Dimitris Samaras

In this paper, we propose the first data-driven computational model that addresses the search-termination problem and predicts the scanpath of search fixations made by people searching for targets that do not appear in images.

Imitation Learning

Exemplar Free Class Agnostic Counting

no code implementations27 May 2022 Viresh Ranjan, Minh Hoai

We tackle the task of Class Agnostic Counting, which aims to count objects in a novel object category at test time without any access to labeled training data for that category.

Density Estimation Region Proposal +1

Forward Propagation, Backward Regression, and Pose Association for Hand Tracking in the Wild

1 code implementation CVPR 2022 Mingzhen Huang, Supreeth Narasimhaswamy, Saif Vazir, Haibin Ling, Minh Hoai

The first stage is Forward Propagation, where the features from frame t-1 are propagated to frame t based on previously detected hands and their estimated motion.

 Ranked #1 on Multiple Object Tracking on YouTube-Hands (using extra training data)

Multiple Object Tracking regression

Whose Hands Are These? Hand Detection and Hand-Body Association in the Wild

1 code implementation CVPR 2022 Supreeth Narasimhaswamy, Thanh Nguyen, Mingzhen Huang, Minh Hoai

We also introduce a new challenging dataset called BodyHands containing unconstrained images with hand and their corresponding body locations annotations.

Hand Detection

Vicinal Counting Networks

no code implementations29 Sep 2021 Viresh Ranjan, Minh Hoai

Given an image containing multiple objects of a novel visual category and few exemplar bounding boxes depicting the visual category of interest, we want to count all of the instances of the desired visual category in the image.

Crowd Counting Data Augmentation

Toward Realistic Single-View 3D Object Reconstruction with Unsupervised Learning from Multiple Images

1 code implementation ICCV 2021 Long-Nhat Ho, Anh Tuan Tran, Quynh Phung, Minh Hoai

In this paper, we eliminate the symmetry requirement with a novel unsupervised algorithm that can learn a 3D reconstruction network from a multi-image dataset.

3D Object Reconstruction 3D Reconstruction +1

Explore Image Deblurring via Encoded Blur Kernel Space

1 code implementation CVPR 2021 Phong Tran, Anh Tuan Tran, Quynh Phung, Minh Hoai

This paper introduces a method to encode the blur operators of an arbitrary dataset of sharp-blur image pairs into a blur kernel space.

Blind Image Deblurring Image Deblurring

Learning To Count Everything

1 code implementation CVPR 2021 Viresh Ranjan, Udbhav Sharma, Thu Nguyen, Minh Hoai

We also present a novel adaptation strategy to adapt our network to any novel visual category at test time, using only a few exemplar objects from the novel category.

Object Counting

Progressive Semantic Segmentation

1 code implementation CVPR 2021 Chuong Huynh, Anh Tran, Khoa Luu, Minh Hoai

In this work, we present MagNet, a multi-scale framework that resolves local ambiguity by looking at the image at multiple magnification levels.

Land Cover Classification Segmentation +1

Lipstick ain't enough: Beyond Color Matching for In-the-Wild Makeup Transfer

1 code implementation CVPR 2021 Thao Nguyen, Anh Tran, Minh Hoai

However, existing works overlooked the latter components and confined makeup transfer to color manipulation, focusing only on light makeup styles.

Color Manipulation Facial Makeup Transfer +2

Explore Image Deblurring via Blur Kernel Space

1 code implementation1 Apr 2021 Phong Tran, Anh Tran, Quynh Phung, Minh Hoai

This paper introduces a method to encode the blur operators of an arbitrary dataset of sharp-blur image pairs into a blur kernel space.

Blind Image Deblurring Facial Expression Recognition (FER) +1

Localization in the Crowd with Topological Constraints

1 code implementation23 Dec 2020 Shahira Abousamra, Minh Hoai, Dimitris Samaras, Chao Chen

Due to various challenges, a localization method is prone to spatial semantic errors, i. e., predicting multiple dots within a same person or collapsing multiple dots in a cluttered region.

Crowd Counting

Detecting Hands and Recognizing Physical Contact in the Wild

no code implementations NeurIPS 2020 Supreeth Narasimhaswamy, Trung Nguyen, Minh Hoai

The first attention mechanism is based on the hand and a region's affinity, enclosing the hand and the object, and densely pools features from this region to the hand region.

Object

Uncertainty Estimation and Sample Selection for Crowd Counting

1 code implementation30 Sep 2020 Viresh Ranjan, Boyu Wang, Mubarak Shah, Minh Hoai

We present sample selection strategies which make use of the density and uncertainty of predictions from the networks trained on one domain to select the informative images from a target domain of interest to acquire human annotation.

Crowd Counting

Distribution Matching for Crowd Counting

1 code implementation NeurIPS 2020 Boyu Wang, Huidong Liu, Dimitris Samaras, Minh Hoai

Existing crowd counting methods need to use a Gaussian to smooth each annotated dot or to estimate the likelihood of every pixel given the annotated point.

Crowd Counting

A Study of Human Gaze Behavior During Visual Crowd Counting

no code implementations14 Sep 2020 Raji Annadi, Yupei Chen, Viresh Ranjan, Dimitris Samaras, Gregory Zelinsky, Minh Hoai

Analyzing the collected gaze behavior of ten human participants on thirty crowd images, we observe some common approaches for visual counting.

Crowd Counting

Interactive Visual Study of Multiple Attributes Learning Model of X-Ray Scattering Images

no code implementations3 Sep 2020 Xinyi Huang, Suphanut Jamonnak, Ye Zhao, Boyu Wang, Minh Hoai, Kevin Yager, Wei Xu

Existing interactive visualization tools for deep learning are mostly applied to the training, debugging, and refinement of neural network models working on natural images.

Image Classification

Predicting Goal-directed Attention Control Using Inverse-Reinforcement Learning

no code implementations31 Jan 2020 Gregory J. Zelinsky, Yupei Chen, Seoyoung Ahn, Hossein Adeli, Zhibo Yang, Lihan Huang, Dimitrios Samaras, Minh Hoai

Using machine learning and the psychologically-meaningful principle of reward, it is possible to learn the visual features used in goal-directed attention control.

BIG-bench Machine Learning reinforcement-learning +1

Visual Understanding of Multiple Attributes Learning Model of X-Ray Scattering Images

no code implementations10 Oct 2019 Xinyi Huang, Suphanut Jamonnak, Ye Zhao, Boyu Wang, Minh Hoai, Kevin Yager, Wei Xu

This extended abstract presents a visualization system, which is designed for domain scientists to visually understand their deep learning model of extracting multiple attributes in x-ray scattering images.

Attentive Action and Context Factorization

no code implementations10 Apr 2019 Yang Wang, Vinh Tran, Gedas Bertasius, Lorenzo Torresani, Minh Hoai

This is a challenging task due to the subtlety of human actions in video and the co-occurrence of contextual elements.

Action Recognition Temporal Action Localization

Contextual Attention for Hand Detection in the Wild

1 code implementation ICCV 2019 Supreeth Narasimhaswamy, Zhengwei Wei, Yang Wang, Justin Zhang, Minh Hoai

We also conduct ablation studies on hand detection to show the effectiveness of the proposed contextual attention module.

Hand Detection object-detection +1

Knowledge Distillation for Human Action Anticipation

no code implementations9 Apr 2019 Vinh Tran, Yang Wang, Minh Hoai

In this paper, we propose a novel knowledge distillation framework that uses an action recognition network to supervise the training of an action anticipation network, guiding the latter to attend to the relevant information needed for correctly anticipating the future actions.

Action Anticipation Action Recognition +3

BusyHands: A Hand-Tool Interaction Database for Assembly Tasks Semantic Segmentation

no code implementations19 Feb 2019 Roy Shilkrot, Zhi Chai, Minh Hoai

Visual segmentation has seen tremendous advancement recently with ready solutions for a wide variety of scene types, including human hands and other body parts.

Segmentation Semantic Segmentation

GIF2Video: Color Dequantization and Temporal Interpolation of GIF images

no code implementations CVPR 2019 Yang Wang, Haibin Huang, Chuan Wang, Tong He, Jue Wang, Minh Hoai

In this paper, we propose GIF2Video, the first learning-based method for enhancing the visual quality of GIFs in the wild.

Quantization

Fake Sentence Detection as a Training Task for Sentence Encoding

no code implementations ICLR 2019 Viresh Ranjan, Heeyoung Kwon, Niranjan Balasubramanian, Minh Hoai

We automatically generate fake sentences by corrupting original sentences from a source collection and train the encoders to produce representations that are effective at detecting fake sentences.

Binary Classification Language Modelling +1

A+D Net: Training a Shadow Detector with Adversarial Shadow Attenuation

1 code implementation ECCV 2018 Hieu Le, Tomas F. Yago Vicente, Vu Nguyen, Minh Hoai, Dimitris Samaras

The A-Net modifies the original training images constrained by a simplified physical shadow model and is focused on fooling the D-Net's shadow predictions.

Detecting Shadows Shadow Detection

Shadow Detection With Conditional Generative Adversarial Networks

no code implementations ICCV 2017 Vu Nguyen, Tomas F. Yago Vicente, Maozheng Zhao, Minh Hoai, Dimitris Samaras

We introduce scGAN, a novel extension of conditional Generative Adversarial Networks (GAN) tailored for the challenging problem of shadow detection in images.

Shadow Detection

Eigen Evolution Pooling for Human Action Recognition

no code implementations17 Aug 2017 Yang Wang, Vinh Tran, Minh Hoai

We introduce Eigen Evolution Pooling, an efficient method to aggregate a sequence of feature vectors.

Action Recognition Temporal Action Localization

Evolution-Preserving Dense Trajectory Descriptors

no code implementations14 Feb 2017 Yang Wang, Vinh Tran, Minh Hoai

Recently Trajectory-pooled Deep-learning Descriptors were shown to achieve state-of-the-art human action recognition results on a number of datasets.

Action Recognition Temporal Action Localization

X-ray Scattering Image Classification Using Deep Learning

no code implementations10 Nov 2016 Boyu Wang, Kevin Yager, Dantong Yu, Minh Hoai

In this paper, we explore the use of deep learning to develop methods for automatically analyzing x-ray scattering images.

Classification General Classification +1

Region Ranking SVM for Image Classification

no code implementations CVPR 2016 Zijun Wei, Minh Hoai

RRSVM exploits the correlation of local regions in an image, and it jointly learns a region evaluation function and a scheme for integrating multiple regions.

Classification General Classification +1

Latent Bi-constraint SVM for Video-based Object Recognition

no code implementations31 May 2016 Yang Liu, Minh Hoai, Mang Shao, Tae-Kyun Kim

LBSVM is based on Structured-Output SVM, but extends it to handle noisy video data and ensure consistency of the output decision throughout time.

Object Object Recognition

Improving Human Action Recognition by Non-action Classification

no code implementations CVPR 2016 Yang Wang, Minh Hoai

In this paper we consider the task of recognizing human actions in realistic video where human actions are dominated by irrelevant factors.

Action Classification Action Recognition +3

Leave-One-Out Kernel Optimization for Shadow Detection

no code implementations ICCV 2015 Tomas F. Yago Vicente, Minh Hoai, Dimitris Samaras

Optimizing the leave-one-out cross validation error is typically difficult, but it can be done efficiently in our framework.

Shadow Detection Superpixels

Talking Heads: Detecting Humans and Recognizing Their Interactions

no code implementations CVPR 2014 Minh Hoai, Andrew Zisserman

The objective of this work is to accurately and efficiently detect configurations of one or more people in edited TV material.

Cannot find the paper you are looking for? You can Submit a new open access paper.