Search Results for author: Minh Hoai

Found 57 papers, 25 papers with code

Driver Attention Tracking and Analysis

no code implementations • 10 Apr 2024 • Dat Viet Thanh Nguyen, Anh Tran, Hoai Nam Vu, Cuong Pham, Minh Hoai

This network has a camera calibration module that can compute an embedding vector that represents the spatial configuration between the driver and the camera system.

Camera Calibration

Paper
Add Code

Blur2Blur: Blur Conversion for Unsupervised Image Deblurring on Unknown Domains

no code implementations • 24 Mar 2024 • Bang-Dang Pham, Phong Tran, Anh Tran, Cuong Pham, Rang Nguyen, Minh Hoai

This algorithm works by transforming a blurry input image, which is challenging to deblur, into another blurry image that is more amenable to deblurring.

Deblurring Image Deblurring

Paper
Add Code

HanDiffuser: Text-to-Image Generation With Realistic Hand Appearances

no code implementations • 4 Mar 2024 • Supreeth Narasimhaswamy, Uttaran Bhattacharya, Xiang Chen, Ishita Dasgupta, Saayan Mitra, Minh Hoai

To generate images with realistic hands, we propose a novel diffusion-based architecture called HanDiffuser that achieves realism by injecting hand embeddings in the generative process.

Text-to-Image Generation

Paper
Add Code

Count What You Want: Exemplar Identification and Few-shot Counting of Human Actions in the Wild

1 code implementation • 28 Dec 2023 • Yifeng Huang, Duc DUy Nguyen, Lam Nguyen, Cuong Pham, Minh Hoai

To develop and evaluate our approach, we introduce a diverse and realistic dataset consisting of real-world data from 37 subjects and 50 action categories, encompassing both sensor and audio data.

Density Estimation

Paper
Code

Interactive Class-Agnostic Object Counting

no code implementations • ICCV 2023 • Yifeng Huang, Viresh Ranjan, Minh Hoai

The user can provide feedback by selecting a region with obvious counting errors and specifying the range for the estimated number of objects within it.

Object Object Counting

Paper
Add Code

HyperCUT: Video Sequence from a Single Blurry Image using Unsupervised Ordering

1 code implementation • CVPR 2023 • Bang-Dang Pham, Phong Tran, Anh Tran, Cuong Pham, Rang Nguyen, Minh Hoai

We consider the challenging task of training models for image-to-video deblurring, which aims to recover a sequence of sharp images corresponding to a given blurry image input.

Deblurring

Paper
Code

Gazeformer: Scalable, Effective and Fast Prediction of Goal-Directed Human Attention

1 code implementation • CVPR 2023 • Sounak Mondal, Zhibo Yang, Seoyoung Ahn, Dimitris Samaras, Gregory Zelinsky, Minh Hoai

In response, we pose a new task called ZeroGaze, a new variant of zero-shot learning where gaze is predicted for never-before-searched objects, and we develop a novel model, Gazeformer, to solve the ZeroGaze problem.

Gaze Prediction Language Modelling +2

Paper
Code

Unifying Top-down and Bottom-up Scanpath Prediction Using Transformers

1 code implementation • 16 Mar 2023 • Zhibo Yang, Sounak Mondal, Seoyoung Ahn, Ruoyu Xue, Gregory Zelinsky, Minh Hoai, Dimitris Samaras

Most models of visual attention aim at predicting either top-down or bottom-up control, as studied using different visual search and free-viewing tasks.

Scanpath prediction

Paper
Code

Object Detection With Self-Supervised Scene Adaptation

1 code implementation • CVPR 2023 • Zekun Zhang, Minh Hoai

This paper proposes a novel method to improve the performance of a trained object detector on scenes with fixed camera perspectives based on self-supervised adaptation.

Data Augmentation Object +2

Paper
Code

Patch-level Gaze Distribution Prediction for Gaze Following

1 code implementation • 20 Nov 2022 • Qiaomu Miao, Minh Hoai, Dimitris Samaras

Gaze following aims to predict where a person is looking in a scene, by predicting the target location, or indicating that the target is located outside the image.

Binary Classification

Paper
Code

Self-Supervised Learning with Multi-View Rendering for 3D Point Cloud Analysis

1 code implementation • 28 Oct 2022 • Bach Tran, Binh-Son Hua, Anh Tuan Tran, Minh Hoai

Inspired by the success of deep learning in the image domain, we devise a novel pre-training technique for better model initialization by utilizing the multi-view rendering of the 3D data.

Knowledge Distillation Self-Supervised Learning

Paper
Code

Text-Derived Knowledge Helps Vision: A Simple Cross-modal Distillation for Video-based Action Anticipation

1 code implementation • 12 Oct 2022 • Sayontan Ghosh, Tanvi Aggarwal, Minh Hoai, Niranjan Balasubramanian

Anticipating future actions in a video is useful for many autonomous and assistive technologies.

Action Anticipation Transfer Learning

Paper
Code

Few-shot Object Counting and Detection

1 code implementation • 22 Jul 2022 • Thanh Nguyen, Chau Pham, Khoi Nguyen, Minh Hoai

We tackle a new task of few-shot object counting and detection.

Ranked #10 on Object Counting on FSC147

Few-Shot Object Detection Object +1

Paper
Code

Target-absent Human Attention

1 code implementation • 4 Jul 2022 • Zhibo Yang, Sounak Mondal, Seoyoung Ahn, Gregory Zelinsky, Minh Hoai, Dimitris Samaras

In this paper, we propose the first data-driven computational model that addresses the search-termination problem and predicts the scanpath of search fixations made by people searching for targets that do not appear in images.

Imitation Learning

Paper
Code

Exemplar Free Class Agnostic Counting

no code implementations • 27 May 2022 • Viresh Ranjan, Minh Hoai

We tackle the task of Class Agnostic Counting, which aims to count objects in a novel object category at test time without any access to labeled training data for that category.

Density Estimation Region Proposal +1

Paper
Add Code

Forward Propagation, Backward Regression, and Pose Association for Hand Tracking in the Wild

1 code implementation • CVPR 2022 • Mingzhen Huang, Supreeth Narasimhaswamy, Saif Vazir, Haibin Ling, Minh Hoai

The first stage is Forward Propagation, where the features from frame t-1 are propagated to frame t based on previously detected hands and their estimated motion.

Ranked #1 on Multiple Object Tracking on YouTube-Hands (using extra training data)

Multiple Object Tracking regression

Paper
Code

Whose Hands Are These? Hand Detection and Hand-Body Association in the Wild

1 code implementation • CVPR 2022 • Supreeth Narasimhaswamy, Thanh Nguyen, Mingzhen Huang, Minh Hoai

We also introduce a new challenging dataset called BodyHands containing unconstrained images with hand and their corresponding body locations annotations.

Hand Detection

Paper
Code

Vicinal Counting Networks

no code implementations • 29 Sep 2021 • Viresh Ranjan, Minh Hoai

Given an image containing multiple objects of a novel visual category and few exemplar bounding boxes depicting the visual category of interest, we want to count all of the instances of the desired visual category in the image.

Crowd Counting Data Augmentation

Paper
Add Code

Toward Realistic Single-View 3D Object Reconstruction with Unsupervised Learning from Multiple Images

1 code implementation • ICCV 2021 • Long-Nhat Ho, Anh Tuan Tran, Quynh Phung, Minh Hoai

In this paper, we eliminate the symmetry requirement with a novel unsupervised algorithm that can learn a 3D reconstruction network from a multi-image dataset.

3D Object Reconstruction 3D Reconstruction +1

Paper
Code

Explore Image Deblurring via Encoded Blur Kernel Space

1 code implementation • CVPR 2021 • Phong Tran, Anh Tuan Tran, Quynh Phung, Minh Hoai

This paper introduces a method to encode the blur operators of an arbitrary dataset of sharp-blur image pairs into a blur kernel space.

Blind Image Deblurring Image Deblurring

137

Paper
Code

Dictionary-Guided Scene Text Recognition

1 code implementation • CVPR 2021 • Nguyen Nguyen, Thu Nguyen, Vinh Tran, Minh-Triet Tran, Thanh Duc Ngo, Thien Huu Nguyen, Minh Hoai

Language prior plays an important role in the way humans perceive and recognize text in the wild.

Scene Text Detection Scene Text Recognition +2

126

Paper
Code

Learning To Count Everything

1 code implementation • CVPR 2021 • Viresh Ranjan, Udbhav Sharma, Thu Nguyen, Minh Hoai

We also present a novel adaptation strategy to adapt our network to any novel visual category at test time, using only a few exemplar objects from the novel category.

Ranked #13 on Object Counting on FSC147

Object Counting

310

Paper
Code

Progressive Semantic Segmentation

1 code implementation • CVPR 2021 • Chuong Huynh, Anh Tran, Khoa Luu, Minh Hoai

In this work, we present MagNet, a multi-scale framework that resolves local ambiguity by looking at the image at multiple magnification levels.

Ranked #3 on Land Cover Classification on DeepGlobe

Land Cover Classification Segmentation +1

113

Paper
Code

Lipstick ain't enough: Beyond Color Matching for In-the-Wild Makeup Transfer

1 code implementation • CVPR 2021 • Thao Nguyen, Anh Tran, Minh Hoai

However, existing works overlooked the latter components and confined makeup transfer to color manipulation, focusing only on light makeup styles.

Ranked #1 on Facial Makeup Transfer on CPM-Synt-2

Color Manipulation Facial Makeup Transfer +2

344

Paper
Code

Explore Image Deblurring via Blur Kernel Space

1 code implementation • 1 Apr 2021 • Phong Tran, Anh Tran, Quynh Phung, Minh Hoai

This paper introduces a method to encode the blur operators of an arbitrary dataset of sharp-blur image pairs into a blur kernel space.

Blind Image Deblurring Facial Expression Recognition (FER) +1

137

Paper
Code

FineNet: Frame Interpolation and Enhancement for Face Video Deblurring

no code implementations • 1 Mar 2021 • Phong Tran, Anh Tran, Thao Nguyen, Minh Hoai

The objective of this work is to deblur face videos.

Deblurring

Paper
Add Code

Localization in the Crowd with Topological Constraints

1 code implementation • 23 Dec 2020 • Shahira Abousamra, Minh Hoai, Dimitris Samaras, Chao Chen

Due to various challenges, a localization method is prone to spatial semantic errors, i. e., predicting multiple dots within a same person or collapsing multiple dots in a cluttered region.

Crowd Counting

Paper
Code

Structural and Functional Decomposition for Personality Image Captioning in a Communication Game

no code implementations • Findings of the Association for Computational Linguistics 2020 • Thu Nguyen, Duy Phung, Minh Hoai, Thien Huu Nguyen

Personality image captioning (PIC) aims to describe an image with a natural language caption given a personality trait.

Caption Generation Image Captioning +1

Paper
Add Code

Detecting Hands and Recognizing Physical Contact in the Wild

no code implementations • NeurIPS 2020 • Supreeth Narasimhaswamy, Trung Nguyen, Minh Hoai

The first attention mechanism is based on the hand and a region's affinity, enclosing the hand and the object, and densely pools features from this region to the hand region.

Object

Paper
Add Code

Uncertainty Estimation and Sample Selection for Crowd Counting

1 code implementation • 30 Sep 2020 • Viresh Ranjan, Boyu Wang, Mubarak Shah, Minh Hoai

We present sample selection strategies which make use of the density and uncertainty of predictions from the networks trained on one domain to select the informative images from a target domain of interest to acquire human annotation.

Crowd Counting

Paper
Code

Distribution Matching for Crowd Counting

1 code implementation • NeurIPS 2020 • Boyu Wang, Huidong Liu, Dimitris Samaras, Minh Hoai

Existing crowd counting methods need to use a Gaussian to smooth each annotated dot or to estimate the likelihood of every pixel given the annotated point.

Ranked #3 on Crowd Counting on UCF CC 50

Crowd Counting

204

Paper
Code

A Study of Human Gaze Behavior During Visual Crowd Counting

no code implementations • 14 Sep 2020 • Raji Annadi, Yupei Chen, Viresh Ranjan, Dimitris Samaras, Gregory Zelinsky, Minh Hoai

Analyzing the collected gaze behavior of ten human participants on thirty crowd images, we observe some common approaches for visual counting.

Crowd Counting

Paper
Add Code

Interactive Visual Study of Multiple Attributes Learning Model of X-Ray Scattering Images

no code implementations • 3 Sep 2020 • Xinyi Huang, Suphanut Jamonnak, Ye Zhao, Boyu Wang, Minh Hoai, Kevin Yager, Wei Xu

Existing interactive visualization tools for deep learning are mostly applied to the training, debugging, and refinement of neural network models working on natural images.

Image Classification

Paper
Add Code

Predicting Goal-directed Human Attention Using Inverse Reinforcement Learning

2 code implementations • CVPR 2020 • Zhibo Yang, Lihan Huang, Yupei Chen, Zijun Wei, Seoyoung Ahn, Gregory Zelinsky, Dimitris Samaras, Minh Hoai

These maps were learned by IRL and then used to predict behavioral scanpaths for multiple target categories.

Object reinforcement-learning +1

Paper
Code

Predicting Goal-directed Attention Control Using Inverse-Reinforcement Learning

no code implementations • 31 Jan 2020 • Gregory J. Zelinsky, Yupei Chen, Seoyoung Ahn, Hossein Adeli, Zhibo Yang, Lihan Huang, Dimitrios Samaras, Minh Hoai

Using machine learning and the psychologically-meaningful principle of reward, it is possible to learn the visual features used in goal-directed attention control.

BIG-bench Machine Learning reinforcement-learning +1

Paper
Add Code

Visual Understanding of Multiple Attributes Learning Model of X-Ray Scattering Images

no code implementations • 10 Oct 2019 • Xinyi Huang, Suphanut Jamonnak, Ye Zhao, Boyu Wang, Minh Hoai, Kevin Yager, Wei Xu

This extended abstract presents a visualization system, which is designed for domain scientists to visually understand their deep learning model of extracting multiple attributes in x-ray scattering images.

Paper
Add Code

Attentive Action and Context Factorization

no code implementations • 10 Apr 2019 • Yang Wang, Vinh Tran, Gedas Bertasius, Lorenzo Torresani, Minh Hoai

This is a challenging task due to the subtlety of human actions in video and the co-occurrence of contextual elements.

Action Recognition Temporal Action Localization

Paper
Add Code

Contextual Attention for Hand Detection in the Wild

1 code implementation • ICCV 2019 • Supreeth Narasimhaswamy, Zhengwei Wei, Yang Wang, Justin Zhang, Minh Hoai

We also conduct ablation studies on hand detection to show the effectiveness of the proposed contextual attention module.

Hand Detection object-detection +1

142

Paper
Code

Knowledge Distillation for Human Action Anticipation

no code implementations • 9 Apr 2019 • Vinh Tran, Yang Wang, Minh Hoai

In this paper, we propose a novel knowledge distillation framework that uses an action recognition network to supervise the training of an action anticipation network, guiding the latter to attend to the relevant information needed for correctly anticipating the future actions.

Action Anticipation Action Recognition +3

Paper
Add Code

BusyHands: A Hand-Tool Interaction Database for Assembly Tasks Semantic Segmentation

no code implementations • 19 Feb 2019 • Roy Shilkrot, Zhi Chai, Minh Hoai

Visual segmentation has seen tremendous advancement recently with ready solutions for a wide variety of scene types, including human hands and other body parts.

Segmentation Semantic Segmentation

Paper
Add Code

GIF2Video: Color Dequantization and Temporal Interpolation of GIF images

no code implementations • CVPR 2019 • Yang Wang, Haibin Huang, Chuan Wang, Tong He, Jue Wang, Minh Hoai

In this paper, we propose GIF2Video, the first learning-based method for enhancing the visual quality of GIFs in the wild.

Quantization

Paper
Add Code

Fake Sentence Detection as a Training Task for Sentence Encoding

no code implementations • ICLR 2019 • Viresh Ranjan, Heeyoung Kwon, Niranjan Balasubramanian, Minh Hoai

We automatically generate fake sentences by corrupting original sentences from a source collection and train the encoders to produce representations that are effective at detecting fake sentences.

Binary Classification Language Modelling +1

Paper
Add Code

Iterative Crowd Counting

no code implementations • ECCV 2018 • Viresh Ranjan, Hieu Le, Minh Hoai

In this work, we tackle the problem of crowd counting in images.

Ranked #10 on Crowd Counting on UCF CC 50

Crowd Counting Density Estimation

Paper
Add Code

Good View Hunting: Learning Photo Composition From Dense View Pairs

no code implementations • CVPR 2018 • Zijun Wei, Jianming Zhang, Xiaohui Shen, Zhe Lin, RadomÃr Mech, Minh Hoai, Dimitris Samaras

Finding views with good photo composition is a challenging task for machine learning methods.

Image Cropping Transfer Learning

Paper
Add Code

Pulling Actions out of Context: Explicit Separation for Effective Combination

no code implementations • CVPR 2018 • Yang Wang, Minh Hoai

The ability to recognize human actions in video has many potential applications.

Action Recognition Temporal Action Localization

Paper
Add Code

A+D Net: Training a Shadow Detector with Adversarial Shadow Attenuation

1 code implementation • ECCV 2018 • Hieu Le, Tomas F. Yago Vicente, Vu Nguyen, Minh Hoai, Dimitris Samaras

The A-Net modifies the original training images constrained by a simplified physical shadow model and is focused on fooling the D-Net's shadow predictions.

Ranked #4 on Shadow Detection on SBU

Detecting Shadows Shadow Detection

Paper
Code

Shadow Detection With Conditional Generative Adversarial Networks

no code implementations • ICCV 2017 • Vu Nguyen, Tomas F. Yago Vicente, Maozheng Zhao, Minh Hoai, Dimitris Samaras

We introduce scGAN, a novel extension of conditional Generative Adversarial Networks (GAN) tailored for the challenging problem of shadow detection in images.

Ranked #6 on RGB Salient Object Detection on ISTD

Shadow Detection

Paper
Add Code

Eigen Evolution Pooling for Human Action Recognition

no code implementations • 17 Aug 2017 • Yang Wang, Vinh Tran, Minh Hoai

We introduce Eigen Evolution Pooling, an efficient method to aggregate a sequence of feature vectors.

Action Recognition Temporal Action Localization

Paper
Add Code

Evolution-Preserving Dense Trajectory Descriptors

no code implementations • 14 Feb 2017 • Yang Wang, Vinh Tran, Minh Hoai

Recently Trajectory-pooled Deep-learning Descriptors were shown to achieve state-of-the-art human action recognition results on a number of datasets.

Action Recognition Temporal Action Localization

Paper
Add Code

X-ray Scattering Image Classification Using Deep Learning

no code implementations • 10 Nov 2016 • Boyu Wang, Kevin Yager, Dantong Yu, Minh Hoai

In this paper, we explore the use of deep learning to develop methods for automatically analyzing x-ray scattering images.

Classification General Classification +1

Paper
Add Code

Noisy Label Recovery for Shadow Detection in Unfamiliar Domains

no code implementations • CVPR 2016 • Tomas F. Yago Vicente, Minh Hoai, Dimitris Samaras

However, shadow detection on broader image domains is still challenging due to the lack of annotated training data.

Detecting Shadows Shadow Detection

Paper
Add Code

Region Ranking SVM for Image Classification

no code implementations • CVPR 2016 • Zijun Wei, Minh Hoai

RRSVM exploits the correlation of local regions in an image, and it jointly learns a region evaluation function and a scheme for integrating multiple regions.

Classification General Classification +1