Search Results for author: Huaizu Jiang

Found 27 papers, 12 papers with code

Face Detection with the Faster R-CNN

1 code implementation10 Jun 2016 Huaizu Jiang, Erik Learned-Miller

The Faster R-CNN has recently demonstrated impressive results on various object detection benchmarks.

Face Detection object-detection +1

In Defense of Grid Features for Visual Question Answering

2 code implementations CVPR 2020 Huaizu Jiang, Ishan Misra, Marcus Rohrbach, Erik Learned-Miller, Xinlei Chen

Popularized as 'bottom-up' attention, bounding box (or region) based visual features have recently surpassed vanilla grid-based convolutional features as the de facto standard for vision and language tasks like visual question answering (VQA).

Image Captioning Question Answering +1

PlanarRecon: Real-time 3D Plane Detection and Reconstruction from Posed Monocular Videos

1 code implementation CVPR 2022 Yiming Xie, Matheus Gadelha, Fengting Yang, Xiaowei Zhou, Huaizu Jiang

We present PlanarRecon -- a novel framework for globally coherent detection and reconstruction of 3D planes from a posed monocular video.

3D Plane Detection

OmniControl: Control Any Joint at Any Time for Human Motion Generation

1 code implementation12 Oct 2023 Yiming Xie, Varun Jampani, Lei Zhong, Deqing Sun, Huaizu Jiang

We present a novel approach named OmniControl for incorporating flexible spatial control signals into a text-conditioned human motion generation model based on the diffusion process.

DCVNet: Dilated Cost Volume Networks for Fast Optical Flow

1 code implementation31 Mar 2021 Huaizu Jiang, Erik Learned-Miller

When sampling correspondences to build the cost volume, a large neighborhood radius is required to deal with large displacements, introducing a significant computational burden.

Optical Flow Estimation

SENSE: a Shared Encoder Network for Scene-flow Estimation

1 code implementation ICCV 2019 Huaizu Jiang, Deqing Sun, Varun Jampani, Zhaoyang Lv, Erik Learned-Miller, Jan Kautz

We introduce a compact network for holistic scene flow estimation, called SENSE, which shares common encoder features among four closely-related tasks: optical flow estimation, disparity estimation from stereo, occlusion estimation, and semantic segmentation.

Disparity Estimation Occlusion Estimation +3

RelViT: Concept-guided Vision Transformer for Visual Relational Reasoning

1 code implementation ICLR 2022 Xiaojian Ma, Weili Nie, Zhiding Yu, Huaizu Jiang, Chaowei Xiao, Yuke Zhu, Song-Chun Zhu, Anima Anandkumar

This task remains challenging for current deep learning algorithms since it requires addressing three key technical problems jointly: 1) identifying object entities and their properties, 2) inferring semantic relations between pairs of entities, and 3) generalizing to novel object-relation combinations, i. e., systematic generalization.

Human-Object Interaction Detection Object +5

Bongard-HOI: Benchmarking Few-Shot Visual Reasoning for Human-Object Interactions

1 code implementation CVPR 2022 Huaizu Jiang, Xiaojian Ma, Weili Nie, Zhiding Yu, Yuke Zhu, Song-Chun Zhu, Anima Anandkumar

A significant gap remains between today's visual pattern recognition models and human-level visual cognition especially when it comes to few-shot learning and compositional reasoning of novel concepts.

Benchmarking Few-Shot Image Classification +5

NeuFlow: Real-time, High-accuracy Optical Flow Estimation on Robots Using Edge Devices

1 code implementation15 Mar 2024 Zhiyong Zhang, Huaizu Jiang, Hanumant Singh

Given the features of the input images extracted at different spatial resolutions, global matching is employed to estimate an initial optical flow on the 1/16 resolution, capturing large displacement, which is then refined on the 1/8 resolution with lightweight CNN layers for better accuracy.

Activity Recognition Edge-computing +2

Diagnosing Human-object Interaction Detectors

1 code implementation16 Aug 2023 Fangrui Zhu, Yiming Xie, Weidi Xie, Huaizu Jiang

To address this issue, in this paper, we introduce a diagnosis toolbox to provide detailed quantitative break-down analysis of HOI detection models, inspired by the success of object detection diagnosis toolboxes.

Classification Human-Object Interaction Detection +3

Automatic adaptation of object detectors to new domains using self-training

1 code implementation CVPR 2019 Aruni RoyChowdhury, Prithvijit Chakrabarty, Ashish Singh, SouYoung Jin, Huaizu Jiang, Liangliang Cao, Erik Learned-Miller

Our results demonstrate the usefulness of incorporating hard examples obtained from tracking, the advantage of using soft-labels via distillation loss versus hard-labels, and show promising performance as a simple method for unsupervised domain adaptation of object detectors, with minimal dependence on hyper-parameters.

Knowledge Distillation Pedestrian Detection +1

Salient Object Detection: A Benchmark

no code implementations5 Jan 2015 Ali Borji, Ming-Ming Cheng, Huaizu Jiang, Jia Li

We extensively compare, qualitatively and quantitatively, 40 state-of-the-art models (28 salient object detection, 10 fixation prediction, 1 objectness, and 1 baseline) over 6 challenging datasets for the purpose of benchmarking salient object detection and segmentation methods.

Benchmarking Object +3

Salient Object Detection: A Survey

no code implementations18 Nov 2014 Ali Borji, Ming-Ming Cheng, Qibin Hou, Huaizu Jiang, Jia Li

Detecting and segmenting salient objects from natural scenes, often referred to as salient object detection, has attracted great interest in computer vision.

Object object-detection +4

Reasoning about Fine-grained Attribute Phrases using Reference Games

no code implementations ICCV 2017 Jong-Chyi Su, Chenyun Wu, Huaizu Jiang, Subhransu Maji

We collect a large dataset of such phrases by asking annotators to describe several visual differences between a pair of instances within a category.

Attribute Image Retrieval +1

Weakly Supervised Learning for Salient Object Detection

no code implementations29 Jan 2015 Huaizu Jiang

Given a set of background images and salient object images, we propose a solution toward jointly addressing the salient object existence and detection tasks.

Object object-detection +4

Salient Object Detection: A Discriminative Regional Feature Integration Approach

no code implementations CVPR 2013 Huaizu Jiang, Zejian yuan, Ming-Ming Cheng, Yihong Gong, Nanning Zheng, Jingdong Wang

Our method, which is based on multi-level image segmentation, utilizes the supervised learning approach to map the regional feature vector to a saliency score.

Image Segmentation Object +4

Unsupervised Hard Example Mining from Videos for Improved Object Detection

no code implementations ECCV 2018 SouYoung Jin, Aruni RoyChowdhury, Huaizu Jiang, Ashish Singh, Aditya Prasad, Deep Chakraborty, Erik Learned-Miller

In this work, we show how large numbers of hard negatives can be obtained {\em automatically} by analyzing the output of a trained detector on video sequences.

Face Detection object-detection +2

SportsSloMo: A New Benchmark and Baselines for Human-centric Video Frame Interpolation

no code implementations31 Aug 2023 Jiaben Chen, Huaizu Jiang

We re-train several state-of-the-art methods on our benchmark, and the results show a decrease in their accuracy compared to other datasets.

Panoptic Segmentation Video Frame Interpolation

Zero-shot Referring Expression Comprehension via Structural Similarity Between Images and Captions

no code implementations28 Nov 2023 Zeyu Han, Fangrui Zhu, Qianru Lao, Huaizu Jiang

Zero-shot referring expression comprehension aims at localizing bounding boxes in an image corresponding to the provided textual prompts, which requires: (i) a fine-grained disentanglement of complex visual scene and textual context, and (ii) a capacity to understand relationships among disentangled entities.

Disentanglement Referring Expression +2

HOI-Diff: Text-Driven Synthesis of 3D Human-Object Interactions using Diffusion Models

no code implementations11 Dec 2023 Xiaogang Peng, Yiming Xie, Zizhao Wu, Varun Jampani, Deqing Sun, Huaizu Jiang

We also develop an affordance prediction diffusion model (APDM) to predict the contacting area between the human and object during the interactions driven by the textual prompt.

Human-Object Interaction Detection Object

ODTFormer: Efficient Obstacle Detection and Tracking with Stereo Cameras Based on Transformer

no code implementations21 Mar 2024 Tianye Ding, Hongyu Li, Huaizu Jiang

In this paper, we propose ODTFormer, a Transformer-based model to address both obstacle detection and tracking problems.

Autonomous Navigation

Cannot find the paper you are looking for? You can Submit a new open access paper.