Search Results for author: Gang Hua

Found 111 papers, 49 papers with code

Topical Video Object Discovery from Key Frames by Modeling Word Co-occurrence Prior

no code implementations • CVPR 2013 • Gangqiang Zhao, Junsong Yuan, Gang Hua

We show that such data driven co-occurrence information from bottom-up can conveniently be incorporated in LDA with a Gaussian Markov prior, which combines top down probabilistic topic modeling with bottom up priors in a unified model.

Object Object Discovery +1

Paper
Add Code

Probabilistic Elastic Matching for Pose Variant Face Verification

no code implementations • CVPR 2013 • Haoxiang Li, Gang Hua, Zhe Lin, Jonathan Brandt, Jianchao Yang

By augmenting each feature with its location, a Gaussian mixture model (GMM) is trained to capture the spatialappearance distribution of all face images in the training corpus.

Face Recognition Face Verification

Paper
Add Code

Hash-SVM: Scalable Kernel Machines for Large-Scale Visual Classification

no code implementations • CVPR 2014 • Yadong Mu, Gang Hua, Wei Fan, Shih-Fu Chang

This paper presents a novel algorithm which uses compact hash bits to greatly improve the efficiency of non-linear kernel SVM in very large scale visual classification problems.

Classification General Classification

Paper
Add Code

Semi-supervised Relational Topic Model for Weakly Annotated Image Recognition in Social Media

no code implementations • CVPR 2014 • Zhenxing Niu, Gang Hua, Xinbo Gao, Qi Tian

In such way, we can efficiently leverage the loosely related tags, and build an intermediate level representation for a collection of weakly annotated images.

Paper
Add Code

Unsupervised One-Class Learning for Automatic Outlier Removal

no code implementations • CVPR 2014 • Wei Liu, Gang Hua, John R. Smith

Outliers are pervasive in many computer vision and pattern recognition problems.

One-class classifier

Paper
Add Code

Efficient Boosted Exemplar-based Face Detection

no code implementations • CVPR 2014 • Haoxiang Li, Zhe Lin, Jonathan Brandt, Xiaohui Shen, Gang Hua

Despite the fact that face detection has been studied intensively over the past several decades, the problem is still not completely solved.

Face Detection

Paper
Add Code

Hierarchical-PEP Model for Real-World Face Recognition

no code implementations • CVPR 2015 • Haoxiang Li, Gang Hua

We apply the PEP model hierarchically to decompose a face image into face parts at different levels of details to build pose-invariant part-based face representations.

Face Recognition Face Verification

Paper
Add Code

Similarity Learning on an Explicit Polynomial Kernel Feature Map for Person Re-Identification

no code implementations • CVPR 2015 • Dapeng Chen, Zejian yuan, Gang Hua, Nanning Zheng, Jingdong Wang

We follow the learning-to-rank methodology and learn a similarity function to maximize the difference between the similarity scores of matched and unmatched images for a same person.

Learning-To-Rank Patch Matching +1

Paper
Add Code

A Convolutional Neural Network Cascade for Face Detection

no code implementations • CVPR 2015 • Haoxiang Li, Zhe Lin, Xiaohui Shen, Jonathan Brandt, Gang Hua

To improve localization effectiveness, and reduce the number of candidates at later stages, we introduce a CNN-based calibration stage after each of the detection stages in the cascade.

Face Detection

Paper
Add Code

Multi-Class Multi-Annotator Active Learning With Robust Gaussian Process for Visual Recognition

no code implementations • ICCV 2015 • Chengjiang Long, Gang Hua

Based on the EP approximation inference, a generalized Expectation Maximization (GEM) algorithm is derived to estimate both the parameters for instances and the quality of each individual annotator.

Active Learning Bayesian Inference +2

Paper
Add Code

Learning Discriminative Reconstructions for Unsupervised Outlier Removal

no code implementations • ICCV 2015 • Yan Xia, Xudong Cao, Fang Wen, Gang Hua, Jian Sun

We study the problem of automatically removing outliers from noisy data, with application for removing outlier images from an image collection.

Paper
Add Code

Neural Aggregation Network for Video Face Recognition

no code implementations • CVPR 2017 • Jiaolong Yang, Peiran Ren, Dong-Qing Zhang, Dong Chen, Fang Wen, Hongdong Li, Gang Hua

The network takes a face video or face image set of a person with a variable number of face images as its input, and produces a compact, fixed-dimension feature representation for recognition.

Ranked #7 on Face Verification on IJB-A

Face Recognition Face Verification

Paper
Add Code

Counting Grid Aggregation for Event Retrieval and Recognition

no code implementations • 5 Apr 2016 • Zhanning Gao, Gang Hua, Dongqing Zhang, Jianru Xue, Nanning Zheng

Event retrieval and recognition in a large corpus of videos necessitates a holistic fixed-size visual representation at the video clip level that is comprehensive, compact, and yet discriminative.

Retrieval

Paper
Add Code

Ordinal Regression With Multiple Output CNN for Age Estimation

no code implementations • CVPR 2016 • Zhenxing Niu, Mo Zhou, Le Wang, Xinbo Gao, Gang Hua

To address the non-stationary property of aging patterns, age estimation can be cast as an ordinal regression problem.

Age Estimation Binary Classification +3

Paper
Add Code

A Multi-Level Contextual Model For Person Recognition in Photo Albums

no code implementations • CVPR 2016 • Haoxiang Li, Jonathan Brandt, Zhe Lin, Xiaohui Shen, Gang Hua

Our new framework enables efficient use of these complementary multi-level contextual cues to improve overall recognition rates on the photo album person recognition task, as demonstrated through state-of-the-art results on a challenging public dataset.

Person Recognition

Paper
Add Code

Supervised Transformer Network for Efficient Face Detection

no code implementations • 19 Jul 2016 • Dong Chen, Gang Hua, Fang Wen, Jian Sun

For real-time performance, we run the cascaded network only on regions of interests produced from a boosting cascade face detector.

Ranked #5 on Face Detection on PASCAL Face

Face Detection Region Proposal +1

Paper
Add Code

Revisiting Deep Intrinsic Image Decompositions

no code implementations • CVPR 2018 • Qingnan Fan, Jiaolong Yang, Gang Hua, Baoquan Chen, David Wipf

While invaluable for many computer vision applications, decomposing a natural image into intrinsic reflectance and shading layers represents a challenging, underdetermined inverse problem.

Paper
Add Code

Collaborative Deep Reinforcement Learning for Joint Object Search

no code implementations • CVPR 2017 • Xiangyu Kong, Bo Xin, Yizhou Wang, Gang Hua

We examine the problem of joint top-down active search of multiple objects under interaction, e. g., person riding a bicycle, cups held by the table, etc..

Active Object Localization Object +5

Paper
Add Code

StyleBank: An Explicit Representation for Neural Image Style Transfer

1 code implementation • CVPR 2017 • Dongdong Chen, Lu Yuan, Jing Liao, Nenghai Yu, Gang Hua

It also enables us to conduct incremental learning to add a new image style by learning a new filter bank while holding the auto-encoder fixed.

Incremental Learning Style Transfer

Paper
Code

Coherent Online Video Style Transfer

no code implementations • ICCV 2017 • Dongdong Chen, Jing Liao, Lu Yuan, Nenghai Yu, Gang Hua

Training a feed-forward network for fast neural style transfer of images is proven to be successful.

Image Stylization Video Style Transfer

Paper
Add Code

CVAE-GAN: Fine-Grained Image Generation through Asymmetric Training

3 code implementations • ICCV 2017 • Jianmin Bao, Dong Chen, Fang Wen, Houqiang Li, Gang Hua

Our approach models an image as a composition of label and latent attributes in a probabilistic model.

Attribute Data Augmentation +4

Paper
Code

Visual Attribute Transfer through Deep Image Analogy

5 code implementations • 2 May 2017 • Jing Liao, Yuan YAO, Lu Yuan, Gang Hua, Sing Bing Kang

We propose a new technique for visual attribute transfer across images that may have very different appearance but have perceptually similar semantic structure.

Attribute

1,365

Paper
Code

Hidden Talents of the Variational Autoencoder

1 code implementation • 16 Jun 2017 • Bin Dai, Yu Wang, John Aston, Gang Hua, David Wipf

Variational autoencoders (VAE) represent a popular, flexible form of deep generative model that can be stochastically fit to samples from a given random process using an information-theoretic variational bound on the true underlying distribution.

Dimensionality Reduction

Paper
Code

ER3: A Unified Framework for Event Retrieval, Recognition and Recounting

no code implementations • CVPR 2017 • Zhanning Gao, Gang Hua, Dong-Qing Zhang, Nebojsa Jojic, Le Wang, Jianru Xue, Nanning Zheng

We develop a unified framework for complex event retrieval, recognition and recounting.

Language Modelling Retrieval

Paper
Add Code

Correlational Gaussian Processes for Cross-Domain Visual Recognition

no code implementations • CVPR 2017 • Chengjiang Long, Gang Hua

A set of correlational tensors is adopted to model the relationship within a single domain as well as across multiple domains.

Gaussian Processes

Paper
Add Code

Order-Preserving Wasserstein Distance for Sequence Matching

no code implementations • CVPR 2017 • Bing Su, Gang Hua

We present a new distance measure between sequences that can tackle local temporal distortion and periodic sequences with arbitrary starting points.

Paper
Add Code

A Generic Deep Architecture for Single Image Reflection Removal and Image Smoothing

1 code implementation • ICCV 2017 • Qingnan Fan, Jiaolong Yang, Gang Hua, Baoquan Chen, David Wipf

This paper proposes a deep neural network structure that exploits edge information in addressing representative low-level vision tasks such as layer separation and image filtering.

image smoothing Reflection Removal +1

109

Paper
Code

Fast, Accurate Thin-Structure Obstacle Detection for Autonomous Mobile Robots

no code implementations • 14 Aug 2017 • Chen Zhou, Jiaolong Yang, Chunshui Zhao, Gang Hua

This work is devoted to a task that is indispensable for safety yet was largely overlooked in the past -- detecting obstacles that are of very thin structures, such as wires, cables and tree branches.

Self-Driving Cars Visual Odometry

Paper
Add Code

Hierarchical Multimodal LSTM for Dense Visual-Semantic Embedding

no code implementations • ICCV 2017 • Zhenxing Niu, Mo Zhou, Le Wang, Xinbo Gao, Gang Hua

We address the problem of dense visual-semantic embedding that maps not only full sentences and whole images but also phrases within sentences and salient regions within images into a multimodal embedding space.

Sentence

Paper
Add Code

Understanding and Predicting The Attractiveness of Human Action Shot

no code implementations • 2 Nov 2017 • Bin Dai, Baoyuan Wang, Gang Hua

Selecting attractive photos from a human action shot sequence is quite challenging, because of the subjective nature of the "attractiveness", which is mainly a combined factor of human pose in action and the background.

Paper
Add Code

Semi-supervised FusedGAN for Conditional Image Generation

no code implementations • ECCV 2018 • Navaneeth Bodla, Gang Hua, Rama Chellappa

We achieve this by fusing two generators: one for unconditional image generation, and the other for conditional image generation, where the two partly share a common latent space thereby disentangling the generation.

Attribute Conditional Image Generation +2

Paper
Add Code

Stereoscopic Neural Style Transfer

no code implementations • CVPR 2018 • Dongdong Chen, Lu Yuan, Jing Liao, Nenghai Yu, Gang Hua

This paper presents the first attempt at stereoscopic neural style transfer, which responds to the emerging demand for 3D movies or AR/VR.

Style Transfer

Paper
Add Code

Attention-based Temporal Weighted Convolutional Neural Network for Action Recognition

no code implementations • 19 Mar 2018 • Jinliang Zang, Le Wang, Ziyi Liu, Qilin Zhang, Zhenxing Niu, Gang Hua, Nanning Zheng

Research in human action recognition has accelerated significantly since the introduction of powerful machine learning tools such as Convolutional Neural Networks (CNNs).

Action Recognition Temporal Action Localization

Paper
Add Code

Stacked Cross Attention for Image-Text Matching

6 code implementations • ECCV 2018 • Kuang-Huei Lee, Xi Chen, Gang Hua, Houdong Hu, Xiaodong He

Prior work either simply aggregates the similarity of all possible pairs of regions and words without attending differentially to more and less important words or regions, or uses a multi-step attentional process to capture limited number of semantic alignments which is less interpretable.

Ranked #4 on Image Retrieval on PhotoChat

Image Retrieval Image-text matching +5

521

Paper
Code

Towards Open-Set Identity Preserving Face Synthesis

no code implementations • CVPR 2018 • Jianmin Bao, Dong Chen, Fang Wen, Houqiang Li, Gang Hua

We then recombine the identity vector and the attribute vector to synthesize a new face of the subject with the extracted attribute.

Attribute Face Generation

Paper
Add Code

Decouple Learning for Parameterized Image Operators

1 code implementation • ECCV 2018 • Qingnan Fan, Dong-Dong Chen, Lu Yuan, Gang Hua, Nenghai Yu, Baoquan Chen

Many different deep networks have been used to approximate, accelerate or improve traditional image operators, such as image smoothing, super-resolution and denoising.

Denoising image smoothing +1

Paper
Code

LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks

1 code implementation • ECCV 2018 • Dongqing Zhang, Jiaolong Yang, Dongqiangzi Ye, Gang Hua

Although weight and activation quantization is an effective approach for Deep Neural Network (DNN) compression and has a lot of potentials to increase inference speed leveraging bit-operations, there is still a noticeable gap in terms of prediction accuracy between the quantized model and the full-precision model.

Quantization

235

Paper
Code

Reinforced Pipeline Optimization: Behaving Optimally with Non-Differentiabilities

no code implementations • 27 Sep 2018 • Aijun Bai, Dongdong Chen, Gang Hua, Lu Yuan

Many machine learning systems are implemented as pipelines.

object-detection Object Detection

Paper
Add Code

Conscious Inference for Object Detection

no code implementations • 27 Sep 2018 • Jiahuan Zhou, Nikolaos Karianakis, Ying Wu, Gang Hua

Current Convolutional Neural Network (CNN)-based object detection models adopt strictly feedforward inference to predict the final detection results.

6D Pose Estimation using RGB Object +2

Paper
Add Code

Gated Context Aggregation Network for Image Dehazing and Deraining

1 code implementation • 21 Nov 2018 • Dongdong Chen, Mingming He, Qingnan Fan, Jing Liao, Liheng Zhang, Dongdong Hou, Lu Yuan, Gang Hua

Image dehazing aims to recover the uncorrupted content from a hazy image.

Ranked #1 on Rain Removal on DID-MDN

Image Dehazing Rain Removal

219

Paper
Code

A Compositional Textual Model for Recognition of Imperfect Word Images

no code implementations • 27 Nov 2018 • Wei Tang, John Corring, Ying Wu, Gang Hua

Printed text recognition is an important problem for industrial OCR systems.

Optical Character Recognition (OCR) Printed Text Recognition

Paper
Add Code

A General Decoupled Learning Framework for Parameterized Image Operators

no code implementations • 11 Jul 2019 • Qingnan Fan, Dong-Dong Chen, Lu Yuan, Gang Hua, Nenghai Yu, Baoquan Chen

To overcome this limitation, we propose a new decoupled learning algorithm to learn from the operator parameters to dynamically adjust the weights of a deep network for image operators, denoted as the base network.

Paper
Add Code

Any-Precision Deep Neural Networks

2 code implementations • 17 Nov 2019 • Haichao Yu, Haoxiang Li, Honghui Shi, Thomas S. Huang, Gang Hua

When all layers are set to low-bits, we show that the model achieved accuracy comparable to dedicated models trained at the same precision.

Paper
Code

Ladder Loss for Coherent Visual-Semantic Embedding

2 code implementations • 18 Nov 2019 • Mo Zhou, Zhenxing Niu, Le Wang, Zhanning Gao, Qilin Zhang, Gang Hua

For visual-semantic embedding, the existing methods normally treat the relevance between queries and candidates in a bipolar way -- relevant or irrelevant, and all "irrelevant" candidates are uniformly pushed away from the query by an equal margin in the embedding space, regardless of their various proximity to the query.

Retrieval

Paper
Code

Calibrated Domain-Invariant Learning for Highly Generalizable Large Scale Re-Identification

1 code implementation • 26 Nov 2019 • Ye Yuan, Wuyang Chen, Tianlong Chen, Yang Yang, Zhou Ren, Zhangyang Wang, Gang Hua

Many real-world applications, such as city-scale traffic monitoring and control, requires large-scale re-identification.

Paper
Code

Adversarial Ranking Attack and Defense

3 code implementations • ECCV 2020 • Mo Zhou, Zhenxing Niu, Le Wang, Qilin Zhang, Gang Hua

In this paper, we propose two attacks against deep ranking systems, i. e., Candidate Attack and Query Attack, that can raise or lower the rank of chosen candidates by adversarial perturbations.

Adversarial Attack Image Retrieval

Paper
Code

SaccadeNet: A Fast and Accurate Object Detector

1 code implementation • CVPR 2020 • Shiyi Lan, Zhou Ren, Yi Wu, Larry S. Davis, Gang Hua

Object detection is an essential step towards holistic scene understanding.

Ranked #194 on Object Detection on COCO test-dev

Object object-detection +2

Paper
Code

gDLS*: Generalized Pose-and-Scale Estimation Given Scale and Gravity Priors

no code implementations • CVPR 2020 • Victor Fragoso, Joseph DeGol, Gang Hua

Many real-world applications in augmented reality (AR), 3D mapping, and robotics require both fast and accurate estimation of camera poses and scales from multiple images captured by multiple cameras or a single moving camera.

Paper
Add Code

Few-Shot Open-Set Recognition using Meta-Learning

1 code implementation • CVPR 2020 • Bo Liu, Hao Kang, Haoxiang Li, Gang Hua, Nuno Vasconcelos

It is argued that the classic softmax classifier is a poor solution for open-set recognition, since it tends to overfit on the training classes.

Classification General Classification +3

Paper
Code

Improving Person Re-identification with Iterative Impression Aggregation

no code implementations • 21 Sep 2020 • Dengpan Fu, Bo Xin, Jingdong Wang, Dong-Dong Chen, Jianmin Bao, Gang Hua, Houqiang Li

Not only does such a simple method improve the performance of the baseline models, it also achieves comparable performance with latest advanced re-ranking methods.

Person Re-Identification Re-Ranking

Paper
Add Code

Two-Stream Consensus Network for Weakly-Supervised Temporal Action Localization

no code implementations • ECCV 2020 • Yuanhao Zhai, Le Wang, Wei Tang, Qilin Zhang, Junsong Yuan, Gang Hua

Weakly-supervised Temporal Action Localization (W-TAL) aims to classify and localize all action instances in an untrimmed video under only video-level supervision.

Ranked #12 on Weakly Supervised Action Localization on THUMOS14

Vocal Bursts Valence Prediction Weakly Supervised Action Localization +2

Paper
Add Code

Passport-aware Normalization for Deep Model Protection

1 code implementation • NeurIPS 2020 • Jie Zhang, Dongdong Chen, Jing Liao, Weiming Zhang, Gang Hua, Nenghai Yu

Only when the model IP is suspected to be stolen by someone, the private passport-aware branch is added back for ownership verification.

Model Compression

Paper
Code

LG-GAN: Label Guided Adversarial Network for Flexible Targeted Attack of Point Cloud-based Deep Networks

no code implementations • 1 Nov 2020 • Hang Zhou, Dongdong Chen, Jing Liao, Weiming Zhang, Kejiang Chen, Xiaoyi Dong, Kunlin Liu, Gang Hua, Nenghai Yu

To overcome these shortcomings, this paper proposes a novel label guided adversarial network (LG-GAN) for real-time flexible targeted point cloud attack.

Paper
Add Code

Efficient Semantic Image Synthesis via Class-Adaptive Normalization

1 code implementation • 8 Dec 2020 • Zhentao Tan, Dongdong Chen, Qi Chu, Menglei Chai, Jing Liao, Mingming He, Lu Yuan, Gang Hua, Nenghai Yu

Spatially-adaptive normalization (SPADE) is remarkably successful recently in conditional semantic image synthesis \cite{park2019semantic}, which modulates the normalized activation with spatially-varying transformations learned from semantic layouts, to prevent the semantic information from being washed away.

Image Generation

Paper
Code

Practical Order Attack in Deep Ranking

no code implementations • 1 Jan 2021 • Mo Zhou, Le Wang, Zhenxing Niu, Qilin Zhang, Xu Yinghui, Nanning Zheng, Gang Hua

The objective of this paper is to formalize and practically implement a new adversarial attack against deep ranking systems, i. e., the Order Attack, which covertly alters the relative order of a selected set of candidates according to a permutation vector predefined by the attacker, with only limited interference to other unrelated candidates.

Adversarial Attack Image Retrieval

Paper
Add Code

Meta Pairwise Relationship Distillation for Unsupervised Person Re-Identification

1 code implementation • ICCV 2021 • Haoxuanye Ji, Le Wang, Sanping Zhou, Wei Tang, Nanning Zheng, Gang Hua

Unsupervised person re-identification (Re-ID) remains challenging due to the lack of ground-truth labels.

Unsupervised Person Re-Identification

Paper
Code

Deep Model Intellectual Property Protection via Deep Watermarking

1 code implementation • 8 Mar 2021 • Jie Zhang, Dongdong Chen, Jing Liao, Weiming Zhang, Huamin Feng, Gang Hua, Nenghai Yu

By jointly training the target model and watermark embedding, the extra barrier can even be absorbed into the target model.

Paper
Code

Practical Relative Order Attack in Deep Ranking

2 code implementations • ICCV 2021 • Mo Zhou, Le Wang, Zhenxing Niu, Qilin Zhang, Yinghui Xu, Nanning Zheng, Gang Hua

In this paper, we formulate a new adversarial attack against deep ranking systems, i. e., the Order Attack, which covertly alters the relative order among a selected set of candidates according to an attacker-specified permutation, with limited interference to other unrelated candidates.

Adversarial Attack

Paper
Code

Diverse Semantic Image Synthesis via Probability Distribution Modeling

1 code implementation • CVPR 2021 • Zhentao Tan, Menglei Chai, Dongdong Chen, Jing Liao, Qi Chu, Bin Liu, Gang Hua, Nenghai Yu

In this paper, we propose a novel diverse semantic image synthesis framework from the perspective of semantic class distributions, which naturally supports diverse generation at semantic or even instance level.

Ranked #1 on Image-to-Image Translation on ADE20K Labels-to-Photos (LPIPS metric)

Image-to-Image Translation

Paper
Code

Beyond Visual Attractiveness: Physically Plausible Single Image HDR Reconstruction for Spherical Panoramas

no code implementations • 24 Mar 2021 • Wei Wei, Li Guan, Yue Liu, Hao Kang, Haoxiang Li, Ying Wu, Gang Hua

By the proposed physical regularization, our method can generate HDRs which are not only visually appealing but also physically plausible.

HDR Reconstruction Single-shot HDR Reconstruction

Paper
Add Code

ACSNet: Action-Context Separation Network for Weakly Supervised Temporal Action Localization

no code implementations • 28 Mar 2021 • Ziyi Liu, Le Wang, Qilin Zhang, Wei Tang, Junsong Yuan, Nanning Zheng, Gang Hua

In this paper, we introduce an Action-Context Separation Network (ACSNet) that explicitly takes into account context for accurate action localization.

Ranked #7 on Weakly Supervised Action Localization on THUMOS’14

Video Polyp Segmentation Weakly Supervised Action Localization +2

Paper
Add Code

Weakly Supervised Temporal Action Localization Through Learning Explicit Subspaces for Action and Context

no code implementations • 30 Mar 2021 • Ziyi Liu, Le Wang, Wei Tang, Junsong Yuan, Nanning Zheng, Gang Hua

To address this challenge, we introduce a framework that learns two feature subspaces respectively for actions and their context.

Action Recognition Weakly-supervised Temporal Action Localization +1

Paper
Add Code

SGCN:Sparse Graph Convolution Network for Pedestrian Trajectory Prediction

3 code implementations • 4 Apr 2021 • Liushuai Shi, Le Wang, Chengjiang Long, Sanping Zhou, Mo Zhou, Zhenxing Niu, Gang Hua

Meanwhile, we use a sparse directed temporal graph to model the motion tendency, thus to facilitate the prediction based on the observed direction.

Pedestrian Trajectory Prediction Trajectory Prediction

Paper
Code

E2Style: Improve the Efficiency and Effectiveness of StyleGAN Inversion

2 code implementations • 15 Apr 2021 • Tianyi Wei, Dongdong Chen, Wenbo Zhou, Jing Liao, Weiming Zhang, Lu Yuan, Gang Hua, Nenghai Yu

This paper studies the problem of StyleGAN inversion, which plays an essential role in enabling the pretrained StyleGAN to be used for real image editing tasks.

Face Parsing

143

Paper
Code

Sparse Pose Trajectory Completion

no code implementations • 1 May 2021 • Bo Liu, Mandar Dixit, Roland Kwitt, Gang Hua, Nuno Vasconcelos

In the absence of dense pose sampling in image space, these latent space trajectories provide cross-modal guidance for learning.

Novel View Synthesis Object

Paper
Add Code

Breadcrumbs: Adversarial Class-Balanced Sampling for Long-tailed Recognition

no code implementations • 1 May 2021 • Bo Liu, Haoxiang Li, Hao Kang, Gang Hua, Nuno Vasconcelos

It is shown that, unlike class-balanced sampling, this is an adversarial augmentation strategy.

Paper
Add Code

GistNet: a Geometric Structure Transfer Network for Long-Tailed Recognition

no code implementations • ICCV 2021 • Bo Liu, Haoxiang Li, Hao Kang, Gang Hua, Nuno Vasconcelos

A new learning algorithm is then proposed for GeometrIc Structure Transfer (GIST), with resort to a combination of loss functions that combine class-balanced and random sampling to guarantee that, while overfitting to the popular classes is restricted to geometric parameters, it is leveraged to transfer class geometry from popular to few-shot classes.

Transfer Learning

Paper
Add Code

Semi-supervised Long-tailed Recognition using Alternate Sampling

no code implementations • 1 May 2021 • Bo Liu, Haoxiang Li, Hao Kang, Nuno Vasconcelos, Gang Hua

A consistency loss has been introduced to limit the impact from unlabeled data while leveraging them to update the feature embedding.

Paper
Add Code

Learning Dynamics via Graph Neural Networks for Human Pose Estimation and Tracking

no code implementations • CVPR 2021 • Yiding Yang, Zhou Ren, Haoxiang Li, Chunluan Zhou, Xinchao Wang, Gang Hua

In this paper, we propose a novel online approach to learning the pose dynamics, which are independent of pose detections in current fame, and hence may serve as a robust estimation even in challenging scenarios including occlusion.

Multi-Person Pose Estimation Multi-Person Pose Estimation and Tracking +1

Paper
Add Code

Adversarial Attack and Defense in Deep Ranking

1 code implementation • 7 Jun 2021 • Mo Zhou, Le Wang, Zhenxing Niu, Qilin Zhang, Nanning Zheng, Gang Hua

In this paper, we propose two attacks against deep ranking systems, i. e., Candidate Attack and Query Attack, that can raise or lower the rank of chosen candidates by adversarial perturbations.

Adversarial Attack Adversarial Robustness

Paper
Code

Video Imprint

no code implementations • 7 Jun 2021 • Zhanning Gao, Le Wang, Nebojsa Jojic, Zhenxing Niu, Nanning Zheng, Gang Hua

In the proposed framework, a dedicated feature alignment module is incorporated for redundancy removal across frames to produce the tensor representation, i. e., the video imprint.

Language Modelling Retrieval

Paper
Add Code

Learning View Selection for 3D Scenes

no code implementations • CVPR 2021 • Yifan Sun, QiXing Huang, Dun-Yu Hsiao, Li Guan, Gang Hua

Efficient 3D space sampling to represent an underlying3D object/scene is essential for 3D vision, robotics, and be-yond.

Paper
Add Code

SGCN: Sparse Graph Convolution Network for Pedestrian Trajectory Prediction

no code implementations • CVPR 2021 • Liushuai Shi, Le Wang, Chengjiang Long, Sanping Zhou, Mo Zhou, Zhenxing Niu, Gang Hua

Specifically, the SGCN explicitly models the sparse directed interaction with a sparse directed spatial graph to capture adaptive interaction pedestrians.

Pedestrian Trajectory Prediction Trajectory Prediction

Paper
Add Code

Enriching Local and Global Contexts for Temporal Action Localization

1 code implementation • ICCV 2021 • Zixin Zhu, Wei Tang, Le Wang, Nanning Zheng, Gang Hua

We explore two existing models to be the P-Net in our experiments.

Action Classification Retrieval +2

Paper
Code

Unlimited Neighborhood Interaction for Heterogeneous Trajectory Prediction

1 code implementation • ICCV 2021 • Fang Zheng, Le Wang, Sanping Zhou, Wei Tang, Zhenxing Niu, Nanning Zheng, Gang Hua

Specifically, the proposed unlimited neighborhood interaction module generates the fused-features of all agents involved in an interaction simultaneously, which is adaptive to any number of agents and any range of interaction area.

Graph Attention Trajectory Prediction

Paper
Code

Poison Ink: Robust and Invisible Backdoor Attack

1 code implementation • 5 Aug 2021 • Jie Zhang, Dongdong Chen, Qidong Huang, Jing Liao, Weiming Zhang, Huamin Feng, Gang Hua, Nenghai Yu

As the image structure can keep its semantic meaning during the data transformation, such trigger pattern is inherently robust to data transformations.

Backdoor Attack Data Poisoning

Paper
Code

Exploring Structure Consistency for Deep Model Watermarking

no code implementations • 5 Aug 2021 • Jie Zhang, Dongdong Chen, Jing Liao, Han Fang, Zehua Ma, Weiming Zhang, Gang Hua, Nenghai Yu

However, little attention has been devoted to the protection of DNNs in image processing tasks.

Data Augmentation

Paper
Add Code

DSSL: Deep Surroundings-person Separation Learning for Text-based Person Retrieval

1 code implementation • 12 Sep 2021 • Aichun Zhu, Zijie Wang, Yifeng Li, Xili Wan, Jing Jin, Tian Wang, Fangqiang Hu, Gang Hua

Many previous methods on text-based person retrieval tasks are devoted to learning a latent common space mapping, with the purpose of extracting modality-invariant features from both visual and textual modality.

Ranked #6 on Text based Person Retrieval on RSTPReid

Person Retrieval Retrieval +2

Paper
Code

Weakly-guided Self-supervised Pretraining for Temporal Activity Detection

1 code implementation • 26 Nov 2021 • Kumara Kahatapitiya, Zhou Ren, Haoxiang Li, Zhenyu Wu, Michael S. Ryoo, Gang Hua

However, such pretrained models are not ideal for downstream detection, due to the disparity between the pretraining and the downstream fine-tuning tasks.

Ranked #3 on Action Detection on Charades

Action Detection Activity Detection +2

Paper
Code

Robust Pose Estimation in Crowded Scenes with Direct Pose-Level Inference

1 code implementation • NeurIPS 2021 • Dongkai Wang, Shiliang Zhang, Gang Hua

Instead of inferring individual keypoints, the Pose-level Inference Network (PINet) directly infers the complete pose cues for a person from his/her visible body parts.

Multi-Person Pose Estimation

Paper
Code

Implicit Autoencoder for Point-Cloud Self-Supervised Representation Learning

1 code implementation • ICCV 2023 • Siming Yan, Zhenpei Yang, Haoxiang Li, Chen Song, Li Guan, Hao Kang, Gang Hua, QiXing Huang

The most popular and accessible 3D representation, i. e., point clouds, involves discrete samples of the underlying continuous 3D surface.

Ranked #5 on 3D Point Cloud Linear Classification on ModelNet40 (using extra training data)

3D Point Cloud Classification 3D Point Cloud Linear Classification +3

Paper
Code

Progressive Backdoor Erasing via connecting Backdoor and Adversarial Attacks

no code implementations • CVPR 2023 • Bingxu Mu, Zhenxing Niu, Le Wang, Xue Wang, Rong Jin, Gang Hua

Deep neural networks (DNNs) are known to be vulnerable to both backdoor attacks as well as adversarial attacks.

backdoor defense

Paper
Add Code

E^2TAD: An Energy-Efficient Tracking-based Action Detector

1 code implementation • 9 Apr 2022 • Xin Hu, Zhenyu Wu, Hao-Yu Miao, Siqi Fan, Taiyu Long, Zhenyu Hu, Pengcheng Pi, Yi Wu, Zhou Ren, Zhangyang Wang, Gang Hua

Video action detection (spatio-temporal action localization) is usually the starting point for human-centric intelligent analysis of videos nowadays.

Fine-Grained Action Detection object-detection +3

Paper
Code

Social Interpretable Tree for Pedestrian Trajectory Prediction

1 code implementation • 26 May 2022 • Liushuai Shi, Le Wang, Chengjiang Long, Sanping Zhou, Fang Zheng, Nanning Zheng, Gang Hua

Understanding the multiple socially-acceptable future behaviors is an essential task for many vision applications.

Pedestrian Trajectory Prediction Trajectory Prediction

Paper
Code

PointCAT: Contrastive Adversarial Training for Robust Point Cloud Recognition

no code implementations • 16 Sep 2022 • Qidong Huang, Xiaoyi Dong, Dongdong Chen, Hang Zhou, Weiming Zhang, Kui Zhang, Gang Hua, Nenghai Yu

Notwithstanding the prominent performance achieved in various applications, point cloud recognition models have often suffered from natural corruptions and adversarial perturbations.

Paper
Add Code

Exploring Discrete Diffusion Models for Image Captioning

1 code implementation • 21 Nov 2022 • Zixin Zhu, Yixuan Wei, JianFeng Wang, Zhe Gan, Zheng Zhang, Le Wang, Gang Hua, Lijuan Wang, Zicheng Liu, Han Hu

The image captioning task is typically realized by an auto-regressive method that decodes the text tokens one by one.

Image Captioning Image Generation

Paper
Code

Boosted Dynamic Neural Networks

1 code implementation • 30 Nov 2022 • Haichao Yu, Haoxiang Li, Gang Hua, Gao Huang, Humphrey Shi

To optimize the model, these prediction heads together with the network backbone are trained on every batch of training data.

Paper
Code

DDM-NET: End-to-end learning of keypoint feature Detection, Description and Matching for 3D localization

1 code implementation • 8 Dec 2022 • Xiangyu Xu, Li Guan, Enrique Dunn, Haoxiang Li, Gang Hua

In this paper, we propose an end-to-end framework that jointly learns keypoint detection, descriptor representation and cross-frame matching for the task of image-based 3D localization.

Keypoint Detection

Paper
Code

Sparse Instance Conditioned Multimodal Trajectory Prediction

no code implementations • ICCV 2023 • Yonghao Dong, Le Wang, Sanping Zhou, Gang Hua

Specifically, SICNet learns comprehensive sparse instances, i. e., representative points of the future trajectory, through a mask generated by a long short-term memory encoder and uses the memory mechanism to store and retrieve such sparse instances.

Future prediction Pedestrian Trajectory Prediction +1

Paper
Add Code

Trajectory Unified Transformer for Pedestrian Trajectory Prediction

1 code implementation • ICCV 2023 • Liushuai Shi, Le Wang, Sanping Zhou, Gang Hua

Pedestrian trajectory prediction is an essentially connecting link to understanding human behavior.

Pedestrian Trajectory Prediction Trajectory Prediction

Paper
Code

Learning from Noisy Pseudo Labels for Semi-Supervised Temporal Action Localization

1 code implementation • ICCV 2023 • Kun Xia, Le Wang, Sanping Zhou, Gang Hua, Wei Tang

To this end, we propose a unified framework, termed Noisy Pseudo-Label Learning, to handle both location biases and category errors.

Pseudo Label Temporal Action Localization

Paper
Code

Parallel Attention Interaction Network for Few-Shot Skeleton-Based Action Recognition

no code implementations • ICCV 2023 • Xingyu Liu, Sanping Zhou, Le Wang, Gang Hua

Learning discriminative features from very few labeled samples to identify novel classes has received increasing attention in skeleton-based action recognition.

Action Recognition Skeleton Based Action Recognition

Paper
Add Code

Diversity-Aware Meta Visual Prompting

1 code implementation • CVPR 2023 • Qidong Huang, Xiaoyi Dong, Dongdong Chen, Weiming Zhang, Feifei Wang, Gang Hua, Nenghai Yu

We present Diversity-Aware Meta Visual Prompting~(DAM-VP), an efficient and effective prompting method for transferring pre-trained models to downstream tasks with frozen backbone.

Visual Prompting

Paper
Code

MotionTrack: Learning Robust Short-term and Long-term Motions for Multi-Object Tracking

no code implementations • CVPR 2023 • Zheng Qin, Sanping Zhou, Le Wang, Jinghai Duan, Gang Hua, Wei Tang

For dense crowds, we design a novel Interaction Module to learn interaction-aware motions from short-term trajectories, which can estimate the complex movement of each target.

motion prediction Multi-Object Tracking

Paper
Add Code

Regularizing Second-Order Influences for Continual Learning

1 code implementation • CVPR 2023 • Zhicheng Sun, Yadong Mu, Gang Hua

Continual learning aims to learn on non-stationary data streams without catastrophically forgetting previous knowledge.

Continual Learning

Paper
Code

Designing a Better Asymmetric VQGAN for StableDiffusion

2 code implementations • 7 Jun 2023 • Zixin Zhu, Xuelu Feng, Dongdong Chen, Jianmin Bao, Le Wang, Yinpeng Chen, Lu Yuan, Gang Hua

The training cost of our asymmetric VQGAN is cheap, and we only need to retrain a new asymmetric decoder while keeping the vanilla VQGAN encoder and StableDiffusion unchanged.

Image Inpainting

959

Paper
Code

HQ-50K: A Large-scale, High-quality Dataset for Image Restoration

1 code implementation • 8 Jun 2023 • Qinhong Yang, Dongdong Chen, Zhentao Tan, Qiankun Liu, Qi Chu, Jianmin Bao, Lu Yuan, Gang Hua, Nenghai Yu

This paper introduces a new large-scale image restoration dataset, called HQ-50K, which contains 50, 000 high-quality images with rich texture details and semantic diversity.

Denoising Image Restoration +2

Paper
Code

Improving Adversarial Robustness of Masked Autoencoders via Test-time Frequency-domain Prompting

1 code implementation • ICCV 2023 • Qidong Huang, Xiaoyi Dong, Dongdong Chen, Yinpeng Chen, Lu Yuan, Gang Hua, Weiming Zhang, Nenghai Yu

Based on our analysis, we provide a simple yet effective way to boost the adversarial robustness of MAE.

Adversarial Robustness

Paper
Code

SOAR: Scene-debiasing Open-set Action Recognition

1 code implementation • ICCV 2023 • Yuanhao Zhai, Ziyi Liu, Zhenyu Wu, Yi Wu, Chunluan Zhou, David Doermann, Junsong Yuan, Gang Hua

The former prevents the decoder from reconstructing the video background given video features, and thus helps reduce the background information in feature learning.

Open Set Action Recognition Scene Classification

Paper
Code

Flexible Visual Recognition by Evidential Modeling of Confusion and Ignorance

no code implementations • ICCV 2023 • Lei Fan, Bo Liu, Haoxiang Li, Ying Wu, Gang Hua

First, prediction uncertainty should be separately quantified as confusion depicting inter-class uncertainties and ignorance identifying out-of-distribution samples.

Decision Making

Paper
Add Code

HairCLIPv2: Unifying Hair Editing via Proxy Feature Blending

1 code implementation • ICCV 2023 • Tianyi Wei, Dongdong Chen, Wenbo Zhou, Jing Liao, Weiming Zhang, Gang Hua, Nenghai Yu

Even though they can enable very fine-grained local control, such interaction modes are inefficient for the editing conditions that can be easily specified by language descriptions or reference images.

Attribute

134

Paper
Code

Evidential Active Recognition: Intelligent and Prudent Open-World Embodied Perception

no code implementations • 23 Nov 2023 • Lei Fan, Mingfu Liang, Yunxuan Li, Gang Hua, Ying Wu

Active recognition enables robots to intelligently explore novel observations, thereby acquiring more information while circumventing undesired viewing conditions.

Uncertainty Quantification

Paper
Add Code

Sparse Pedestrian Character Learning for Trajectory Prediction

no code implementations • 27 Nov 2023 • Yonghao Dong, Le Wang, Sanpin Zhou, Gang Hua, Changyin Sun

Specifically, TSNet learns the negative-removed characters in the sparse character representation stream to improve the trajectory embedding obtained in the trajectory representation stream.

Autonomous Driving Pedestrian Trajectory Prediction +1

Paper
Add Code

UGG: Unified Generative Grasping

1 code implementation • 28 Nov 2023 • Jiaxin Lu, Hao Kang, Haoxiang Li, Bo Liu, Yiding Yang, QiXing Huang, Gang Hua

Generation-based methods that generate grasping postures conditioned on the object can often produce diverse grasping, but they are insufficient for high grasping success due to lack of discriminative information.

Grasp Generation Object

530

Paper
Code

DL3DV-10K: A Large-Scale Scene Dataset for Deep Learning-based 3D Vision

no code implementations • 26 Dec 2023 • Lu Ling, Yichen Sheng, Zhi Tu, Wentian Zhao, Cheng Xin, Kun Wan, Lantao Yu, Qianyu Guo, Zixun Yu, Yawen Lu, Xuanmao Li, Xingpeng Sun, Rohan Ashok, Aniruddha Mukherjee, Hao Kang, Xiangrui Kong, Gang Hua, Tianyi Zhang, Bedrich Benes, Aniket Bera

We have witnessed significant progress in deep learning-based 3D vision, ranging from neural radiance field (NeRF) based 3D representation learning to applications in novel view synthesis (NVS).

Novel View Synthesis Representation Learning

Paper
Add Code

Jailbreaking Attack against Multimodal Large Language Model

1 code implementation • 4 Feb 2024 • Zhenxing Niu, Haodong Ren, Xinbo Gao, Gang Hua, Rong Jin

This paper focuses on jailbreaking attacks against multi-modal large language models (MLLMs), seeking to elicit MLLMs to generate objectionable responses to harmful user queries.

Language Modelling Large Language Model

Paper
Code

Deployment Prior Injection for Run-time Calibratable Object Detection

no code implementations • 27 Feb 2024 • Mo Zhou, Yiding Yang, Haoxiang Li, Vishal M. Patel, Gang Hua

With a strong alignment between the training and test distributions, object relation as a context prior facilitates object detection.

Object object-detection +1

Paper
Add Code

Recurrent Aligned Network for Generalized Pedestrian Trajectory Prediction

no code implementations • 9 Mar 2024 • Yonghao Dong, Le Wang, Sanping Zhou, Gang Hua, Changyin Sun

Previous studies have tried to tackle this problem by leveraging a portion of the trajectory data from the target domain to adapt the model.

Domain Adaptation Pedestrian Trajectory Prediction +1

Paper
Add Code

Boosting Semi-Supervised Temporal Action Localization by Learning from Non-Target Classes

no code implementations • 17 Mar 2024 • Kun Xia, Le Wang, Sanping Zhou, Gang Hua, Wei Tang

To this end, we first devise innovative strategies to adaptively select high-quality positive and negative classes from the label space, by modeling both the confidence and rank of a class in relation to those of the target class.

Temporal Action Localization

Paper
Add Code

Exploring Pre-trained Text-to-Video Diffusion Models for Referring Video Object Segmentation

1 code implementation • 18 Mar 2024 • Zixin Zhu, Xuelu Feng, Dongdong Chen, Junsong Yuan, Chunming Qiao, Gang Hua

We hypothesize that the latent representation learned from a pretrained generative T2V model encapsulates rich semantics and coherent temporal correspondences, thereby naturally facilitating video understanding.

Referring Video Object Segmentation Semantic Segmentation +2

Paper
Code

Transformer based Pluralistic Image Completion with Reduced Information Loss

1 code implementation • 31 Mar 2024 • Qiankun Liu, Yuqi Jiang, Zhentao Tan, Dongdong Chen, Ying Fu, Qi Chu, Gang Hua, Nenghai Yu

The indices of quantized pixels are used as tokens for the inputs and prediction targets of the transformer.

Image Inpainting Quantization

147

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.