Search Results for author: Qingming Huang

Found 70 papers, 32 papers with code

Interpretable Visual Reasoning via Probabilistic Formulation under Natural Supervision

no code implementations ECCV 2020 Xinzhe Han, Shuhui Wang, Chi Su, Weigang Zhang, Qingming Huang, Qi Tian

In this paper, we rethink implicit reasoning process in VQA, and propose a new formulation which maximizes the log-likelihood of joint distribution for the observed question and predicted answer.

Question Answering Visual Question Answering +1

Weakly-Supervised Crowd Counting Learns from Sorting rather than Locations

no code implementations ECCV 2020 Yifan Yang, Guorong Li, Zhe Wu, Li Su, Qingming Huang, Nicu Sebe

We propose a soft-label sorting network along with the counting network, which sorts the given images by their crowd numbers.

Crowd Counting

When False Positive is Intolerant: End-to-End Optimization with Low FPR for Multipartite Ranking

no code implementations NeurIPS 2021 Peisong Wen, Qianqian Xu, Zhiyong Yang, Yuan He, Qingming Huang

To leverage high performance under low FPRs, we consider an alternative metric for multipartite ranking evaluating the True Positive Rate (TPR) at a given FPR, denoted as TPR@FPR.

Hierarchical Modular Network for Video Captioning

no code implementations24 Nov 2021 Hanhua Ye, Guorong Li, Yuankai Qi, Shuhui Wang, Qingming Huang, Ming-Hsuan Yang

(II) Predicate level, which learns the actions conditioned on highlighted objects and is supervised by the predicate in captions.

Representation Learning Video Captioning

Modeling Temporal Concept Receptive Field Dynamically for Untrimmed Video Analysis

1 code implementation23 Nov 2021 Zhaobo Qi, Shuhui Wang, Chi Su, Li Su, Weigang Zhang, Qingming Huang

Based on TDC, we propose the temporal dynamic concept modeling network (TDCMN) to learn an accurate and complete concept representation for efficient untrimmed video analysis.

Image Categorization

DVCFlow: Modeling Information Flow Towards Human-like Video Captioning

no code implementations19 Nov 2021 Xu Yan, Zhengcong Fei, Shuhui Wang, Qingming Huang, Qi Tian

Dense video captioning (DVC) aims to generate multi-sentence descriptions to elucidate the multiple events in the video, which is challenging and demands visual consistency, discoursal coherence, and linguistic diversity.

Dense Video Captioning

Semi-Autoregressive Image Captioning

1 code implementation11 Oct 2021 Xu Yan, Zhengcong Fei, Zekang Li, Shuhui Wang, Qingming Huang, Qi Tian

Non-autoregressive image captioning with continuous iterative refinement, which eliminates the sequential dependence in a sentence generation, can achieve comparable performance to the autoregressive counterparts with a considerable acceleration.

Image Captioning

Edge-featured Graph Neural Architecture Search

no code implementations3 Sep 2021 Shaofei Cai, Liang Li, Xinzhe Han, Zheng-Jun Zha, Qingming Huang

Recently, researchers study neural architecture search (NAS) to reduce the dependence of human expertise and explore better GNN architectures, but they over-emphasize entity features and ignore latent relation information concealed in the edges.

Neural Architecture Search

Learning with Multiclass AUC: Theory and Algorithms

no code implementations28 Jul 2021 Zhiyong Yang, Qianqian Xu, Shilong Bao, Xiaochun Cao, Qingming Huang

Our foundation is based on the M metric, which is a well-known multiclass extension of AUC.

Recommendation Systems

Greedy Gradient Ensemble for Robust Visual Question Answering

1 code implementation ICCV 2021 Xinzhe Han, Shuhui Wang, Chi Su, Qingming Huang, Qi Tian

Language bias is a critical issue in Visual Question Answering (VQA), where models often exploit dataset biases for the final decision without considering the image information.

Question Answering Visual Question Answering

Fast Batch Nuclear-norm Maximization and Minimization for Robust Domain Adaptation

1 code implementation13 Jul 2021 Shuhao Cui, Shuhui Wang, Junbao Zhuo, Liang Li, Qingming Huang, Qi Tian

Due to the domain discrepancy in visual domain adaptation, the performance of source model degrades when bumping into the high data density near decision boundary in target domain.

Domain Adaptation

Poisoning Attack against Estimating from Pairwise Comparisons

1 code implementation5 Jul 2021 Ke Ma, Qianqian Xu, Jinshan Zeng, Xiaochun Cao, Qingming Huang

In this paper, to the best of our knowledge, we initiate the first systematic investigation of data poisoning attacks on pairwise ranking algorithms, which can be formalized as the dynamic and static games between the ranker and the attacker and can be modeled as certain kinds of integer programming problems.

Data Poisoning

When False Positive is Intolerant: End-to-End Optimization with Low FPR for Multipartite Ranking

no code implementations NeurIPS 2021 Peisong Wen, Qianqian Xu, Zhiyong Yang, Yuan He, Qingming Huang

To leverage high performance under low FPRs, we consider an alternative metric for multipartite ranking evaluating the True Positive Rate (TPR) at a given FPR, denoted as TPR@FPR.

Location-Sensitive Visual Recognition with Cross-IOU Loss

1 code implementation11 Apr 2021 Kaiwen Duan, Lingxi Xie, Honggang Qi, Song Bai, Qingming Huang, Qi Tian

Object detection, instance segmentation, and pose estimation are popular visual recognition tasks which require localizing the object by internal or boundary landmarks.

2D Object Detection Instance Segmentation +2

Rethinking Graph Neural Architecture Search from Message-passing

1 code implementation CVPR 2021 Shaofei Cai, Liang Li, Jincan Deng, Beichen Zhang, Zheng-Jun Zha, Li Su, Qingming Huang

Inspired by the strong searching capability of neural architecture search (NAS) in CNN, this paper proposes Graph Neural Architecture Search (GNAS) with novel-designed search space.

Feature Selection Neural Architecture Search

Seeking the Shape of Sound: An Adaptive Framework for Learning Voice-Face Association

1 code implementation CVPR 2021 Peisong Wen, Qianqian Xu, Yangbangyan Jiang, Zhiyong Yang, Yuan He, Qingming Huang

Targeting at (a), we propose a two-level modality alignment loss where both global and local information are considered.

Exploiting Sample Correlation for Crowd Counting With Multi-Expert Network

no code implementations ICCV 2021 Xinyan Liu, Guorong Li, Zhenjun Han, Weigang Zhang, Yifan Yang, Qingming Huang, Nicu Sebe

Specifically, we propose a task-driven similarity metric based on sample's mutual enhancement, referred as co-fine-tune similarity, which can find a more efficient subset of data for training the expert network.

Crowd Counting Fine-tuning

Heuristic Domain Adaptation

1 code implementation NeurIPS 2020 Shuhao Cui, Xuan Jin, Shuhui Wang, Yuan He, Qingming Huang

In visual domain adaptation (DA), separating the domain-specific characteristics from the domain-invariant representations is an ill-posed problem.

Domain Adaptation

Semantic Editing On Segmentation Map Via Multi-Expansion Loss

no code implementations16 Oct 2020 Jianfeng He, Xuchao Zhang, Shuo Lei, Shuhui Wang, Qingming Huang, Chang-Tien Lu, Bei Xiao

Each MEx area has the mask area of the generation as the majority and the boundary of original context as the minority.

Image Inpainting

Label Decoupling Framework for Salient Object Detection

1 code implementation CVPR 2020 Jun Wei, Shuhui Wang, Zhe Wu, Chi Su, Qingming Huang, Qi Tian

Though remarkable progress has been achieved, we observe that the closer the pixel is to the edge, the more difficult it is to be predicted, because edge pixels have a very imbalance distribution.

 Ranked #1 on Salient Object Detection on DUTS-TE (MAE metric)

RGB Salient Object Detection Saliency Detection +1

Corner Proposal Network for Anchor-free, Two-stage Object Detection

1 code implementation ECCV 2020 Kaiwen Duan, Lingxi Xie, Honggang Qi, Song Bai, Qingming Huang, Qi Tian

On the MS-COCO dataset, CPN achieves an AP of 49. 2% which is competitive among state-of-the-art object detection methods.

Object Detection

Task-Feature Collaborative Learning with Application to Personalized Attribute Prediction

no code implementations29 Apr 2020 Zhiyong Yang, Qianqian Xu, Xiaochun Cao, Qingming Huang

To this end, we propose a novel multi-task learning method called Task-Feature Collaborative Learning (TFCL).

Multi-Task Learning

State-Relabeling Adversarial Active Learning

1 code implementation CVPR 2020 Beichen Zhang, Liang Li, Shijie Yang, Shuhui Wang, Zheng-Jun Zha, Qingming Huang

In this paper, we propose a state relabeling adversarial active learning model (SRAAL), that leverages both the annotation and the labeled/unlabeled state information for deriving the most informative unlabeled samples.

Active Learning

Gradually Vanishing Bridge for Adversarial Domain Adaptation

1 code implementation CVPR 2020 Shuhao Cui, Shuhui Wang, Junbao Zhuo, Chi Su, Qingming Huang, Qi Tian

On the discriminator, GVB contributes to enhance the discriminating ability, and balance the adversarial training process.

Unsupervised Domain Adaptation

Towards Discriminability and Diversity: Batch Nuclear-norm Maximization under Label Insufficient Situations

1 code implementation CVPR 2020 Shuhao Cui, Shuhui Wang, Junbao Zhuo, Liang Li, Qingming Huang, Qi Tian

We find by theoretical analysis that the prediction discriminability and diversity could be separately measured by the Frobenius-norm and rank of the batch output matrix.

Domain Adaptation

DPANet: Depth Potentiality-Aware Gated Attention Network for RGB-D Salient Object Detection

1 code implementation19 Mar 2020 Zuyao Chen, Runmin Cong, Qianqian Xu, Qingming Huang

There are two main issues in RGB-D salient object detection: (1) how to effectively integrate the complementarity from the cross-modal RGB-D data; (2) how to prevent the contamination effect from the unreliable depth map.

RGB-D Salient Object Detection RGB Salient Object Detection +1

Global Context-Aware Progressive Aggregation Network for Salient Object Detection

2 code implementations2 Mar 2020 Zuyao Chen, Qianqian Xu, Runmin Cong, Qingming Huang

Deep convolutional neural networks have achieved competitive performance in salient object detection, in which how to learn effective and comprehensive features plays a critical role.

RGB Salient Object Detection Salient Object Detection

DM2C: Deep Mixed-Modal Clustering

1 code implementation NeurIPS 2019 Yangbangyan Jiang, Qianqian Xu, Zhiyong Yang, Xiaochun Cao, Qingming Huang

Instead of transforming all the samples into a joint modality-independent space, our framework learns the mappings across individual modal spaces by virtue of cycle-consistency.

Generalized Block-Diagonal Structure Pursuit: Learning Soft Latent Task Assignment against Negative Transfer

1 code implementation NeurIPS 2019 Zhiyong Yang, Qianqian Xu, Yangbangyan Jiang, Xiaochun Cao, Qingming Huang

Different from most of the previous work, pursuing the Block-Diagonal structure of LTAM (assigning latent tasks to output tasks) alleviates negative transfer via collaboratively grouping latent tasks and output tasks such that inter-group knowledge transfer and sharing is suppressed.

Multi-Task Learning

F3Net: Fusion, Feedback and Focus for Salient Object Detection

4 code implementations26 Nov 2019 Jun Wei, Shuhui Wang, Qingming Huang

Furthermore, different from binary cross entropy, the proposed PPA loss doesn't treat pixels equally, which can synthesize the local structure information of a pixel to guide the network to focus more on local details.

RGB Salient Object Detection Salient Object Detection

iSplit LBI: Individualized Partial Ranking with Ties via Split LBI

1 code implementation NeurIPS 2019 Qianqian Xu, Xinwei Sun, Zhiyong Yang, Xiaochun Cao, Qingming Huang, Yuan YAO

In this paper, instead of learning a global ranking which is agreed with the consensus, we pursue the tie-aware partial ranking from an individualized perspective.

Knowledge-guided Pairwise Reconstruction Network for Weakly Supervised Referring Expression Grounding

1 code implementation5 Sep 2019 Xuejing Liu, Liang Li, Shuhui Wang, Zheng-Jun Zha, Li Su, Qingming Huang

Weakly supervised referring expression grounding (REG) aims at localizing the referential entity in an image according to linguistic query, where the mapping between the image region (proposal) and the query is unknown in the training stage.

Region Proposal

Adaptive Reconstruction Network for Weakly Supervised Referring Expression Grounding

1 code implementation ICCV 2019 Xuejing Liu, Liang Li, Shuhui Wang, Zheng-Jun Zha, Dechao Meng, Qingming Huang

It builds the correspondence between image region proposal and query in an adaptive manner: adaptive grounding and collaborative reconstruction.

Region Proposal

Learning Personalized Attribute Preference via Multi-task AUC Optimization

no code implementations18 Jun 2019 Zhiyong Yang, Qianqian Xu, Xiaochun Cao, Qingming Huang

Traditionally, most of the existing attribute learning methods are trained based on the consensus of annotations aggregated from a limited number of annotators.

Multimodal Transformer with Multi-View Visual Representation for Image Captioning

no code implementations20 May 2019 Jun Yu, Jing Li, Zhou Yu, Qingming Huang

Despite the success of existing studies, current methods only model the co-attention that characterizes the inter-modal interactions while neglecting the self-attention that characterizes the intra-modal interactions.

Image Captioning Machine Translation

Unsupervised Open Domain Recognition by Semantic Discrepancy Minimization

1 code implementation CVPR 2019 Junbao Zhuo, Shuhui Wang, Shuhao Cui, Qingming Huang

We address the unsupervised open domain recognition (UODR) problem, where categories in labeled source domain S is only a subset of those in unlabeled target domain T. The task is to correctly classify all samples in T including known and unknown categories.

Classification General Classification

CenterNet: Keypoint Triplets for Object Detection

10 code implementations ICCV 2019 Kaiwen Duan, Song Bai, Lingxi Xie, Honggang Qi, Qingming Huang, Qi Tian

In object detection, keypoint-based approaches often suffer a large number of incorrect object bounding boxes, arguably due to the lack of an additional look into the cropped regions.

Object Detection

Spatiotemporal CNN for Video Object Segmentation

1 code implementation CVPR 2019 Kai Xu, Longyin Wen, Guorong Li, Liefeng Bo, Qingming Huang

Specifically, the temporal coherence branch pretrained in an adversarial fashion from unlabeled video data, is designed to capture the dynamic appearance and motion cues of video sequences to guide object segmentation.

Semantic Segmentation Semi-Supervised Video Object Segmentation +3

Deep Robust Subjective Visual Property Prediction in Crowdsourcing

no code implementations CVPR 2019 Qianqian Xu, Zhiyong Yang, Yangbangyan Jiang, Xiaochun Cao, Qingming Huang, Yuan YAO

The problem of estimating subjective visual properties (SVP) of images (e. g., Shoes A is more comfortable than B) is gaining rising attention.

HSCS: Hierarchical Sparsity Based Co-saliency Detection for RGBD Images

no code implementations16 Nov 2018 Runmin Cong, Jianjun Lei, Huazhu Fu, Qingming Huang, Xiaochun Cao, Nam Ling

In this paper, we propose a novel co-saliency detection method for RGBD images based on hierarchical sparsity reconstruction and energy function refinement.

Co-Salient Object Detection

Person Re-Identification by Semantic Region Representation and Topology Constraint

no code implementations20 Aug 2018 Jianjun Lei, Lijie Niu, Huazhu Fu, Bo Peng, Qingming Huang, Chunping Hou

In this paper, we propose a novel person re-identification method, which consists of a reliable representation called Semantic Region Representation (SRR), and an effective metric learning with Mapping Space Topology Constraint (MSTC).

Metric Learning Person Re-Identification

Weakly Supervised Bilinear Attention Network for Fine-Grained Visual Classification

no code implementations6 Aug 2018 Tao Hu, Jizheng Xu, Cong Huang, Honggang Qi, Qingming Huang, Yan Lu

Besides, we propose attention regularization and attention dropout to weakly supervise the generating process of attention maps.

Classification Fine-Grained Image Classification +1

A Margin-based MLE for Crowdsourced Partial Ranking

no code implementations29 Jul 2018 Qianqian Xu, Jiechao Xiong, Xinwei Sun, Zhiyong Yang, Xiaochun Cao, Qingming Huang, Yuan YAO

A preference order or ranking aggregated from pairwise comparison data is commonly understood as a strict total order.

RAM: A Region-Aware Deep Model for Vehicle Re-Identification

no code implementations25 Jun 2018 Xiaobin Liu, Shiliang Zhang, Qingming Huang, Wen Gao

Specifically, in addition to extracting global features, RAM also extracts features from a series of local regions.

Vehicle Re-Identification

The Unmanned Aerial Vehicle Benchmark: Object Detection and Tracking

no code implementations ECCV 2018 Dawei Du, Yuankai Qi, Hongyang Yu, Yifan Yang, Kaiwen Duan, Guorong Li, Weigang Zhang, Qingming Huang, Qi Tian

Selected from 10 hours raw videos, about 80, 000 representative frames are fully annotated with bounding boxes as well as up to 14 kinds of attributes (e. g., weather condition, flying altitude, camera view, vehicle category, and occlusion) for three fundamental computer vision tasks: object detection, single object tracking, and multiple object tracking.

Multiple Object Tracking Object Detection +1

Facial Landmarks Detection by Self-Iterative Regression based Landmarks-Attention Network

no code implementations18 Mar 2018 Tao Hu, Honggang Qi, Jizheng Xu, Qingming Huang

Only one self-iterative regressor is trained to learn the descent directions for samples from coarse stages to fine stages, and parameters are iteratively updated by the same regressor.

Face Alignment

Review of Visual Saliency Detection with Comprehensive Information

no code implementations9 Mar 2018 Runmin Cong, Jianjun Lei, Huazhu Fu, Ming-Ming Cheng, Weisi Lin, Qingming Huang

With the acquisition technology development, more comprehensive information, such as depth cue, inter-image correspondence, or temporal relationship, is available to extend image saliency detection to RGBD saliency detection, co-saliency detection, or video saliency detection.

Co-Salient Object Detection Video Saliency Detection

From Social to Individuals: a Parsimonious Path of Multi-level Models for Crowdsourced Preference Aggregation

no code implementations8 Mar 2018 Qianqian Xu, Jiechao Xiong, Xiaochun Cao, Qingming Huang, Yuan YAO

In crowdsourced preference aggregation, it is often assumed that all the annotators are subject to a common preference or social utility function which generates their comparison behaviors in experiments.

Towards Realistic Face Photo-Sketch Synthesis via Composition-Aided GANs

2 code implementations4 Dec 2017 Jun Yu, Xingxin Xu, Fei Gao, Shengjie Shi, Meng Wang, DaCheng Tao, Qingming Huang

Experimental results show that our method is capable of generating both visually comfortable and identity-preserving face sketches/photos over a wide range of challenging data.

 Ranked #1 on Face Sketch Synthesis on CUFS (FID metric)

Face Sketch Synthesis

From Common to Special: When Multi-Attribute Learning Meets Personalized Opinions

no code implementations18 Nov 2017 Zhiyong Yang, Qianqian Xu, Xiaochun Cao, Qingming Huang

However, both categories ignore the joint effect of the two mentioned factors: the personal diversity with respect to the global consensus; and the intrinsic correlation among multiple attributes.

Feature Selection

HodgeRank with Information Maximization for Crowdsourced Pairwise Ranking Aggregation

no code implementations16 Nov 2017 Qianqian Xu, Jiechao Xiong, Xi Chen, Qingming Huang, Yuan YAO

Recently, crowdsourcing has emerged as an effective paradigm for human-powered large scale problem solving in various domains.

An Iterative Co-Saliency Framework for RGBD Images

no code implementations4 Nov 2017 Runmin Cong, Jianjun Lei, Huazhu Fu, Weisi Lin, Qingming Huang, Xiaochun Cao, Chunping Hou

In this paper, we propose an iterative RGBD co-saliency framework, which utilizes the existing single saliency maps as the initialization, and generates the final RGBD cosaliency map by using a refinement-cycle model.

Co-Salient Object Detection

Co-saliency Detection for RGBD Images Based on Multi-constraint Feature Matching and Cross Label Propagation

no code implementations14 Oct 2017 Runmin Cong, Jianjun Lei, Huazhu Fu, Qingming Huang, Xiaochun Cao, Chunping Hou

Different from the most existing co-saliency methods focusing on RGB images, this paper proposes a novel co-saliency detection model for RGBD images, which utilizes the depth information to enhance identification of co-saliency.

Co-Salient Object Detection

Exploring Outliers in Crowdsourced Ranking for QoE

no code implementations18 Jul 2017 Qianqian Xu, Ming Yan, Chendi Huang, Jiechao Xiong, Qingming Huang, Yuan YAO

Outlier detection is a crucial part of robust evaluation for crowdsourceable assessment of Quality of Experience (QoE) and has attracted much attention in recent years.

Outlier Detection

Online Asymmetric Similarity Learning for Cross-Modal Retrieval

no code implementations CVPR 2017 Yiling Wu, Shuhui Wang, Qingming Huang

In this paper, we propose an online learning method to learn the similarity function between heterogeneous modalities by preserving the relative similarity in the training data, which is modeled as a set of bi-directional hinge loss constraints on the cross-modal training triplets.

Cross-Modal Retrieval Semantic Similarity +1

Hedged Deep Tracking

no code implementations CVPR 2016 Yuankai Qi, Shengping Zhang, Lei Qin, Hongxun Yao, Qingming Huang, Jongwoo Lim, Ming-Hsuan Yang

In recent years, several methods have been developed to utilize hierarchical features learned from a deep convolutional neural network (CNN) for visual tracking.

Visual Tracking

Geometric Hypergraph Learning for Visual Tracking

no code implementations18 Mar 2016 Dawei Du, Honggang Qi, Longyin Wen, Qi Tian, Qingming Huang, Siwei Lyu

Graph based representation is widely used in visual tracking field by finding correct correspondences between target parts in consecutive frames.

Visual Tracking

Similarity Gaussian Process Latent Variable Model for Multi-Modal Data Analysis

no code implementations ICCV 2015 Guoli Song, Shuhui Wang, Qingming Huang, Qi Tian

Data from real applications involve multiple modalities representing content with the same semantics and deliver rich information from complementary aspects.

Evaluating Visual Properties via Robust HodgeRank

no code implementations15 Aug 2014 Qianqian Xu, Jiechao Xiong, Xiaochun Cao, Qingming Huang, Yuan YAO

In this paper we study the problem of how to estimate such visual properties from a ranking perspective with the help of the annotators from online crowdsourcing platforms.

Graph Sampling Outlier Detection

Multi-level Discriminative Dictionary Learning towards Hierarchical Visual Categorization

no code implementations CVPR 2013 Li Shen, Shuhui Wang, Gang Sun, Shuqiang Jiang, Qingming Huang

For each internode of the hierarchical category structure, a discriminative dictionary and a set of classification models are learnt for visual categorization, and the dictionaries in different layers are learnt to exploit the discriminative visual properties of different granularity.

Dictionary Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.