no code implementations • ECCV 2020 • Xinzhe Han, Shuhui Wang, Chi Su, Weigang Zhang, Qingming Huang, Qi Tian
In this paper, we rethink implicit reasoning process in VQA, and propose a new formulation which maximizes the log-likelihood of joint distribution for the observed question and predicted answer.
no code implementations • ECCV 2020 • Yifan Yang, Guorong Li, Zhe Wu, Li Su, Qingming Huang, Nicu Sebe
We propose a soft-label sorting network along with the counting network, which sorts the given images by their crowd numbers.
no code implementations • 17 Jun 2025 • Siran Dai, Qianqian Xu, Peisong Wen, Yang Liu, Qingming Huang
Image-based cell profiling aims to create informative representations of cell images.
1 code implementation • 14 Jun 2025 • Zonghao Ying, Siyang Wu, Run Hao, Peng Ying, Shixuan Sun, PengYu Chen, Junze Chen, Hao Du, Kaiwen Shen, Shangkun Wu, Jiwei Wei, Shiyuan He, Yang Yang, Xiaohai Xu, Ke Ma, Qianqian Xu, Qingming Huang, Shi Lin, Xun Wang, Changting Lin, Meng Han, Yilei Jiang, Siqi Lai, Yaozhi Zheng, Yifei Song, Xiangyu Yue, Zonglei Jing, Tianyuan Zhang, Zhilei Zhu, Aishan Liu, Jiakai Wang, Siyuan Liang, Xianglong Kong, Hainan Li, Junjie Mu, Haotong Qin, Yue Yu, Lei Chen, Felix Juefei-Xu, Qing Guo, Xinyun Chen, Yew Soon Ong, Xianglong Liu, Dawn Song, Alan Yuille, Philip Torr, DaCheng Tao
Multimodal Large Language Models (MLLMs) have enabled transformative advancements across diverse applications but remain susceptible to safety threats, especially jailbreak attacks that induce harmful outputs.
1 code implementation • 16 May 2025 • Feiran Li, Qianqian Xu, Shilong Bao, Zhiyong Yang, Xiaochun Cao, Qingming Huang
This approach effectively bypasses the knowledge gap between text and image, significantly enhancing erasure efficacy.
1 code implementation • 12 May 2025 • Shixi Qin, Zhiyong Yang, Shilong Bao, Shi Wang, Qianqian Xu, Qingming Huang
This paper focuses on implanting multiple heterogeneous backdoor triggers in bridge-based diffusion models designed for complex and arbitrary input distributions.
1 code implementation • 8 May 2025 • Cong Hua, Qianqian Xu, Zhiyong Yang, Zitai Wang, Shilong Bao, Qingming Huang
This practical challenge has spurred the development of open-world prompt tuning, which demands a unified evaluation of two stages: 1) detecting whether an input belongs to the base or new domain (P1), and 2) classifying the sample into its correct class (P2).
1 code implementation • 7 May 2025 • Guanghui Wang, Zhiyong Yang, Zitai Wang, Shi Wang, Qianqian Xu, Qingming Huang
In contrast, both are too strong in RKLD, causing the student to overly emphasize the target class while ignoring the broader distributional information from the teacher.
no code implementations • 3 May 2025 • Haoming Yang, Ke Ma, Xiaojun Jia, Yingfei Sun, Qianqian Xu, Qingming Huang
Despite the remarkable performance of Large Language Models (LLMs), they remain vulnerable to jailbreak attacks, which can compromise their safety mechanisms.
no code implementations • 3 May 2025 • Sicong Li, Qianqian Xu, Zhiyong Yang, Zitai Wang, Linchao Zhang, Xiaochun Cao, Qingming Huang
Recent methods resorted to long-tail variants of Sharpness-Aware Minimization (SAM), such as ImbSAM and CC-SAM, to improve generalization by flattening the loss landscape.
no code implementations • 2 May 2025 • Gaozheng Pei, Ke Ma, Yingfei Sun, Qianqian Xu, Qingming Huang
Specifically, at each time step during the reverse process, for the amplitude spectrum, we replace the low-frequency components of the estimated image's amplitude spectrum with the corresponding parts of the adversarial image.
no code implementations • 2 May 2025 • Gaoxiang Cong, Liang Li, Jiadong Pan, Zhedong Zhang, Amin Beheshti, Anton Van Den Hengel, Yuankai Qi, Qingming Huang
Movie Dubbing aims to convert scripts into speeches that align with the given movie clip in both temporal and emotional aspects while preserving the vocal timbre of a given brief reference audio.
no code implementations • 31 Mar 2025 • Mingkai Tian, Guorong Li, Yuankai Qi, Amin Beheshti, Javen Qinfeng Shi, Anton Van Den Hengel, Qingming Huang
Zero-shot video captioning requires that a model generate high-quality captions without human-annotated video-text pairs for training.
1 code implementation • 25 Mar 2025 • Hongcheng Gao, Jiashu Qu, Jingyi Tang, Baolong Bi, Yue Liu, Hongyu Chen, Li Liang, Li Su, Qingming Huang
The hallucination of large multimodal models (LMMs), providing responses that appear correct but are actually incorrect, limits their reliability and applicability.
no code implementations • 22 Mar 2025 • Zhuo Tao, Liang Li, Qi Chen, Yunbin Tu, Zheng-Jun Zha, Ming-Hsuan Yang, Yuankai Qi, Qingming Huang
To address this problem, we propose a new COllaborative Temporal consistEncy Learning (COTEL) framework that leverages the synergy between saliency detection and moment localization to strengthen the video-language alignment.
1 code implementation • CVPR 2025 • Yang Liu, Qianqian Xu, Peisong Wen, Siran Dai, Qingming Huang
For challenge 1), we propose a sandwich sampling strategy that selects two auxiliary frames to reduce reconstruction uncertainty in a two-side-squeezing manner.
1 code implementation • 18 Mar 2025 • Shengping Zhang, Xiaoyu Han, Weigang Zhang, Xiangyuan Lan, Hongxun Yao, Qingming Huang
Finally, we introduce Limb-aware Texture Fusion (LTF) that focuses on generating realistic details in limb regions, where a coarse try-on result is first generated by fusing the warped clothing image with the person image, then limb textures are further fused with the coarse result under limb-aware guidance to refine limb details.
no code implementations • 13 Mar 2025 • Jiaqi Wu, Junbiao Pang, Qingming Huang
We further model the utility of pseudo-labels as long-tailed weights to avoid the open problem of setting the threshold.
no code implementations • 20 Feb 2025 • Dengchao Jin, Jianjun Lei, Bo Peng, Zhaoqing Pan, Nam Ling, Qingming Huang
2D image coding for machines (ICM) has achieved great success in coding efficiency, while less effort has been devoted to stereo image fields.
no code implementations • 10 Feb 2025 • Lv Tang, Jun Zhu, Xinfeng Zhang, Li Zhang, Siwei Ma, Qingming Huang
Furthermore, to enhance the capture of dynamics between frames within a sequence, we implement a dynamic frame-level adjustment (DFA).
1 code implementation • 19 Jan 2025 • Zhipeng Yu, Qianqian Xu, Yangbangyan Jiang, Yingfei Sun, Qingming Huang
Existing noisy label learning methods designed for DML mainly discard suspicious noisy samples, resulting in a waste of the training data.
no code implementations • CVPR 2025 • Zhen Yang, Zhuo Tao, Qi Chen, Liang Li, Yuankai Qi, Anton Van Den Hengel, Qingming Huang
To train such a model, we utilize GPT-4 to build a corresponding high-quality question-aware caption dataset on top of existing KBVQA datasets.
no code implementations • CVPR 2025 • Gaozheng Pei, Shaojie Lyu, Gong Chen, Ke Ma, Qianqian Xu, Yingfei Sun, Qingming Huang
Existing diffusion-based purification methods aim to disrupt adversarial perturbations by introducing a certain amount of noise through a forward diffusion process, followed by a reverse process to recover clean examples.
no code implementations • CVPR 2025 • Yue Wu, Zhaobo Qi, Junshu Sun, YaoWei Wang, Qingming Huang, Shuhui Wang
The development of self-supervised video-language models based on mask learning has significantly advanced downstream video tasks.
no code implementations • 20 Dec 2024 • Jiadong Pan, Hongcheng Gao, Liang Li, Zheng-Jun Zha, Qingming Huang, Jiebo Luo
Experimental results show that by incorporating HGR, images generated by diffusion models achieve both high quality and strong safety, and safe DMs trained through unsupervised methods according to the harmfulness detected by HGR also exhibit good safety performance.
no code implementations • 18 Dec 2024 • Yunbin Tu, Liang Li, Li Su, Qingming Huang
In this paper, guided by the shallow-to-deep principle, we propose a query-centric audio-visual cognition (QUAG) network to construct a reliable multi-modal representation for moment retrieval, segmentation and step-captioning.
1 code implementation • 18 Dec 2024 • Xingyu Lyu, Qianqian Xu, Zhiyong Yang, Shaojie Lyu, Qingming Huang
Our experiments confirm that SSE-SAM has better ability in escaping saddles both on head and tail classes, and shows performance improvements.
1 code implementation • 17 Dec 2024 • Zhiguang Lu, Qianqian Xu, Shilong Bao, Zhiyong Yang, Qingming Huang
This paper addresses the challenge of Granularity Competition in fine-grained classification tasks, which arises due to the semantic gap between multi-granularity labels.
1 code implementation • 4 Nov 2024 • Shufan Shen, Junshu Sun, Xiangyang Ji, Qingming Huang, Shuhui Wang
In this paper, we propose a method named SNELL (Sparse tuning with kerNELized LoRA) for sparse tuning with low memory usage.
1 code implementation • 31 Oct 2024 • Junshu Sun, Chenxue Yang, Xiangyang Ji, Qingming Huang, Shuhui Wang
With nodes moving in the space, their evolving relations facilitate flexible pathway construction for a dynamic message-passing process.
1 code implementation • 12 Oct 2024 • Ting Yu, Kunhao Fu, Jian Zhang, Qingming Huang, Jun Yu
Long-term Video Question Answering (VideoQA) is a challenging vision-and-language bridging task focusing on semantic understanding of untrimmed long-term videos and diverse free-form questions, simultaneously emphasizing comprehensive cross-modal reasoning to yield precise answers.
no code implementations • 12 Oct 2024 • Ting Yu, Kunhao Fu, Shuhui Wang, Qingming Huang, Jun Yu
Video Question Answering (VideoQA) represents a crucial intersection between video understanding and language processing, requiring both discriminative unimodal comprehension and sophisticated cross-modal interaction for accurate inference.
1 code implementation • 9 Oct 2024 • Benyuan Meng, Qianqian Xu, Zitai Wang, Zhiyong Yang, Xiaochun Cao, Qingming Huang
We locate the cause of content shift as one inherent characteristic of diffusion models, which suggests the broad existence of this phenomenon in diffusion feature.
1 code implementation • 4 Oct 2024 • Benyuan Meng, Qianqian Xu, Zitai Wang, Xiaochun Cao, Qingming Huang
To this end, the early study of this field performs a large-scale quantitative comparison of the discriminative ability of the activations.
1 code implementation • 30 Sep 2024 • Boyu Han, Qianqian Xu, Zhiyong Yang, Shilong Bao, Peisong Wen, Yangbangyan Jiang, Qingming Huang
On one hand, AUC optimization in a pixel-level task involves complex coupling across loss terms, with structured inner-image and pairwise inter-image dependencies, complicating theoretical analysis.
no code implementations • 19 Sep 2024 • Junbiao Pang, Anjing Hu, Qingming Huang
A state-of-the-art solution is firstly to organize webpages into a large volume of multi-granularity topic candidates; hot topics are further identified by estimating their interestingness.
1 code implementation • 2 Sep 2024 • Shilong Bao, Qianqian Xu, Zhiyong Yang, Yuan He, Xiaochun Cao, Qingming Huang
Under this setting, the unique user representation might induce preference bias, especially when the item category distribution is imbalanced.
no code implementations • 8 Aug 2024 • Jiaqi Wu, Junbiao Pang, Qingming Huang
This allows DSA to be easily extensible to architecture-agnostic networks for a range of computer vision tasks.
no code implementations • 2 Aug 2024 • Yijia Wang, Qianqian Xu, Yangbangyan Jiang, Siran Dai, Qingming Huang
In recent years, multi-view outlier detection (MVOD) methods have advanced significantly, aiming to identify outliers within multi-view datasets.
no code implementations • 26 Jul 2024 • Junbiao Pang, Qingming Huang
Discovering popular topics from web faces a sea of noise webpages which never evolve into popular topics.
1 code implementation • 26 Jul 2024 • Junshu Sun, Shuhui Wang, Chenxue Yang, Qingming Huang
Previous methods of designing optimal pathways are limited with information loss on the input features.
no code implementations • 22 Jul 2024 • Yang Liu, Qianqian Xu, Peisong Wen, Siran Dai, Qingming Huang
For the former challenge, we develop the TopK-Chamfer Similarity and QuadLinear-AP loss to measure and optimize video-level similarities in terms of AP.
no code implementations • 20 Jul 2024 • Beichen Zhang, Liang Li, Zheng-Jun Zha, Jiebo Luo, Qingming Huang
DOKT consists of a traceback diversity indicator and a domain-based uncertainty estimator.
1 code implementation • 16 Jul 2024 • Yunbin Tu, Liang Li, Li Su, Chenggang Yan, Qingming Huang
However, most existing methods directly capture the difference between them, which risk obtaining error-prone difference features.
no code implementations • 9 Jul 2024 • Zitai Wang, Qianqian Xu, Zhiyong Yang, Peisong Wen, Yuan He, Xiaochun Cao, Qingming Huang
On the other hand, we establish a sharp generalization bound for the proposed framework based on a novel technique named data-dependent contraction.
no code implementations • 2 Jul 2024 • Ke Ma, Qianqian Xu, Jinshan Zeng, Wei Liu, Xiaochun Cao, Yingfei Sun, Qingming Huang
Since it is independent of rank aggregation and lacks effective protection mechanisms, we disrupt the data collection process by fabricating pairwise comparisons without knowledge of the future data or the true distribution.
1 code implementation • 31 May 2024 • Yunbin Tu, Liang Li, Li Su, Zheng-Jun Zha, Chenggang Yan, Qingming Huang
Given an image pair, CARD first decouples context features that aggregate all similar/dissimilar semantics, termed common/difference context features.
1 code implementation • 16 May 2024 • Feiran Li, Qianqian Xu, Shilong Bao, Zhiyong Yang, Runmin Cong, Xiaochun Cao, Qingming Huang
This paper explores the size-invariance of evaluation metrics in Salient Object Detection (SOD), especially when multiple targets of diverse sizes co-exist in the same image.
1 code implementation • 15 May 2024 • Cong Hua, Qianqian Xu, Shilong Bao, Zhiyong Yang, Qingming Huang
This paper explores a novel multi-modal alternating learning paradigm pursuing a reconciliation between the exploitation of uni-modal features and the exploration of cross-modal interactions.
1 code implementation • 13 May 2024 • Zhiyong Yang, Qianqian Xu, Zitai Wang, Sicong Li, Boyu Han, Shilong Bao, Xiaochun Cao, Qingming Huang
Traditional methods predominantly use a Mixture-of-Expert (MoE) approach, targeting a few fixed test label distributions that exhibit substantial global variations.
Ranked #1 on
Test Agnostic Long-Tailed Learning
on CIFAR-10-LT
no code implementations • 11 May 2024 • Yunchuan Ma, Laiyun Qing, Guorong Li, Yuankai Qi, Amin Beheshti, Quan Z. Sheng, Qingming Huang
Specifically, we bridge video and text using four key models: a general video-text retrieval model XCLIP, a general image-text matching model CLIP, a text alignment model AnglE, and a text generation model GPT-2, due to their source-code availability.
1 code implementation • 29 Apr 2024 • Zhaobo Qi, Shuhui Wang, Weigang Zhang, Qingming Huang
Video activity anticipation aims to predict what will happen in the future, embracing a broad application prospect ranging from robot vision and autonomous driving.
no code implementations • 27 Mar 2024 • Jiaqi Wu, Junbiao Pang, Baochang Zhang, Qingming Huang
Semi-supervised learning (SSL) is a practical challenge in computer vision.
no code implementations • 12 Mar 2024 • Ting Yu, Xiaojun Lin, Shuhui Wang, Weiguo Sheng, Qingming Huang, Jun Yu
Three-Dimensional (3D) dense captioning is an emerging vision-language bridging task that aims to generate multiple detailed and accurate descriptions for 3D scenes.
no code implementations • 11 Mar 2024 • Runmin Cong, Hang Xiong, Jinpeng Chen, Wei zhang, Qingming Huang, Yao Zhao
To address this, we present the Query-guided Prototype Evolution Network (QPENet), a new method that integrates query features into the generation process of foreground and background prototypes, thereby yielding customized prototypes attuned to specific queries.
1 code implementation • 20 Feb 2024 • Gaoxiang Cong, Yuankai Qi, Liang Li, Amin Beheshti, Zhedong Zhang, Anton Van Den Hengel, Ming-Hsuan Yang, Chenggang Yan, Qingming Huang
Given a script, the challenge in Movie Dubbing (Visual Voice Cloning, V2C) is to generate speech that aligns well with the video in both time and emotion, based on the tone of a reference audio track.
no code implementations • 30 Jan 2024 • Henglei Lv, Jiayu Xiao, Liang Li, Qingming Huang
To this end, we propose Pick-and-Draw, a training-free semantic guidance approach to boost identity consistency and generative diversity for personalization methods.
1 code implementation • 15 Jan 2024 • Zhaobo Qi, Yibo Yuan, Xiaowen Ruan, Shuhui Wang, Weigang Zhang, Qingming Huang
Temporal Sentence Grounding in Video (TSGV) is troubled by dataset bias issue, which is caused by the uneven temporal distribution of the target moments for samples with similar semantic components in input videos or query texts.
no code implementations • CVPR 2024 • Junxi Chen, Liang Li, Li Su, Zheng-Jun Zha, Qingming Huang
The detector can utilize the semantic-rich features to capture diverse abnormal patterns.
1 code implementation • CVPR 2024 • Xinyan Liu, Guorong Li, Yuankai Qi, Ziheng Yan, Zhenjun Han, Anton Van Den Hengel, Ming-Hsuan Yang, Qingming Huang
To provide a more realistic reflection of the underlying practical challenge we introduce a weakly supervised VIC task wherein trajectory labels are not provided.
1 code implementation • 22 Dec 2023 • Junwei He, Qianqian Xu, Yangbangyan Jiang, Zitai Wang, Qingming Huang
We pretrain graph autoencoders on these augmented graphs at multiple levels, which enables the graph autoencoders to capture normal patterns.
no code implementations • 20 Dec 2023 • Chang Teng, Yunchuan Ma, Guorong Li, Yuankai Qi, Laiyu Qing, Qingming Huang
To address this problem, we propose a new video captioning task, Subject-Oriented Video Captioning (SOVC), which aims to allow users to specify the describing target via a bounding box.
1 code implementation • 10 Dec 2023 • Xinyan Liu, Guorong Li, Yuankai Qi, Ziheng Yan, Zhenjun Han, Anton Van Den Hengel, Ming-Hsuan Yang, Qingming Huang
% To provide a more realistic reflection of the underlying practical challenge, we introduce a weakly supervised VIC task, wherein trajectory labels are not provided.
1 code implementation • 4 Dec 2023 • Chen Zhang, Guorong Li, Yuankai Qi, Hanhua Ye, Laiyun Qing, Ming-Hsuan Yang, Qingming Huang
To address these limitations, we propose a Dynamic Erasing Network (DE-Net) for weakly supervised video anomaly detection, which learns multi-scale temporal features.
1 code implementation • NeurIPS 2023 • Siran Dai, Qianqian Xu, Zhiyong Yang, Xiaochun Cao, Qingming Huang
To tackle this challenge, methodically we propose an instance-wise surrogate loss of Distributionally Robust AUC (DRAUC) and build our optimization framework on top of it.
1 code implementation • 3 Nov 2023 • Jiaqi Wu, Junbiao Pang, Qingming Huang
Both semi-supervised classification and regression are practically challenging tasks for computer vision.
no code implementations • 3 Nov 2023 • Jiaqi Wu, Junbiao Pang, Qingming Huang
Semi-supervised pose estimation is a practically challenging task for computer vision.
no code implementations • 13 Oct 2023 • Jiayu Xiao, Henglei Lv, Liang Li, Shuhui Wang, Qingming Huang
Recent text-to-image (T2I) diffusion models have achieved remarkable progress in generating high-quality images given text-prompts as input.
no code implementations • 12 Oct 2023 • Peifeng Gao, Qianqian Xu, Yibo Yang, Peisong Wen, Huiyang Shao, Zhiyong Yang, Bernard Ghanem, Qingming Huang
While there have been extensive studies on optimization characteristics showing the global optimality of neural collapse, little research has been done on the generalization behaviors during the occurrence of NC.
1 code implementation • 12 Oct 2023 • Jingru Gan, Xinzhe Han, Shuhui Wang, Qingming Huang
Given an image and an associated textual question, the purpose of Knowledge-Based Visual Question Answering (KB-VQA) is to provide a correct answer to the question with the aid of external knowledge bases.
1 code implementation • NeurIPS 2023 • Zitai Wang, Qianqian Xu, Zhiyong Yang, Yuan He, Xiaochun Cao, Qingming Huang
However, existing generalization analysis of such losses is still coarse-grained and fragmented, failing to explain some empirical results.
1 code implementation • 7 Oct 2023 • Zitai Wang, Qianqian Xu, Zhiyong Yang, Yuan He, Xiaochun Cao, Qingming Huang
However, existing generalization analysis of such losses is still coarse-grained and fragmented, failing to explain some empirical results.
Ranked #6 on
Long-tail Learning
on CIFAR-10-LT (ρ=10)
1 code implementation • journal 2023 • Junyu Chen, Qianqian Xu, Zhiyong Yang, Ke Ma, Xiaochun Cao, Qingming Huang
For the motif-based node representation learning process, we propose a Motif Coarsening strategy for incorporating motif structure into the graph representation learning process.
1 code implementation • ICCV 2023 • Yunbin Tu, Liang Li, Li Su, Zheng-Jun Zha, Chenggang Yan, Qingming Huang
Change captioning aims to describe the difference between a pair of similar images.
1 code implementation • TPAMI 2023 • Zhiyong Yang, Qianqian Xu, Shilong Bao, Peisong Wen, Xiaochun Cao, Qingming Huang
We propose a new result that not only addresses the interdependency issue but also brings a much sharper bound with weaker assumptions about the loss function.
2 code implementations • TPAMI 2023 • Zhiyong Yang, Qianqian Xu, Wenzheng Hou, Shilong Bao, Yuan He, Xiaochun Cao, Qingming Huang
On top of this, we can show that: 1) Under mild conditions, AdAUC can be optimized equivalently with score-based or instance-wise-loss-based perturbations, which is compatible with most of the popular adversarial example generation methods.
1 code implementation • 27 Jul 2023 • Yuchen Sun, Qianqian Xu, Zitai Wang, Qingming Huang
However, existing adversarial attacks toward multi-label learning only pursue the traditional visual imperceptibility but ignore the new perceptible problem coming from measures such as Precision@$k$ and mAP@$k$.
1 code implementation • 15 Jun 2023 • Runmin Cong, Wenyu Yang, Wei zhang, Chongyi Li, Chun-Le Guo, Qingming Huang, Sam Kwong
Among existing UIE methods, Generative Adversarial Networks (GANs) based methods perform well in visual aesthetics, while the physical model-based methods have better scene adaptability.
no code implementations • 13 May 2023 • Ke Zhang, Yan Yang, Jun Yu, Hanliang Jiang, Jianping Fan, Qingming Huang, Weidong Han
To address this limitation, we propose a unified Med-VLP framework based on Multi-task Paired Masking with Alignment (MPMA) to integrate the cross-modal alignment task into the joint image-text reconstruction framework to achieve more comprehensive cross-modal interaction, while a Global and Local Alignment (GLA) module is designed to assist self-supervised paradigm in obtaining semantic representations with rich domain knowledge.
no code implementations • 18 Apr 2023 • Peifeng Gao, Qianqian Xu, Peisong Wen, Huiyang Shao, Zhiyong Yang, Qingming Huang
Out of curiosity about the symmetry of Grassmannian Frame, we conduct experiments to explore if models with different Grassmannian Frames have different performance.
1 code implementation • 6 Mar 2023 • Yunbin Tu, Liang Li, Li Su, Ke Lu, Qingming Huang
Change captioning is to describe the semantic change between a pair of similar images in natural language.
1 code implementation • 1 Feb 2023 • Guanqi Ding, Xinzhe Han, Shuhui Wang, Xin Jin, Dandan Tu, Qingming Huang
SAGE takes use of all given few-shot images and estimates a class center embedding based on the category-relevant attribute dictionary.
no code implementations • ICCV 2023 • Huiyang Shao, Qianqian Xu, Peisong Wen, Peifeng Gao, Zhiyong Yang, Qingming Huang
Finally, experimental results support the effectiveness of the proposed framework in terms of both mural synthesis and restoration.
1 code implementation • ICCV 2023 • Zhenhuan Liu, Liang Li, Jiayu Xiao, Zheng-Jun Zha, Qingming Huang
The experiments demonstrate the effectiveness of our method to preserve the diversity of source domain and generate high fidelity target images.
no code implementations • 23 Dec 2022 • Runmin Cong, Ke Huang, Jianjun Lei, Yao Zhao, Qingming Huang, Sam Kwong
Salient object detection (SOD) aims to determine the most visually attractive objects in an image.
no code implementations • CVPR 2023 • Chen Zhang, Guorong Li, Yuankai Qi, Shuhui Wang, Laiyun Qing, Qingming Huang, Ming-Hsuan Yang
Weakly supervised video anomaly detection aims to identify abnormal events in videos using only video-level labels.
1 code implementation • 8 Dec 2022 • Ziheng Yan, Yuankai Qi, Guorong Li, Xinyan Liu, Weigang Zhang, Qingming Huang, Ming-Hsuan Yang
Crowd counting is usually handled in a density map regression fashion, which is supervised via a L2 loss between the predicted density map and ground truth.
no code implementations • 8 Dec 2022 • Xinyan Liu, Guorong Li, Yuankai Qi, Zhenjun Han, Qingming Huang, Ming-Hsuan Yang, Nicu Sebe
Crowd localization aims to predict the spatial position of humans in a crowd scenario.
1 code implementation • CVPR 2023 • Gaoxiang Cong, Liang Li, Yuankai Qi, ZhengJun Zha, Qi Wu, Wenyu Wang, Bin Jiang, Ming-Hsuan Yang, Qingming Huang
Given a piece of text, a video clip and a reference audio, the movie dubbing (also known as visual voice clone V2C) task aims to generate speeches that match the speaker's emotion presented in the video using the desired speaker voice as reference.
1 code implementation • CVPR 2022 • Yunrui Zhao, Qianqian Xu, Yangbangyan Jiang, Peisong Wen, Qingming Huang
Positive-Unlabeled (PU) learning tries to learn binary classifiers from a few labeled positive examples with many unlabeled ones.
1 code implementation • 22 Oct 2022 • Zitai Wang, Qianqian Xu, Zhiyong Yang, Yuan He, Xiaochun Cao, Qingming Huang
In this paper, a systematic analysis reveals that most existing metrics are essentially inconsistent with the aforementioned goal of OSR: (1) For metrics extended from close-set classification, such as Open-set F-score, Youden's index, and Normalized Accuracy, a poor open-set prediction can escape from a low performance score with a superior close-set prediction.
2 code implementations • 9 Oct 2022 • Yao Zhu, Yuefeng Chen, Xiaodan Li, Kejiang Chen, Yuan He, Xiang Tian, Bolun Zheng, Yaowu Chen, Qingming Huang
We conduct comprehensive transferable attacks against multiple DNNs to demonstrate the effectiveness of the proposed method.
2 code implementations • 9 Oct 2022 • Runmin Cong, Kepu Zhang, Chen Zhang, Feng Zheng, Yao Zhao, Qingming Huang, Sam Kwong
In addition, considering the role of thermal modality, we set up different cross-modality interaction mechanisms in the encoding phase and the decoding phase.
2 code implementations • NeurIPS 2022 • Huiyang Shao, Qianqian Xu, Zhiyong Yang, Shilong Bao, Qingming Huang
sample size and a slow convergence rate, especially for TPAUC.
3 code implementations • 6 Oct 2022 • Runmin Cong, Qinwei Lin, Chen Zhang, Chongyi Li, Xiaochun Cao, Qingming Huang, Yao Zhao
Focusing on the issue of how to effectively capture and utilize cross-modality information in RGB-D salient object detection (SOD) task, we present a convolutional neural network (CNN) model, named CIR-Net, based on the novel cross-modality interaction and refinement.
2 code implementations • Proceedings of the 30th ACM International Conference on Multimedia 2022 • Junyu Chen, Qianqian Xu, Zhiyong Yang, Xiaochun Cao, Qingming Huang
We develop a multi-class AUC optimization work to deal with the class imbalance problem.
1 code implementation • Conference 2022 • Junyu Chen, Qianqian Xu, Zhiyong Yang, Ke Ma, Xiaochun Cao, Qingming Huang
To attack this problem, we propose a recursive meta-learning model with the user's behavior sequence prediction as a separate training task.
1 code implementation • NeurIPS 2023 • Shilong Bao, Qianqian Xu, Zhiyong Yang, Yuan He, Xiaochun Cao, Qingming Huang
Collaborative Metric Learning (CML) has recently emerged as a popular method in recommendation systems (RS), closing the gap between metric learning and Collaborative Filtering.
1 code implementation • 27 Sep 2022 • Peisong Wen, Qianqian Xu, Zhiyong Yang, Yuan He, Qingming Huang
Stochastic optimization of the Area Under the Precision-Recall Curve (AUPRC) is a crucial problem for machine learning.
no code implementations • 26 Sep 2022 • Yangbangyan Jiang, Xiaodan Li, Yuefeng Chen, Yuan He, Qianqian Xu, Zhiyong Yang, Xiaochun Cao, Qingming Huang
In recent years, great progress has been made to incorporate unlabeled data to overcome the inefficiently supervised problem via semi-supervised learning (SSL).
1 code implementation • 13 Sep 2022 • Ke Ma, Qianqian Xu, Jinshan Zeng, Guorong Li, Xiaochun Cao, Qingming Huang
From the perspective of the dynamical system, the attack behavior with a target ranking list is a fixed point belonging to the composition of the adversary and the victim.
1 code implementation • 3 Sep 2022 • Zitai Wang, Qianqian Xu, Zhiyong Yang, Yuan He, Xiaochun Cao, Qingming Huang
Finally, the experimental results on four benchmark datasets validate the effectiveness of our proposed framework.
1 code implementation • 26 Jul 2022 • Weidong Chen, Dexiang Hong, Yuankai Qi, Zhenjun Han, Shuhui Wang, Laiyun Qing, Qingming Huang, Guorong Li
To address this problem, we propose a multi-attention network which consists of dual-path dual-attention module and a query-based cross-modal Transformer module.
Ranked #5 on
Referring Expression Segmentation
on A2D Sentences
1 code implementation • 18 Jul 2022 • Xuejing Liu, Liang Li, Shuhui Wang, Zheng-Jun Zha, Zechao Li, Qi Tian, Qingming Huang
Second, most previous weakly supervised REG methods ignore the discriminative location and context of the referent, causing difficulties in distinguishing the target from other same-category objects.
no code implementations • 28 Jun 2022 • Tianwei Cao, Qianqian Xu, Zhiyong Yang, Qingming Huang
In this paper, we regard user interest modeling as a feature selection problem, which we call user interest selection.
1 code implementation • 24 Jun 2022 • Zongsheng Cao, Qianqian Xu, Zhiyong Yang, Xiaochun Cao, Qingming Huang
Knowledge graph (KG) embeddings have shown great power in learning representations of entities and relations for link prediction tasks.
1 code implementation • 24 Jun 2022 • Zongsheng Cao, Qianqian Xu, Zhiyong Yang, Qingming Huang
To address this issue, we propose a new regularizer, namely, Equivariance Regularizer (ER), which can suppress overfitting by leveraging the implicit semantic information.
no code implementations • ICML 2022 • Wenzheng Hou, Qianqian Xu, Zhiyong Yang, Shilong Bao, Yuan He, Qingming Huang
Our analysis differs from the existing studies since the algorithm is asked to generate adversarial examples by calculating the gradient of a min-max problem.
1 code implementation • TPAMI 2022 • Shilong Bao, Qianqian Xu, Zhiyong Yang, Xiaochun Cao, Qingming Huang
However, in this work, by taking a theoretical analysis, we find that negative sampling would lead to a biased estimation of the generalization error.
1 code implementation • TPAMI 2022 • Zhiyong Yang, Qianqian Xu, Shilong Bao, Yuan He, Xiaochun Cao, Qingming Huang
The critical challenge along this course lies in the difficulty of performing gradient-based optimization with end-to-end stochastic training, even with a proper choice of surrogate loss.
1 code implementation • CVPR 2022 • Shaofei Cai, Liang Li, Xinzhe Han, Jiebo Luo, Zheng-Jun Zha, Qingming Huang
However, the currently used graph search space overemphasizes learning node features and neglects mining hierarchical relational information.
Ranked #2 on
Link Prediction
on TSP/HCP Benchmark set
2 code implementations • 19 Apr 2022 • Runmin Cong, Ning Yang, Chongyi Li, Huazhu Fu, Yao Zhao, Qingming Huang, Sam Kwong
In this paper, we propose a global-and-local collaborative learning architecture, which includes a global correspondence modeling (GCM) and a local correspondence modeling (LCM) to capture comprehensive inter-image corresponding relationship among different images from the global and local perspectives.
3 code implementations • 18 Apr 2022 • Kaiwen Duan, Song Bai, Lingxi Xie, Honggang Qi, Qingming Huang, Qi Tian
Our approach, named CenterNet, detects each object as a triplet keypoints (top-left and bottom-right corners and the center keypoint).
Ranked #38 on
Object Detection
on COCO test-dev
no code implementations • 2 Apr 2022 • Zhenhuan Liu, Jincan Deng, Liang Li, Shaofei Cai, Qianqian Xu, Shuhui Wang, Qingming Huang
Conditional image generation is an active research topic including text2image and image translation.
Conditional Image Generation
Generative Adversarial Network
+1
1 code implementation • CVPR 2022 • Guanqi Ding, Xinzhe Han, Shuhui Wang, Shuzhe Wu, Xin Jin, Dandan Tu, Qingming Huang
Few-shot image generation is a challenging task even using the state-of-the-art Generative Adversarial Networks (GANs).
2 code implementations • CVPR 2022 • Jiayu Xiao, Liang Li, Chaofei Wang, Zheng-Jun Zha, Qingming Huang
A feasible solution is to start with a GAN well-trained on a large scale source domain and adapt it to the target domain with a few samples, termed as few shot generative model adaption.
1 code implementation • 20 Dec 2021 • Xinzhe Han, Shuhui Wang, Chi Su, Qingming Huang, Qi Tian
Existing de-bias learning frameworks try to capture specific dataset bias by annotations but they fail to handle complicated OOD scenarios.
no code implementations • NeurIPS 2021 • Peisong Wen, Qianqian Xu, Zhiyong Yang, Yuan He, Qingming Huang
To leverage high performance under low FPRs, we consider an alternative metric for multipartite ranking evaluating the True Positive Rate (TPR) at a given FPR, denoted as TPR@FPR.
1 code implementation • CVPR 2022 • Hanhua Ye, Guorong Li, Yuankai Qi, Shuhui Wang, Qingming Huang, Ming-Hsuan Yang
(II) Predicate level, which learns the actions conditioned on highlighted objects and is supervised by the predicate in captions.
1 code implementation • 23 Nov 2021 • Zhaobo Qi, Shuhui Wang, Chi Su, Li Su, Weigang Zhang, Qingming Huang
Based on TDC, we propose the temporal dynamic concept modeling network (TDCMN) to learn an accurate and complete concept representation for efficient untrimmed video analysis.
1 code implementation • 23 Nov 2021 • Zhaobo Qi, Shuhui Wang, Chi Su, Li Su, Qingming Huang, Qi Tian
Future activity anticipation is a challenging problem in egocentric vision.
no code implementations • 19 Nov 2021 • Xu Yan, Zhengcong Fei, Shuhui Wang, Qingming Huang, Qi Tian
Dense video captioning (DVC) aims to generate multi-sentence descriptions to elucidate the multiple events in the video, which is challenging and demands visual consistency, discoursal coherence, and linguistic diversity.
1 code implementation • ACM MM 2021 2021 • Zitai Wang, Qianqian Xu, Zhiyong Yang, Xiaochun Cao, Qingming Huang
As the core of the framework, the iterative relabeling module exploits the self-training principle to dynamically generate pseudo labels for user preferences.
1 code implementation • ACM MM 2021 2021 • Qianxiu Hao, Qianqian Xu, Zhiyong Yang, Qingming Huang
Heterogeneous information networks (HINs) have become a popular tool to capture complicated user-item relationships in recommendation problems in recent years.
1 code implementation • 11 Oct 2021 • Xu Yan, Zhengcong Fei, Zekang Li, Shuhui Wang, Qingming Huang, Qi Tian
Non-autoregressive image captioning with continuous iterative refinement, which eliminates the sequential dependence in a sentence generation, can achieve comparable performance to the autoregressive counterparts with a considerable acceleration.
3 code implementations • MM '21: Proceedings of the 29th ACM International Conference on Multimedia 2021 • Qianxiu Hao, Qianqian Xu, Zhiyong Yang, Qingming Huang
To balance overall recommendation performance and fairness, prevalent solutions apply fairness constraints or regularizations to enforce equality of certain performance across different subgroups.
no code implementations • 3 Sep 2021 • Shaofei Cai, Liang Li, Xinzhe Han, Zheng-Jun Zha, Qingming Huang
Recently, researchers study neural architecture search (NAS) to reduce the dependence of human expertise and explore better GNN architectures, but they over-emphasize entity features and ignore latent relation information concealed in the edges.
no code implementations • TPAMI 2021 • Zhiyong Yang, Qianqian Xu, Shilong Bao, Xiaochun Cao, Qingming Huang
Our foundation is based on the M metric, which is a well-known multiclass extension of AUC.
1 code implementation • ICCV 2021 • Xinzhe Han, Shuhui Wang, Chi Su, Qingming Huang, Qi Tian
Language bias is a critical issue in Visual Question Answering (VQA), where models often exploit dataset biases for the final decision without considering the image information.
Ranked #2 on
Visual Question Answering (VQA)
on VQA-CP
1 code implementation • ICML 2021 • Zhiyong Yang, Qianqian Xu, Shilong Bao, Yuan He, Xiaochun Cao, Qingming Huang
The critical challenge along this course lies in the difficulty of performing gradient-based optimization with end-to-end stochastic training, even with a proper choice of surrogate loss.
1 code implementation • 13 Jul 2021 • Shuhao Cui, Shuhui Wang, Junbao Zhuo, Liang Li, Qingming Huang, Qi Tian
Due to the domain discrepancy in visual domain adaptation, the performance of source model degrades when bumping into the high data density near decision boundary in target domain.
1 code implementation • 5 Jul 2021 • Ke Ma, Qianqian Xu, Jinshan Zeng, Xiaochun Cao, Qingming Huang
In this paper, to the best of our knowledge, we initiate the first systematic investigation of data poisoning attacks on pairwise ranking algorithms, which can be formalized as the dynamic and static games between the ranker and the attacker and can be modeled as certain kinds of integer programming problems.
no code implementations • NeurIPS 2021 • Peisong Wen, Qianqian Xu, Zhiyong Yang, Yuan He, Qingming Huang
To leverage high performance under low FPRs, we consider an alternative metric for multipartite ranking evaluating the True Positive Rate (TPR) at a given FPR, denoted as TPR@FPR.
1 code implementation • 11 Apr 2021 • Kaiwen Duan, Lingxi Xie, Honggang Qi, Song Bai, Qingming Huang, Qi Tian
Object detection, instance segmentation, and pose estimation are popular visual recognition tasks which require localizing the object by internal or boundary landmarks.
Ranked #60 on
Object Detection
on COCO test-dev
1 code implementation • CVPR 2021 • Shaofei Cai, Liang Li, Jincan Deng, Beichen Zhang, Zheng-Jun Zha, Li Su, Qingming Huang
Inspired by the strong searching capability of neural architecture search (NAS) in CNN, this paper proposes Graph Neural Architecture Search (GNAS) with novel-designed search space.
1 code implementation • CVPR 2021 • Peisong Wen, Qianqian Xu, Yangbangyan Jiang, Zhiyong Yang, Yuan He, Qingming Huang
Targeting at (a), we propose a two-level modality alignment loss where both global and local information are considered.
1 code implementation • IJCV 2021 • Shangzhi Teng, Shiliang Zhang, Qingming Huang, Nicu Sebe
Moreover, our method also achieves competitive performance compared with recent works on existing vehicle ReID datasets including VehicleID, VeRi-776 and VERI-Wild.
no code implementations • ICCV 2021 • Xinyan Liu, Guorong Li, Zhenjun Han, Weigang Zhang, Yifan Yang, Qingming Huang, Nicu Sebe
Specifically, we propose a task-driven similarity metric based on sample's mutual enhancement, referred as co-fine-tune similarity, which can find a more efficient subset of data for training the expert network.
1 code implementation • NeurIPS 2020 • Shuhao Cui, Xuan Jin, Shuhui Wang, Yuan He, Qingming Huang
In visual domain adaptation (DA), separating the domain-specific characteristics from the domain-invariant representations is an ill-posed problem.
no code implementations • 16 Oct 2020 • Jianfeng He, Xuchao Zhang, Shuo Lei, Shuhui Wang, Qingming Huang, Chang-Tien Lu, Bei Xiao
Each MEx area has the mask area of the generation as the majority and the boundary of original context as the minority.
1 code implementation • CVPR 2020 • Jun Wei, Shuhui Wang, Zhe Wu, Chi Su, Qingming Huang, Qi Tian
Though remarkable progress has been achieved, we observe that the closer the pixel is to the edge, the more difficult it is to be predicted, because edge pixels have a very imbalance distribution.
Ranked #1 on
Saliency Detection
on HKU-IS
1 code implementation • ECCV 2020 • Kaiwen Duan, Lingxi Xie, Honggang Qi, Song Bai, Qingming Huang, Qi Tian
On the MS-COCO dataset, CPN achieves an AP of 49. 2% which is competitive among state-of-the-art object detection methods.
Ranked #98 on
Object Detection
on COCO test-dev
no code implementations • 29 Apr 2020 • Zhiyong Yang, Qianqian Xu, Xiaochun Cao, Qingming Huang
To this end, we propose a novel multi-task learning method called Task-Feature Collaborative Learning (TFCL).
1 code implementation • CVPR 2020 • Dechao Meng, Liang Li, Xuejing Liu, Yadong Li, Shijie Yang, Zheng-Jun Zha, Xingyu Gao, Shuhui Wang, Qingming Huang
Vehicle Re-Identification is to find images of the same vehicle from various views in the cross-camera scenario.
1 code implementation • CVPR 2020 • Beichen Zhang, Liang Li, Shijie Yang, Shuhui Wang, Zheng-Jun Zha, Qingming Huang
In this paper, we propose a state relabeling adversarial active learning model (SRAAL), that leverages both the annotation and the labeled/unlabeled state information for deriving the most informative unlabeled samples.
2 code implementations • CVPR 2020 • Shuhao Cui, Shuhui Wang, Junbao Zhuo, Chi Su, Qingming Huang, Qi Tian
On the discriminator, GVB contributes to enhance the discriminating ability, and balance the adversarial training process.
2 code implementations • CVPR 2020 • Shuhao Cui, Shuhui Wang, Junbao Zhuo, Liang Li, Qingming Huang, Qi Tian
We find by theoretical analysis that the prediction discriminability and diversity could be separately measured by the Frobenius-norm and rank of the batch output matrix.
1 code implementation • 19 Mar 2020 • Zuyao Chen, Runmin Cong, Qianqian Xu, Qingming Huang
There are two main issues in RGB-D salient object detection: (1) how to effectively integrate the complementarity from the cross-modal RGB-D data; (2) how to prevent the contamination effect from the unreliable depth map.
Ranked #22 on
Thermal Image Segmentation
on RGB-T-Glass-Segmentation
2 code implementations • 2 Mar 2020 • Zuyao Chen, Qianqian Xu, Runmin Cong, Qingming Huang
Deep convolutional neural networks have achieved competitive performance in salient object detection, in which how to learn effective and comprehensive features plays a critical role.
Ranked #21 on
Dichotomous Image Segmentation
on DIS-TE1
1 code implementation • NeurIPS 2019 • Yangbangyan Jiang, Qianqian Xu, Zhiyong Yang, Xiaochun Cao, Qingming Huang
Instead of transforming all the samples into a joint modality-independent space, our framework learns the mappings across individual modal spaces by virtue of cycle-consistency.
1 code implementation • NeurIPS 2019 • Zhiyong Yang, Qianqian Xu, Yangbangyan Jiang, Xiaochun Cao, Qingming Huang
Different from most of the previous work, pursuing the Block-Diagonal structure of LTAM (assigning latent tasks to output tasks) alleviates negative transfer via collaboratively grouping latent tasks and output tasks such that inter-group knowledge transfer and sharing is suppressed.
4 code implementations • 26 Nov 2019 • Jun Wei, Shuhui Wang, Qingming Huang
Furthermore, different from binary cross entropy, the proposed PPA loss doesn't treat pixels equally, which can synthesize the local structure information of a pixel to guide the network to focus more on local details.
Ranked #6 on
Salient Object Detection
on DUT-OMRON
Camouflaged Object Segmentation
Dichotomous Image Segmentation
+3
1 code implementation • ACM MM 2019 • Shilong Bao, Qianqian Xu, Ke Ma, Zhiyong Yang, Xiaochun Cao, Qingming Huang
From the margin theory point-of-view, we then propose a generalization enhancement scheme for sparse and insufficient labels via optimizing the margin distribution.
1 code implementation • NeurIPS 2019 • Qianqian Xu, Xinwei Sun, Zhiyong Yang, Xiaochun Cao, Qingming Huang, Yuan YAO
In this paper, instead of learning a global ranking which is agreed with the consensus, we pursue the tie-aware partial ranking from an individualized perspective.
1 code implementation • ACMMM 2019 • Yiling Wu, Shuhui Wang, Guoli Song, Qingming Huang
In this paper, we propose Self-Attention Embeddings (SAEM) to exploit fragment relations in images or texts by self-attention mechanism, and aggregate fragment information into visual and textual embeddings.
1 code implementation • 5 Sep 2019 • Xuejing Liu, Liang Li, Shuhui Wang, Zheng-Jun Zha, Li Su, Qingming Huang
Weakly supervised referring expression grounding (REG) aims at localizing the referential entity in an image according to linguistic query, where the mapping between the image region (proposal) and the query is unknown in the training stage.
1 code implementation • ICCV 2019 • Xuejing Liu, Liang Li, Shuhui Wang, Zheng-Jun Zha, Dechao Meng, Qingming Huang
It builds the correspondence between image region proposal and query in an adaptive manner: adaptive grounding and collaborative reconstruction.
1 code implementation • 14 Aug 2019 • Guoli Song, Shuhui Wang, Qingming Huang, Qi Tian
Multimodal learning aims to discover the relationship between multiple modalities.
no code implementations • 18 Jun 2019 • Zhiyong Yang, Qianqian Xu, Xiaochun Cao, Qingming Huang
Traditionally, most of the existing attribute learning methods are trained based on the consensus of annotations aggregated from a limited number of annotators.
no code implementations • 20 May 2019 • Jun Yu, Jing Li, Zhou Yu, Qingming Huang
Despite the success of existing studies, current methods only model the co-attention that characterizes the inter-modal interactions while neglecting the self-attention that characterizes the intra-modal interactions.
1 code implementation • CVPR 2019 • Junbao Zhuo, Shuhui Wang, Shuhao Cui, Qingming Huang
We address the unsupervised open domain recognition (UODR) problem, where categories in labeled source domain S is only a subset of those in unlabeled target domain T. The task is to correctly classify all samples in T including known and unknown categories.
1 code implementation • CVPR 2019 • Zhe Wu, Li Su, Qingming Huang
In this paper, we propose a novel Cascaded Partial Decoder (CPD) framework for fast and accurate salient object detection.
Ranked #1 on
RGB Salient Object Detection
on ISTD
20 code implementations • ICCV 2019 • Kaiwen Duan, Song Bai, Lingxi Xie, Honggang Qi, Qingming Huang, Qi Tian
In object detection, keypoint-based approaches often suffer a large number of incorrect object bounding boxes, arguably due to the lack of an additional look into the cropped regions.
Ranked #121 on
Object Detection
on COCO test-dev
1 code implementation • CVPR 2019 • Kai Xu, Longyin Wen, Guorong Li, Liefeng Bo, Qingming Huang
Specifically, the temporal coherence branch pretrained in an adversarial fashion from unlabeled video data, is designed to capture the dynamic appearance and motion cues of video sequences to guide object segmentation.
Ranked #1 on
Semi-Supervised Video Object Segmentation
on YouTube
no code implementations • CVPR 2019 • Qianqian Xu, Zhiyong Yang, Yangbangyan Jiang, Xiaochun Cao, Qingming Huang, Yuan YAO
The problem of estimating subjective visual properties (SVP) of images (e. g., Shoes A is more comfortable than B) is gaining rising attention.
4 code implementations • 26 Jan 2019 • Tao Hu, Honggang Qi, Qingming Huang, Yan Lu
Specifically, for each training image, we first generate attention maps to represent the object's discriminative parts by weakly supervised learning.
Ranked #16 on
Fine-Grained Image Classification
on CUB-200-2011
no code implementations • 16 Nov 2018 • Runmin Cong, Jianjun Lei, Huazhu Fu, Qingming Huang, Xiaochun Cao, Nam Ling
In this paper, we propose a novel co-saliency detection method for RGBD images based on hierarchical sparsity reconstruction and energy function refinement.
no code implementations • 20 Aug 2018 • Jianjun Lei, Lijie Niu, Huazhu Fu, Bo Peng, Qingming Huang, Chunping Hou
In this paper, we propose a novel person re-identification method, which consists of a reliable representation called Semantic Region Representation (SRR), and an effective metric learning with Mapping Space Topology Constraint (MSTC).
no code implementations • 6 Aug 2018 • Tao Hu, Jizheng Xu, Cong Huang, Honggang Qi, Qingming Huang, Yan Lu
Besides, we propose attention regularization and attention dropout to weakly supervise the generating process of attention maps.
no code implementations • 29 Jul 2018 • Qianqian Xu, Jiechao Xiong, Xinwei Sun, Zhiyong Yang, Xiaochun Cao, Qingming Huang, Yuan YAO
A preference order or ranking aggregated from pairwise comparison data is commonly understood as a strict total order.
no code implementations • 25 Jun 2018 • Xiaobin Liu, Shiliang Zhang, Qingming Huang, Wen Gao
Specifically, in addition to extracting global features, RAM also extracts features from a series of local regions.
no code implementations • ECCV 2018 • Dawei Du, Yuankai Qi, Hongyang Yu, Yifan Yang, Kaiwen Duan, Guorong Li, Weigang Zhang, Qingming Huang, Qi Tian
Selected from 10 hours raw videos, about 80, 000 representative frames are fully annotated with bounding boxes as well as up to 14 kinds of attributes (e. g., weather condition, flying altitude, camera view, vehicle category, and occlusion) for three fundamental computer vision tasks: object detection, single object tracking, and multiple object tracking.
Ranked #5 on
Object Detection
on UAVDT
no code implementations • 18 Mar 2018 • Tao Hu, Honggang Qi, Jizheng Xu, Qingming Huang
Only one self-iterative regressor is trained to learn the descent directions for samples from coarse stages to fine stages, and parameters are iteratively updated by the same regressor.
Ranked #16 on
Face Alignment
on 300W
(NME_inter-pupil (%, Common) metric)
no code implementations • 9 Mar 2018 • Runmin Cong, Jianjun Lei, Huazhu Fu, Ming-Ming Cheng, Weisi Lin, Qingming Huang
With the acquisition technology development, more comprehensive information, such as depth cue, inter-image correspondence, or temporal relationship, is available to extend image saliency detection to RGBD saliency detection, co-saliency detection, or video saliency detection.
no code implementations • 8 Mar 2018 • Qianqian Xu, Jiechao Xiong, Xiaochun Cao, Qingming Huang, Yuan YAO
In crowdsourced preference aggregation, it is often assumed that all the annotators are subject to a common preference or social utility function which generates their comparison behaviors in experiments.
no code implementations • ECCV 2018 • Yangyu Chen, Shuhui Wang, Weigang Zhang, Qingming Huang
We propose a plug-and-play PickNet to perform informative frame picking in video captioning.
2 code implementations • 4 Dec 2017 • Jun Yu, Xingxin Xu, Fei Gao, Shengjie Shi, Meng Wang, DaCheng Tao, Qingming Huang
Experimental results show that our method is capable of generating both visually comfortable and identity-preserving face sketches/photos over a wide range of challenging data.
Ranked #1 on
Face Sketch Synthesis
on CUFS
(FID metric)
no code implementations • 18 Nov 2017 • Zhiyong Yang, Qianqian Xu, Xiaochun Cao, Qingming Huang
However, both categories ignore the joint effect of the two mentioned factors: the personal diversity with respect to the global consensus; and the intrinsic correlation among multiple attributes.
no code implementations • 16 Nov 2017 • Qianqian Xu, Jiechao Xiong, Xi Chen, Qingming Huang, Yuan YAO
Recently, crowdsourcing has emerged as an effective paradigm for human-powered large scale problem solving in various domains.
no code implementations • 4 Nov 2017 • Runmin Cong, Jianjun Lei, Huazhu Fu, Weisi Lin, Qingming Huang, Xiaochun Cao, Chunping Hou
In this paper, we propose an iterative RGBD co-saliency framework, which utilizes the existing single saliency maps as the initialization, and generates the final RGBD cosaliency map by using a refinement-cycle model.
no code implementations • 14 Oct 2017 • Runmin Cong, Jianjun Lei, Changqing Zhang, Qingming Huang, Xiaochun Cao, Chunping Hou
Stereoscopic perception is an important part of human visual system that allows the brain to perceive depth.
no code implementations • 14 Oct 2017 • Runmin Cong, Jianjun Lei, Huazhu Fu, Qingming Huang, Xiaochun Cao, Chunping Hou
Different from the most existing co-saliency methods focusing on RGB images, this paper proposes a novel co-saliency detection model for RGBD images, which utilizes the depth information to enhance identification of co-saliency.
no code implementations • ICCV 2017 • Guoli Song, Shuhui Wang, Qingming Huang, Qi Tian
We incorporate the harmonization mechanism into the learning process of multimodal GPLVMs.
no code implementations • 18 Jul 2017 • Qianqian Xu, Ming Yan, Chendi Huang, Jiechao Xiong, Qingming Huang, Yuan YAO
Outlier detection is a crucial part of robust evaluation for crowdsourceable assessment of Quality of Experience (QoE) and has attracted much attention in recent years.
1 code implementation • CVPR 2017 • Shijie Yang, Liang Li, Shuhui Wang, Weigang Zhang, Qingming Huang
Deep Auto-Encoder (DAE) has shown its promising power in high-level representation learning.
no code implementations • CVPR 2017 • Yiling Wu, Shuhui Wang, Qingming Huang
In this paper, we propose an online learning method to learn the similarity function between heterogeneous modalities by preserving the relative similarity in the training data, which is modeled as a set of bi-directional hinge loss constraints on the cross-modal training triplets.
no code implementations • CVPR 2016 • Yuankai Qi, Shengping Zhang, Lei Qin, Hongxun Yao, Qingming Huang, Jongwoo Lim, Ming-Hsuan Yang
In recent years, several methods have been developed to utilize hierarchical features learned from a deep convolutional neural network (CNN) for visual tracking.
no code implementations • 18 Mar 2016 • Dawei Du, Honggang Qi, Longyin Wen, Qi Tian, Qingming Huang, Siwei Lyu
Graph based representation is widely used in visual tracking field by finding correct correspondences between target parts in consecutive frames.
1 code implementation • 18 Dec 2015 • Li Shen, Zhouchen Lin, Qingming Huang
Learning deeper convolutional neural networks becomes a tendency in recent years.
Ranked #8 on
Long-tail Learning
on VOC-MLT
no code implementations • ICCV 2015 • Guoli Song, Shuhui Wang, Qingming Huang, Qi Tian
Data from real applications involve multiple modalities representing content with the same semantics and deliver rich information from complementary aspects.
no code implementations • 15 Aug 2014 • Qianqian Xu, Jiechao Xiong, Xiaochun Cao, Qingming Huang, Yuan YAO
In this paper we study the problem of how to estimate such visual properties from a ranking perspective with the help of the annotators from online crowdsourcing platforms.
no code implementations • CVPR 2013 • Li Shen, Shuhui Wang, Gang Sun, Shuqiang Jiang, Qingming Huang
For each internode of the hierarchical category structure, a discriminative dictionary and a set of classification models are learnt for visual categorization, and the dictionaries in different layers are learnt to exploit the discriminative visual properties of different granularity.