Search Results for author: Shuhui Wang

Found 31 papers, 20 papers with code

Interpretable Visual Reasoning via Probabilistic Formulation under Natural Supervision

no code implementations ECCV 2020 Xinzhe Han, Shuhui Wang, Chi Su, Weigang Zhang, Qingming Huang, Qi Tian

In this paper, we rethink implicit reasoning process in VQA, and propose a new formulation which maximizes the log-likelihood of joint distribution for the observed question and predicted answer.

Question Answering Visual Question Answering +1

Hierarchical Modular Network for Video Captioning

no code implementations24 Nov 2021 Hanhua Ye, Guorong Li, Yuankai Qi, Shuhui Wang, Qingming Huang, Ming-Hsuan Yang

(II) Predicate level, which learns the actions conditioned on highlighted objects and is supervised by the predicate in captions.

Representation Learning Video Captioning

Modeling Temporal Concept Receptive Field Dynamically for Untrimmed Video Analysis

1 code implementation23 Nov 2021 Zhaobo Qi, Shuhui Wang, Chi Su, Li Su, Weigang Zhang, Qingming Huang

Based on TDC, we propose the temporal dynamic concept modeling network (TDCMN) to learn an accurate and complete concept representation for efficient untrimmed video analysis.

Image Categorization

DVCFlow: Modeling Information Flow Towards Human-like Video Captioning

no code implementations19 Nov 2021 Xu Yan, Zhengcong Fei, Shuhui Wang, Qingming Huang, Qi Tian

Dense video captioning (DVC) aims to generate multi-sentence descriptions to elucidate the multiple events in the video, which is challenging and demands visual consistency, discoursal coherence, and linguistic diversity.

Dense Video Captioning

Semi-Autoregressive Image Captioning

1 code implementation11 Oct 2021 Xu Yan, Zhengcong Fei, Zekang Li, Shuhui Wang, Qingming Huang, Qi Tian

Non-autoregressive image captioning with continuous iterative refinement, which eliminates the sequential dependence in a sentence generation, can achieve comparable performance to the autoregressive counterparts with a considerable acceleration.

Image Captioning

Greedy Gradient Ensemble for Robust Visual Question Answering

1 code implementation ICCV 2021 Xinzhe Han, Shuhui Wang, Chi Su, Qingming Huang, Qi Tian

Language bias is a critical issue in Visual Question Answering (VQA), where models often exploit dataset biases for the final decision without considering the image information.

Question Answering Visual Question Answering

Fast Batch Nuclear-norm Maximization and Minimization for Robust Domain Adaptation

1 code implementation13 Jul 2021 Shuhao Cui, Shuhui Wang, Junbao Zhuo, Liang Li, Qingming Huang, Qi Tian

Due to the domain discrepancy in visual domain adaptation, the performance of source model degrades when bumping into the high data density near decision boundary in target domain.

Domain Adaptation

Learning Invariant Representation with Consistency and Diversity for Semi-supervised Source Hypothesis Transfer

1 code implementation7 Jul 2021 Xiaodong Wang, Junbao Zhuo, Shuhao Cui, Shuhui Wang

Semi-supervised domain adaptation (SSDA) aims to solve tasks in target domain by utilizing transferable information learned from the available source domain and a few labeled target data.

Domain Adaptation

Mining Latent Structures for Multimedia Recommendation

1 code implementation19 Apr 2021 Jinghao Zhang, Yanqiao Zhu, Qiang Liu, Shu Wu, Shuhui Wang, Liang Wang

To be specific, in the proposed LATTICE model, we devise a novel modality-aware structure learning layer, which learns item-item structures for each modality and aggregates multiple modalities to obtain latent item graphs.

Collaborative Filtering Recommendation Systems

QAIR: Practical Query-efficient Black-Box Attacks for Image Retrieval

no code implementations CVPR 2021 Xiaodan Li, Jinfeng Li, Yuefeng Chen, Shaokai Ye, Yuan He, Shuhui Wang, Hang Su, Hui Xue

Comprehensive experiments show that the proposed attack achieves a high attack success rate with few queries against the image retrieval systems under the black-box setting.

Image Classification Image Retrieval

Composite Adversarial Attacks

1 code implementation10 Dec 2020 Xiaofeng Mao, Yuefeng Chen, Shuhui Wang, Hang Su, Yuan He, Hui Xue

Adversarial attack is a technique for deceiving Machine Learning (ML) models, which provides a way to evaluate the adversarial robustness.

Adversarial Attack Adversarial Robustness

Heuristic Domain Adaptation

1 code implementation NeurIPS 2020 Shuhao Cui, Xuan Jin, Shuhui Wang, Yuan He, Qingming Huang

In visual domain adaptation (DA), separating the domain-specific characteristics from the domain-invariant representations is an ill-posed problem.

Domain Adaptation

Semantic Editing On Segmentation Map Via Multi-Expansion Loss

no code implementations16 Oct 2020 Jianfeng He, Xuchao Zhang, Shuo Lei, Shuhui Wang, Qingming Huang, Chang-Tien Lu, Bei Xiao

Each MEx area has the mask area of the generation as the majority and the boundary of original context as the minority.

Image Inpainting

Label Decoupling Framework for Salient Object Detection

1 code implementation CVPR 2020 Jun Wei, Shuhui Wang, Zhe Wu, Chi Su, Qingming Huang, Qi Tian

Though remarkable progress has been achieved, we observe that the closer the pixel is to the edge, the more difficult it is to be predicted, because edge pixels have a very imbalance distribution.

 Ranked #1 on Salient Object Detection on DUTS-TE (MAE metric)

RGB Salient Object Detection Saliency Detection +1

Sharp Multiple Instance Learning for DeepFake Video Detection

no code implementations11 Aug 2020 Xiaodan Li, Yining Lang, Yuefeng Chen, Xiaofeng Mao, Yuan He, Shuhui Wang, Hui Xue, Quan Lu

A sharp MIL (S-MIL) is proposed which builds direct mapping from instance embeddings to bag prediction, rather than from instance embeddings to instance prediction and then to bag prediction in traditional MIL.

Face Swapping Multiple Instance Learning

State-Relabeling Adversarial Active Learning

1 code implementation CVPR 2020 Beichen Zhang, Liang Li, Shijie Yang, Shuhui Wang, Zheng-Jun Zha, Qingming Huang

In this paper, we propose a state relabeling adversarial active learning model (SRAAL), that leverages both the annotation and the labeled/unlabeled state information for deriving the most informative unlabeled samples.

Active Learning

Gradually Vanishing Bridge for Adversarial Domain Adaptation

1 code implementation CVPR 2020 Shuhao Cui, Shuhui Wang, Junbao Zhuo, Chi Su, Qingming Huang, Qi Tian

On the discriminator, GVB contributes to enhance the discriminating ability, and balance the adversarial training process.

Unsupervised Domain Adaptation

Towards Discriminability and Diversity: Batch Nuclear-norm Maximization under Label Insufficient Situations

1 code implementation CVPR 2020 Shuhao Cui, Shuhui Wang, Junbao Zhuo, Liang Li, Qingming Huang, Qi Tian

We find by theoretical analysis that the prediction discriminability and diversity could be separately measured by the Frobenius-norm and rank of the batch output matrix.

Domain Adaptation

F3Net: Fusion, Feedback and Focus for Salient Object Detection

4 code implementations26 Nov 2019 Jun Wei, Shuhui Wang, Qingming Huang

Furthermore, different from binary cross entropy, the proposed PPA loss doesn't treat pixels equally, which can synthesize the local structure information of a pixel to guide the network to focus more on local details.

RGB Salient Object Detection Salient Object Detection

Knowledge-guided Pairwise Reconstruction Network for Weakly Supervised Referring Expression Grounding

1 code implementation5 Sep 2019 Xuejing Liu, Liang Li, Shuhui Wang, Zheng-Jun Zha, Li Su, Qingming Huang

Weakly supervised referring expression grounding (REG) aims at localizing the referential entity in an image according to linguistic query, where the mapping between the image region (proposal) and the query is unknown in the training stage.

Region Proposal

Adaptive Reconstruction Network for Weakly Supervised Referring Expression Grounding

1 code implementation ICCV 2019 Xuejing Liu, Liang Li, Shuhui Wang, Zheng-Jun Zha, Dechao Meng, Qingming Huang

It builds the correspondence between image region proposal and query in an adaptive manner: adaptive grounding and collaborative reconstruction.

Region Proposal

Unsupervised Open Domain Recognition by Semantic Discrepancy Minimization

1 code implementation CVPR 2019 Junbao Zhuo, Shuhui Wang, Shuhao Cui, Qingming Huang

We address the unsupervised open domain recognition (UODR) problem, where categories in labeled source domain S is only a subset of those in unlabeled target domain T. The task is to correctly classify all samples in T including known and unknown categories.

Classification General Classification

Online Asymmetric Similarity Learning for Cross-Modal Retrieval

no code implementations CVPR 2017 Yiling Wu, Shuhui Wang, Qingming Huang

In this paper, we propose an online learning method to learn the similarity function between heterogeneous modalities by preserving the relative similarity in the training data, which is modeled as a set of bi-directional hinge loss constraints on the cross-modal training triplets.

Cross-Modal Retrieval Semantic Similarity +1

Similarity Gaussian Process Latent Variable Model for Multi-Modal Data Analysis

no code implementations ICCV 2015 Guoli Song, Shuhui Wang, Qingming Huang, Qi Tian

Data from real applications involve multiple modalities representing content with the same semantics and deliver rich information from complementary aspects.

Multi-level Discriminative Dictionary Learning towards Hierarchical Visual Categorization

no code implementations CVPR 2013 Li Shen, Shuhui Wang, Gang Sun, Shuqiang Jiang, Qingming Huang

For each internode of the hierarchical category structure, a discriminative dictionary and a set of classification models are learnt for visual categorization, and the dictionaries in different layers are learnt to exploit the discriminative visual properties of different granularity.

Dictionary Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.