Search Results for author: Qianru Sun

Found 46 papers, 37 papers with code

Freestyle Layout-to-Image Synthesis

1 code implementation CVPR 2023 Han Xue, Zhiwu Huang, Qianru Sun, Li Song, Wenjun Zhang

In this work, we explore the freestyle capability of the model, i. e., how far can it generate unseen semantics (e. g., classes, attributes, and styles) onto a given layout, and call the task Freestyle LIS (FLIS).

Image Classification Layout-to-Image Generation +2

Class-Incremental Exemplar Compression for Class-Incremental Learning

1 code implementation CVPR 2023 Zilin Luo, Yaoyao Liu, Bernt Schiele, Qianru Sun

Exemplar-based class-incremental learning (CIL) finetunes the model with all samples of new classes but few-shot exemplars of old classes in each incremental phase, where the "few-shot" abides by the limited memory budget.

Bilevel Optimization class-incremental learning +2

Unbiased Multiple Instance Learning for Weakly Supervised Video Anomaly Detection

1 code implementation CVPR 2023 Hui Lv, Zhongqi Yue, Qianru Sun, Bin Luo, Zhen Cui, Hanwang Zhang

At each MIL training iteration, we use the current detector to divide the samples into two groups with different context biases: the most confident abnormal/normal snippets and the rest ambiguous ones.

Anomaly Detection Multiple Instance Learning +1

Extracting Class Activation Maps from Non-Discriminative Features as well

1 code implementation CVPR 2023 Zhaozheng Chen, Qianru Sun

Extracting class activation maps (CAM) from a classification model often results in poor coverage on foreground objects, i. e., only the discriminative region (e. g., the "head" of "sheep") is recognized and the rest (e. g., the "leg" of "sheep") mistakenly as background.

Weakly supervised Semantic Segmentation Weakly-Supervised Semantic Segmentation

Semantic Scene Completion with Cleaner Self

1 code implementation CVPR 2023 Fengyun Wang, Dong Zhang, Hanwang Zhang, Jinhui Tang, Qianru Sun

SSC is a well-known ill-posed problem as the prediction model has to "imagine" what is behind the visible surface, which is usually represented by Truncated Signed Distance Function (TSDF).

Compositional Prompt Tuning with Motion Cues for Open-vocabulary Video Relation Detection

1 code implementation1 Feb 2023 Kaifeng Gao, Long Chen, Hanwang Zhang, Jun Xiao, Qianru Sun

Without bells and whistles, our RePro achieves a new state-of-the-art performance on two VidVRD benchmarks of not only the base training object and predicate categories, but also the unseen ones.

Video Visual Relation Detection

RMM: Reinforced Memory Management for Class-Incremental Learning

3 code implementations NeurIPS 2021 Yaoyao Liu, Bernt Schiele, Qianru Sun

Class-Incremental Learning (CIL) [40] trains classifiers under a strict memory budget: in each incremental phase, learning is done for new data, most of which is abandoned to free space for the next phase.

class-incremental learning Class Incremental Learning +2

Online Hyperparameter Optimization for Class-Incremental Learning

1 code implementation11 Jan 2023 Yaoyao Liu, YingYing Li, Bernt Schiele, Qianru Sun

Class-incremental learning (CIL) aims to train a classification model while the number of classes increases phase-by-phase.

class-incremental learning Class Incremental Learning +3

Attention-based Class Activation Diffusion for Weakly-Supervised Semantic Segmentation

no code implementations20 Nov 2022 Jianqiang Huang, Jian Wang, Qianru Sun, Hanwang Zhang

An intuitive solution is ``coupling'' the CAM with the long-range attention matrix of visual transformers (ViT) We find that the direct ``coupling'', e. g., pixel-wise multiplication of attention and activation, achieves a more global coverage (on the foreground), but unfortunately goes with a great increase of false positives, i. e., background pixels are mistakenly included.

Weakly supervised Semantic Segmentation Weakly-Supervised Semantic Segmentation

Efficient Cross-Modal Video Retrieval with Meta-Optimized Frames

1 code implementation16 Oct 2022 Ning Han, Xun Yang, Ee-Peng Lim, Hao Chen, Qianru Sun

In turn, the frame-level optimization is through gradient descent using the meta loss of video retrieval model computed on the whole video.

Bilevel Optimization Retrieval +2

Class Is Invariant to Context and Vice Versa: On Learning Invariance for Out-Of-Distribution Generalization

1 code implementation6 Aug 2022 Jiaxin Qi, Kaihua Tang, Qianru Sun, Xian-Sheng Hua, Hanwang Zhang

If the context in every class is evenly distributed, OOD would be trivial because the context can be easily removed due to an underlying principle: class is invariant to context.

Out-of-Distribution Generalization

On Mitigating Hard Clusters for Face Clustering

1 code implementation25 Jul 2022 Yingjie Chen, Huasong Zhong, Chong Chen, Chen Shen, Jianqiang Huang, Tao Wang, Yun Liang, Qianru Sun

Face clustering is a promising way to scale up face recognition systems using large-scale unlabeled face images.

Face Clustering Face Recognition

Equivariance and Invariance Inductive Bias for Learning from Insufficient Data

1 code implementation25 Jul 2022 Tan Wang, Qianru Sun, Sugiri Pranata, Karlekar Jayashree, Hanwang Zhang

We are interested in learning robust models from insufficient data, without the need for any externally pre-trained checkpoints.

Inductive Bias

Class Re-Activation Maps for Weakly-Supervised Semantic Segmentation

1 code implementation CVPR 2022 Zhaozheng Chen, Tan Wang, Xiongwei Wu, Xian-Sheng Hua, Hanwang Zhang, Qianru Sun

Specifically, due to the sum-over-class pooling nature of BCE, each pixel in CAM may be responsive to multiple classes co-occurring in the same receptive field.

Weakly supervised Semantic Segmentation Weakly-Supervised Semantic Segmentation

Deconfounded Visual Grounding

1 code implementation31 Dec 2021 Jianqiang Huang, Yu Qin, Jiaxin Qi, Qianru Sun, Hanwang Zhang

We focus on the confounding bias between language and location in the visual grounding pipeline, where we find that the bias is the major visual reasoning bottleneck.

Referring Expression Visual Grounding +1

Self-Supervised Learning Disentangled Group Representation as Feature

1 code implementation NeurIPS 2021 Tan Wang, Zhongqi Yue, Jianqiang Huang, Qianru Sun, Hanwang Zhang

A good visual representation is an inference map from observations (images) to features (vectors) that faithfully reflects the hidden modularized generative factors (semantics).

Colorization Contrastive Learning +1

Wakening Past Concepts without Past Data: Class-incremental Learning from Placebos

no code implementations29 Sep 2021 Yaoyao Liu, Bernt Schiele, Qianru Sun

However, we empirically observe that this both harms learning of new classes and also underperforms to distil old class knowledge from the previous phase model.

class-incremental learning Class Incremental Learning +2

Attention-based Feature Aggregation

no code implementations29 Sep 2021 Xiongwei Wu, Ee-Peng Lim, Steven Hoi, Qianru Sun

To implement this module, we define two variants of attention: self-attention on the summed-up feature map, and cross-attention between two feature maps before summed up.

Instance Segmentation object-detection +2

Causal Attention for Unbiased Visual Recognition

1 code implementation ICCV 2021 Tan Wang, Chang Zhou, Qianru Sun, Hanwang Zhang

Attention module does not always help deep models learn causal features that are robust in any confounding context, e. g., a foreground object feature is invariant to different backgrounds.

COSY: COunterfactual SYntax for Cross-Lingual Understanding

1 code implementation ACL 2021 Sicheng Yu, Hao Zhang, Yulei Niu, Qianru Sun, Jing Jiang

Pre-trained multilingual language models, e. g., multilingual-BERT, are widely used in cross-lingual tasks, yielding the state-of-the-art performance.

Natural Language Inference POS +2

Transporting Causal Mechanisms for Unsupervised Domain Adaptation

1 code implementation ICCV 2021 Zhongqi Yue, Qianru Sun, Xian-Sheng Hua, Hanwang Zhang

However, the theoretical solution provided by transportability is far from practical for UDA, because it requires the stratification and representation of the unobserved confounder that is the cause of the domain gap.

Unsupervised Domain Adaptation

A Large-Scale Benchmark for Food Image Segmentation

2 code implementations12 May 2021 Xiongwei Wu, Xin Fu, Ying Liu, Ee-Peng Lim, Steven C. H. Hoi, Qianru Sun

Existing food image segmentation models are underperforming due to two reasons: (1) there is a lack of high quality food image datasets with fine-grained ingredient labels and pixel-wise location masks -- the existing datasets either carry coarse ingredient labels or are small in size; and (2) the complex appearance of food makes it difficult to localize and recognize ingredients in food images, e. g., the ingredients may overlap one another in the same image, and the identical ingredient may appear distinctly in different food images.

Ranked #2 on Semantic Segmentation on FoodSeg103 (using extra training data)

Image Segmentation Semantic Segmentation

Revisiting Local Descriptor for Improved Few-Shot Classification

1 code implementation30 Mar 2021 Jun He, Richang Hong, Xueliang Liu, Mingliang Xu, Qianru Sun

Few-shot classification studies the problem of quickly adapting a deep learner to understanding novel classes based on few support images.

Classification Decision Making +1

Counterfactual Zero-Shot and Open-Set Visual Recognition

1 code implementation CVPR 2021 Zhongqi Yue, Tan Wang, Hanwang Zhang, Qianru Sun, Xian-Sheng Hua

We show that the key reason is that the generation is not Counterfactual Faithful, and thus we propose a faithful one, whose generation is from the sample-specific counterfactual question: What would the sample look like, if we set its class attribute to a certain class, while keeping its sample attribute unchanged?

Binary Classification Open Set Learning +1

Counterfactual Variable Control for Robust and Interpretable Question Answering

1 code implementation12 Oct 2020 Sicheng Yu, Yulei Niu, Shuohang Wang, Jing Jiang, Qianru Sun

We then conduct two novel CVC inference methods (on trained models) to capture the effect of comprehensive reasoning as the final prediction.

Causal Inference Multiple-choice +2

Adaptive Aggregation Networks for Class-Incremental Learning

2 code implementations CVPR 2021 Yaoyao Liu, Bernt Schiele, Qianru Sun

Class-Incremental Learning (CIL) aims to learn a classification model with the number of classes increasing phase-by-phase.

class-incremental learning Class Incremental Learning +1

Interventional Few-Shot Learning

1 code implementation NeurIPS 2020 Zhongqi Yue, Hanwang Zhang, Qianru Sun, Xian-Sheng Hua

Specifically, we develop three effective IFSL algorithmic implementations based on the backdoor adjustment, which is essentially a causal intervention towards the SCM of many-shot learning: the upper-bound of FSL in a causal view.

Few-Shot Learning

Feature Pyramid Transformer

1 code implementation ECCV 2020 Dong Zhang, Hanwang Zhang, Jinhui Tang, Meng Wang, Xiansheng Hua, Qianru Sun

Yet, the non-local spatial interactions are not across scales, and thus they fail to capture the non-local contexts of objects (or parts) residing in different scales.

Instance Segmentation object-detection +2

Visual Commonsense R-CNN

2 code implementations CVPR 2020 Tan Wang, Jianqiang Huang, Hanwang Zhang, Qianru Sun

We present a novel unsupervised feature representation learning method, Visual Commonsense Region-based Convolutional Neural Network (VC R-CNN), to serve as an improved visual region encoder for high-level tasks such as captioning and VQA.

Image Captioning Representation Learning +1

Meta-Transfer Learning through Hard Tasks

1 code implementation7 Oct 2019 Qianru Sun, Yaoyao Liu, Zhaozheng Chen, Tat-Seng Chua, Bernt Schiele

In this paper, we propose a novel approach called meta-transfer learning (MTL) which learns to transfer the weights of a deep NN for few-shot learning tasks.

Few-Shot Learning Transfer Learning

Joint Visual Grounding with Language Scene Graphs

no code implementations9 Jun 2019 Daqing Liu, Hanwang Zhang, Zheng-Jun Zha, Meng Wang, Qianru Sun

In this paper, we alleviate the missing-annotation problem and enable the joint reasoning by leveraging the language scene graph which covers both labeled referent and unlabeled contexts (other objects, attributes, and relationships).

Referring Expression Visual Grounding

Learning to Self-Train for Semi-Supervised Few-Shot Classification

1 code implementation NeurIPS 2019 Xinzhe Li, Qianru Sun, Yaoyao Liu, Shibao Zheng, Qin Zhou, Tat-Seng Chua, Bernt Schiele

On each task, we train a few-shot model to predict pseudo labels for unlabeled data, and then iterate the self-training steps on labeled and pseudo-labeled data with each step followed by fine-tuning.

Classification General Classification +1

A Novel BiLevel Paradigm for Image-to-Image Translation

no code implementations18 Apr 2019 Liqian Ma, Qianru Sun, Bernt Schiele, Luc van Gool

Image-to-image (I2I) translation is a pixel-level mapping that requires a large number of paired training data and often suffers from the problems of high diversity and strong category bias in image scenes.

Image-to-Image Translation Translation

An Ensemble of Epoch-wise Empirical Bayes for Few-shot Learning

1 code implementation ECCV 2020 Yaoyao Liu, Bernt Schiele, Qianru Sun

"Empirical" means that the hyperparameters, e. g., used for learning and ensembling the epoch-wise models, are generated by hyperprior learners conditional on task-specific data.

Few-Shot Learning

Learning a Disentangled Embedding for Monocular 3D Shape Retrieval and Pose Estimation

no code implementations24 Dec 2018 Kyaw Zaw Lin, Weipeng Xu, Qianru Sun, Christian Theobalt, Tat-Seng Chua

We propose a novel approach to jointly perform 3D shape retrieval and pose estimation from monocular images. In order to make the method robust to real-world image variations, e. g. complex textures and backgrounds, we learn an embedding space from 3D data that only includes the relevant information, namely the shape and pose.

3D Object Retrieval 3D Shape Classification +3

Meta-Transfer Learning for Few-Shot Learning

2 code implementations CVPR 2019 Qianru Sun, Yaoyao Liu, Tat-Seng Chua, Bernt Schiele

In this paper we propose a novel few-shot learning method called meta-transfer learning (MTL) which learns to adapt a deep NN for few shot learning tasks.

Few-Shot Image Classification Few-Shot Learning +1

A Hybrid Model for Identity Obfuscation by Face Replacement

no code implementations ECCV 2018 Qianru Sun, Ayush Tewari, Weipeng Xu, Mario Fritz, Christian Theobalt, Bernt Schiele

As more and more personal photos are shared and tagged in social media, avoiding privacy risks such as unintended recognition becomes increasingly challenging.

Face Generation

Disentangled Person Image Generation

1 code implementation CVPR 2018 Liqian Ma, Qianru Sun, Stamatios Georgoulis, Luc van Gool, Bernt Schiele, Mario Fritz

Generating novel, yet realistic, images of persons is a challenging task due to the complex interplay between the different image factors, such as the foreground, background and pose information.

Gesture-to-Gesture Translation Person Re-Identification +1

Natural and Effective Obfuscation by Head Inpainting

no code implementations CVPR 2018 Qianru Sun, Liqian Ma, Seong Joon Oh, Luc van Gool, Bernt Schiele, Mario Fritz

As more and more personal photos are shared online, being able to obfuscate identities in such photos is becoming a necessity for privacy protection.

Pose Guided Person Image Generation

2 code implementations NeurIPS 2017 Liqian Ma, Xu Jia, Qianru Sun, Bernt Schiele, Tinne Tuytelaars, Luc van Gool

This paper proposes the novel Pose Guided Person Generation Network (PG$^2$) that allows to synthesize person images in arbitrary poses, based on an image of that person and a novel pose.

Gesture-to-Gesture Translation Pose Transfer

Orientation Driven Bag of Appearances for Person Re-identification

2 code implementations9 May 2016 Liqian Ma, Hong Liu, Liang Hu, Can Wang, Qianru Sun

Experimental results on three public datasets and two proposed datasets demonstrate the superiority of the proposed approach, indicating the effectiveness of body structure and orientation information for improving re-identification performance.

Person Re-Identification

Cannot find the paper you are looking for? You can Submit a new open access paper.