Large-Scale Few-Shot Learning via Multi-Modal Knowledge Discovery

Large-Scale Few-Shot Learning via Multi-Modal Knowledge Discovery

ECCV 2020 Shuo Wang, Jun Yue, Jianzhuang Liu, Qi Tian, Meng Wang

It is a challenging problem since (1) the identifying process is susceptible to over-fitting with limited samples of an object, and (2) the sample imbalance between a base (known knowledge) category and a novel category is easy to bias the recognition results.

Few-Shot Learning

Interpretable Visual Reasoning via Probabilistic Formulation under Natural Supervision

no code implementations ECCV 2020 Xinzhe Han, Shuhui Wang, Chi Su, Weigang Zhang, Qingming Huang, Qi Tian

In this paper, we rethink implicit reasoning process in VQA, and propose a new formulation which maximizes the log-likelihood of joint distribution for the observed question and predicted answer.

Question Answering Visual Question Answering +2

Extract and Merge: Superpixel Segmentation with Regional Attributes

Extract and Merge: Superpixel Segmentation with Regional Attributes

ECCV 2020 Jianqiao An, Yucheng Shi, Yahong Han, Meijun Sun, Qi Tian

For a certain object in an image, the relationship between its central region and the peripheral region is not well utilized in existing superpixel segmentation methods.


FTL: A universal framework for training low-bit DNNs via Feature Transfer

FTL: A universal framework for training low-bit DNNs via Feature Transfer

ECCV 2020 Kunyuan Du, Ya zhang, Haibing Guan, Qi Tian, Shenggan Cheng, James Lin

Compared with low-bit models trained directly, the proposed framework brings 0. 5% to 3. 4% accuracy gains to three different quantization schemes.

Quantization Transfer Learning

API-Net: Robust Generative Classifier via a Single Discriminator

API-Net: Robust Generative Classifier via a Single Discriminator

ECCV 2020 Xinshuai Dong, Hong Liu, Rongrong Ji, Liujuan Cao, Qixiang Ye, Jianzhuang Liu, Qi Tian

On the contrary, a discriminative classifier only models the conditional distribution of labels given inputs, but benefits from effective optimization owing to its succinct structure.

Robust classification

Wavelet-Based Dual-Branch Network for Image Demoiréing

Wavelet-Based Dual-Branch Network for Image Demoiréing

ECCV 2020 Lin Liu, Jianzhuang Liu, Shanxin Yuan, Gregory Slabaugh, Aleš Leonardis, Wengang Zhou, Qi Tian

When smartphone cameras are used to take photos of digital screens, usually moire patterns result, severely degrading photo quality.

Image Restoration Rain Removal

Multi-modal Prompting for Low-Shot Temporal Action Localization

Multi-modal Prompting for Low-Shot Temporal Action Localization

21 Mar 2023 Chen Ju, Zeqian Li, Peisen Zhao, Ya zhang, Xiaopeng Zhang, Qi Tian, Yanfeng Wang, Weidi Xie

In this paper, we consider the problem of temporal action localization under low-shot (zero-shot & few-shot) scenario, with the goal of detecting and classifying the action instances from arbitrary categories within some untrimmed videos, even not seen at training time.

Action Classification Temporal Action Localization

LION: Implicit Vision Prompt Tuning

LION: Implicit Vision Prompt Tuning

17 Mar 2023 Haixin Wang, Jianlong Chang, Xiao Luo, Jinan Sun, Zhouchen Lin, Qi Tian

Despite recent competitive performance across a range of vision tasks, vision Transformers still have an issue of heavy computational costs.

Transfer Learning

Focus on Your Target: A Dual Teacher-Student Framework for Domain-adaptive Semantic Segmentation

no code implementations16 Mar 2023 Xinyue Huo, Lingxi Xie, Wengang Zhou, Houqiang Li, Qi Tian

Currently, a popular UDA framework lies in self-training which endows the model with two-fold abilities: (i) learning reliable semantics from the labeled images in the source domain, and (ii) adapting to the target domain via generating pseudo labels on the unlabeled images.

Semantic Segmentation Unsupervised Domain Adaptation

Gradient-Regulated Meta-Prompt Learning for Generalizable Vision-Language Models

Gradient-Regulated Meta-Prompt Learning for Generalizable Vision-Language Models

12 Mar 2023 Juncheng Li, Minghe Gao, Longhui Wei, Siliang Tang, Wenqiao Zhang, Mengze Li, Wei Ji, Qi Tian, Tat-Seng Chua, Yueting Zhuang

Prompt tuning, a recently emerging paradigm, enables the powerful vision-language pre-training models to adapt to downstream tasks in a parameter -- and data -- efficient way, by learning the ``soft prompts'' to condition frozen pre-training models.

Domain Generalization Few-Shot Learning

R-Tuning: Regularized Prompt Tuning in Open-Set Scenarios

R-Tuning: Regularized Prompt Tuning in Open-Set Scenarios

9 Mar 2023 Ning Liao, Xiaopeng Zhang, Min Cao, Qi Tian, Junchi Yan

In realistic open-set scenarios where labels of a part of testing data are totally unknown, current prompt methods on vision-language (VL) models always predict the unknown classes as the downstream training classes.

Open Set Learning

Rethinking Visual Prompt Learning as Masked Visual Token Modeling

Rethinking Visual Prompt Learning as Masked Visual Token Modeling

9 Mar 2023 Ning Liao, Bowen Shi, Min Cao, Xiaopeng Zhang, Qi Tian, Junchi Yan

To explore prompt learning on the generative pre-trained visual model as well as keeping the task consistency, we propose Visual Prompt learning as masked visual Token Modeling (VPTM) to transform the downstream visual classification into the pre-trained masked visual token prediction.

Lformer: Text-to-Image Generation with L-shape Block Parallel Decoding

Lformer: Text-to-Image Generation with L-shape Block Parallel Decoding

7 Mar 2023 Jiacheng Li, Longhui Wei, Zongyuan Zhan, Xin He, Siliang Tang, Qi Tian, Yueting Zhuang

To better accelerate the generative transformers while keeping good generation quality, we propose Lformer, a semi-autoregressive text-to-image generation model.

Text-to-Image Generation

Constraint and Union for Partially-Supervised Temporal Sentence Grounding

Constraint and Union for Partially-Supervised Temporal Sentence Grounding

20 Feb 2023 Chen Ju, Haicheng Wang, Jinxiang Liu, Chaofan Ma, Ya zhang, Peisen Zhao, Jianlong Chang, Qi Tian

Temporal sentence grounding aims to detect the event timestamps described by the natural language query from given untrimmed videos.

ShiftDDPMs: Exploring Conditional Diffusion Models by Shifting Diffusion Trajectories

no code implementations5 Feb 2023 Zijian Zhang, Zhou Zhao, Jun Yu, Qi Tian

In this paper, we propose a novel and flexible conditional diffusion model by introducing conditions into the forward process.

Denoising Image Generation

Prototype-guided Cross-task Knowledge Distillation for Large-scale Models

no code implementations26 Dec 2022 Deng Li, Aming Wu, Yahong Han, Qi Tian

Considering the complexity and variability of real scene tasks, we propose a Prototype-guided Cross-task Knowledge Distillation (ProC-KD) approach to transfer the intrinsic local-level object knowledge of a large-scale teacher network to various task scenarios.

Knowledge Distillation

FedSkip: Combatting Statistical Heterogeneity with Federated Skip Aggregation

1 code implementation14 Dec 2022 Ziqing Fan, Yanfeng Wang, Jiangchao Yao, Lingjuan Lyu, Ya zhang, Qi Tian

However, in addition to previous explorations for improvement in federated averaging, our analysis shows that another critical bottleneck is the poorer optima of client models in more heterogeneous conditions.

Federated Learning

Feature Calibration Network for Occluded Pedestrian Detection

Feature Calibration Network for Occluded Pedestrian Detection

12 Dec 2022 Tianliang Zhang, Qixiang Ye, Baochang Zhang, Jianzhuang Liu, Xiaopeng Zhang, Qi Tian

FC-Net is based on the observation that the visible parts of pedestrians are selective and decisive for detection, and is implemented as a self-paced feature learning framework with a self-activation (SA) module and a feature calibration (FC) module.

Pedestrian Detection

ConfounderGAN: Protecting Image Data Privacy with Causal Confounder

ConfounderGAN: Protecting Image Data Privacy with Causal Confounder

4 Dec 2022 Qi Tian, Kun Kuang, Kelu Jiang, Furui Liu, Zhihua Wang, Fei Wu

The success of deep learning is partly attributed to the availability of massive data downloaded freely from the Internet.

Image Classification

Pangu-Weather: A 3D High-Resolution Model for Fast and Accurate Global Weather Forecast

no code implementations3 Nov 2022 Kaifeng Bi, Lingxi Xie, Hengheng Zhang, Xin Chen, Xiaotao Gu, Qi Tian

In this paper, we present Pangu-Weather, a deep learning based system for fast and accurate global weather forecast.

Being Comes from Not-being: Open-vocabulary Text-to-Motion Generation with Wordless Training

no code implementations28 Oct 2022 Junfan Lin, Jianlong Chang, Lingbo Liu, Guanbin Li, Liang Lin, Qi Tian, Chang Wen Chen

During inference, instead of changing the motion generator, our method reformulates the input text into a masked motion as the prompt for the motion generator to ``reconstruct'' the motion.

Language Modelling Zero-Shot Learning

End-to-End Context-Aided Unicity Matching for Person Re-identification

no code implementations20 Oct 2022 Min Cao, Cong Ding, Chen Chen, Junchi Yan, Qi Tian

Based on a natural assumption that images belonging to the same person identity should not match with images belonging to multiple different person identities across views, called the unicity of person matching on the identity level, we propose an end-to-end person unicity matching architecture for learning and refining the person matching relations.

Graph Matching Person Re-Identification

See Blue Sky: Deep Image Dehaze Using Paired and Unpaired Training Images

2 code implementations14 Oct 2022 Xiaoyan Zhang, Gaoyang Tang, Yingying Zhu, Qi Tian

The issue of image haze removal has attracted wide attention in recent years.

Towards a Unified View on Visual Parameter-Efficient Transfer Learning

1 code implementation3 Oct 2022 Bruce X. B. Yu, Jianlong Chang, Lingbo Liu, Qi Tian, Chang Wen Chen

Towards this goal, we propose a framework with a unified view of PETL called visual-PETL (V-PETL) to investigate the effects of different PETL techniques, data scales of downstream domains, positions of trainable parameters, and other aspects affecting the trade-off.

Action Recognition Image Classification +2

Learnable Distribution Calibration for Few-Shot Class-Incremental Learning

no code implementations1 Oct 2022 Binghao Liu, Boyu Yang, Lingxi Xie, Ren Wang, Qi Tian, Qixiang Ye

LDC is built upon a parameterized calibration unit (PCU), which initializes biased distributions for all classes based on classifier vectors (memory-free) and a single covariance matrix.

Few-Shot Class-Incremental Learning Few-Shot Learning +2

Low-Light Video Enhancement with Synthetic Event Guidance

Low-Light Video Enhancement with Synthetic Event Guidance

23 Aug 2022 Lin Liu, Junfeng An, Jianzhuang Liu, Shanxin Yuan, Xiangyu Chen, Wengang Zhou, Houqiang Li, Yanfeng Wang, Qi Tian

Low-light video enhancement (LLVE) is an important yet challenging task with many applications such as photographing and autonomous driving.

Autonomous Driving Image Enhancement +1

Prompt-Matched Semantic Segmentation

Prompt-Matched Semantic Segmentation

22 Aug 2022 Lingbo Liu, Jianlong Chang, Bruce X. B. Yu, Liang Lin, Qi Tian, Chang-Wen Chen

Previous methods usually fine-tuned the entire networks for each specific dataset, which will be burdensome to store massive parameters of these networks.

Representation Learning Semantic Segmentation

T-Person-GAN: Text-to-Person Image Generation with Identity-Consistency and Manifold Mix-Up

1 code implementation18 Aug 2022 Lin Wu, Yang Wang, Feng Zheng, Qi Tian, Meng Wang

Our architecture is orthogonal to StackGAN++ , and focuses on person image generation, with all of them together to enrich the spectrum of GANs for the image generation task.

Text-to-Image Generation

Dilated Context Integrated Network with Cross-Modal Consensus for Temporal Emotion Localization in Videos

Dilated Context Integrated Network with Cross-Modal Consensus for Temporal Emotion Localization in Videos

3 Aug 2022 Juncheng Li, Junlin Xie, Linchao Zhu, Long Qian, Siliang Tang, Wenqiao Zhang, Haochen Shi, Shengyu Zhang, Longhui Wei, Qi Tian, Yueting Zhuang

In this paper, we introduce a new task, named Temporal Emotion Localization in videos~(TEL), which aims to detect human emotions and localize their corresponding temporal boundaries in untrimmed videos with aligned subtitles.

Emotion Classification Temporal Action Localization +1

SdAE: Self-distillated Masked Autoencoder

SdAE: Self-distillated Masked Autoencoder

31 Jul 2022 Yabo Chen, Yuchen Liu, Dongsheng Jiang, Xiaopeng Zhang, Wenrui Dai, Hongkai Xiong, Qi Tian

We also analyze how to build good views for the teacher branch to produce latent representation from the perspective of information bottleneck.

Self-Supervised Learning

Skeleton-Parted Graph Scattering Networks for 3D Human Motion Prediction

1 code implementation31 Jul 2022 Maosen Li, Siheng Chen, Zijing Zhang, Lingxi Xie, Qi Tian, Ya zhang

To address the first issue, we propose adaptive graph scattering, which leverages multiple trainable band-pass graph filters to decompose pose features into richer graph spectrum bands.

Human motion prediction motion prediction

Fine-grained Retrieval Prompt Tuning

Fine-grained Retrieval Prompt Tuning

29 Jul 2022 Shijie Wang, Jianlong Chang, Zhihui Wang, Haojie Li, Wanli Ouyang, Qi Tian

In this paper, we develop Fine-grained Retrieval Prompt Tuning (FRPT), which steers a frozen pre-trained model to perform the fine-grained retrieval task from the perspectives of sample prompting and feature adaptation.


Pro-tuning: Unified Prompt Tuning for Vision Tasks

Pro-tuning: Unified Prompt Tuning for Vision Tasks

28 Jul 2022 Xing Nie, Bolin Ni, Jianlong Chang, Gaomeng Meng, Chunlei Huo, Zhaoxiang Zhang, Shiming Xiang, Qi Tian, Chunhong Pan

To this end, we propose parameter-efficient Prompt tuning (Pro-tuning) to adapt frozen vision models to various downstream vision tasks.

Adversarial Robustness Image Classification +4

Visual Recognition by Request

Visual Recognition by Request

28 Jul 2022 Chufeng Tang, Lingxi Xie, Xiaopeng Zhang, Xiaolin Hu, Qi Tian

Humans have the ability of recognizing visual semantics in an unlimited granularity, but existing visual recognition algorithms cannot achieve this goal.

Instance Segmentation Semantic Segmentation

Active Pointly-Supervised Instance Segmentation

Active Pointly-Supervised Instance Segmentation

23 Jul 2022 Chufeng Tang, Lingxi Xie, Gang Zhang, Xiaopeng Zhang, Qi Tian, Xiaolin Hu

In this paper, we present an economic active learning setting, named active pointly-supervised instance segmentation (APIS), which starts with box-level annotations and iteratively samples a point within the box and asks if it falls on the object.

Active Learning Instance Segmentation +1

Entity-enhanced Adaptive Reconstruction Network for Weakly Supervised Referring Expression Grounding

1 code implementation18 Jul 2022 Xuejing Liu, Liang Li, Shuhui Wang, Zheng-Jun Zha, Zechao Li, Qi Tian, Qingming Huang

Second, most previous weakly supervised REG methods ignore the discriminative location and context of the referent, causing difficulties in distinguishing the target from other same-category objects.

Referring Expression Semantic Similarity +1

A Survey on Label-efficient Deep Image Segmentation: Bridging the Gap between Weak Supervision and Dense Prediction

no code implementations4 Jul 2022 Wei Shen, Zelin Peng, Xuehui Wang, Huayu Wang, Jiazhong Cen, Dongsheng Jiang, Lingxi Xie, Xiaokang Yang, Qi Tian

Next, we summarize the existing label-efficient image segmentation methods from a unified perspective that discusses an important question: how to bridge the gap between weak supervision and dense prediction -- the current methods are mostly based on heuristic priors, such as cross-pixel similarity, cross-label constraint, cross-view consistency, and cross-image relation.

Image Segmentation Instance Segmentation +1

Reading and Writing: Discriminative and Generative Modeling for Self-Supervised Text Recognition

1 code implementation1 Jul 2022 Mingkun Yang, Minghui Liao, Pu Lu, Jing Wang, Shenggao Zhu, Hualin Luo, Qi Tian, Xiang Bai

Inspired by the observation that humans learn to recognize the texts through both reading and writing, we propose to learn discrimination and generation by integrating contrastive learning and masked image modeling in our self-supervised method.

Contrastive Learning Scene Text Recognition

Towards Generalizable Person Re-identification with a Bi-stream Generative Model

no code implementations19 Jun 2022 Xin Xu, Wei Liu, Zheng Wang, Ruiming Hu, Qi Tian

Guided by original pedestrian images, one stream is employed to learn a camera-invariant global feature for the CC problem via filtering cross-camera interference factors.

Domain Generalization Generalizable Person Re-identification

Masked Autoencoders are Robust Data Augmentors

Masked Autoencoders are Robust Data Augmentors

10 Jun 2022 Haohang Xu, Shuangrui Ding, Xiaopeng Zhang, Hongkai Xiong, Qi Tian

Specifically, MRA consistently enhances the performance on supervised, semi-supervised as well as few-shot classification.

Image Augmentation Image Classification +1

DE-Net: Dynamic Text-guided Image Editing Adversarial Networks

DE-Net: Dynamic Text-guided Image Editing Adversarial Networks

2 Jun 2022 Ming Tao, Bing-Kun Bao, Hao Tang, Fei Wu, Longhui Wei, Qi Tian

To solve these limitations, we propose: (i) a Dynamic Editing Block (DEBlock) which composes different editing modules dynamically for various editing requirements.


HiViT: Hierarchical Vision Transformer Meets Masked Image Modeling

HiViT: Hierarchical Vision Transformer Meets Masked Image Modeling

30 May 2022 Xiaosong Zhang, Yunjie Tian, Wei Huang, Qixiang Ye, Qi Dai, Lingxi Xie, Qi Tian

A key idea of efficient implementation is to discard the masked image patches (or tokens) throughout the target network (encoder), which requires the encoder to be a plain vision transformer (e. g., ViT), albeit hierarchical vision transformers (e. g., Swin Transformer) have potentially better properties in formulating vision inputs.

Transfer Learning

Fast Dynamic Radiance Fields with Time-Aware Neural Voxels

Fast Dynamic Radiance Fields with Time-Aware Neural Voxels

30 May 2022 Jiemin Fang, Taoran Yi, Xinggang Wang, Lingxi Xie, Xiaopeng Zhang, Wenyu Liu, Matthias Nießner, Qi Tian

A multi-distance interpolation method is proposed and applied on voxel features to model both small and large motions.

HiVLP: Hierarchical Vision-Language Pre-Training for Fast Image-Text Retrieval

no code implementations24 May 2022 Feilong Chen, Xiuyi Chen, Jiaxin Shi, Duzhen Zhang, Jianlong Chang, Qi Tian

It also achieves about +4. 9 AR on COCO and +3. 8 AR on Flickr30K than LightingDot and achieves comparable performance with the state-of-the-art (SOTA) fusion-based model METER.

Cross-Modal Retrieval Retrieval +1

GraphQ IR: Unifying the Semantic Parsing of Graph Query Languages with One Intermediate Representation

1 code implementation24 May 2022 Lunyiu Nie, Shulin Cao, Jiaxin Shi, Jiuding Sun, Qi Tian, Lei Hou, Juanzi Li, Jidong Zhai

Subject to the huge semantic gap between natural and formal languages, neural semantic parsing is typically bottlenecked by its complexity of dealing with both input semantics and output syntax.

Few-Shot Learning Semantic Parsing

CenterNet++ for Object Detection

CenterNet++ for Object Detection

18 Apr 2022 Kaiwen Duan, Song Bai, Lingxi Xie, Honggang Qi, Qingming Huang, Qi Tian

Our approach, named CenterNet, detects each object as a triplet keypoints (top-left and bottom-right corners and the center keypoint).

object-detection Object Detection

HyperDet3D: Learning a Scene-conditioned 3D Object Detector

HyperDet3D: Learning a Scene-conditioned 3D Object Detector

CVPR 2022 Yu Zheng, Yueqi Duan, Jiwen Lu, Jie zhou, Qi Tian

A bathtub in a library, a sink in an office, a bed in a laundry room -- the counter-intuition suggests that scene provides important prior knowledge for 3D object detection, which instructs to eliminate the ambiguous detection of similar objects.

3D Object Detection object-detection

DATA: Domain-Aware and Task-Aware Self-supervised Learning

DATA: Domain-Aware and Task-Aware Self-supervised Learning

CVPR 2022 Qing Chang, Junran Peng, Lingxie Xie, Jiajun Sun, Haoran Yin, Qi Tian, Zhaoxiang Zhang

However, due to the high training costs and the unconsciousness of downstream usages, most self-supervised learning methods lack the capability to correspond to the diversities of downstream scenarios, as there are various data domains, different vision tasks and latency constraints on models.

Image Classification Model Selection +5

Deep Class Incremental Learning from Decentralized Data

Deep Class Incremental Learning from Decentralized Data

11 Mar 2022 Xiaohan Zhang, Songlin Dong, Jinjie Chen, Qi Tian, Yihong Gong, Xiaopeng Hong

In this paper, we focus on a new and challenging decentralized machine learning paradigm in which there are continuous inflows of data to be addressed and the data are stored in multiple repositories.

Class Incremental Learning Incremental Learning +1

MVP: Multimodality-guided Visual Pre-training

MVP: Multimodality-guided Visual Pre-training

10 Mar 2022 Longhui Wei, Lingxi Xie, Wengang Zhou, Houqiang Li, Qi Tian

Recently, masked image modeling (MIM) has become a promising direction for visual pre-training.

Language Modelling

The KFIoU Loss for Rotated Object Detection

The KFIoU Loss for Rotated Object Detection

29 Jan 2022 Xue Yang, Yue Zhou, Gefan Zhang, Jirui Yang, Wentao Wang, Junchi Yan, Xiaopeng Zhang, Qi Tian

This is in contrast to recent Gaussian modeling based rotation detectors e. g. GWD loss and KLD loss that involve a human-specified distribution distance metric which require additional hyperparameter tuning that vary across datasets and detectors.

object-detection Object Detection In Aerial Images

GhostNets on Heterogeneous Devices via Cheap Operations

GhostNets on Heterogeneous Devices via Cheap Operations

10 Jan 2022 Kai Han, Yunhe Wang, Chang Xu, Jianyuan Guo, Chunjing Xu, Enhua Wu, Qi Tian

The proposed C-Ghost module can be taken as a plug-and-play component to upgrade existing convolutional neural networks.

DeeCap: Dynamic Early Exiting for Efficient Image Captioning

DeeCap: Dynamic Early Exiting for Efficient Image Captioning

On one hand, the representation in shallow layers lacks high-level semantic and sufficient cross-modal fusion information for accurate prediction.

Image Captioning Imitation Learning

One-Bit Active Query With Contrastive Pairs

no code implementations CVPR 2022 Yuhang Zhang, Xiaopeng Zhang, Lingxi Xie, Jie Li, Robert C. Qiu, Hengtong Hu, Qi Tian

The Yes query is treated as positive pairs of the queried category for contrastive pulling, while the No query is treated as hard negative pairs for contrastive repelling.

Active Learning Contrastive Learning

Wnet: Audio-Guided Video Object Segmentation via Wavelet-Based Cross-Modal Denoising Networks

1 code implementation CVPR 2022 Wenwen Pan, Haonan Shi, Zhou Zhao, Jieming Zhu, Xiuqiang He, Zhigeng Pan, Lianli Gao, Jun Yu, Fei Wu, Qi Tian

Audio-Guided video semantic segmentation is a challenging problem in visual analysis and editing, which automatically separates foreground objects from background in a video sequence according to the referring audio expressions.

Denoising Semantic Segmentation +2

Partial Class Activation Attention for Semantic Segmentation

1 code implementation CVPR 2022 Sun-Ao Liu, Hongtao Xie, Hai Xu, Yongdong Zhang, Qi Tian

Current attention-based methods for semantic segmentation mainly model pixel relation through pairwise affinity and coarse segmentation.

Semantic Segmentation

Contextual Similarity Distillation for Asymmetric Image Retrieval

no code implementations CVPR 2022 Hui Wu, Min Wang, Wengang Zhou, Houqiang Li, Qi Tian

To this end, we propose a flexible contextual similarity distillation framework to enhance the small query model and keep its output feature compatible with that of large gallery model, which is crucial with asymmetric retrieval.

Image Retrieval Retrieval

Learning To Learn by Jointly Optimizing Neural Architecture and Weights

no code implementations CVPR 2022 Yadong Ding, Yu Wu, Chengyue Huang, Siliang Tang, Yi Yang, Longhui Wei, Yueting Zhuang, Qi Tian

Existing NAS-based meta-learning methods apply a two-stage strategy, i. e., first searching architectures and then re-training meta-weights on the searched architecture.


General Greedy De-bias Learning

1 code implementation20 Dec 2021 Xinzhe Han, Shuhui Wang, Chi Su, Qingming Huang, Qi Tian

Existing de-bias learning frameworks try to capture specific dataset bias by annotations but they fail to handle complicated OOD scenarios.

Image Classification Question Answering +2

SiamTrans: Zero-Shot Multi-Frame Image Restoration with Pre-Trained Siamese Transformers

no code implementations17 Dec 2021 Lin Liu, Shanxin Yuan, Jianzhuang Liu, Xin Guo, Youliang Yan, Qi Tian

For zero-shot image restoration, we design a novel model, termed SiamTrans, which is constructed by Siamese transformers, encoders, and decoders.

Denoising Image Restoration +1

Frequency Spectrum Augmentation Consistency for Domain Adaptive Object Detection

no code implementations16 Dec 2021 Rui Liu, Yahong Han, YaoWei Wang, Qi Tian

In the second stage, augmented source and target data with pseudo labels are adopted to perform the self-training for prediction consistency.

object-detection Object Detection

Mining Minority-class Examples With Uncertainty Estimates

no code implementations15 Dec 2021 Gursimran Singh, Lingyang Chu, Lanjun Wang, Jian Pei, Qi Tian, Yong Zhang

In the real world, the frequency of occurrence of objects is naturally skewed forming long-tail class distributions, which results in poor performance on the statistically rare classes.

Exploring Complicated Search Spaces with Interleaving-Free Sampling

no code implementations5 Dec 2021 Yunjie Tian, Lingxi Xie, Jiemin Fang, Jianbin Jiao, Qixiang Ye, Qi Tian

In this paper, we build the search algorithm upon a complicated search space with long-distance connections, and show that existing weight-sharing search algorithms mostly fail due to the existence of \textbf{interleaved connections}.

Neural Architecture Search

NeuSample: Neural Sample Field for Efficient View Synthesis

1 code implementation30 Nov 2021 Jiemin Fang, Lingxi Xie, Xinggang Wang, Xiaopeng Zhang, Wenyu Liu, Qi Tian

Neural radiance fields (NeRF) have shown great potentials in representing 3D scenes and synthesizing novel views, but the computational overhead of NeRF at the inference stage is still heavy.

Semantic-Aware Generation for Self-Supervised Visual Representation Learning

1 code implementation25 Nov 2021 Yunjie Tian, Lingxi Xie, Xiaopeng Zhang, Jiemin Fang, Haohang Xu, Wei Huang, Jianbin Jiao, Qi Tian, Qixiang Ye

In this paper, we propose a self-supervised visual representation learning approach which involves both generative and discriminative proxies, where we focus on the former part by requiring the target network to recover the original image based on the mid-level features.

Representation Learning Semantic Segmentation

Consensus Synergizes with Memory: A Simple Approach for Anomaly Segmentation in Urban Scenes

no code implementations24 Nov 2021 Jiazhong Cen, Zenkun Jiang, Lingxi Xie, Qi Tian, Xiaokang Yang, Wei Shen

Anomaly segmentation is a crucial task for safety-critical applications, such as autonomous driving in urban scenes, where the goal is to detect out-of-distribution (OOD) objects with categories which are unseen during training.

Anomaly Detection Autonomous Driving

DVCFlow: Modeling Information Flow Towards Human-like Video Captioning

no code implementations19 Nov 2021 Xu Yan, Zhengcong Fei, Shuhui Wang, Qingming Huang, Qi Tian

Dense video captioning (DVC) aims to generate multi-sentence descriptions to elucidate the multiple events in the video, which is challenging and demands visual consistency, discoursal coherence, and linguistic diversity.

Dense Video Captioning

DocScanner: Robust Document Image Rectification with Progressive Learning

3 code implementations28 Oct 2021 Hao Feng, Wengang Zhou, Jiajun Deng, Qi Tian, Houqiang Li

The iterative refinements make DocScanner converge to a robust and superior rectification performance, while the lightweight recurrent architecture ensures the running efficiency.

Optical Character Recognition (OCR)

CIPS-3D: A 3D-Aware Generator of GANs Based on Conditionally-Independent Pixel Synthesis

1 code implementation19 Oct 2021 Peng Zhou, Lingxi Xie, Bingbing Ni, Qi Tian

The style-based GAN (StyleGAN) architecture achieved state-of-the-art results for generating high-quality images, but it lacks explicit and precise control over camera poses.

3D-Aware Image Synthesis Transfer Learning

Semi-Autoregressive Image Captioning

1 code implementation11 Oct 2021 Xu Yan, Zhengcong Fei, Zekang Li, Shuhui Wang, Qingming Huang, Qi Tian

Non-autoregressive image captioning with continuous iterative refinement, which eliminates the sequential dependence in a sentence generation, can achieve comparable performance to the autoregressive counterparts with a considerable acceleration.

Image Captioning

Vibration-based Uncertainty Estimation for Learning from Limited Supervision

no code implementations29 Sep 2021 Hengtong Hu, Lingxi Xie, Yinquan Wang, Richang Hong, Meng Wang, Qi Tian

We investigate the problem of estimating uncertainty for training data, so that deep neural networks can make use of the results for learning from limited supervision.

Active Learning

Deep Encryption: Protecting Pre-Trained Neural Networks with Confusion Neurons

no code implementations29 Sep 2021 Mengbiao Zhao, Shixiong Xu, Jianlong Chang, Lingxi Xie, Jie Chen, Qi Tian

Having consumed huge amounts of training data and computational resource, large-scale pre-trained models are often considered key assets of AI service providers.

Differentiable Convolution Search for Point Cloud Processing

no code implementations ICCV 2021 Xing Nie, Yongcheng Liu, Shaohong Chen, Jianlong Chang, Chunlei Huo, Gaofeng Meng, Qi Tian, Weiming Hu, Chunhong Pan

It can work in a purely data-driven manner and thus is capable of auto-creating a group of suitable convolutions for geometric shape modeling.

Multiscale Spatio-Temporal Graph Neural Networks for 3D Skeleton-Based Motion Prediction

no code implementations25 Aug 2021 Maosen Li, Siheng Chen, Yangheng Zhao, Ya zhang, Yanfeng Wang, Qi Tian

The core of MST-GNN is a multiscale spatio-temporal graph that explicitly models the relations in motions at various spatial and temporal scales.

motion prediction

Pixel Difference Networks for Efficient Edge Detection

1 code implementation ICCV 2021 Zhuo Su, Wenzhe Liu, Zitong Yu, Dewen Hu, Qing Liao, Qi Tian, Matti Pietikäinen, Li Liu

A faster version of PiDiNet with less than 0. 1M parameters can still achieve comparable performance among state of the arts with 200 FPS.

BSDS500 Edge Detection

Greedy Gradient Ensemble for Robust Visual Question Answering

1 code implementation ICCV 2021 Xinzhe Han, Shuhui Wang, Chi Su, Qingming Huang, Qi Tian

Language bias is a critical issue in Visual Question Answering (VQA), where models often exploit dataset biases for the final decision without considering the image information.

Question Answering Visual Question Answering +1

Revisiting Catastrophic Forgetting in Class Incremental Learning

no code implementations26 Jul 2021 Zixuan Ni, Haizhou Shi, Siliang Tang, Longhui Wei, Qi Tian, Yueting Zhuang

After investigating existing strategies, we observe that there is a lack of study on how to prevent the inter-phase confusion.

Class Incremental Learning Contrastive Learning +2

Domain Adaptation without Model Transferring

no code implementations21 Jul 2021 Kunhong Wu, Yucheng Shi, Yahong Han, Yunfeng Shao, Bingshuai Li, Qi Tian

Existing unsupervised domain adaptation (UDA) methods can achieve promising performance without transferring data from source domain to target domain.

Unsupervised Domain Adaptation

Fast Batch Nuclear-norm Maximization and Minimization for Robust Domain Adaptation

1 code implementation13 Jul 2021 Shuhao Cui, Shuhui Wang, Junbao Zhuo, Liang Li, Qingming Huang, Qi Tian

Due to the domain discrepancy in visual domain adaptation, the performance of source model degrades when bumping into the high data density near decision boundary in target domain.

Domain Adaptation

Bag of Instances Aggregation Boosts Self-supervised Distillation

1 code implementation ICLR 2022 Haohang Xu, Jiemin Fang, Xiaopeng Zhang, Lingxi Xie, Xinggang Wang, Wenrui Dai, Hongkai Xiong, Qi Tian

Here bag of instances indicates a set of similar samples constructed by the teacher and are grouped within a bag, and the goal of distillation is to aggregate compact representations over the student with respect to instances in a bag.

Contrastive Learning Self-Supervised Learning

ATSO: Asynchronous Teacher-Student Optimization for Semi-Supervised Image Segmentation

no code implementations CVPR 2021 Xinyue Huo, Lingxi Xie, Jianzhong He, Zijie Yang, Wengang Zhou, Houqiang Li, Qi Tian

Semi-supervised learning is a useful tool for image segmentation, mainly due to its ability in extracting knowledge from unlabeled data to assist learning from labeled data.

Continual Learning Image Segmentation +2

Multi-dataset Pretraining: A Unified Model for Semantic Segmentation

no code implementations8 Jun 2021 Bowen Shi, Xiaopeng Zhang, Haohang Xu, Wenrui Dai, Junni Zou, Hongkai Xiong, Qi Tian

This is achieved by first pretraining the network via the proposed pixel-to-prototype contrastive loss over multiple datasets regardless of their taxonomy labels, and followed by fine-tuning the pretrained model over specific dataset as usual.

Semantic Segmentation

Learning High-Precision Bounding Box for Rotated Object Detection via Kullback-Leibler Divergence

2 code implementations NeurIPS 2021 Xue Yang, Xiaojiang Yang, Jirui Yang, Qi Ming, Wentao Wang, Qi Tian, Junchi Yan

Taking the perspective that horizontal detection is a special case for rotated object detection, in this paper, we are motivated to change the design of rotation regression loss from induction paradigm to deduction methodology, in terms of the relation between rotation and horizontal detection.

object-detection Object Detection In Aerial Images +1

Exploring the Diversity and Invariance in Yourself for Visual Pre-Training Task

no code implementations1 Jun 2021 Longhui Wei, Lingxi Xie, Wengang Zhou, Houqiang Li, Qi Tian

By simply pulling the different augmented views of each image together or other novel mechanisms, they can learn much unsupervised knowledge and significantly improve the transfer performance of pre-training models.

Self-Supervised Learning

Large-Scale Spatio-Temporal Person Re-identification: Algorithms and Benchmark

1 code implementation31 May 2021 Xiujun Shu, Xiao Wang, Xianghao Zang, Shiliang Zhang, Yuanqi Chen, Ge Li, Qi Tian

We also verified that models pre-trained on LaST can generalize well on existing datasets with short-term and cloth-changing scenarios.

Person Re-Identification

Analysis and Applications of Class-wise Robustness in Adversarial Training

no code implementations29 May 2021 Qi Tian, Kun Kuang, Kelu Jiang, Fei Wu, Yisen Wang

Adversarial training is one of the most effective approaches to improve model robustness against adversarial examples.

A Fourier-based Framework for Domain Generalization

1 code implementation CVPR 2021 Qinwei Xu, Ruipeng Zhang, Ya zhang, Yanfeng Wang, Qi Tian

Modern deep neural networks suffer from performance degradation when evaluated on testing data under different distributions from training data.

Data Augmentation Domain Generalization

Visformer: The Vision-friendly Transformer

3 code implementations ICCV 2021 Zhengsu Chen, Lingxi Xie, Jianwei Niu, Xuefeng Liu, Longhui Wei, Qi Tian

The past year has witnessed the rapid development of applying the Transformer module to vision problems.

Image Classification

Location-Sensitive Visual Recognition with Cross-IOU Loss

1 code implementation11 Apr 2021 Kaiwen Duan, Lingxi Xie, Honggang Qi, Song Bai, Qingming Huang, Qi Tian

Object detection, instance segmentation, and pose estimation are popular visual recognition tasks which require localizing the object by internal or boundary landmarks.

2D Human Pose Estimation Instance Segmentation +4

Adaptive Mutual Supervision for Weakly-Supervised Temporal Action Localization

no code implementations6 Apr 2021 Chen Ju, Peisen Zhao, Siheng Chen, Ya zhang, Xiaoyun Zhang, Qi Tian

To solve this issue, we introduce an adaptive mutual supervision framework (AMS) with two branches, where the base branch adopts CAS to localize the most discriminative action regions, while the supplementary branch localizes the less discriminative action regions through a novel adaptive sampler.

Weakly Supervised Action Localization Weakly-supervised Temporal Action Localization +1

Unsupervised Domain Adaptation for Image Classification via Structure-Conditioned Adversarial Learning

no code implementations4 Mar 2021 Hui Wang, Jian Tian, Songyuan Li, Hanbin Zhao, Qi Tian, Fei Wu, Xi Li

Unsupervised domain adaptation (UDA) typically carries out knowledge transfer from a label-rich source domain to an unlabeled target domain by adversarial learning.

General Classification Image Classification +2

Rethinking Rotated Object Detection with Gaussian Wasserstein Distance Loss

2 code implementations28 Jan 2021 Xue Yang, Junchi Yan, Qi Ming, Wentao Wang, Xiaopeng Zhang, Qi Tian

Boundary discontinuity and its inconsistency to the final detection metric have been the bottleneck for rotating detection regression loss design.

object-detection Object Detection In Aerial Images +2

Foreground Activation Maps for Weakly Supervised Object Localization

no code implementations ICCV 2021 Meng Meng, Tianzhu Zhang, Qi Tian, Yongdong Zhang, Feng Wu

To the best of our knowledge, this is the first work that can achieve remarkable performance for both tasks by optimizing them jointly via FAM for WSOL.

Classification Weakly Supervised Object Localization +1

Intriguing class-wise properties of adversarial training

no code implementations1 Jan 2021 Qi Tian, Kun Kuang, Fei Wu, Yisen Wang

Adversarial training is one of the most effective approaches to improve model robustness against adversarial examples.

Adversarial Robustness

Divide and Conquer for Single-Frame Temporal Action Localization

no code implementations ICCV 2021 Chen Ju, Peisen Zhao, Siheng Chen, Ya zhang, Yanfeng Wang, Qi Tian

Single-frame temporal action localization (STAL) aims to localize actions in untrimmed videos with only one timestamp annotation for each action instance.

Temporal Action Localization

Shape Self-Correction for Unsupervised Point Cloud Understanding

no code implementations ICCV 2021 Ye Chen, Jinxian Liu, Bingbing Ni, Hang Wang, Jiancheng Yang, Ning Liu, Teng Li, Qi Tian

Then the destroyed shape and the normal shape are sent into a point cloud network to get representations, which are employed to segment points that belong to distorted parts and further reconstruct them to restore the shape to normal.

Self-Supervised Learning

Point-Level Temporal Action Localization: Bridging Fully-supervised Proposals to Weakly-supervised Losses

no code implementations15 Dec 2020 Chen Ju, Peisen Zhao, Ya zhang, Yanfeng Wang, Qi Tian

Point-Level temporal action localization (PTAL) aims to localize actions in untrimmed videos with only one timestamp annotation for each action instance.

Weakly Supervised Action Localization

ESAD: End-to-end Deep Semi-supervised Anomaly Detection

no code implementations9 Dec 2020 Chaoqin Huang, Fei Ye, Peisen Zhao, Ya zhang, Yan-Feng Wang, Qi Tian

This paper explores semi-supervised anomaly detection, a more practical setting for anomaly detection where a small additional set of labeled samples are provided.

Ranked #24 on Anomaly Detection on One-class CIFAR-10 (using extra training data)

Medical Diagnosis Semi-supervised Anomaly Detection +1

UnrealPerson: An Adaptive Pipeline towards Costless Person Re-identification

1 code implementation CVPR 2021 Tianyu Zhang, Lingxi Xie, Longhui Wei, Zijie Zhuang, Yongfei Zhang, Bo Li, Qi Tian

The main difficulty of person re-identification (ReID) lies in collecting annotated data and transferring the model across different domains.

Domain Adaptation Image Generation +1

Seed the Views: Hierarchical Semantic Alignment for Contrastive Representation Learning

no code implementations4 Dec 2020 Haohang Xu, Xiaopeng Zhang, Hao Li, Lingxi Xie, Hongkai Xiong, Qi Tian

In this paper, we propose a hierarchical semantic alignment strategy via expanding the views generated by a single image to \textbf{Cross-samples and Multi-level} representation, and models the invariance to semantically similar images in a hierarchical way.

Contrastive Learning Representation Learning +2

Self-Adaptively Learning to Demoiré from Focused and Defocused Image Pairs

no code implementations NeurIPS 2020 Lin Liu, Shanxin Yuan, Jianzhuang Liu, Liping Bao, Gregory Slabaugh, Qi Tian

In this paper, we propose a self-adaptive learning method for demoiréing a high-frequency image, with the help of an additional defocused moiré-free blur image.

Omni-GAN: On the Secrets of cGANs and Beyond

1 code implementation ICCV 2021 Peng Zhou, Lingxi Xie, Bingbing Ni, Cong Geng, Qi Tian

The conditional generative adversarial network (cGAN) is a powerful tool of generating high-quality images, but existing approaches mostly suffer unsatisfying performance or the risk of mode collapse.

Conditional Image Generation

Heterogeneous Contrastive Learning: Encoding Spatial Information for Compact Visual Representations

no code implementations19 Nov 2020 Xinyue Huo, Lingxi Xie, Longhui Wei, Xiaopeng Zhang, Hao Li, Zijie Yang, Wengang Zhou, Houqiang Li, Qi Tian

Contrastive learning has achieved great success in self-supervised visual representation learning, but existing approaches mostly ignored spatial information which is often crucial for visual representation.

Contrastive Learning Data Augmentation +1

Privileged Knowledge Distillation for Online Action Detection

no code implementations18 Nov 2020 Peisen Zhao, Lingxi Xie, Ya zhang, Yanfeng Wang, Qi Tian

Knowledge distillation is employed to transfer the privileged information from the offline teacher to the online student.

Knowledge Distillation Online Action Detection

Self-Adaptively Learning to Demoire from Focused and Defocused Image Pairs

1 code implementation3 Nov 2020 Lin Liu, Shanxin Yuan, Jianzhuang Liu, Liping Bao, Gregory Slabaugh, Qi Tian

In this paper, we propose a self-adaptive learning method for demoireing a high-frequency image, with the help of an additional defocused moire-free blur image.


CooGAN: A Memory-Efficient Framework for High-Resolution Facial Attribute Editing

1 code implementation ECCV 2020 Xuanhong Chen, Bingbing Ni, Naiyuan Liu, Ziang Liu, Yiliu Jiang, Loc Truong, Qi Tian

In contrast to great success of memory-consuming face editing methods at a low resolution, to manipulate high-resolution (HR) facial images, i. e., typically larger than 7682 pixels, with very limited memory is still challenging.

Image Generation Translation

Loss re-scaling VQA: Revisiting the LanguagePrior Problem from a Class-imbalance View

1 code implementation30 Oct 2020 Yangyang Guo, Liqiang Nie, Zhiyong Cheng, Qi Tian, Min Zhang

Concretely, we design a novel interpretation scheme whereby the loss of mis-predicted frequent and sparse answers of the same question type is distinctly exhibited during the late training phase.

Face Recognition Image Classification +3

One-bit Supervision for Image Classification

1 code implementation NeurIPS 2020 Hengtong Hu, Lingxi Xie, Zewei Du, Richang Hong, Qi Tian

Instead of training a model upon the accurate label of each sample, our setting requires the model to query with a predicted label of each sample and learn from the answer whether the guess is correct.

Classification General Classification +1

Label Decoupling Framework for Salient Object Detection

1 code implementation CVPR 2020 Jun Wei, Shuhui Wang, Zhe Wu, Chi Su, Qingming Huang, Qi Tian

Though remarkable progress has been achieved, we observe that the closer the pixel is to the edge, the more difficult it is to be predicted, because edge pixels have a very imbalance distribution.

object-detection RGB Salient Object Detection +2

Dual Distribution Alignment Network for Generalizable Person Re-Identification

1 code implementation27 Jul 2020 Peixian Chen, Pingyang Dai, Jianzhuang Liu, Feng Zheng, Qi Tian, Rongrong Ji

Domain generalization (DG) serves as a promising solution to handle person Re-Identification (Re-ID), which trains the model using labels from the source domain alone, and then directly adopts the trained model to the target domain without model updating.

Domain Generalization Generalizable Person Re-identification

Polar Relative Positional Encoding for Video-Language Segmentation

no code implementations20 Jul 2020 Ke Ning, Lingxi Xie, Fei Wu, Qi Tian

In this paper, we propose a novel Polar Relative Positional Encoding (PRPE) mechanism that represents spatial relations in a ``linguistic'' way, i. e., in terms of direction and range.

Referring Expression Segmentation

Social Adaptive Module for Weakly-supervised Group Activity Recognition

no code implementations ECCV 2020 Rui Yan, Lingxi Xie, Jinhui Tang, Xiangbo Shu, Qi Tian

This paper presents a new task named weakly-supervised group activity recognition (GAR) which differs from conventional GAR tasks in that only video-level labels are available, yet the important persons within each frame are not provided even in the training data.

Group Activity Recognition

Wavelet-Based Dual-Branch Network for Image Demoireing

1 code implementation14 Jul 2020 Lin Liu, Jianzhuang Liu, Shanxin Yuan, Gregory Slabaugh, Ales Leonardis, Wengang Zhou, Qi Tian

When smartphone cameras are used to take photos of digital screens, usually moire patterns result, severely degrading photo quality.

Demoire Image Restoration +1

Universal-to-Specific Framework for Complex Action Recognition

no code implementations13 Jul 2020 Peisen Zhao, Lingxi Xie, Ya zhang, Qi Tian

The U2S framework is composed of three subnetworks: a universal network, a category-specific network, and a mask network.

Action Recognition Decision Making

GOLD-NAS: Gradual, One-Level, Differentiable

1 code implementation7 Jul 2020 Kaifeng Bi, Lingxi Xie, Xin Chen, Longhui Wei, Qi Tian

There has been a large literature of neural architecture search, but most existing work made use of heuristic rules that largely constrained the search flexibility.

Image Classification Neural Architecture Search

MgSvF: Multi-Grained Slow vs. Fast Framework for Few-Shot Class-Incremental Learning

no code implementations28 Jun 2020 Hanbin Zhao, Yongjian Fu, Mintong Kang, Qi Tian, Fei Wu, Xi Li

As a challenging problem, few-shot class-incremental learning (FSCIL) continually learns a sequence of tasks, confronting the dilemma between slow forgetting of old knowledge and fast adaptation to new knowledge.

Few-Shot Class-Incremental Learning Incremental Learning

Searching towards Class-Aware Generators for Conditional Generative Adversarial Networks

1 code implementation25 Jun 2020 Peng Zhou, Lingxi Xie, Xiaopeng Zhang, Bingbing Ni, Qi Tian

To learn the sampling policy, a Markov decision process is embedded into the search algorithm and a moving average is applied for better stability.

Image Generation

ATSO: Asynchronous Teacher-Student Optimization for Semi-Supervised Medical Image Segmentation

no code implementations24 Jun 2020 Xinyue Huo, Lingxi Xie, Jianzhong He, Zijie Yang, Qi Tian

This paper focuses on a popular pipeline known as self learning, and points out a weakness named lazy learning that refers to the difficulty for a model to learn from the pseudo labels generated by itself.

Autonomous Driving Image Segmentation +3

Distilling Object Detectors with Task Adaptive Regularization

no code implementations23 Jun 2020 Ruoyu Sun, Fuhui Tang, Xiaopeng Zhang, Hongkai Xiong, Qi Tian

Knowledge distillation, which aims at training a smaller student network by transferring knowledge from a larger teacher model, is one of the promising solutions for model miniaturization.

Knowledge Distillation Region Proposal

Cascaded Regression Tracking: Towards Online Hard Distractor Discrimination

no code implementations18 Jun 2020 Ning Wang, Wengang Zhou, Qi Tian, Houqiang Li

In the second stage, a discrete sampling based ridge regression is designed to double-check the remaining ambiguous hard samples, which serves as an alternative of fully-connected layers and benefits from the closed-form solver for efficient learning.

regression Visual Tracking

Rethinking Performance Estimation in Neural Architecture Search

1 code implementation CVPR 2020 Xiawu Zheng, Rongrong Ji, Qiang Wang, Qixiang Ye, Zhenguo Li, Yonghong Tian, Qi Tian

In this paper, we provide a novel yet systematic rethinking of PE in a resource constrained regime, termed budgeted PE (BPE), which precisely and effectively estimates the performance of an architecture sampled from an architecture space.

Neural Architecture Search

A Semi-Supervised Assessor of Neural Architectures

no code implementations CVPR 2020 Yehui Tang, Yunhe Wang, Yixing Xu, Hanting Chen, Chunjing Xu, Boxin Shi, Chao Xu, Qi Tian, Chang Xu

A graph convolutional neural network is introduced to predict the performance of architectures based on the learned representations and their relation modeled by the graph.

Neural Architecture Search

Projection & Probability-Driven Black-Box Attack

1 code implementation CVPR 2020 Jie Li, Rongrong Ji, Hong Liu, Jianzhuang Liu, Bineng Zhong, Cheng Deng, Qi Tian

For reducing the solution space, we first model the adversarial perturbation optimization problem as a process of recovering frequency-sparse perturbations with compressed sensing, under the setting that random noise in the low-frequency space is more likely to be adversarial.

Deep Multimodal Neural Architecture Search

no code implementations25 Apr 2020 Zhou Yu, Yuhao Cui, Jun Yu, Meng Wang, DaCheng Tao, Qi Tian

Most existing works focus on a single task and design neural architectures manually, which are highly task-specific and hard to generalize to different tasks.

Neural Architecture Search Question Answering +4

Fitting the Search Space of Weight-sharing NAS with Graph Convolutional Networks

no code implementations17 Apr 2020 Xin Chen, Lingxi Xie, Jun Wu, Longhui Wei, Yuhui Xu, Qi Tian

We alleviate this issue by training a graph convolutional network to fit the performance of sampled sub-networks so that the impact of random errors becomes minimal.

Neural Architecture Search

Creating Something from Nothing: Unsupervised Knowledge Distillation for Cross-Modal Hashing

1 code implementation CVPR 2020 Hengtong Hu, Lingxi Xie, Richang Hong, Qi Tian

In recent years, cross-modal hashing (CMH) has attracted increasing attentions, mainly because its potential ability of mapping contents from different modalities, especially in vision and language, into the same space, so that it becomes efficient in cross-modal data retrieval.

Knowledge Distillation Retrieval

Gradually Vanishing Bridge for Adversarial Domain Adaptation

2 code implementations CVPR 2020 Shuhao Cui, Shuhui Wang, Junbao Zhuo, Chi Su, Qingming Huang, Qi Tian

On the discriminator, GVB contributes to enhance the discriminating ability, and balance the adversarial training process.

Unsupervised Domain Adaptation