no code implementations • ECCV 2020 • Shuo Wang, Jun Yue, Jianzhuang Liu, Qi Tian, Meng Wang
It is a challenging problem since (1) the identifying process is susceptible to over-fitting with limited samples of an object, and (2) the sample imbalance between a base (known knowledge) category and a novel category is easy to bias the recognition results.
no code implementations • ECCV 2020 • Xinzhe Han, Shuhui Wang, Chi Su, Weigang Zhang, Qingming Huang, Qi Tian
In this paper, we rethink implicit reasoning process in VQA, and propose a new formulation which maximizes the log-likelihood of joint distribution for the observed question and predicted answer.
no code implementations • ECCV 2020 • Jianqiao An, Yucheng Shi, Yahong Han, Meijun Sun, Qi Tian
For a certain object in an image, the relationship between its central region and the peripheral region is not well utilized in existing superpixel segmentation methods.
no code implementations • ECCV 2020 • Kunyuan Du, Ya zhang, Haibing Guan, Qi Tian, Shenggan Cheng, James Lin
Compared with low-bit models trained directly, the proposed framework brings 0. 5% to 3. 4% accuracy gains to three different quantization schemes.
1 code implementation • ECCV 2020 • Xinshuai Dong, Hong Liu, Rongrong Ji, Liujuan Cao, Qixiang Ye, Jianzhuang Liu, Qi Tian
On the contrary, a discriminative classifier only models the conditional distribution of labels given inputs, but benefits from effective optimization owing to its succinct structure.
no code implementations • ECCV 2020 • Lin Liu, Jianzhuang Liu, Shanxin Yuan, Gregory Slabaugh, Aleš Leonardis, Wengang Zhou, Qi Tian
When smartphone cameras are used to take photos of digital screens, usually moire patterns result, severely degrading photo quality.
no code implementations • 21 Mar 2023 • Chen Ju, Zeqian Li, Peisen Zhao, Ya zhang, Xiaopeng Zhang, Qi Tian, Yanfeng Wang, Weidi Xie
In this paper, we consider the problem of temporal action localization under low-shot (zero-shot & few-shot) scenario, with the goal of detecting and classifying the action instances from arbitrary categories within some untrimmed videos, even not seen at training time.
no code implementations • 17 Mar 2023 • Haixin Wang, Jianlong Chang, Xiao Luo, Jinan Sun, Zhouchen Lin, Qi Tian
Despite recent competitive performance across a range of vision tasks, vision Transformers still have an issue of heavy computational costs.
no code implementations • 16 Mar 2023 • Xinyue Huo, Lingxi Xie, Wengang Zhou, Houqiang Li, Qi Tian
Currently, a popular UDA framework lies in self-training which endows the model with two-fold abilities: (i) learning reliable semantics from the labeled images in the source domain, and (ii) adapting to the target domain via generating pseudo labels on the unlabeled images.
no code implementations • 14 Mar 2023 • Zelin Peng, Guanchun Wang, Lingxi Xie, Dongsheng Jiang, Wei Shen, Qi Tian
Seed area generation is usually the starting point of weakly supervised semantic segmentation (WSSS).
Multi-Label Classification
Weakly supervised Semantic Segmentation
+1
no code implementations • 12 Mar 2023 • Juncheng Li, Minghe Gao, Longhui Wei, Siliang Tang, Wenqiao Zhang, Mengze Li, Wei Ji, Qi Tian, Tat-Seng Chua, Yueting Zhuang
Prompt tuning, a recently emerging paradigm, enables the powerful vision-language pre-training models to adapt to downstream tasks in a parameter -- and data -- efficient way, by learning the ``soft prompts'' to condition frozen pre-training models.
no code implementations • 9 Mar 2023 • Ning Liao, Xiaopeng Zhang, Min Cao, Qi Tian, Junchi Yan
In realistic open-set scenarios where labels of a part of testing data are totally unknown, current prompt methods on vision-language (VL) models always predict the unknown classes as the downstream training classes.
no code implementations • 9 Mar 2023 • Ning Liao, Bowen Shi, Min Cao, Xiaopeng Zhang, Qi Tian, Junchi Yan
To explore prompt learning on the generative pre-trained visual model as well as keeping the task consistency, we propose Visual Prompt learning as masked visual Token Modeling (VPTM) to transform the downstream visual classification into the pre-trained masked visual token prediction.
no code implementations • 7 Mar 2023 • Jiacheng Li, Longhui Wei, Zongyuan Zhan, Xin He, Siliang Tang, Qi Tian, Yueting Zhuang
To better accelerate the generative transformers while keeping good generation quality, we propose Lformer, a semi-autoregressive text-to-image generation model.
no code implementations • 20 Feb 2023 • Chen Ju, Haicheng Wang, Jinxiang Liu, Chaofan Ma, Ya zhang, Peisen Zhao, Jianlong Chang, Qi Tian
Temporal sentence grounding aims to detect the event timestamps described by the natural language query from given untrimmed videos.
no code implementations • 5 Feb 2023 • Zijian Zhang, Zhou Zhao, Jun Yu, Qi Tian
In this paper, we propose a novel and flexible conditional diffusion model by introducing conditions into the forward process.
no code implementations • 26 Dec 2022 • Deng Li, Aming Wu, Yahong Han, Qi Tian
Considering the complexity and variability of real scene tasks, we propose a Prototype-guided Cross-task Knowledge Distillation (ProC-KD) approach to transfer the intrinsic local-level object knowledge of a large-scale teacher network to various task scenarios.
no code implementations • 19 Dec 2022 • Chen Ju, Kunhao Zheng, Jinxiang Liu, Peisen Zhao, Ya zhang, Jianlong Chang, Yanfeng Wang, Qi Tian
And as a result, the dual-branch complementarity is effectively fused to promote a strong alliance.
Weakly-supervised Temporal Action Localization
Weakly Supervised Temporal Action Localization
1 code implementation • 14 Dec 2022 • Ziqing Fan, Yanfeng Wang, Jiangchao Yao, Lingjuan Lyu, Ya zhang, Qi Tian
However, in addition to previous explorations for improvement in federated averaging, our analysis shows that another critical bottleneck is the poorer optima of client models in more heterogeneous conditions.
no code implementations • 12 Dec 2022 • Tianliang Zhang, Qixiang Ye, Baochang Zhang, Jianzhuang Liu, Xiaopeng Zhang, Qi Tian
FC-Net is based on the observation that the visible parts of pedestrians are selective and decisive for detection, and is implemented as a self-paced feature learning framework with a self-activation (SA) module and a feature calibration (FC) module.
no code implementations • 4 Dec 2022 • Qi Tian, Kun Kuang, Kelu Jiang, Furui Liu, Zhihua Wang, Fei Wu
The success of deep learning is partly attributed to the availability of massive data downloaded freely from the Internet.
no code implementations • 28 Nov 2022 • Qi Tian, Kun Kuang, Furui Liu, Baoxiang Wang
e. g., an agent is a random policy while other agents are medium policies.
1 code implementation • 23 Nov 2022 • Yunjie Tian, Lingxi Xie, Zhaozhi Wang, Longhui Wei, Xiaopeng Zhang, Jianbin Jiao, YaoWei Wang, Qi Tian, Qixiang Ye
In this paper, we present an integral pre-training framework based on masked image modeling (MIM).
no code implementations • 3 Nov 2022 • Kaifeng Bi, Lingxi Xie, Hengheng Zhang, Xin Chen, Xiaotao Gu, Qi Tian
In this paper, we present Pangu-Weather, a deep learning based system for fast and accurate global weather forecast.
no code implementations • 28 Oct 2022 • Junfan Lin, Jianlong Chang, Lingbo Liu, Guanbin Li, Liang Lin, Qi Tian, Chang Wen Chen
During inference, instead of changing the motion generator, our method reformulates the input text into a masked motion as the prompt for the motion generator to ``reconstruct'' the motion.
no code implementations • 20 Oct 2022 • Min Cao, Cong Ding, Chen Chen, Junchi Yan, Qi Tian
Based on a natural assumption that images belonging to the same person identity should not match with images belonging to multiple different person identities across views, called the unicity of person matching on the identity level, we propose an end-to-end person unicity matching architecture for learning and refining the person matching relations.
2 code implementations • 14 Oct 2022 • Xiaoyan Zhang, Gaoyang Tang, Yingying Zhu, Qi Tian
The issue of image haze removal has attracted wide attention in recent years.
1 code implementation • 3 Oct 2022 • Bruce X. B. Yu, Jianlong Chang, Lingbo Liu, Qi Tian, Chang Wen Chen
Towards this goal, we propose a framework with a unified view of PETL called visual-PETL (V-PETL) to investigate the effects of different PETL techniques, data scales of downstream domains, positions of trainable parameters, and other aspects affecting the trade-off.
no code implementations • 1 Oct 2022 • Binghao Liu, Boyu Yang, Lingxi Xie, Ren Wang, Qi Tian, Qixiang Ye
LDC is built upon a parameterized calibration unit (PCU), which initializes biased distributions for all classes based on classifier vectors (memory-free) and a single covariance matrix.
no code implementations • 1 Oct 2022 • Shuangrui Ding, Weidi Xie, Yabo Chen, Rui Qian, Xiaopeng Zhang, Hongkai Xiong, Qi Tian
In this paper, we consider the task of unsupervised object discovery in videos.
no code implementations • 23 Aug 2022 • Lin Liu, Junfeng An, Jianzhuang Liu, Shanxin Yuan, Xiangyu Chen, Wengang Zhou, Houqiang Li, Yanfeng Wang, Qi Tian
Low-light video enhancement (LLVE) is an important yet challenging task with many applications such as photographing and autonomous driving.
no code implementations • 22 Aug 2022 • Lingbo Liu, Jianlong Chang, Bruce X. B. Yu, Liang Lin, Qi Tian, Chang-Wen Chen
Previous methods usually fine-tuned the entire networks for each specific dataset, which will be burdensome to store massive parameters of these networks.
1 code implementation • 18 Aug 2022 • Lin Wu, Yang Wang, Feng Zheng, Qi Tian, Meng Wang
Our architecture is orthogonal to StackGAN++ , and focuses on person image generation, with all of them together to enrich the spectrum of GANs for the image generation task.
1 code implementation • 4 Aug 2022 • Juncheng Li, Xin He, Longhui Wei, Long Qian, Linchao Zhu, Lingxi Xie, Yueting Zhuang, Qi Tian, Siliang Tang
Large-scale vision-language pre-training has shown impressive advances in a wide range of downstream tasks.
1 code implementation • 3 Aug 2022 • Juncheng Li, Junlin Xie, Linchao Zhu, Long Qian, Siliang Tang, Wenqiao Zhang, Haochen Shi, Shengyu Zhang, Longhui Wei, Qi Tian, Yueting Zhuang
In this paper, we introduce a new task, named Temporal Emotion Localization in videos~(TEL), which aims to detect human emotions and localize their corresponding temporal boundaries in untrimmed videos with aligned subtitles.
1 code implementation • 31 Jul 2022 • Yabo Chen, Yuchen Liu, Dongsheng Jiang, Xiaopeng Zhang, Wenrui Dai, Hongkai Xiong, Qi Tian
We also analyze how to build good views for the teacher branch to produce latent representation from the perspective of information bottleneck.
1 code implementation • 31 Jul 2022 • Maosen Li, Siheng Chen, Zijing Zhang, Lingxi Xie, Qi Tian, Ya zhang
To address the first issue, we propose adaptive graph scattering, which leverages multiple trainable band-pass graph filters to decompose pose features into richer graph spectrum bands.
no code implementations • 29 Jul 2022 • Shijie Wang, Jianlong Chang, Zhihui Wang, Haojie Li, Wanli Ouyang, Qi Tian
In this paper, we develop Fine-grained Retrieval Prompt Tuning (FRPT), which steers a frozen pre-trained model to perform the fine-grained retrieval task from the perspectives of sample prompting and feature adaptation.
no code implementations • 28 Jul 2022 • Xing Nie, Bolin Ni, Jianlong Chang, Gaomeng Meng, Chunlei Huo, Zhaoxiang Zhang, Shiming Xiang, Qi Tian, Chunhong Pan
To this end, we propose parameter-efficient Prompt tuning (Pro-tuning) to adapt frozen vision models to various downstream vision tasks.
1 code implementation • 28 Jul 2022 • Chufeng Tang, Lingxi Xie, Xiaopeng Zhang, Xiaolin Hu, Qi Tian
Humans have the ability of recognizing visual semantics in an unlimited granularity, but existing visual recognition algorithms cannot achieve this goal.
1 code implementation • 23 Jul 2022 • Chufeng Tang, Lingxi Xie, Gang Zhang, Xiaopeng Zhang, Qi Tian, Xiaolin Hu
In this paper, we present an economic active learning setting, named active pointly-supervised instance segmentation (APIS), which starts with box-level annotations and iteratively samples a point within the box and asks if it falls on the object.
1 code implementation • 18 Jul 2022 • Xuejing Liu, Liang Li, Shuhui Wang, Zheng-Jun Zha, Zechao Li, Qi Tian, Qingming Huang
Second, most previous weakly supervised REG methods ignore the discriminative location and context of the referent, causing difficulties in distinguishing the target from other same-category objects.
no code implementations • 4 Jul 2022 • Wei Shen, Zelin Peng, Xuehui Wang, Huayu Wang, Jiazhong Cen, Dongsheng Jiang, Lingxi Xie, Xiaokang Yang, Qi Tian
Next, we summarize the existing label-efficient image segmentation methods from a unified perspective that discusses an important question: how to bridge the gap between weak supervision and dense prediction -- the current methods are mostly based on heuristic priors, such as cross-pixel similarity, cross-label constraint, cross-view consistency, and cross-image relation.
1 code implementation • 1 Jul 2022 • Mingkun Yang, Minghui Liao, Pu Lu, Jing Wang, Shenggao Zhu, Hualin Luo, Qi Tian, Xiang Bai
Inspired by the observation that humans learn to recognize the texts through both reading and writing, we propose to learn discrimination and generation by integrating contrastive learning and masked image modeling in our self-supervised method.
no code implementations • 19 Jun 2022 • Xin Xu, Wei Liu, Zheng Wang, Ruiming Hu, Qi Tian
Guided by original pedestrian images, one stream is employed to learn a camera-invariant global feature for the CC problem via filtering cross-camera interference factors.
Domain Generalization
Generalizable Person Re-identification
1 code implementation • 10 Jun 2022 • Haohang Xu, Shuangrui Ding, Xiaopeng Zhang, Hongkai Xiong, Qi Tian
Specifically, MRA consistently enhances the performance on supervised, semi-supervised as well as few-shot classification.
1 code implementation • 2 Jun 2022 • Ming Tao, Bing-Kun Bao, Hao Tang, Fei Wu, Longhui Wei, Qi Tian
To solve these limitations, we propose: (i) a Dynamic Editing Block (DEBlock) which composes different editing modules dynamically for various editing requirements.
no code implementations • 30 May 2022 • Xiaosong Zhang, Yunjie Tian, Wei Huang, Qixiang Ye, Qi Dai, Lingxi Xie, Qi Tian
A key idea of efficient implementation is to discard the masked image patches (or tokens) throughout the target network (encoder), which requires the encoder to be a plain vision transformer (e. g., ViT), albeit hierarchical vision transformers (e. g., Swin Transformer) have potentially better properties in formulating vision inputs.
1 code implementation • 30 May 2022 • Jiemin Fang, Taoran Yi, Xinggang Wang, Lingxi Xie, Xiaopeng Zhang, Wenyu Liu, Matthias Nießner, Qi Tian
A multi-distance interpolation method is proposed and applied on voxel features to model both small and large motions.
no code implementations • 24 May 2022 • Feilong Chen, Xiuyi Chen, Jiaxin Shi, Duzhen Zhang, Jianlong Chang, Qi Tian
It also achieves about +4. 9 AR on COCO and +3. 8 AR on Flickr30K than LightingDot and achieves comparable performance with the state-of-the-art (SOTA) fusion-based model METER.
1 code implementation • 24 May 2022 • Lunyiu Nie, Shulin Cao, Jiaxin Shi, Jiuding Sun, Qi Tian, Lei Hou, Juanzi Li, Jidong Zhai
Subject to the huge semantic gap between natural and formal languages, neural semantic parsing is typically bottlenecked by its complexity of dealing with both input semantics and output syntax.
1 code implementation • 18 Apr 2022 • Kaiwen Duan, Song Bai, Lingxi Xie, Honggang Qi, Qingming Huang, Qi Tian
Our approach, named CenterNet, detects each object as a triplet keypoints (top-left and bottom-right corners and the center keypoint).
Ranked #31 on
Object Detection
on COCO test-dev
no code implementations • CVPR 2022 • Yu Zheng, Yueqi Duan, Jiwen Lu, Jie zhou, Qi Tian
A bathtub in a library, a sink in an office, a bed in a laundry room -- the counter-intuition suggests that scene provides important prior knowledge for 3D object detection, which instructs to eliminate the ambiguous detection of similar objects.
no code implementations • CVPR 2022 • Xinyue Huo, Lingxi Xie, Hengtong Hu, Wengang Zhou, Houqiang Li, Qi Tian
Unsupervised domain adaptation (UDA) is an important topic in the computer vision community.
1 code implementation • 27 Mar 2022 • Yunjie Tian, Lingxi Xie, Jiemin Fang, Mengnan Shi, Junran Peng, Xiaopeng Zhang, Jianbin Jiao, Qi Tian, Qixiang Ye
The past year has witnessed a rapid development of masked image modeling (MIM).
1 code implementation • CVPR 2022 • Qing Chang, Junran Peng, Lingxie Xie, Jiajun Sun, Haoran Yin, Qi Tian, Zhaoxiang Zhang
However, due to the high training costs and the unconsciousness of downstream usages, most self-supervised learning methods lack the capability to correspond to the diversities of downstream scenarios, as there are various data domains, different vision tasks and latency constraints on models.
no code implementations • 11 Mar 2022 • Xiaohan Zhang, Songlin Dong, Jinjie Chen, Qi Tian, Yihong Gong, Xiaopeng Hong
In this paper, we focus on a new and challenging decentralized machine learning paradigm in which there are continuous inflows of data to be addressed and the data are stored in multiple repositories.
no code implementations • 11 Mar 2022 • Lin Liu, Lingxi Xie, Xiaopeng Zhang, Shanxin Yuan, Xiangyu Chen, Wengang Zhou, Houqiang Li, Qi Tian
In this paper, we propose a novel approach that embeds a task-agnostic prior into a transformer.
no code implementations • 10 Mar 2022 • Longhui Wei, Lingxi Xie, Wengang Zhou, Houqiang Li, Qi Tian
Recently, masked image modeling (MIM) has become a promising direction for visual pre-training.
3 code implementations • 29 Jan 2022 • Xue Yang, Yue Zhou, Gefan Zhang, Jirui Yang, Wentao Wang, Junchi Yan, Xiaopeng Zhang, Qi Tian
This is in contrast to recent Gaussian modeling based rotation detectors e. g. GWD loss and KLD loss that involve a human-specified distribution distance metric which require additional hyperparameter tuning that vary across datasets and detectors.
Ranked #8 on
Object Detection In Aerial Images
on DOTA
2 code implementations • 10 Jan 2022 • Kai Han, Yunhe Wang, Chang Xu, Jianyuan Guo, Chunjing Xu, Enhua Wu, Qi Tian
The proposed C-Ghost module can be taken as a plug-and-play component to upgrade existing convolutional neural networks.
1 code implementation • CVPR 2022 • Zhengcong Fei, Xu Yan, Shuhui Wang, Qi Tian
On one hand, the representation in shallow layers lacks high-level semantic and sufficient cross-modal fusion information for accurate prediction.
no code implementations • CVPR 2022 • Yuhang Zhang, Xiaopeng Zhang, Lingxi Xie, Jie Li, Robert C. Qiu, Hengtong Hu, Qi Tian
The Yes query is treated as positive pairs of the queried category for contrastive pulling, while the No query is treated as hard negative pairs for contrastive repelling.
1 code implementation • CVPR 2022 • Wenwen Pan, Haonan Shi, Zhou Zhao, Jieming Zhu, Xiuqiang He, Zhigeng Pan, Lianli Gao, Jun Yu, Fei Wu, Qi Tian
Audio-Guided video semantic segmentation is a challenging problem in visual analysis and editing, which automatically separates foreground objects from background in a video sequence according to the referring audio expressions.
1 code implementation • CVPR 2022 • Sun-Ao Liu, Hongtao Xie, Hai Xu, Yongdong Zhang, Qi Tian
Current attention-based methods for semantic segmentation mainly model pixel relation through pairwise affinity and coarse segmentation.
no code implementations • CVPR 2022 • Hui Wu, Min Wang, Wengang Zhou, Houqiang Li, Qi Tian
To this end, we propose a flexible contextual similarity distillation framework to enhance the small query model and keep its output feature compatible with that of large gallery model, which is crucial with asymmetric retrieval.
no code implementations • CVPR 2022 • Yadong Ding, Yu Wu, Chengyue Huang, Siliang Tang, Yi Yang, Longhui Wei, Yueting Zhuang, Qi Tian
Existing NAS-based meta-learning methods apply a two-stage strategy, i. e., first searching architectures and then re-training meta-weights on the searched architecture.
no code implementations • 20 Dec 2021 • Qi Tian, Kun Kuang, Baoxiang Wang, Furui Liu, Fei Wu
The node information compression aims to address the problem of what to communicate via learning compact node representations.
Multi-agent Reinforcement Learning
reinforcement-learning
+3
1 code implementation • 20 Dec 2021 • Xinzhe Han, Shuhui Wang, Chi Su, Qingming Huang, Qi Tian
Existing de-bias learning frameworks try to capture specific dataset bias by annotations but they fail to handle complicated OOD scenarios.
no code implementations • 17 Dec 2021 • Lin Liu, Shanxin Yuan, Jianzhuang Liu, Xin Guo, Youliang Yan, Qi Tian
For zero-shot image restoration, we design a novel model, termed SiamTrans, which is constructed by Siamese transformers, encoders, and decoders.
no code implementations • 16 Dec 2021 • Rui Liu, Yahong Han, YaoWei Wang, Qi Tian
In the second stage, augmented source and target data with pseudo labels are adopted to perform the self-training for prediction consistency.
no code implementations • 15 Dec 2021 • Gursimran Singh, Lingyang Chu, Lanjun Wang, Jian Pei, Qi Tian, Yong Zhang
In the real world, the frequency of occurrence of objects is naturally skewed forming long-tail class distributions, which results in poor performance on the statistically rare classes.
no code implementations • 5 Dec 2021 • Yunjie Tian, Lingxi Xie, Jiemin Fang, Jianbin Jiao, Qixiang Ye, Qi Tian
In this paper, we build the search algorithm upon a complicated search space with long-distance connections, and show that existing weight-sharing search algorithms mostly fail due to the existence of \textbf{interleaved connections}.
1 code implementation • 30 Nov 2021 • Jiemin Fang, Lingxi Xie, Xinggang Wang, Xiaopeng Zhang, Wenyu Liu, Qi Tian
Neural radiance fields (NeRF) have shown great potentials in representing 3D scenes and synthesizing novel views, but the computational overhead of NeRF at the inference stage is still heavy.
1 code implementation • 25 Nov 2021 • Yunjie Tian, Lingxi Xie, Xiaopeng Zhang, Jiemin Fang, Haohang Xu, Wei Huang, Jianbin Jiao, Qi Tian, Qixiang Ye
In this paper, we propose a self-supervised visual representation learning approach which involves both generative and discriminative proxies, where we focus on the former part by requiring the target network to recover the original image based on the mid-level features.
Ranked #56 on
Semantic Segmentation
on Cityscapes test
no code implementations • 24 Nov 2021 • Jiazhong Cen, Zenkun Jiang, Lingxi Xie, Qi Tian, Xiaokang Yang, Wei Shen
Anomaly segmentation is a crucial task for safety-critical applications, such as autonomous driving in urban scenes, where the goal is to detect out-of-distribution (OOD) objects with categories which are unseen during training.
Ranked #6 on
Anomaly Detection
on Fishyscapes L&F
1 code implementation • 23 Nov 2021 • Zhaobo Qi, Shuhui Wang, Chi Su, Li Su, Qingming Huang, Qi Tian
Future activity anticipation is a challenging problem in egocentric vision.
no code implementations • 19 Nov 2021 • Xu Yan, Zhengcong Fei, Shuhui Wang, Qingming Huang, Qi Tian
Dense video captioning (DVC) aims to generate multi-sentence descriptions to elucidate the multiple events in the video, which is challenging and demands visual consistency, discoursal coherence, and linguistic diversity.
3 code implementations • 28 Oct 2021 • Hao Feng, Wengang Zhou, Jiajun Deng, Qi Tian, Houqiang Li
The iterative refinements make DocScanner converge to a robust and superior rectification performance, while the lightweight recurrent architecture ensures the running efficiency.
1 code implementation • 19 Oct 2021 • Peng Zhou, Lingxi Xie, Bingbing Ni, Qi Tian
The style-based GAN (StyleGAN) architecture achieved state-of-the-art results for generating high-quality images, but it lacks explicit and precise control over camera poses.
Ranked #1 on
3D-Aware Image Synthesis
on FFHQ 256 x 256
1 code implementation • 11 Oct 2021 • Xu Yan, Zhengcong Fei, Zekang Li, Shuhui Wang, Qingming Huang, Qi Tian
Non-autoregressive image captioning with continuous iterative refinement, which eliminates the sequential dependence in a sentence generation, can achieve comparable performance to the autoregressive counterparts with a considerable acceleration.
no code implementations • 29 Sep 2021 • Hengtong Hu, Lingxi Xie, Yinquan Wang, Richang Hong, Meng Wang, Qi Tian
We investigate the problem of estimating uncertainty for training data, so that deep neural networks can make use of the results for learning from limited supervision.
no code implementations • 29 Sep 2021 • Mengbiao Zhao, Shixiong Xu, Jianlong Chang, Lingxi Xie, Jie Chen, Qi Tian
Having consumed huge amounts of training data and computational resource, large-scale pre-trained models are often considered key assets of AI service providers.
no code implementations • 7 Sep 2021 • Xiaoman Zhang, Weidi Xie, Chaoqin Huang, Yanfeng Wang, Ya zhang, Xin Chen, Qi Tian
In this paper, we target self-supervised representation learning for zero-shot tumor segmentation.
no code implementations • ICCV 2021 • Xing Nie, Yongcheng Liu, Shaohong Chen, Jianlong Chang, Chunlei Huo, Gaofeng Meng, Qi Tian, Weiming Hu, Chunhong Pan
It can work in a purely data-driven manner and thus is capable of auto-creating a group of suitable convolutions for geometric shape modeling.
no code implementations • 25 Aug 2021 • Maosen Li, Siheng Chen, Yangheng Zhao, Ya zhang, Yanfeng Wang, Qi Tian
The core of MST-GNN is a multiscale spatio-temporal graph that explicitly models the relations in motions at various spatial and temporal scales.
1 code implementation • ICCV 2021 • Zhuo Su, Wenzhe Liu, Zitong Yu, Dewen Hu, Qing Liao, Qi Tian, Matti Pietikäinen, Li Liu
A faster version of PiDiNet with less than 0. 1M parameters can still achieve comparable performance among state of the arts with 200 FPS.
Ranked #2 on
Edge Detection
on BRIND
1 code implementation • ICCV 2021 • Xinzhe Han, Shuhui Wang, Chi Su, Qingming Huang, Qi Tian
Language bias is a critical issue in Visual Question Answering (VQA), where models often exploit dataset biases for the final decision without considering the image information.
Ranked #2 on
Visual Question Answering (VQA)
on VQA-CP
no code implementations • 26 Jul 2021 • Zixuan Ni, Haizhou Shi, Siliang Tang, Longhui Wei, Qi Tian, Yueting Zhuang
After investigating existing strategies, we observe that there is a lack of study on how to prevent the inter-phase confusion.
1 code implementation • 24 Jul 2021 • Xiujun Shu, Ge Li, Xiao Wang, Weijian Ruan, Qi Tian
The key to this task is to exploit cloth-irrelevant cues.
no code implementations • 21 Jul 2021 • Kunhong Wu, Yucheng Shi, Yahong Han, Yunfeng Shao, Bingshuai Li, Qi Tian
Existing unsupervised domain adaptation (UDA) methods can achieve promising performance without transferring data from source domain to target domain.
1 code implementation • NeurIPS 2021 • Xu Luo, Longhui Wei, Liangjian Wen, Jinrong Yang, Lingxi Xie, Zenglin Xu, Qi Tian
The category gap between training and evaluation has been characterised as one of the main obstacles to the success of Few-Shot Learning (FSL).
1 code implementation • 13 Jul 2021 • Shuhao Cui, Shuhui Wang, Junbao Zhuo, Liang Li, Qingming Huang, Qi Tian
Due to the domain discrepancy in visual domain adaptation, the performance of source model degrades when bumping into the high data density near decision boundary in target domain.
1 code implementation • ICLR 2022 • Haohang Xu, Jiemin Fang, Xiaopeng Zhang, Lingxi Xie, Xinggang Wang, Wenrui Dai, Hongkai Xiong, Qi Tian
Here bag of instances indicates a set of similar samples constructed by the teacher and are grouped within a bag, and the goal of distillation is to aggregate compact representations over the student with respect to instances in a bag.
no code implementations • CVPR 2021 • Xinyue Huo, Lingxi Xie, Jianzhong He, Zijie Yang, Wengang Zhou, Houqiang Li, Qi Tian
Semi-supervised learning is a useful tool for image segmentation, mainly due to its ability in extracting knowledge from unlabeled data to assist learning from labeled data.
no code implementations • 8 Jun 2021 • Bowen Shi, Xiaopeng Zhang, Haohang Xu, Wenrui Dai, Junni Zou, Hongkai Xiong, Qi Tian
This is achieved by first pretraining the network via the proposed pixel-to-prototype contrastive loss over multiple datasets regardless of their taxonomy labels, and followed by fine-tuning the pretrained model over specific dataset as usual.
2 code implementations • NeurIPS 2021 • Xue Yang, Xiaojiang Yang, Jirui Yang, Qi Ming, Wentao Wang, Qi Tian, Junchi Yan
Taking the perspective that horizontal detection is a special case for rotated object detection, in this paper, we are motivated to change the design of rotation regression loss from induction paradigm to deduction methodology, in terms of the relation between rotation and horizontal detection.
Ranked #11 on
Object Detection In Aerial Images
on DOTA
no code implementations • 1 Jun 2021 • Longhui Wei, Lingxi Xie, Wengang Zhou, Houqiang Li, Qi Tian
By simply pulling the different augmented views of each image together or other novel mechanisms, they can learn much unsupervised knowledge and significantly improve the transfer performance of pre-training models.
2 code implementations • CVPR 2022 • Jiemin Fang, Lingxi Xie, Xinggang Wang, Xiaopeng Zhang, Wenyu Liu, Qi Tian
Transformers have offered a new methodology of designing neural networks for visual recognition.
1 code implementation • 31 May 2021 • Xiujun Shu, Xiao Wang, Xianghao Zang, Shiliang Zhang, Yuanqi Chen, Ge Li, Qi Tian
We also verified that models pre-trained on LaST can generalize well on existing datasets with short-term and cloth-changing scenarios.
no code implementations • 29 May 2021 • Qi Tian, Kun Kuang, Kelu Jiang, Fei Wu, Yisen Wang
Adversarial training is one of the most effective approaches to improve model robustness against adversarial examples.
no code implementations • 28 May 2021 • Lingxi Xie, Xiaopeng Zhang, Longhui Wei, Jianlong Chang, Qi Tian
This is an opinion paper.
1 code implementation • CVPR 2021 • Yuchao Li, Shaohui Lin, Jianzhuang Liu, Qixiang Ye, Mengdi Wang, Fei Chao, Fan Yang, Jincheng Ma, Qi Tian, Rongrong Ji
Channel pruning and tensor decomposition have received extensive attention in convolutional neural network compression.
1 code implementation • CVPR 2021 • Qinwei Xu, Ruipeng Zhang, Ya zhang, Yanfeng Wang, Qi Tian
Modern deep neural networks suffer from performance degradation when evaluated on testing data under different distributions from training data.
no code implementations • 16 May 2021 • Yuhang Zhang, Xiaopeng Zhang, Robert. C. Qiu, Jie Li, Haohang Xu, Qi Tian
Semi-supervised learning acts as an effective way to leverage massive unlabeled data.
4 code implementations • 12 May 2021 • Hu Cao, Yueyue Wang, Joy Chen, Dongsheng Jiang, Xiaopeng Zhang, Qi Tian, Manning Wang
In the past few years, convolutional neural networks (CNNs) have achieved milestones in medical image analysis.
Ranked #2 on
Medical Image Segmentation
on ACDC
3 code implementations • ICCV 2021 • Zhengsu Chen, Lingxi Xie, Jianwei Niu, Xuefeng Liu, Longhui Wei, Qi Tian
The past year has witnessed the rapid development of applying the Transformer module to vision problems.
Ranked #444 on
Image Classification
on ImageNet
1 code implementation • 11 Apr 2021 • Kaiwen Duan, Lingxi Xie, Honggang Qi, Song Bai, Qingming Huang, Qi Tian
Object detection, instance segmentation, and pose estimation are popular visual recognition tasks which require localizing the object by internal or boundary landmarks.
Ranked #54 on
Object Detection
on COCO test-dev
1 code implementation • CVPR 2021 • Le Yang, Haojun Jiang, Ruojin Cai, Yulin Wang, Shiji Song, Gao Huang, Qi Tian
Reusing features in deep networks through dense connectivity is an effective way to achieve high computational efficiency.
no code implementations • 6 Apr 2021 • Chen Ju, Peisen Zhao, Siheng Chen, Ya zhang, Xiaoyun Zhang, Qi Tian
To solve this issue, we introduce an adaptive mutual supervision framework (AMS) with two branches, where the base branch adopts CAS to localize the most discriminative action regions, while the supplementary branch localizes the less discriminative action regions through a novel adaptive sampler.
Ranked #5 on
Weakly Supervised Action Localization
on THUMOS14
Weakly Supervised Action Localization
Weakly-supervised Temporal Action Localization
+1
no code implementations • 30 Mar 2021 • Tianyu Zhang, Longhui Wei, Lingxi Xie, Zijie Zhuang, Yongfei Zhang, Bo Li, Qi Tian
Recently, the Transformer module has been transplanted from natural language processing to computer vision.
2 code implementations • ICCV 2021 • Wei Gao, Fang Wan, Xingjia Pan, Zhiliang Peng, Qi Tian, Zhenjun Han, Bolei Zhou, Qixiang Ye
TS-CAM finally couples the patch tokens with the semantic-agnostic attention map to achieve semantic-aware localization.
Weakly Supervised Object Localization
Weakly-Supervised Object Localization
no code implementations • 4 Mar 2021 • Hui Wang, Jian Tian, Songyuan Li, Hanbin Zhao, Qi Tian, Fei Wu, Xi Li
Unsupervised domain adaptation (UDA) typically carries out knowledge transfer from a label-rich source domain to an unlabeled target domain by adversarial learning.
2 code implementations • 28 Jan 2021 • Xue Yang, Junchi Yan, Qi Ming, Wentao Wang, Xiaopeng Zhang, Qi Tian
Boundary discontinuity and its inconsistency to the final detection metric have been the bottleneck for rotating detection regression loss design.
Ranked #13 on
Object Detection In Aerial Images
on DOTA
no code implementations • ICCV 2021 • Meng Meng, Tianzhu Zhang, Qi Tian, Yongdong Zhang, Feng Wu
To the best of our knowledge, this is the first work that can achieve remarkable performance for both tasks by optimizing them jointly via FAM for WSOL.
no code implementations • 1 Jan 2021 • Qi Tian, Kun Kuang, Fei Wu, Yisen Wang
Adversarial training is one of the most effective approaches to improve model robustness against adversarial examples.
no code implementations • ICCV 2021 • Chen Ju, Peisen Zhao, Siheng Chen, Ya zhang, Yanfeng Wang, Qi Tian
Single-frame temporal action localization (STAL) aims to localize actions in untrimmed videos with only one timestamp annotation for each action instance.
no code implementations • ICCV 2021 • Ye Chen, Jinxian Liu, Bingbing Ni, Hang Wang, Jiancheng Yang, Ning Liu, Teng Li, Qi Tian
Then the destroyed shape and the normal shape are sent into a point cloud network to get representations, which are employed to segment points that belong to distorted parts and further reconstruct them to restore the shape to normal.
no code implementations • 15 Dec 2020 • Chen Ju, Peisen Zhao, Ya zhang, Yanfeng Wang, Qi Tian
Point-Level temporal action localization (PTAL) aims to localize actions in untrimmed videos with only one timestamp annotation for each action instance.
Ranked #2 on
Weakly Supervised Action Localization
on BEOID
no code implementations • 9 Dec 2020 • Chaoqin Huang, Fei Ye, Peisen Zhao, Ya zhang, Yan-Feng Wang, Qi Tian
This paper explores semi-supervised anomaly detection, a more practical setting for anomaly detection where a small additional set of labeled samples are provided.
Ranked #24 on
Anomaly Detection
on One-class CIFAR-10
(using extra training data)
1 code implementation • CVPR 2021 • Tianyu Zhang, Lingxi Xie, Longhui Wei, Zijie Zhuang, Yongfei Zhang, Bo Li, Qi Tian
The main difficulty of person re-identification (ReID) lies in collecting annotated data and transferring the model across different domains.
no code implementations • 4 Dec 2020 • Haohang Xu, Xiaopeng Zhang, Hao Li, Lingxi Xie, Hongkai Xiong, Qi Tian
In this paper, we propose a hierarchical semantic alignment strategy via expanding the views generated by a single image to \textbf{Cross-samples and Multi-level} representation, and models the invariance to semantically similar images in a hierarchical way.
no code implementations • NeurIPS 2020 • Lin Liu, Shanxin Yuan, Jianzhuang Liu, Liping Bao, Gregory Slabaugh, Qi Tian
In this paper, we propose a self-adaptive learning method for demoiréing a high-frequency image, with the help of an additional defocused moiré-free blur image.
1 code implementation • ICCV 2021 • Peng Zhou, Lingxi Xie, Bingbing Ni, Cong Geng, Qi Tian
The conditional generative adversarial network (cGAN) is a powerful tool of generating high-quality images, but existing approaches mostly suffer unsatisfying performance or the risk of mode collapse.
Ranked #5 on
Conditional Image Generation
on ImageNet 128x128
no code implementations • 19 Nov 2020 • Xinyue Huo, Lingxi Xie, Longhui Wei, Xiaopeng Zhang, Hao Li, Zijie Yang, Wengang Zhou, Houqiang Li, Qi Tian
Contrastive learning has achieved great success in self-supervised visual representation learning, but existing approaches mostly ignored spatial information which is often crucial for visual representation.
no code implementations • 18 Nov 2020 • Peisen Zhao, Lingxi Xie, Ya zhang, Yanfeng Wang, Qi Tian
Knowledge distillation is employed to transfer the privileged information from the offline teacher to the online student.
Ranked #5 on
Online Action Detection
on TVSeries
no code implementations • 17 Nov 2020 • Longhui Wei, Lingxi Xie, Jianzhong He, Jianlong Chang, Xiaopeng Zhang, Wengang Zhou, Houqiang Li, Qi Tian
Recently, contrastive learning has largely advanced the progress of unsupervised visual representation learning.
1 code implementation • 3 Nov 2020 • Lin Liu, Shanxin Yuan, Jianzhuang Liu, Liping Bao, Gregory Slabaugh, Qi Tian
In this paper, we propose a self-adaptive learning method for demoireing a high-frequency image, with the help of an additional defocused moire-free blur image.
1 code implementation • ECCV 2020 • Xuanhong Chen, Bingbing Ni, Naiyuan Liu, Ziang Liu, Yiliu Jiang, Loc Truong, Qi Tian
In contrast to great success of memory-consuming face editing methods at a low resolution, to manipulate high-resolution (HR) facial images, i. e., typically larger than 7682 pixels, with very limited memory is still challenging.
1 code implementation • 30 Oct 2020 • Yangyang Guo, Liqiang Nie, Zhiyong Cheng, Qi Tian, Min Zhang
Concretely, we design a novel interpretation scheme whereby the loss of mis-predicted frequent and sparse answers of the same question type is distinctly exhibited during the late training phase.
1 code implementation • NeurIPS 2020 • Hengtong Hu, Lingxi Xie, Zewei Du, Richang Hong, Qi Tian
Instead of training a model upon the accurate label of each sample, our setting requires the model to query with a predicted label of each sample and learn from the answer whether the guess is correct.
no code implementations • ECCV 2020 • Lijie Liu, Chufan Wu, Jiwen Lu, Lingxi Xie, Jie zhou, Qi Tian
Monocular 3D object detection aims to extract the 3D position and properties of objects from a 2D input image.
Ranked #16 on
Vehicle Pose Estimation
on KITTI Cars Hard
1 code implementation • CVPR 2020 • Jun Wei, Shuhui Wang, Zhe Wu, Chi Su, Qingming Huang, Qi Tian
Though remarkable progress has been achieved, we observe that the closer the pixel is to the edge, the more difficult it is to be predicted, because edge pixels have a very imbalance distribution.
Ranked #1 on
Saliency Detection
on HKU-IS
no code implementations • 4 Aug 2020 • Lingxi Xie, Xin Chen, Kaifeng Bi, Longhui Wei, Yuhui Xu, Zhengsu Chen, Lanfei Wang, An Xiao, Jianlong Chang, Xiaopeng Zhang, Qi Tian
Neural architecture search (NAS) has attracted increasing attentions in both academia and industry.
2 code implementations • ECCV 2020 • Takashi Isobe, Xu Jia, Shuhang Gu, Songjiang Li, Shengjin Wang, Qi Tian
Most video super-resolution methods super-resolve a single reference frame with the help of neighboring frames in a temporal sliding window.
1 code implementation • 27 Jul 2020 • Peixian Chen, Pingyang Dai, Jianzhuang Liu, Feng Zheng, Qi Tian, Rongrong Ji
Domain generalization (DG) serves as a promising solution to handle person Re-Identification (Re-ID), which trains the model using labels from the source domain alone, and then directly adopts the trained model to the target domain without model updating.
Domain Generalization
Generalizable Person Re-identification
no code implementations • 27 Jul 2020 • Pingyang Dai, Peixian Chen, Qiong Wu, Xiaopeng Hong, Qixiang Ye, Qi Tian, Rongrong Ji
This drawback limits the flexibility of UDA in complicated open-set tasks where no labels are shared between domains.
1 code implementation • ECCV 2020 • Kaiwen Duan, Lingxi Xie, Honggang Qi, Song Bai, Qingming Huang, Qi Tian
On the MS-COCO dataset, CPN achieves an AP of 49. 2% which is competitive among state-of-the-art object detection methods.
Ranked #94 on
Object Detection
on COCO test-dev
1 code implementation • CVPR 2020 • Takashi Isobe, Songjiang Li, Xu Jia, Shanxin Yuan, Gregory Slabaugh, Chunjing Xu, Ya-Li Li, Shengjin Wang, Qi Tian
Video super-resolution, which aims at producing a high-resolution video from its corresponding low-resolution version, has recently drawn increasing attention.
no code implementations • 20 Jul 2020 • Ke Ning, Lingxi Xie, Fei Wu, Qi Tian
In this paper, we propose a novel Polar Relative Positional Encoding (PRPE) mechanism that represents spatial relations in a ``linguistic'' way, i. e., in terms of direction and range.
Ranked #8 on
Referring Expression Segmentation
on J-HMDB
no code implementations • ECCV 2020 • Rui Yan, Lingxi Xie, Jinhui Tang, Xiangbo Shu, Qi Tian
This paper presents a new task named weakly-supervised group activity recognition (GAR) which differs from conventional GAR tasks in that only video-level labels are available, yet the important persons within each frame are not provided even in the training data.
1 code implementation • 14 Jul 2020 • Lin Liu, Jianzhuang Liu, Shanxin Yuan, Gregory Slabaugh, Ales Leonardis, Wengang Zhou, Qi Tian
When smartphone cameras are used to take photos of digital screens, usually moire patterns result, severely degrading photo quality.
no code implementations • 13 Jul 2020 • Peisen Zhao, Lingxi Xie, Ya zhang, Qi Tian
The U2S framework is composed of three subnetworks: a universal network, a category-specific network, and a mask network.
1 code implementation • 7 Jul 2020 • Kaifeng Bi, Lingxi Xie, Xin Chen, Longhui Wei, Qi Tian
There has been a large literature of neural architecture search, but most existing work made use of heuristic rules that largely constrained the search flexibility.
no code implementations • 28 Jun 2020 • Hanbin Zhao, Yongjian Fu, Mintong Kang, Qi Tian, Fei Wu, Xi Li
As a challenging problem, few-shot class-incremental learning (FSCIL) continually learns a sequence of tasks, confronting the dilemma between slow forgetting of old knowledge and fast adaptation to new knowledge.
1 code implementation • 25 Jun 2020 • Peng Zhou, Lingxi Xie, Xiaopeng Zhang, Bingbing Ni, Qi Tian
To learn the sampling policy, a Markov decision process is embedded into the search algorithm and a moving average is applied for better stability.
no code implementations • 24 Jun 2020 • Xinyue Huo, Lingxi Xie, Jianzhong He, Zijie Yang, Qi Tian
This paper focuses on a popular pipeline known as self learning, and points out a weakness named lazy learning that refers to the difficulty for a model to learn from the pseudo labels generated by itself.
no code implementations • 23 Jun 2020 • Ruoyu Sun, Fuhui Tang, Xiaopeng Zhang, Hongkai Xiong, Qi Tian
Knowledge distillation, which aims at training a smaller student network by transferring knowledge from a larger teacher model, is one of the promising solutions for model miniaturization.
no code implementations • 18 Jun 2020 • Ning Wang, Wengang Zhou, Qi Tian, Houqiang Li
In the second stage, a discrete sampling based ridge regression is designed to double-check the remaining ambiguous hard samples, which serves as an alternative of fully-connected layers and benefits from the closed-form solver for efficient learning.
1 code implementation • CVPR 2020 • Xiawu Zheng, Rongrong Ji, Qiang Wang, Qixiang Ye, Zhenguo Li, Yonghong Tian, Qi Tian
In this paper, we provide a novel yet systematic rethinking of PE in a resource constrained regime, termed budgeted PE (BPE), which precisely and effectively estimates the performance of an architecture sampled from an architecture space.
no code implementations • CVPR 2020 • Yehui Tang, Yunhe Wang, Yixing Xu, Hanting Chen, Chunjing Xu, Boxin Shi, Chao Xu, Qi Tian, Chang Xu
A graph convolutional neural network is introduced to predict the performance of architectures based on the learned representations and their relation modeled by the graph.
1 code implementation • CVPR 2020 • Jie Li, Rongrong Ji, Hong Liu, Jianzhuang Liu, Bineng Zhong, Cheng Deng, Qi Tian
For reducing the solution space, we first model the adversarial perturbation optimization problem as a process of recovering frequency-sparse perturbations with compressed sensing, under the setting that random noise in the low-frequency space is more likely to be adversarial.
no code implementations • 25 Apr 2020 • Zhou Yu, Yuhao Cui, Jun Yu, Meng Wang, DaCheng Tao, Qi Tian
Most existing works focus on a single task and design neural architectures manually, which are highly task-specific and hard to generalize to different tasks.
Ranked #28 on
Visual Question Answering (VQA)
on VQA v2 test-std
no code implementations • 17 Apr 2020 • Xin Chen, Lingxi Xie, Jun Wu, Longhui Wei, Yuhui Xu, Qi Tian
We alleviate this issue by training a graph convolutional network to fit the performance of sampled sub-networks so that the impact of random errors becomes minimal.
1 code implementation • CVPR 2020 • Yutian Lin, Lingxi Xie, Yu Wu, Chenggang Yan, Qi Tian
Person re-identification (re-ID) is an important topic in computer vision.
1 code implementation • 6 Apr 2020 • Hao Li, Xiaopeng Zhang, Hongkai Xiong, Qi Tian
In this paper, we propose Attribute Mix, a data augmentation strategy at attribute level to expand the fine-grained samples.
Ranked #21 on
Fine-Grained Image Classification
on Stanford Cars
1 code implementation • CVPR 2020 • Zhengsu Chen, Jianwei Niu, Lingxi Xie, Xuefeng Liu, Longhui Wei, Qi Tian
Automatic designing computationally efficient neural networks has received much attention in recent years.
no code implementations • CVPR 2020 • Linjun Zhou, Peng Cui, Xu Jia, Shiqiang Yang, Qi Tian
Few-shot learning has attracted intensive research attention in recent years.
1 code implementation • CVPR 2020 • Hengtong Hu, Lingxi Xie, Richang Hong, Qi Tian
In recent years, cross-modal hashing (CMH) has attracted increasing attentions, mainly because its potential ability of mapping contents from different modalities, especially in vision and language, into the same space, so that it becomes efficient in cross-modal data retrieval.
2 code implementations • CVPR 2020 • Shuhao Cui, Shuhui Wang, Junbao Zhuo, Chi Su, Qingming Huang, Qi Tian
On the discriminator, GVB contributes to enhance the discriminating ability, and balance the adversarial training process.
1 code implementation • 30 Mar 2020 • Junyi Feng, Songyuan Li, Xi Li, Fei Wu, Qi Tian, Ming-Hsuan Yang, Haibin Ling
Real-time semantic video segmentati