34 code implementations • CVPR 2020 • Kai Han, Yunhe Wang, Qi Tian, Jianyuan Guo, Chunjing Xu, Chang Xu
Deploying convolutional neural networks (CNNs) on embedded devices is difficult due to the limited memory and computation resources.
Ranked #864 on Image Classification on ImageNet
5 code implementations • ICCV 2021 • Zhengsu Chen, Lingxi Xie, Jianwei Niu, Xuefeng Liu, Longhui Wei, Qi Tian
The past year has witnessed the rapid development of applying the Transformer module to vision problems.
Ranked #507 on Image Classification on ImageNet
1 code implementation • 26 Sep 2023 • Yuhui Xu, Lingxi Xie, Xiaotao Gu, Xin Chen, Heng Chang, Hengheng Zhang, Zhengsu Chen, Xiaopeng Zhang, Qi Tian
Recently years have witnessed a rapid development of large language models (LLMs).
8 code implementations • 10 Jan 2022 • Kai Han, Yunhe Wang, Chang Xu, Jianyuan Guo, Chunjing Xu, Enhua Wu, Qi Tian
The proposed C-Ghost module can be taken as a plug-and-play component to upgrade existing convolutional neural networks.
20 code implementations • ICCV 2019 • Kaiwen Duan, Song Bai, Lingxi Xie, Honggang Qi, Qingming Huang, Qi Tian
In object detection, keypoint-based approaches often suffer a large number of incorrect object bounding boxes, arguably due to the lack of an additional look into the cropped regions.
Ranked #105 on Object Detection on COCO test-dev
2 code implementations • 28 Jan 2021 • Xue Yang, Junchi Yan, Qi Ming, Wentao Wang, Xiaopeng Zhang, Qi Tian
Boundary discontinuity and its inconsistency to the final detection metric have been the bottleneck for rotating detection regression loss design.
Ranked #16 on Object Detection In Aerial Images on DOTA (using extra training data)
2 code implementations • NeurIPS 2021 • Xue Yang, Xiaojiang Yang, Jirui Yang, Qi Ming, Wentao Wang, Qi Tian, Junchi Yan
Taking the perspective that horizontal detection is a special case for rotated object detection, in this paper, we are motivated to change the design of rotation regression loss from induction paradigm to deduction methodology, in terms of the relation between rotation and horizontal detection.
Ranked #14 on Object Detection In Aerial Images on DOTA (using extra training data)
3 code implementations • 29 Jan 2022 • Xue Yang, Yue Zhou, Gefan Zhang, Jirui Yang, Wentao Wang, Junchi Yan, Xiaopeng Zhang, Qi Tian
This is in contrast to recent Gaussian modeling based rotation detectors e. g. GWD loss and KLD loss that involve a human-specified distribution distance metric which require additional hyperparameter tuning that vary across datasets and detectors.
1 code implementation • 12 Oct 2023 • Guanjun Wu, Taoran Yi, Jiemin Fang, Lingxi Xie, Xiaopeng Zhang, Wei Wei, Wenyu Liu, Qi Tian, Xinggang Wang
Representing and rendering dynamic scenes has been an important but challenging task.
5 code implementations • 12 May 2021 • Hu Cao, Yueyue Wang, Joy Chen, Dongsheng Jiang, Xiaopeng Zhang, Qi Tian, Manning Wang
In the past few years, convolutional neural networks (CNNs) have achieved milestones in medical image analysis.
Ranked #3 on Medical Image Segmentation on ACDC
3 code implementations • ICCV 2019 • Hanting Chen, Yunhe Wang, Chang Xu, Zhaohui Yang, Chuanjian Liu, Boxin Shi, Chunjing Xu, Chao Xu, Qi Tian
Learning portable neural networks is very essential for computer vision for the purpose that pre-trained heavy deep models can be well applied on edge devices such as mobile phones and micro sensors.
7 code implementations • CVPR 2020 • Hanting Chen, Yunhe Wang, Chunjing Xu, Boxin Shi, Chao Xu, Qi Tian, Chang Xu
The widely-used convolutions in deep neural networks are exactly cross-correlation to measure the similarity between input feature and convolution filters, which involves massive multiplications between float values.
3 code implementations • 3 Nov 2022 • Kaifeng Bi, Lingxi Xie, Hengheng Zhang, Xin Chen, Xiaotao Gu, Qi Tian
In this paper, we present Pangu-Weather, a deep learning based system for fast and accurate global weather forecast.
1 code implementation • NeurIPS 2023 • Jiazhong Cen, Jiemin Fang, Zanwei Zhou, Chen Yang, Lingxi Xie, Xiaopeng Zhang, Wei Shen, Qi Tian
The Segment Anything Model (SAM) emerges as a powerful vision foundation model to generate high-quality 2D segmentation results.
1 code implementation • 22 May 2023 • Yabo Zhang, Yuxiang Wei, Dongsheng Jiang, Xiaopeng Zhang, WangMeng Zuo, Qi Tian
Text-driven diffusion models have unlocked unprecedented abilities in image generation, whereas their video counterpart still lags behind due to the excessive training cost of temporal modeling.
1 code implementation • 15 Feb 2024 • Chen Yang, Sikuang Li, Jiemin Fang, Ruofan Liang, Lingxi Xie, Xiaopeng Zhang, Wei Shen, Qi Tian
Then we construct a Gaussian repair model based on diffusion models to supplement the omitted object information, where Gaussians are further refined.
1 code implementation • 19 Oct 2021 • Peng Zhou, Lingxi Xie, Bingbing Ni, Qi Tian
The style-based GAN (StyleGAN) architecture achieved state-of-the-art results for generating high-quality images, but it lacks explicit and precise control over camera poses.
Ranked #1 on 3D-Aware Image Synthesis on FFHQ 256 x 256
1 code implementation • 6 Apr 2020 • Hao Li, Xiaopeng Zhang, Hongkai Xiong, Qi Tian
In this paper, we propose Attribute Mix, a data augmentation strategy at attribute level to expand the fine-grained samples.
Ranked #22 on Fine-Grained Image Classification on CUB-200-2011
1 code implementation • 12 Oct 2023 • Taoran Yi, Jiemin Fang, Junjie Wang, Guanjun Wu, Lingxi Xie, Xiaopeng Zhang, Wenyu Liu, Qi Tian, Xinggang Wang
In recent times, the generation of 3D assets from text prompts has shown impressive results.
29 code implementations • ECCV 2018 • Yifan Sun, Liang Zheng, Yi Yang, Qi Tian, Shengjin Wang
RPP re-assigns these outliers to the parts they are closest to, resulting in refined parts with enhanced within-part consistency.
Ranked #3 on Person Re-Identification on UAV-Human
25 code implementations • CVPR 2018 • Longhui Wei, Shiliang Zhang, Wen Gao, Qi Tian
Although the performance of person Re-Identification (ReID) has been significantly boosted, many challenging issues in real scenarios have not been fully investigated, e. g., the complex scenes and lighting variations, viewpoint and pose changes, and the large number of identities in a camera network.
Ranked #11 on Unsupervised Person Re-Identification on DukeMTMC-reID (Rank-10 metric)
7 code implementations • CVPR 2019 • Zhou Yu, Jun Yu, Yuhao Cui, DaCheng Tao, Qi Tian
In this paper, we propose a deep Modular Co-Attention Network (MCAN) that consists of Modular Co-Attention (MCA) layers cascaded in depth.
Ranked #5 on Question Answering on SQA3D
8 code implementations • ICLR 2020 • Yuhui Xu, Lingxi Xie, Xiaopeng Zhang, Xin Chen, Guo-Jun Qi, Qi Tian, Hongkai Xiong
Differentiable architecture search (DARTS) provided a fast solution in finding effective network architectures, but suffered from large memory and computing overheads in jointly training a super-network and searching for an optimal architecture.
Ranked #20 on Neural Architecture Search on CIFAR-10
2 code implementations • ICCV 2021 • Zhuo Su, Wenzhe Liu, Zitong Yu, Dewen Hu, Qing Liao, Qi Tian, Matti Pietikäinen, Li Liu
A faster version of PiDiNet with less than 0. 1M parameters can still achieve comparable performance among state of the arts with 200 FPS.
Ranked #2 on Edge Detection on BRIND
4 code implementations • ICCV 2019 • Xin Chen, Lingxi Xie, Jun Wu, Qi Tian
Recently, differentiable search methods have made major progress in reducing the computational costs of neural architecture search.
4 code implementations • 23 Dec 2019 • Xin Chen, Lingxi Xie, Jun Wu, Qi Tian
With the rapid development of neural architecture search (NAS), researchers found powerful network architectures for a wide range of vision tasks.
3 code implementations • 28 Oct 2021 • Hao Feng, Wengang Zhou, Jiajun Deng, Qi Tian, Houqiang Li
The iterative refinements make DocScanner converge to a robust and superior rectification performance, while the lightweight recurrent architecture ensures the running efficiency.
2 code implementations • CVPR 2020 • Shuhao Cui, Shuhui Wang, Junbao Zhuo, Liang Li, Qingming Huang, Qi Tian
We find by theoretical analysis that the prediction discriminability and diversity could be separately measured by the Frobenius-norm and rank of the batch output matrix.
2 code implementations • CVPR 2020 • Shuhao Cui, Shuhui Wang, Junbao Zhuo, Chi Su, Qingming Huang, Qi Tian
On the discriminator, GVB contributes to enhance the discriminating ability, and balance the adversarial training process.
1 code implementation • 30 May 2022 • Jiemin Fang, Taoran Yi, Xinggang Wang, Lingxi Xie, Xiaopeng Zhang, Wenyu Liu, Matthias Nießner, Qi Tian
A multi-distance interpolation method is proposed and applied on voxel features to model both small and large motions.
1 code implementation • CVPR 2019 • Maosen Li, Siheng Chen, Xu Chen, Ya zhang, Yan-Feng Wang, Qi Tian
We validate AS-GCN in action recognition using two skeleton data sets, NTU-RGB+D and Kinetics.
1 code implementation • 13 Jul 2021 • Shuhao Cui, Shuhui Wang, Junbao Zhuo, Liang Li, Qingming Huang, Qi Tian
Due to the domain discrepancy in visual domain adaptation, the performance of source model degrades when bumping into the high data density near decision boundary in target domain.
2 code implementations • ICCV 2019 • Han Shu, Yunhe Wang, Xu Jia, Kai Han, Hanting Chen, Chunjing Xu, Qi Tian, Chang Xu
Generative adversarial networks (GANs) have been successfully used for considerable computer vision tasks, especially the image-to-image translation.
1 code implementation • CVPR 2014 • Kang Zhang, Yuqiang Fang, Dongbo Min, Lifeng Sun, Shiqiang Yang. Shuicheng Yan, Qi Tian
We firstly reformulate cost aggregation from a unified optimization perspective and show that different cost aggregation methods essentially differ in the choices of similarity kernels.
1 code implementation • ICCV 2019 • Xiawu Zheng, Rongrong Ji, Lang Tang, Baochang Zhang, Jianzhuang Liu, Qi Tian
Therefore, NAS can be transformed to a multinomial distribution learning problem, i. e., the distribution is optimized to have a high expectation of the performance.
1 code implementation • ECCV 2020 • Kaiwen Duan, Lingxi Xie, Honggang Qi, Song Bai, Qingming Huang, Qi Tian
On the MS-COCO dataset, CPN achieves an AP of 49. 2% which is competitive among state-of-the-art object detection methods.
Ranked #83 on Object Detection on COCO test-dev
2 code implementations • 18 Apr 2022 • Kaiwen Duan, Song Bai, Lingxi Xie, Honggang Qi, Qingming Huang, Qi Tian
Our approach, named CenterNet, detects each object as a triplet keypoints (top-left and bottom-right corners and the center keypoint).
Ranked #35 on Object Detection on COCO test-dev
1 code implementation • CVPR 2020 • Xiawu Zheng, Rongrong Ji, Qiang Wang, Qixiang Ye, Zhenguo Li, Yonghong Tian, Qi Tian
In this paper, we provide a novel yet systematic rethinking of PE in a resource constrained regime, termed budgeted PE (BPE), which precisely and effectively estimates the performance of an architecture sampled from an architecture space.
1 code implementation • 4 Dec 2019 • Minghao Xu, Jian Zhang, Bingbing Ni, Teng Li, Chengjie Wang, Qi Tian, Wenjun Zhang
In this paper, we present adversarial domain adaptation with domain mixup (DM-ADA), which guarantees domain-invariance in a more continuous latent space and guides the domain discriminator in judging samples' difference relative to source and target domains.
1 code implementation • 11 Apr 2021 • Kaiwen Duan, Lingxi Xie, Honggang Qi, Song Bai, Qingming Huang, Qi Tian
Object detection, instance segmentation, and pose estimation are popular visual recognition tasks which require localizing the object by internal or boundary landmarks.
Ranked #45 on Object Detection on COCO test-dev
1 code implementation • CVPR 2023 • Yunjie Tian, Lingxi Xie, Jihao Qiu, Jianbin Jiao, YaoWei Wang, Qi Tian, Qixiang Ye
iTPN is born with two elaborated designs: 1) The first pre-trained feature pyramid upon vision transformer (ViT).
1 code implementation • CVPR 2021 • Qinwei Xu, Ruipeng Zhang, Ya zhang, Yanfeng Wang, Qi Tian
Modern deep neural networks suffer from performance degradation when evaluated on testing data under different distributions from training data.
1 code implementation • CVPR 2020 • Minghao Xu, Hang Wang, Bingbing Ni, Qi Tian, Wenjun Zhang
To mitigate these problems, we propose a Graph-induced Prototype Alignment (GPA) framework to seek for category-level domain alignment via elaborate prototype representations.
1 code implementation • 17 Mar 2020 • Maosen Li, Siheng Chen, Yangheng Zhao, Ya zhang, Yan-Feng Wang, Qi Tian
The core idea of DMGNN is to use a multiscale graph to comprehensively model the internal relations of a human body for motion feature learning.
2 code implementations • ICCV 2021 • Wei Gao, Fang Wan, Xingjia Pan, Zhiliang Peng, Qi Tian, Zhenjun Han, Bolei Zhou, Qixiang Ye
TS-CAM finally couples the patch tokens with the semantic-agnostic attention map to achieve semantic-aware localization.
1 code implementation • CVPR 2020 • Takashi Isobe, Songjiang Li, Xu Jia, Shanxin Yuan, Gregory Slabaugh, Chunjing Xu, Ya-Li Li, Shengjin Wang, Qi Tian
Video super-resolution, which aims at producing a high-resolution video from its corresponding low-resolution version, has recently drawn increasing attention.
1 code implementation • CVPR 2020 • Jun Wei, Shuhui Wang, Zhe Wu, Chi Su, Qingming Huang, Qi Tian
Though remarkable progress has been achieved, we observe that the closer the pixel is to the edge, the more difficult it is to be predicted, because edge pixels have a very imbalance distribution.
Ranked #1 on Saliency Detection on HKU-IS
1 code implementation • ECCV 2020 • Zijie Zhuang, Longhui Wei, Lingxi Xie, Tianyu Zhang, Hengheng Zhang, Haozhe Wu, Haizhou Ai, Qi Tian
The fundamental difficulty in person re-identification (ReID) lies in learning the correspondence among individual cameras.
Ranked #16 on Unsupervised Domain Adaptation on Duke to Market
Direct Transfer Person Re-identification Domain Adaptive Person Re-Identification +2
1 code implementation • CVPR 2020 • Zhaohui Yang, Yunhe Wang, Xinghao Chen, Boxin Shi, Chao Xu, Chunjing Xu, Qi Tian, Chang Xu
Architectures in the population that share parameters within one SuperNet in the latest generation will be tuned over the training dataset with a few epochs.
1 code implementation • NeurIPS 2021 • Xu Luo, Longhui Wei, Liangjian Wen, Jinrong Yang, Lingxi Xie, Zenglin Xu, Qi Tian
The category gap between training and evaluation has been characterised as one of the main obstacles to the success of Few-Shot Learning (FSL).
2 code implementations • ECCV 2020 • Takashi Isobe, Xu Jia, Shuhang Gu, Songjiang Li, Shengjin Wang, Qi Tian
Most video super-resolution methods super-resolve a single reference frame with the help of neighboring frames in a temporal sliding window.
1 code implementation • CVPR 2021 • Le Yang, Haojun Jiang, Ruojin Cai, Yulin Wang, Shiji Song, Gao Huang, Qi Tian
Reusing features in deep networks through dense connectivity is an effective way to achieve high computational efficiency.
3 code implementations • CVPR 2022 • Jiemin Fang, Lingxi Xie, Xinggang Wang, Xiaopeng Zhang, Wenyu Liu, Qi Tian
Transformers have offered a new methodology of designing neural networks for visual recognition.
3 code implementations • ICCV 2021 • Peng Zhou, Lingxi Xie, Bingbing Ni, Cong Geng, Qi Tian
The conditional generative adversarial network (cGAN) is a powerful tool of generating high-quality images, but existing approaches mostly suffer unsatisfying performance or the risk of mode collapse.
Ranked #8 on Conditional Image Generation on ImageNet 128x128
1 code implementation • CVPR 2021 • Tianyu Zhang, Lingxi Xie, Longhui Wei, Zijie Zhuang, Yongfei Zhang, Bo Li, Qi Tian
The main difficulty of person re-identification (ReID) lies in collecting annotated data and transferring the model across different domains.
1 code implementation • 25 Oct 2019 • Kaifeng Bi, Changping Hu, Lingxi Xie, Xin Chen, Longhui Wei, Qi Tian
Our approach bridges the gap from two aspects, namely, amending the estimation on the architectural gradients, and unifying the hyper-parameter settings in the search and re-training stages.
1 code implementation • 24 May 2022 • Lunyiu Nie, Shulin Cao, Jiaxin Shi, Jiuding Sun, Qi Tian, Lei Hou, Juanzi Li, Jidong Zhai
Subject to the huge semantic gap between natural and formal languages, neural semantic parsing is typically bottlenecked by its complexity of dealing with both input semantics and output syntax.
1 code implementation • 1 Jul 2022 • Mingkun Yang, Minghui Liao, Pu Lu, Jing Wang, Shenggao Zhu, Hualin Luo, Qi Tian, Xiang Bai
Inspired by the observation that humans learn to recognize the texts through both reading and writing, we propose to learn discrimination and generation by integrating contrastive learning and masked image modeling in our self-supervised method.
1 code implementation • 10 Jun 2022 • Haohang Xu, Shuangrui Ding, Xiaopeng Zhang, Hongkai Xiong, Qi Tian
Specifically, MRA consistently enhances the performance on supervised, semi-supervised as well as few-shot classification.
1 code implementation • 22 Apr 2023 • Haitao Li, Qingyao Ai, Jia Chen, Qian Dong, Yueyue Wu, Yiqun Liu, Chong Chen, Qi Tian
Moreover, in contrast to the general retrieval, the relevance in the legal domain is sensitive to key legal elements.
1 code implementation • 23 Jan 2020 • Mingbao Lin, Liujuan Cao, Shaojie Li, Qixiang Ye, Yonghong Tian, Jianzhuang Liu, Qi Tian, Rongrong Ji
Our approach, referred to as FilterSketch, encodes the second-order information of pre-trained weights, which enables the representation capacity of pruned networks to be recovered with a simple fine-tuning procedure.
1 code implementation • 30 May 2022 • Xiaosong Zhang, Yunjie Tian, Wei Huang, Qixiang Ye, Qi Dai, Lingxi Xie, Qi Tian
A key idea of efficient implementation is to discard the masked image patches (or tokens) throughout the target network (encoder), which requires the encoder to be a plain vision transformer (e. g., ViT), albeit hierarchical vision transformers (e. g., Swin Transformer) have potentially better properties in formulating vision inputs.
1 code implementation • 31 Jul 2022 • Yabo Chen, Yuchen Liu, Dongsheng Jiang, Xiaopeng Zhang, Wenrui Dai, Hongkai Xiong, Qi Tian
We also analyze how to build good views for the teacher branch to produce latent representation from the perspective of information bottleneck.
1 code implementation • ECCV 2020 • Peisen Zhao, Lingxi Xie, Chen Ju, Ya zhang, Yan-Feng Wang, Qi Tian
To alleviate this problem, we introduce two regularization terms to mutually regularize the learning procedure: the Intra-phase Consistency (IntraC) regularization is proposed to make the predictions verified inside each phase; and the Inter-phase Consistency (InterC) regularization is proposed to keep consistency between these phases.
1 code implementation • CVPR 2018 • Ning Wang, Wengang Zhou, Qi Tian, Richang Hong, Meng Wang, Houqiang Li
By combining different types of features, our approach constructs multiple experts through Discriminative Correlation Filter (DCF) and each of them tracks the target independently.
1 code implementation • CVPR 2023 • Chufeng Tang, Lingxi Xie, Xiaopeng Zhang, Xiaolin Hu, Qi Tian
Humans have the ability of recognizing visual semantics in an unlimited granularity, but existing visual recognition algorithms cannot achieve this goal.
1 code implementation • 24 Jan 2024 • Yunjie Tian, Tianren Ma, Lingxi Xie, Jihao Qiu, Xi Tang, Yuan Zhang, Jianbin Jiao, Qi Tian, Qixiang Ye
In this study, we establish a baseline for a new task named multimodal multi-round referring and grounding (MRG), opening up a promising direction for instance-level multimodal dialogues.
1 code implementation • 5 Sep 2019 • Yiming Wu, Omar El Farouk Bourahla, Xi Li, Fei Wu, Qi Tian, Xue Zhou
While correlations between parts are ignored in the previous methods, to leverage the relations of different parts, we propose an innovative adaptive graph representation learning scheme for video person Re-ID, which enables the contextual interactions between relevant regional features.
Ranked #3 on Person Re-Identification on PRID2011
Graph Representation Learning Video-Based Person Re-Identification
2 code implementations • 31 May 2021 • Xiujun Shu, Xiao Wang, Xianghao Zang, Shiliang Zhang, Yuanqi Chen, Ge Li, Qi Tian
We also verified that models pre-trained on LaST can generalize well on existing datasets with short-term and cloth-changing scenarios.
1 code implementation • 4 Aug 2022 • Juncheng Li, Xin He, Longhui Wei, Long Qian, Linchao Zhu, Lingxi Xie, Yueting Zhuang, Qi Tian, Siliang Tang
Large-scale vision-language pre-training has shown impressive advances in a wide range of downstream tasks.
1 code implementation • 30 Nov 2021 • Jiemin Fang, Lingxi Xie, Xinggang Wang, Xiaopeng Zhang, Wenyu Liu, Qi Tian
Neural radiance fields (NeRF) have shown great potentials in representing 3D scenes and synthesizing novel views, but the computational overhead of NeRF at the inference stage is still heavy.
1 code implementation • 14 Jul 2020 • Lin Liu, Jianzhuang Liu, Shanxin Yuan, Gregory Slabaugh, Ales Leonardis, Wengang Zhou, Qi Tian
When smartphone cameras are used to take photos of digital screens, usually moire patterns result, severely degrading photo quality.
1 code implementation • CVPR 2023 • Ruipeng Zhang, Qinwei Xu, Jiangchao Yao, Ya zhang, Qi Tian, Yanfeng Wang
Federated Domain Generalization (FedDG) attempts to learn a global model in a privacy-preserving manner that generalizes well to new clients possibly with domain shift.
1 code implementation • 19 Sep 2018 • Jinhui Tang, Xiaoyu Du, Xiangnan He, Fajie Yuan, Qi Tian, Tat-Seng Chua
To this end, we propose a novel solution named Adversarial Multimedia Recommendation (AMR), which can lead to a more robust multimedia recommender model by using adversarial learning.
Information Retrieval Multimedia
1 code implementation • ICLR 2022 • Haohang Xu, Jiemin Fang, Xiaopeng Zhang, Lingxi Xie, Xinggang Wang, Wenrui Dai, Hongkai Xiong, Qi Tian
Here bag of instances indicates a set of similar samples constructed by the teacher and are grouped within a bag, and the goal of distillation is to aggregate compact representations over the student with respect to instances in a bag.
1 code implementation • 16 Aug 2023 • Keqin Bao, Jizhi Zhang, Wenjie Wang, Yang Zhang, Zhengyi Yang, Yancheng Luo, Chong Chen, Fuli Feng, Qi Tian
As the focus on Large Language Models (LLMs) in the field of recommendation intensifies, the optimization of LLMs for recommendation purposes (referred to as LLM4Rec) assumes a crucial role in augmenting their effectiveness in providing recommendations.
1 code implementation • 20 Apr 2017 • Xun Yang, Meng Wang, Richang Hong, Qi Tian, Yong Rui
To address this problem, in this paper, we propose a self-trained subspace learning paradigm for person re-ID which effectively utilizes both labeled and unlabeled data to learn a discriminative subspace where person images across disjoint camera views can be easily matched.
1 code implementation • CVPR 2022 • Sun-Ao Liu, Hongtao Xie, Hai Xu, Yongdong Zhang, Qi Tian
Current attention-based methods for semantic segmentation mainly model pixel relation through pairwise affinity and coarse segmentation.
1 code implementation • 6 Dec 2023 • Xumeng Han, Longhui Wei, Xuehui Yu, Zhiyang Dou, Xin He, Kuiran Wang, Zhenjun Han, Qi Tian
The recent Segment Anything Model (SAM) has emerged as a new paradigmatic vision foundation model, showcasing potent zero-shot generalization and flexible prompting.
1 code implementation • 22 Nov 2023 • Qifan Yu, Juncheng Li, Longhui Wei, Liang Pang, Wentao Ye, Bosheng Qin, Siliang Tang, Qi Tian, Yueting Zhuang
Multi-modal Large Language Models (MLLMs) tuned on machine-generated instruction-following data have demonstrated remarkable performance in various multi-modal understanding and generation tasks.
1 code implementation • 25 Apr 2020 • Zhou Yu, Yuhao Cui, Jun Yu, Meng Wang, DaCheng Tao, Qi Tian
Most existing works focus on a single task and design neural architectures manually, which are highly task-specific and hard to generalize to different tasks.
Ranked #19 on Visual Question Answering (VQA) on VQA v2 test-std
1 code implementation • 25 Nov 2021 • Yunjie Tian, Lingxi Xie, Xiaopeng Zhang, Jiemin Fang, Haohang Xu, Wei Huang, Jianbin Jiao, Qi Tian, Qixiang Ye
In this paper, we propose a self-supervised visual representation learning approach which involves both generative and discriminative proxies, where we focus on the former part by requiring the target network to recover the original image based on the mid-level features.
Ranked #63 on Semantic Segmentation on Cityscapes test
1 code implementation • 31 Jul 2022 • Maosen Li, Siheng Chen, Zijing Zhang, Lingxi Xie, Qi Tian, Ya zhang
To address the first issue, we propose adaptive graph scattering, which leverages multiple trainable band-pass graph filters to decompose pose features into richer graph spectrum bands.
1 code implementation • 3 Oct 2022 • Bruce X. B. Yu, Jianlong Chang, Lingbo Liu, Qi Tian, Chang Wen Chen
Towards this goal, we propose a framework with a unified view of PETL called visual-PETL (V-PETL) to investigate the effects of different PETL techniques, data scales of downstream domains, positions of trainable parameters, and other aspects affecting the trade-off.
1 code implementation • ECCV 2020 • Xuanhong Chen, Bingbing Ni, Naiyuan Liu, Ziang Liu, Yiliu Jiang, Loc Truong, Qi Tian
In contrast to great success of memory-consuming face editing methods at a low resolution, to manipulate high-resolution (HR) facial images, i. e., typically larger than 7682 pixels, with very limited memory is still challenging.
1 code implementation • ICCV 2021 • Xinzhe Han, Shuhui Wang, Chi Su, Qingming Huang, Qi Tian
Language bias is a critical issue in Visual Question Answering (VQA), where models often exploit dataset biases for the final decision without considering the image information.
Ranked #2 on Visual Question Answering (VQA) on VQA-CP
1 code implementation • 27 Mar 2022 • Yunjie Tian, Lingxi Xie, Jiemin Fang, Mengnan Shi, Junran Peng, Xiaopeng Zhang, Jianbin Jiao, Qi Tian, Qixiang Ye
The past year has witnessed a rapid development of masked image modeling (MIM).
1 code implementation • 11 May 2019 • Mingbao Lin, Rongrong Ji, Hong Liu, Xiaoshuai Sun, Shen Chen, Qi Tian
We then treat the learning of hash functions as a set of binary classification problems to fit the assigned target code.
1 code implementation • 24 Sep 2019 • Tianyu Zhang, Lingxi Xie, Longhui Wei, Yongfei Zhang, Bo Li, Qi Tian
Differently, this paper investigates ReID in an unexplored single-camera-training (SCT) setting, where each person in the training set appears in only one camera.
1 code implementation • 30 Mar 2020 • Junyi Feng, Songyuan Li, Xi Li, Fei Wu, Qi Tian, Ming-Hsuan Yang, Haibin Ling
Real-time semantic video segmentation is a challenging task due to the strict requirements of inference speed.
1 code implementation • 7 Jul 2020 • Kaifeng Bi, Lingxi Xie, Xin Chen, Longhui Wei, Qi Tian
There has been a large literature of neural architecture search, but most existing work made use of heuristic rules that largely constrained the search flexibility.
1 code implementation • 24 Jul 2021 • Xiujun Shu, Ge Li, Xiao Wang, Weijian Ruan, Qi Tian
The key to this task is to exploit cloth-irrelevant cues.
1 code implementation • 2 Jun 2022 • Ming Tao, Bing-Kun Bao, Hao Tang, Fei Wu, Longhui Wei, Qi Tian
To solve these limitations, we propose: (i) a Dynamic Editing Block (DEBlock) which composes different editing modules dynamically for various editing requirements.
1 code implementation • 9 May 2018 • Zhou Yu, Jun Yu, Chenchao Xiang, Zhou Zhao, Qi Tian, DaCheng Tao
Visual grounding aims to localize an object in an image referred to by a textual query phrase.
Ranked #9 on Phrase Grounding on Flickr30k Entities Test
1 code implementation • CVPR 2019 • Jie Hu, Rongrong Ji, Hong Liu, Shengchuan Zhang, Cheng Deng, Qi Tian
In this paper, we make the first attempt towards visual feature translation to break through the barrier of using features across different visual search systems.
1 code implementation • CVPR 2019 • Chen Wei, Lingxi Xie, Xutong Ren, Yingda Xia, Chi Su, Jiaying Liu, Qi Tian, Alan L. Yuille
We consider spatial contexts, for which we solve so-called jigsaw puzzles, i. e., each image is cut into grids and then disordered, and the goal is to recover the correct configuration.
1 code implementation • 25 Jan 2024 • Sihang Li, Zhiyuan Liu, Yanchen Luo, Xiang Wang, Xiangnan He, Kenji Kawaguchi, Tat-Seng Chua, Qi Tian
Through 3D molecule-text alignment and 3D molecule-centric instruction tuning, 3D-MoLM establishes an integration of 3D molecular encoder and LM.
1 code implementation • CVPR 2022 • Wenwen Pan, Haonan Shi, Zhou Zhao, Jieming Zhu, Xiuqiang He, Zhigeng Pan, Lianli Gao, Jun Yu, Fei Wu, Qi Tian
Audio-Guided video semantic segmentation is a challenging problem in visual analysis and editing, which automatically separates foreground objects from background in a video sequence according to the referring audio expressions.
1 code implementation • ICCV 2019 • Guo-Jun Qi, Liheng Zhang, Chang Wen Chen, Qi Tian
This ensures the resultant TERs of individual images contain the {\em intrinsic} information about their visual structures that would equivary {\em extricably} under various transformations in a generalized {\em nonlinear} case.
1 code implementation • 3 Nov 2020 • Lin Liu, Shanxin Yuan, Jianzhuang Liu, Liping Bao, Gregory Slabaugh, Qi Tian
In this paper, we propose a self-adaptive learning method for demoireing a high-frequency image, with the help of an additional defocused moire-free blur image.
1 code implementation • NeurIPS 2019 • Jie Hu, Rongrong Ji, Shengchuan Zhang, Xiaoshuai Sun, Qixiang Ye, Chia-Wen Lin, Qi Tian
Learning representations with diversified information remains as an open problem.
1 code implementation • CVPR 2020 • Hengtong Hu, Lingxi Xie, Richang Hong, Qi Tian
In recent years, cross-modal hashing (CMH) has attracted increasing attentions, mainly because its potential ability of mapping contents from different modalities, especially in vision and language, into the same space, so that it becomes efficient in cross-modal data retrieval.
1 code implementation • 27 Jul 2020 • Peixian Chen, Pingyang Dai, Jianzhuang Liu, Feng Zheng, Qi Tian, Rongrong Ji
Domain generalization (DG) serves as a promising solution to handle person Re-Identification (Re-ID), which trains the model using labels from the source domain alone, and then directly adopts the trained model to the target domain without model updating.
Domain Generalization Generalizable Person Re-identification
1 code implementation • CVPR 2022 • Qing Chang, Junran Peng, Lingxie Xie, Jiajun Sun, Haoran Yin, Qi Tian, Zhaoxiang Zhang
However, due to the high training costs and the unconsciousness of downstream usages, most self-supervised learning methods lack the capability to correspond to the diversities of downstream scenarios, as there are various data domains, different vision tasks and latency constraints on models.
1 code implementation • ICCV 2023 • Shuangrui Ding, Peisen Zhao, Xiaopeng Zhang, Rui Qian, Hongkai Xiong, Qi Tian
Based on the STA score, we are able to progressively prune the tokens without introducing any additional parameters or requiring further re-training.
1 code implementation • 23 Nov 2021 • Zhaobo Qi, Shuhui Wang, Chi Su, Li Su, Qingming Huang, Qi Tian
Future activity anticipation is a challenging problem in egocentric vision.
1 code implementation • CVPR 2022 • Zhengcong Fei, Xu Yan, Shuhui Wang, Qi Tian
On one hand, the representation in shallow layers lacks high-level semantic and sufficient cross-modal fusion information for accurate prediction.
1 code implementation • CVPR 2020 • Jie Li, Rongrong Ji, Hong Liu, Jianzhuang Liu, Bineng Zhong, Cheng Deng, Qi Tian
For reducing the solution space, we first model the adversarial perturbation optimization problem as a process of recovering frequency-sparse perturbations with compressed sensing, under the setting that random noise in the low-frequency space is more likely to be adversarial.
1 code implementation • CVPR 2020 • Zhengsu Chen, Jianwei Niu, Lingxi Xie, Xuefeng Liu, Longhui Wei, Qi Tian
Automatic designing computationally efficient neural networks has received much attention in recent years.
1 code implementation • 19 Feb 2017 • Hantao Yao, Feng Dai, Dongming Zhang, Yike Ma, Shiliang Zhang, Yongdong Zhang, Qi Tian
Accordingly, DR$^{2}$-Net consists of two components, \emph{i. e.,} linear mapping network and residual network, respectively.
1 code implementation • CVPR 2023 • Junfan Lin, Jianlong Chang, Lingbo Liu, Guanbin Li, Liang Lin, Qi Tian, Chang Wen Chen
During inference, instead of changing the motion generator, our method reformulates the input text into a masked motion as the prompt for the motion generator to ``reconstruct'' the motion.
1 code implementation • 25 Oct 2019 • Qiaokang Xie, Wengang Zhou, Guo-Jun Qi, Qi Tian, Houqiang Li
In our approach, we first collect tracklet data within each camera by automatic person detection and tracking.
1 code implementation • CVPR 2021 • Yuchao Li, Shaohui Lin, Jianzhuang Liu, Qixiang Ye, Mengdi Wang, Fei Chao, Fan Yang, Jincheng Ma, Qi Tian, Rongrong Ji
Channel pruning and tensor decomposition have received extensive attention in convolutional neural network compression.
1 code implementation • 7 Dec 2019 • Yiyi Zhou, Rongrong Ji, Gen Luo, Xiaoshuai Sun, Jinsong Su, Xinghao Ding, Chia-Wen Lin, Qi Tian
Referring Expression Comprehension (REC) is an emerging research spot in computer vision, which refers to detecting the target region in an image given an text description.
1 code implementation • ECCV 2020 • Longhui Wei, An Xiao, Lingxi Xie, Xin Chen, Xiaopeng Zhang, Qi Tian
AutoAugment has been a powerful algorithm that improves the accuracy of many vision tasks, yet it is sensitive to the operator space as well as hyper-parameters, and an improper setting may degenerate network optimization.
Ranked #185 on Image Classification on ImageNet
1 code implementation • 25 Jun 2020 • Peng Zhou, Lingxi Xie, Xiaopeng Zhang, Bingbing Ni, Qi Tian
To learn the sampling policy, a Markov decision process is embedded into the search algorithm and a moving average is applied for better stability.
1 code implementation • 16 Dec 2023 • Yihang Zhai, Haixin Wang, Jianlong Chang, Xinlong Yang, Jinan Sun, Shikun Zhang, Qi Tian
Instruction tuning has shown promising potential for developing general-purpose AI capabilities by using large-scale pre-trained models and boosts growing research to integrate multimodal information for creative applications.
1 code implementation • 14 Aug 2019 • Guoli Song, Shuhui Wang, Qingming Huang, Qi Tian
Multimodal learning aims to discover the relationship between multiple modalities.
1 code implementation • 20 Dec 2021 • Xinzhe Han, Shuhui Wang, Chi Su, Qingming Huang, Qi Tian
Existing de-bias learning frameworks try to capture specific dataset bias by annotations but they fail to handle complicated OOD scenarios.
1 code implementation • 23 Jul 2022 • Chufeng Tang, Lingxi Xie, Gang Zhang, Xiaopeng Zhang, Qi Tian, Xiaolin Hu
In this paper, we present an economic active learning setting, named active pointly-supervised instance segmentation (APIS), which starts with box-level annotations and iteratively samples a point within the box and asks if it falls on the object.
1 code implementation • NeurIPS 2023 • Jin Li, Yaoming Wang, Xiaopeng Zhang, Bowen Shi, Dongsheng Jiang, Chenglin Li, Wenrui Dai, Hongkai Xiong, Qi Tian
Specifically, at the intermediate layer of the ViT, we utilize a spatial-aware density-based clustering algorithm to select representative tokens from the token sequence.
1 code implementation • 9 Jan 2024 • Hongcheng Guo, Jian Yang, Jiaheng Liu, Jiaqi Bai, Boyang Wang, Zhoujun Li, Tieqiao Zheng, Bo Zhang, Junran Peng, Qi Tian
Log anomaly detection is a key component in the field of artificial intelligence for IT operations (AIOps).
1 code implementation • 14 Dec 2022 • Ziqing Fan, Yanfeng Wang, Jiangchao Yao, Lingjuan Lyu, Ya zhang, Qi Tian
However, in addition to previous explorations for improvement in federated averaging, our analysis shows that another critical bottleneck is the poorer optima of client models in more heterogeneous conditions.
1 code implementation • 12 Apr 2023 • Liping Bao, Longhui Wei, Xiaoyu Qiu, Wengang Zhou, Houqiang Li, Qi Tian
Recent researches on unsupervised person re-identification~(reID) have demonstrated that pre-training on unlabeled person images achieves superior performance on downstream reID tasks than pre-training on ImageNet.
1 code implementation • NeurIPS 2020 • Hengtong Hu, Lingxi Xie, Zewei Du, Richang Hong, Qi Tian
Instead of training a model upon the accurate label of each sample, our setting requires the model to query with a predicted label of each sample and learn from the answer whether the guess is correct.
1 code implementation • 3 Aug 2022 • Juncheng Li, Junlin Xie, Linchao Zhu, Long Qian, Siliang Tang, Wenqiao Zhang, Haochen Shi, Shengyu Zhang, Longhui Wei, Qi Tian, Yueting Zhuang
In this paper, we introduce a new task, named Temporal Emotion Localization in videos~(TEL), which aims to detect human emotions and localize their corresponding temporal boundaries in untrimmed videos with aligned subtitles.
1 code implementation • 28 Mar 2024 • Zhicai Wang, Longhui Wei, Tan Wang, Heyu Chen, Yanbin Hao, Xiang Wang, Xiangnan He, Qi Tian
Text-to-image (T2I) generative models have recently emerged as a powerful tool, enabling the creation of photo-realistic images and giving rise to a multitude of applications.
1 code implementation • CVPR 2020 • Yutian Lin, Lingxi Xie, Yu Wu, Chenggang Yan, Qi Tian
Person re-identification (re-ID) is an important topic in computer vision.
1 code implementation • MM '19: Proceedings of the 27th ACM International Conference on Multimedia 2019 • Lu Chi, Guiyu Tian, Yadong Mu, Lingxi Xie, Qi Tian
We show its equivalence to conducting residual learning in some spectral domain and carefully re-formulate a variety of neural layers into their spectral forms, such as ReLU or convolutions.
2 code implementations • CVPR 2016 • Lingxi Xie, Jingdong Wang, Zhen Wei, Meng Wang, Qi Tian
During a long period of time we are combating over-fitting in the CNN training process with model regularization, including weight decay, model averaging, data augmentation, etc.
1 code implementation • CVPR 2023 • Yaoming Wang, Bowen Shi, Xiaopeng Zhang, Jin Li, Yuchen Liu, Wenrui Dai, Chenglin Li, Hongkai Xiong, Qi Tian
To mitigate the computational and storage demands, recent research has explored Parameter-Efficient Fine-Tuning (PEFT), which focuses on tuning a minimal number of parameters for efficient adaptation.
1 code implementation • 23 Nov 2023 • Shulin Cao, Jiajie Zhang, Jiaxin Shi, Xin Lv, Zijun Yao, Qi Tian, Juanzi Li, Lei Hou
During reasoning, for leaf nodes, LLMs choose a more confident answer from Closed-book QA that employs parametric knowledge and Open-book QA that employs retrieved external knowledge, thus eliminating the negative retrieval problem.
1 code implementation • 30 Oct 2020 • Yangyang Guo, Liqiang Nie, Zhiyong Cheng, Qi Tian, Min Zhang
Concretely, we design a novel interpretation scheme whereby the loss of mis-predicted frequent and sparse answers of the same question type is distinctly exhibited during the late training phase.
1 code implementation • 17 Jan 2020 • Yuhui Xu, Lingxi Xie, Xiaopeng Zhang, Xin Chen, Bowen Shi, Qi Tian, Hongkai Xiong
However, these methods suffer the difficulty in optimizing network, so that the searched network is often unfriendly to hardware.
1 code implementation • 11 Oct 2021 • Xu Yan, Zhengcong Fei, Zekang Li, Shuhui Wang, Qingming Huang, Qi Tian
Non-autoregressive image captioning with continuous iterative refinement, which eliminates the sequential dependence in a sentence generation, can achieve comparable performance to the autoregressive counterparts with a considerable acceleration.
2 code implementations • 14 Oct 2022 • Xiaoyan Zhang, Gaoyang Tang, Yingying Zhu, Qi Tian
The issue of image haze removal has attracted wide attention in recent years.
1 code implementation • 26 Dec 2022 • Deng Li, Aming Wu, Yahong Han, Qi Tian
Considering the complexity and variability of real scene tasks, we propose a Prototype-guided Cross-task Knowledge Distillation (ProC-KD) approach to transfer the intrinsic local-level object knowledge of a large-scale teacher network to various task scenarios.
1 code implementation • 5 Aug 2016 • Liang Zheng, Yi Yang, Qi Tian
This survey presents milestones in modern instance retrieval, reviews a broad selection of previous works in different categories, and provides insights on the connection between SIFT and CNN-based methods.
1 code implementation • ECCV 2020 • Xinshuai Dong, Hong Liu, Rongrong Ji, Liujuan Cao, Qixiang Ye, Jianzhuang Liu, Qi Tian
On the contrary, a discriminative classifier only models the conditional distribution of labels given inputs, but benefits from effective optimization owing to its succinct structure.
1 code implementation • 18 Jul 2022 • Xuejing Liu, Liang Li, Shuhui Wang, Zheng-Jun Zha, Zechao Li, Qi Tian, Qingming Huang
Second, most previous weakly supervised REG methods ignore the discriminative location and context of the referent, causing difficulties in distinguishing the target from other same-category objects.
no code implementations • CVPR 2018 • Xiaopeng Zhang, Jiashi Feng, Hongkai Xiong, Qi Tian
Unlike them, we propose a zigzag learning strategy to simultaneously discover reliable object instances and prevent the model from overfitting initial seeds.
no code implementations • 12 Apr 2018 • Jinhui Tang, Xiangbo Shu, Zechao Li, Yu-Gang Jiang, Qi Tian
Recent approaches simultaneously explore visual, user and tag information to improve the performance of image retagging by constructing and exploring an image-tag-user graph.
no code implementations • 9 Apr 2018 • Mingxing Duan, Kenli Li, Qi Tian
In this paper, we propose a novel multi-attribute tensor correlation neural network (MTCN) for face attribute prediction.
no code implementations • ECCV 2018 • Dawei Du, Yuankai Qi, Hongyang Yu, Yifan Yang, Kaiwen Duan, Guorong Li, Weigang Zhang, Qingming Huang, Qi Tian
Selected from 10 hours raw videos, about 80, 000 representative frames are fully annotated with bounding boxes as well as up to 14 kinds of attributes (e. g., weather condition, flying altitude, camera view, vehicle category, and occlusion) for three fundamental computer vision tasks: object detection, single object tracking, and multiple object tracking.
Ranked #5 on Object Detection on UAVDT
no code implementations • 20 Dec 2017 • Jianing Li, Shiliang Zhang, Jingdong Wang, Wen Gao, Qi Tian
This paper mainly establishes a large-scale Long sequence Video database for person re-IDentification (LVreID).
no code implementations • 17 Nov 2017 • Fuqing Zhu, Xiangwei Kong, Haiyan Fu, Qi Tian
A small proportion of these retrieved samples are randomly selected as the Pseudo Positive samples and added to the target training set for the supervised CNN training.
no code implementations • 4 Jul 2017 • Hantao Yao, Shiliang Zhang, Yongdong Zhang, Jintao Li, Qi Tian
The representation learning risk is evaluated by the proposed part loss, which automatically generates several parts for an image, and computes the person classification loss on each part separately.
Ranked #97 on Person Re-Identification on Market-1501
no code implementations • CVPR 2019 • Linjun Zhou, Peng Cui, Shiqiang Yang, Wenwu Zhu, Qi Tian
We then propose an out-of-sample embedding method to learn the embedding of a new class represented by a few samples through its visual analogy with base classes and derive the classification parameters for the new class.
no code implementations • 27 Sep 2017 • Zhizhong Zhang, Yuan Xie, Wensheng Zhang, Qi Tian
In this paper, we propose a new multi-index fusion scheme for image retrieval.
no code implementations • ICCV 2017 • Chi Su, Jianing Li, Shiliang Zhang, Junliang Xing, Wen Gao, Qi Tian
Our deep architecture explicitly leverages the human part cues to alleviate the pose variations and learn robust feature representations from both the global image and different local parts.
Ranked #105 on Person Re-Identification on Market-1501
no code implementations • 18 Sep 2017 • Xiaobin Liu, Shiliang Zhang, Tiejun Huang, Qi Tian
To conquer these issues, we propose an End-to-End BoWs (E$^2$BoWs) model based on Deep Convolutional Neural Network (DCNN).
no code implementations • 13 Sep 2017 • Longhui Wei, Shiliang Zhang, Hantao Yao, Wen Gao, Qi Tian
Targeting to solve these problems, this work proposes a Global-Local-Alignment Descriptor (GLAD) and an efficient indexing and retrieval framework, respectively.
Ranked #93 on Person Re-Identification on Market-1501
no code implementations • 1 May 2016 • Song Bai, Xiang Bai, Longin Jan Latecki, Qi Tian
How to do multidimensional scaling on multiple input distance matrices is still unsolved to our best knowledge.
no code implementations • 4 Jul 2017 • Hantao Yao, Shiliang Zhang, Yongdong Zhang, Jintao Li, Qi Tian
Aiming to conquer this issue, we propose a retrieval task named One-Shot Fine-Grained Instance Retrieval (OSFGIR).
no code implementations • 29 May 2017 • Xiaopeng Zhang, Hongkai Xiong, Weiyao Lin, Qi Tian
Part-based representation has been proven to be effective for a variety of visual applications.
no code implementations • 5 May 2017 • Fuqing Zhu, Xiangwei Kong, Liang Zheng, Haiyan Fu, Qi Tian
In the experiment, we show that the proposed Part-based Deep Hashing method yields very competitive re-id accuracy on the large-scale Market-1501 and Market-1501+500K datasets.
no code implementations • CVPR 2017 • Liang Zheng, Hengheng Zhang, Shaoyan Sun, Manmohan Chandraker, Yi Yang, Qi Tian
Our baselines address three issues: the performance of various combinations of detectors and recognizers, mechanisms for pedestrian detection to help improve overall re-identification accuracy and assessing the effectiveness of different detectors for re-identification.
no code implementations • CVPR 2017 • Song Bai, Xiang Bai, Qi Tian
Most existing person re-identification algorithms either extract robust visual features or learn discriminative metrics for person images.
Ranked #100 on Person Re-Identification on Market-1501
no code implementations • 11 May 2016 • Chi Su, Shiliang Zhang, Junliang Xing, Wen Gao, Qi Tian
And we propose a semi-supervised attribute learning framework which progressively boosts the accuracy of attributes only using a limited number of labeled data.
no code implementations • 21 Jul 2016 • Lingxi Xie, Qi Tian, John Flynn, Jingdong Wang, Alan Yuille
For this, we consider the neurons in the hidden layer as neural words, and construct a set of geometric neural phrases on top of them.
no code implementations • 4 Jul 2016 • Gaipeng Kong, Le Dong, Wenpu Dong, Liang Zheng, Qi Tian
Departing from the previous methods fusing multiple image descriptors simultaneously, C2F is featured by a layered procedure composed by filtering and refining.
no code implementations • CVPR 2016 • Lingxi Xie, Liang Zheng, Jingdong Wang, Alan Yuille, Qi Tian
An increasing number of computer vision tasks can be tackled with deep features, which are the intermediate outputs of a pre-trained Convolutional Neural Network.
no code implementations • 1 Apr 2016 • Liang Zheng, Yali Zhao, Shengjin Wang, Jingdong Wang, Qi Tian
The objective of this paper is the effective transfer of the Convolutional Neural Network (CNN) feature in image search and classification.
no code implementations • 18 Mar 2016 • Dawei Du, Honggang Qi, Longyin Wen, Qi Tian, Qingming Huang, Siwei Lyu
Graph based representation is widely used in visual tracking field by finding correct correspondences between target parts in consecutive frames.
no code implementations • 7 Feb 2015 • Liang Zheng, Liyue Shen, Lu Tian, Shengjin Wang, Jiahao Bu, Qi Tian
In the light of recent advances in image search, this paper proposes to treat person re-identification as an image search problem.
no code implementations • 3 Jun 2014 • Ziqiong Liu, Shengjin Wang, Liang Zheng, Qi Tian
This paper introduces an improved reranking method for the Bag-of-Words (BoW) based image search.
no code implementations • 1 Jun 2014 • Liang Zheng, Shengjin Wang, Fei He, Qi Tian
Specifically, the Convolutional Neural Network (CNN) is employed to extract features from regional and global patches, leading to the so-called "Deep Embedding" framework.
no code implementations • CVPR 2014 • Liang Zheng, Shengjin Wang, Wengang Zhou, Qi Tian
Albeit simple, Bayes merging can be well applied in various merging tasks, and consistently improves the baselines on multi-vocabulary merging.
no code implementations • CVPR 2014 • Liang Zheng, Shengjin Wang, Ziqiong Liu, Qi Tian
Specifically, we exploit the fusion of local color feature into c-MI.
no code implementations • 19 Oct 2018 • Han Liu, Dan Zeng, Qi Tian
Secondly, super-pixel level database is used to train our cloud detection models based on CNN and deep forest.
no code implementations • 29 Nov 2018 • Xutong Ren, Lingxi Xie, Chen Wei, Siyuan Qiao, Chi Su, Jiaying Liu, Qi Tian, Elliot K. Fishman, Alan L. Yuille
Computer vision is difficult, partly because the desired mathematical function connecting input and output data is often complex, fuzzy and thus hard to learn.
no code implementations • 28 Nov 2018 • Huangjie Zheng, Lingxi Xie, Tianwei Ni, Ya zhang, Yan-Feng Wang, Qi Tian, Elliot K. Fishman, Alan L. Yuille
However, in medical image analysis, fusing prediction from two phases is often difficult, because (i) there is a domain gap between two phases, and (ii) the semantic labels are not pixel-wise corresponded even for images scanned from the same patient.
no code implementations • 30 Nov 2018 • Yexun Zhang, Ya zhang, Yan-Feng Wang, Qi Tian
Unsupervised domain adaption aims to learn a powerful classifier for the target domain given a labeled source data set and an unlabeled target data set.
no code implementations • ICCV 2019 • Jie Li, Rongrong Ji, Hong Liu, Xiaopeng Hong, Yue Gao, Qi Tian
In this paper, we make the first attempt in attacking image retrieval systems.
no code implementations • ICCV 2019 • Yuefu Zhou, Ya zhang, Yan-Feng Wang, Qi Tian
A new dropout-based measurement of redundancy, which facilitate the computation of posterior assuming inter-layer dependency, is introduced.
no code implementations • NeurIPS 2012 • Jianqiu Ji, Jianmin Li, Shuicheng Yan, Bo Zhang, Qi Tian
Sign-random-projection locality-sensitive hashing (SRP-LSH) is a probabilistic dimension reduction method which provides an unbiased estimate of angular similarity, yet suffers from the large variance of its estimation.
no code implementations • CVPR 2018 • Zhixiang Chen, Xin Yuan, Jiwen Lu, Qi Tian, Jie zhou
This paper presents a discrepancy minimizing model to address the discrete optimization problem in hashing learning.
no code implementations • ECCV 2018 • Liangliang Ren, Jiwen Lu, Zifeng Wang, Qi Tian, Jie zhou
To address this, we develop a deep prediction-decision network in our C-DRL, which simultaneously detects and predicts objects under a unified network via deep reinforcement learning.
no code implementations • CVPR 2013 • Lei Zhang, Yongdong Zhang, Jinhu Tang, Ke Lu, Qi Tian
In this paper, we propose a weighted Hamming distance ranking algorithm (WhRank) to rank the binary codes of hashing methods.
no code implementations • CVPR 2013 • Liang Zheng, Shengjin Wang, Ziqiong Liu, Qi Tian
Further, by counting for the term-frequency in each image, the proposed L p -norm IDF helps to alleviate the visual word burstiness phenomenon.
no code implementations • CVPR 2014 • Lingxi Xie, Jingdong Wang, Baining Guo, Bo Zhang, Qi Tian
The novelty lies in that OPM uses the 3D orientations to form the pyramid and produce the pooling regions, which is unlike SPM that uses the spatial positions to form the pyramid.
no code implementations • CVPR 2014 • Zhenxing Niu, Gang Hua, Xinbo Gao, Qi Tian
In such way, we can efficiently leverage the loosely related tags, and build an intermediate level representation for a collection of weakly annotated images.
no code implementations • CVPR 2015 • Liang Zheng, Shengjin Wang, Lu Tian, Fei He, Ziqiong Liu, Qi Tian
However, in a more realistic situation, one does not know in advance whether a feature is effective or not for a given query.
no code implementations • CVPR 2015 • Yang Zhou, Bingbing Ni, Richang Hong, Meng Wang, Qi Tian
Secondly, these object regions are matched and tracked across frames to form a large spatio-temporal graph based on the appearance matching and the dense motion trajectories through them.
Fine-grained Action Recognition Human-Object Interaction Detection +2
no code implementations • CVPR 2016 • Xiaopeng Zhang, Hongkai Xiong, Wengang Zhou, Weiyao Lin, Qi Tian
Recognizing fine-grained sub-categories such as birds and dogs is extremely challenging due to the highly localized and subtle differences in some specific parts.
no code implementations • CVPR 2016 • Yang Zhou, Bingbing Ni, Richang Hong, Xiaokang Yang, Qi Tian
Firstly, a novel EM-like learning framework is proposed to train the pixel-level deep convolutional neural network (DCNN) by seamlessly integrating weakly supervised data (i. e., massive bounding box annotations) with a small set of strongly supervised data (i. e., fully annotated hand segmentation maps) to achieve state-of-the-art hand segmentation performance.
no code implementations • CVPR 2017 • Xishan Zhang, Ke Gao, Yongdong Zhang, Dongming Zhang, Jintao Li, Qi Tian
This paper contributes to: 1)The first in-depth study of the weakness inherent in data-driven static fusion methods for video captioning.
no code implementations • ICCV 2015 • Lingxi Xie, Jingdong Wang, Weiyao Lin, Bo Zhang, Qi Tian
In many fine-grained object recognition datasets, image orientation (left/right) might vary from sample to sample.
no code implementations • ICCV 2015 • Liang Zheng, Liyue Shen, Lu Tian, Shengjin Wang, Jingdong Wang, Qi Tian
As a minor contribution, inspired by recent advances in large-scale image search, this paper proposes an unsupervised Bag-of-Words descriptor.
Ranked #90 on Person Re-Identification on DukeMTMC-reID
no code implementations • ICCV 2015 • Chi Su, Fan Yang, Shiliang Zhang, Qi Tian, Larry S. Davis, Wen Gao
Since attributes are generally correlated, we introduce a low rank attribute embedding into the MTL formulation to embed original binary attributes to a continuous attribute space, where incorrect and incomplete attributes are rectified and recovered to better describe people.
no code implementations • ICCV 2015 • Guoli Song, Shuhui Wang, Qingming Huang, Qi Tian
Data from real applications involve multiple modalities representing content with the same semantics and deliver rich information from complementary aspects.
no code implementations • ICCV 2017 • Song Bai, Zhichao Zhou, Jingdong Wang, Xiang Bai, Longin Jan Latecki, Qi Tian
This stimulates a great research interest of considering similarity fusion in the framework of diffusion process (i. e., fusion with diffusion) for robust retrieval.
no code implementations • ICCV 2017 • Guoli Song, Shuhui Wang, Qingming Huang, Qi Tian
We incorporate the harmonization mechanism into the learning process of multimodal GPLVMs.
no code implementations • 28 Feb 2019 • Jiancheng Yang, Qiang Zhang, Rongyao Fang, Bingbing Ni, Jinxian Liu, Qi Tian
A set of novel 3D point cloud attack operations are proposed via pointwise gradient perturbation and adversarial point attachment / detachment.
no code implementations • CVPR 2019 • Jiancheng Yang, Qiang Zhang, Bingbing Ni, Linguo Li, Jinxian Liu, Mengdie Zhou, Qi Tian
Thereby, we for the first time propose an end-to-end learnable and task-agnostic sampling operation, named Gumbel Subset Sampling (GSS), to select a representative subset of input points.
no code implementations • CVPR 2019 • Wanhua Li, Jiwen Lu, Jianjiang Feng, Chunjing Xu, Jie zhou, Qi Tian
Existing methods for age estimation usually apply a divide-and-conquer strategy to deal with heterogeneous data caused by the non-stationary aging process.
Ranked #2 on Age Estimation on FGNET
no code implementations • 10 Apr 2019 • Cong-Cong Li, Dawei Du, Libo Zhang, Tiejian Luo, Yanjun Wu, Qi Tian, Longyin Wen, Siwei Lyu
In this paper, we propose a new data priming method to solve the domain adaptation problem.
no code implementations • CVPR 2019 • Lijie Liu, Jiwen Lu, Chunjing Xu, Qi Tian, Jie zhou
In this paper, we propose to learn a deep fitting degree scoring network for monocular 3D object detection, which aims to score fitting degree between proposals and object conclusively.
Ranked #7 on Vehicle Pose Estimation on KITTI Cars Hard
no code implementations • 30 Apr 2019 • Chuan Wen, Jie Chang, Ya zhang, Siheng Chen, Yan-Feng Wang, Mei Han, Qi Tian
Automatic character generation is an appealing solution for new typeface design, especially for Chinese typefaces including over 3700 most commonly-used characters.
no code implementations • 3 May 2019 • Tao Yao, Xiangwei Kong, Lianshan Yan, Wenjing Tang, Qi Tian
In this paper, to address above issues, we propose a supervised cross-modal hashing method based on matrix factorization dubbed Efficient Discrete Supervised Hashing (EDSH).