Search Results for author: Qi Tian

Found 327 papers, 140 papers with code

GhostNet: More Features from Cheap Operations

34 code implementations CVPR 2020 Kai Han, Yunhe Wang, Qi Tian, Jianyuan Guo, Chunjing Xu, Chang Xu

Deploying convolutional neural networks (CNNs) on embedded devices is difficult due to the limited memory and computation resources.

Image Classification

Visformer: The Vision-friendly Transformer

5 code implementations ICCV 2021 Zhengsu Chen, Lingxi Xie, Jianwei Niu, Xuefeng Liu, Longhui Wei, Qi Tian

The past year has witnessed the rapid development of applying the Transformer module to vision problems.

Image Classification

GhostNets on Heterogeneous Devices via Cheap Operations

8 code implementations10 Jan 2022 Kai Han, Yunhe Wang, Chang Xu, Jianyuan Guo, Chunjing Xu, Enhua Wu, Qi Tian

The proposed C-Ghost module can be taken as a plug-and-play component to upgrade existing convolutional neural networks.

CenterNet: Keypoint Triplets for Object Detection

20 code implementations ICCV 2019 Kaiwen Duan, Song Bai, Lingxi Xie, Honggang Qi, Qingming Huang, Qi Tian

In object detection, keypoint-based approaches often suffer a large number of incorrect object bounding boxes, arguably due to the lack of an additional look into the cropped regions.

Object object-detection +1

Rethinking Rotated Object Detection with Gaussian Wasserstein Distance Loss

2 code implementations28 Jan 2021 Xue Yang, Junchi Yan, Qi Ming, Wentao Wang, Xiaopeng Zhang, Qi Tian

Boundary discontinuity and its inconsistency to the final detection metric have been the bottleneck for rotating detection regression loss design.

Ranked #16 on Object Detection In Aerial Images on DOTA (using extra training data)

object-detection Object Detection In Aerial Images +2

Learning High-Precision Bounding Box for Rotated Object Detection via Kullback-Leibler Divergence

2 code implementations NeurIPS 2021 Xue Yang, Xiaojiang Yang, Jirui Yang, Qi Ming, Wentao Wang, Qi Tian, Junchi Yan

Taking the perspective that horizontal detection is a special case for rotated object detection, in this paper, we are motivated to change the design of rotation regression loss from induction paradigm to deduction methodology, in terms of the relation between rotation and horizontal detection.

Ranked #14 on Object Detection In Aerial Images on DOTA (using extra training data)

object-detection Object Detection In Aerial Images +1

The KFIoU Loss for Rotated Object Detection

3 code implementations29 Jan 2022 Xue Yang, Yue Zhou, Gefan Zhang, Jirui Yang, Wentao Wang, Junchi Yan, Xiaopeng Zhang, Qi Tian

This is in contrast to recent Gaussian modeling based rotation detectors e. g. GWD loss and KLD loss that involve a human-specified distribution distance metric which require additional hyperparameter tuning that vary across datasets and detectors.

Object object-detection +1

4D Gaussian Splatting for Real-Time Dynamic Scene Rendering

1 code implementation12 Oct 2023 Guanjun Wu, Taoran Yi, Jiemin Fang, Lingxi Xie, Xiaopeng Zhang, Wei Wei, Wenyu Liu, Qi Tian, Xinggang Wang

Representing and rendering dynamic scenes has been an important but challenging task.

Data-Free Learning of Student Networks

3 code implementations ICCV 2019 Hanting Chen, Yunhe Wang, Chang Xu, Zhaohui Yang, Chuanjian Liu, Boxin Shi, Chunjing Xu, Chao Xu, Qi Tian

Learning portable neural networks is very essential for computer vision for the purpose that pre-trained heavy deep models can be well applied on edge devices such as mobile phones and micro sensors.

Neural Network Compression

AdderNet: Do We Really Need Multiplications in Deep Learning?

7 code implementations CVPR 2020 Hanting Chen, Yunhe Wang, Chunjing Xu, Boxin Shi, Chao Xu, Qi Tian, Chang Xu

The widely-used convolutions in deep neural networks are exactly cross-correlation to measure the similarity between input feature and convolution filters, which involves massive multiplications between float values.

Pangu-Weather: A 3D High-Resolution Model for Fast and Accurate Global Weather Forecast

3 code implementations3 Nov 2022 Kaifeng Bi, Lingxi Xie, Hengheng Zhang, Xin Chen, Xiaotao Gu, Qi Tian

In this paper, we present Pangu-Weather, a deep learning based system for fast and accurate global weather forecast.

Segment Anything in 3D with Radiance Fields

1 code implementation NeurIPS 2023 Jiazhong Cen, Jiemin Fang, Zanwei Zhou, Chen Yang, Lingxi Xie, Xiaopeng Zhang, Wei Shen, Qi Tian

The Segment Anything Model (SAM) emerges as a powerful vision foundation model to generate high-quality 2D segmentation results.

Inverse Rendering Segmentation

ControlVideo: Training-free Controllable Text-to-Video Generation

1 code implementation22 May 2023 Yabo Zhang, Yuxiang Wei, Dongsheng Jiang, Xiaopeng Zhang, WangMeng Zuo, Qi Tian

Text-driven diffusion models have unlocked unprecedented abilities in image generation, whereas their video counterpart still lags behind due to the excessive training cost of temporal modeling.

Image Generation Text-to-Video Generation +1

GaussianObject: Just Taking Four Images to Get A High-Quality 3D Object with Gaussian Splatting

1 code implementation15 Feb 2024 Chen Yang, Sikuang Li, Jiemin Fang, Ruofan Liang, Lingxi Xie, Xiaopeng Zhang, Wei Shen, Qi Tian

Then we construct a Gaussian repair model based on diffusion models to supplement the omitted object information, where Gaussians are further refined.

Neural Rendering Object

CIPS-3D: A 3D-Aware Generator of GANs Based on Conditionally-Independent Pixel Synthesis

1 code implementation19 Oct 2021 Peng Zhou, Lingxi Xie, Bingbing Ni, Qi Tian

The style-based GAN (StyleGAN) architecture achieved state-of-the-art results for generating high-quality images, but it lacks explicit and precise control over camera poses.

3D-Aware Image Synthesis Transfer Learning

Attribute Mix: Semantic Data Augmentation for Fine Grained Recognition

1 code implementation6 Apr 2020 Hao Li, Xiaopeng Zhang, Hongkai Xiong, Qi Tian

In this paper, we propose Attribute Mix, a data augmentation strategy at attribute level to expand the fine-grained samples.

Attribute Data Augmentation +1

Person Transfer GAN to Bridge Domain Gap for Person Re-Identification

25 code implementations CVPR 2018 Longhui Wei, Shiliang Zhang, Wen Gao, Qi Tian

Although the performance of person Re-Identification (ReID) has been significantly boosted, many challenging issues in real scenarios have not been fully investigated, e. g., the complex scenes and lighting variations, viewpoint and pose changes, and the large number of identities in a camera network.

Generative Adversarial Network Person Re-Identification +1

Deep Modular Co-Attention Networks for Visual Question Answering

7 code implementations CVPR 2019 Zhou Yu, Jun Yu, Yuhao Cui, DaCheng Tao, Qi Tian

In this paper, we propose a deep Modular Co-Attention Network (MCAN) that consists of Modular Co-Attention (MCA) layers cascaded in depth.

Question Answering Visual Question Answering

PC-DARTS: Partial Channel Connections for Memory-Efficient Architecture Search

8 code implementations ICLR 2020 Yuhui Xu, Lingxi Xie, Xiaopeng Zhang, Xin Chen, Guo-Jun Qi, Qi Tian, Hongkai Xiong

Differentiable architecture search (DARTS) provided a fast solution in finding effective network architectures, but suffered from large memory and computing overheads in jointly training a super-network and searching for an optimal architecture.

Neural Architecture Search

Pixel Difference Networks for Efficient Edge Detection

2 code implementations ICCV 2021 Zhuo Su, Wenzhe Liu, Zitong Yu, Dewen Hu, Qing Liao, Qi Tian, Matti Pietikäinen, Li Liu

A faster version of PiDiNet with less than 0. 1M parameters can still achieve comparable performance among state of the arts with 200 FPS.

Edge Detection

Progressive Differentiable Architecture Search: Bridging the Depth Gap between Search and Evaluation

4 code implementations ICCV 2019 Xin Chen, Lingxi Xie, Jun Wu, Qi Tian

Recently, differentiable search methods have made major progress in reducing the computational costs of neural architecture search.

Neural Architecture Search

Progressive DARTS: Bridging the Optimization Gap for NAS in the Wild

4 code implementations23 Dec 2019 Xin Chen, Lingxi Xie, Jun Wu, Qi Tian

With the rapid development of neural architecture search (NAS), researchers found powerful network architectures for a wide range of vision tasks.

Neural Architecture Search

DocScanner: Robust Document Image Rectification with Progressive Learning

3 code implementations28 Oct 2021 Hao Feng, Wengang Zhou, Jiajun Deng, Qi Tian, Houqiang Li

The iterative refinements make DocScanner converge to a robust and superior rectification performance, while the lightweight recurrent architecture ensures the running efficiency.

Optical Character Recognition (OCR)

Towards Discriminability and Diversity: Batch Nuclear-norm Maximization under Label Insufficient Situations

2 code implementations CVPR 2020 Shuhao Cui, Shuhui Wang, Junbao Zhuo, Liang Li, Qingming Huang, Qi Tian

We find by theoretical analysis that the prediction discriminability and diversity could be separately measured by the Frobenius-norm and rank of the batch output matrix.

Domain Adaptation

Gradually Vanishing Bridge for Adversarial Domain Adaptation

2 code implementations CVPR 2020 Shuhao Cui, Shuhui Wang, Junbao Zhuo, Chi Su, Qingming Huang, Qi Tian

On the discriminator, GVB contributes to enhance the discriminating ability, and balance the adversarial training process.

Unsupervised Domain Adaptation

Fast Dynamic Radiance Fields with Time-Aware Neural Voxels

1 code implementation30 May 2022 Jiemin Fang, Taoran Yi, Xinggang Wang, Lingxi Xie, Xiaopeng Zhang, Wenyu Liu, Matthias Nießner, Qi Tian

A multi-distance interpolation method is proposed and applied on voxel features to model both small and large motions.

Fast Batch Nuclear-norm Maximization and Minimization for Robust Domain Adaptation

1 code implementation13 Jul 2021 Shuhao Cui, Shuhui Wang, Junbao Zhuo, Liang Li, Qingming Huang, Qi Tian

Due to the domain discrepancy in visual domain adaptation, the performance of source model degrades when bumping into the high data density near decision boundary in target domain.

Domain Adaptation

Co-Evolutionary Compression for Unpaired Image Translation

2 code implementations ICCV 2019 Han Shu, Yunhe Wang, Xu Jia, Kai Han, Hanting Chen, Chunjing Xu, Qi Tian, Chang Xu

Generative adversarial networks (GANs) have been successfully used for considerable computer vision tasks, especially the image-to-image translation.

Image-to-Image Translation Translation

Cross-Scale Cost Aggregation for Stereo Matching

1 code implementation CVPR 2014 Kang Zhang, Yuqiang Fang, Dongbo Min, Lifeng Sun, Shiqiang Yang. Shuicheng Yan, Qi Tian

We firstly reformulate cost aggregation from a unified optimization perspective and show that different cost aggregation methods essentially differ in the choices of similarity kernels.

Stereo Matching Stereo Matching Hand

Multinomial Distribution Learning for Effective Neural Architecture Search

1 code implementation ICCV 2019 Xiawu Zheng, Rongrong Ji, Lang Tang, Baochang Zhang, Jianzhuang Liu, Qi Tian

Therefore, NAS can be transformed to a multinomial distribution learning problem, i. e., the distribution is optimized to have a high expectation of the performance.

Neural Architecture Search

CenterNet++ for Object Detection

2 code implementations18 Apr 2022 Kaiwen Duan, Song Bai, Lingxi Xie, Honggang Qi, Qingming Huang, Qi Tian

Our approach, named CenterNet, detects each object as a triplet keypoints (top-left and bottom-right corners and the center keypoint).

Object object-detection +1

Rethinking Performance Estimation in Neural Architecture Search

1 code implementation CVPR 2020 Xiawu Zheng, Rongrong Ji, Qiang Wang, Qixiang Ye, Zhenguo Li, Yonghong Tian, Qi Tian

In this paper, we provide a novel yet systematic rethinking of PE in a resource constrained regime, termed budgeted PE (BPE), which precisely and effectively estimates the performance of an architecture sampled from an architecture space.

Neural Architecture Search

Adversarial Domain Adaptation with Domain Mixup

1 code implementation4 Dec 2019 Minghao Xu, Jian Zhang, Bingbing Ni, Teng Li, Chengjie Wang, Qi Tian, Wenjun Zhang

In this paper, we present adversarial domain adaptation with domain mixup (DM-ADA), which guarantees domain-invariance in a more continuous latent space and guides the domain discriminator in judging samples' difference relative to source and target domains.

Domain Adaptation

Location-Sensitive Visual Recognition with Cross-IOU Loss

1 code implementation11 Apr 2021 Kaiwen Duan, Lingxi Xie, Honggang Qi, Song Bai, Qingming Huang, Qi Tian

Object detection, instance segmentation, and pose estimation are popular visual recognition tasks which require localizing the object by internal or boundary landmarks.

2D Human Pose Estimation Instance Segmentation +5

A Fourier-based Framework for Domain Generalization

1 code implementation CVPR 2021 Qinwei Xu, Ruipeng Zhang, Ya zhang, Yanfeng Wang, Qi Tian

Modern deep neural networks suffer from performance degradation when evaluated on testing data under different distributions from training data.

Data Augmentation Domain Generalization

Cross-domain Detection via Graph-induced Prototype Alignment

1 code implementation CVPR 2020 Minghao Xu, Hang Wang, Bingbing Ni, Qi Tian, Wenjun Zhang

To mitigate these problems, we propose a Graph-induced Prototype Alignment (GPA) framework to seek for category-level domain alignment via elaborate prototype representations.

Domain Adaptation object-detection +1

Dynamic Multiscale Graph Neural Networks for 3D Skeleton-Based Human Motion Prediction

1 code implementation17 Mar 2020 Maosen Li, Siheng Chen, Yangheng Zhao, Ya zhang, Yan-Feng Wang, Qi Tian

The core idea of DMGNN is to use a multiscale graph to comprehensively model the internal relations of a human body for motion feature learning.

3D Human Pose Estimation 3D Pose Estimation +2

Label Decoupling Framework for Salient Object Detection

1 code implementation CVPR 2020 Jun Wei, Shuhui Wang, Zhe Wu, Chi Su, Qingming Huang, Qi Tian

Though remarkable progress has been achieved, we observe that the closer the pixel is to the edge, the more difficult it is to be predicted, because edge pixels have a very imbalance distribution.

Object object-detection +3

CARS: Continuous Evolution for Efficient Neural Architecture Search

1 code implementation CVPR 2020 Zhaohui Yang, Yunhe Wang, Xinghao Chen, Boxin Shi, Chao Xu, Chunjing Xu, Qi Tian, Chang Xu

Architectures in the population that share parameters within one SuperNet in the latest generation will be tuned over the training dataset with a few epochs.

Neural Architecture Search

Omni-GAN: On the Secrets of cGANs and Beyond

3 code implementations ICCV 2021 Peng Zhou, Lingxi Xie, Bingbing Ni, Cong Geng, Qi Tian

The conditional generative adversarial network (cGAN) is a powerful tool of generating high-quality images, but existing approaches mostly suffer unsatisfying performance or the risk of mode collapse.

Conditional Image Generation Generative Adversarial Network

UnrealPerson: An Adaptive Pipeline towards Costless Person Re-identification

1 code implementation CVPR 2021 Tianyu Zhang, Lingxi Xie, Longhui Wei, Zijie Zhuang, Yongfei Zhang, Bo Li, Qi Tian

The main difficulty of person re-identification (ReID) lies in collecting annotated data and transferring the model across different domains.

Domain Adaptation Image Generation +1

Stabilizing DARTS with Amended Gradient Estimation on Architectural Parameters

1 code implementation25 Oct 2019 Kaifeng Bi, Changping Hu, Lingxi Xie, Xin Chen, Longhui Wei, Qi Tian

Our approach bridges the gap from two aspects, namely, amending the estimation on the architectural gradients, and unifying the hyper-parameter settings in the search and re-training stages.

Neural Architecture Search

GraphQ IR: Unifying the Semantic Parsing of Graph Query Languages with One Intermediate Representation

1 code implementation24 May 2022 Lunyiu Nie, Shulin Cao, Jiaxin Shi, Jiuding Sun, Qi Tian, Lei Hou, Juanzi Li, Jidong Zhai

Subject to the huge semantic gap between natural and formal languages, neural semantic parsing is typically bottlenecked by its complexity of dealing with both input semantics and output syntax.

Few-Shot Learning Semantic Parsing

Reading and Writing: Discriminative and Generative Modeling for Self-Supervised Text Recognition

1 code implementation1 Jul 2022 Mingkun Yang, Minghui Liao, Pu Lu, Jing Wang, Shenggao Zhu, Hualin Luo, Qi Tian, Xiang Bai

Inspired by the observation that humans learn to recognize the texts through both reading and writing, we propose to learn discrimination and generation by integrating contrastive learning and masked image modeling in our self-supervised method.

Contrastive Learning Scene Text Recognition

Masked Autoencoders are Robust Data Augmentors

1 code implementation10 Jun 2022 Haohang Xu, Shuangrui Ding, Xiaopeng Zhang, Hongkai Xiong, Qi Tian

Specifically, MRA consistently enhances the performance on supervised, semi-supervised as well as few-shot classification.

Image Augmentation Image Classification +1

SAILER: Structure-aware Pre-trained Language Model for Legal Case Retrieval

1 code implementation22 Apr 2023 Haitao Li, Qingyao Ai, Jia Chen, Qian Dong, Yueyue Wu, Yiqun Liu, Chong Chen, Qi Tian

Moreover, in contrast to the general retrieval, the relevance in the legal domain is sensitive to key legal elements.

Language Modelling Retrieval

Filter Sketch for Network Pruning

1 code implementation23 Jan 2020 Mingbao Lin, Liujuan Cao, Shaojie Li, Qixiang Ye, Yonghong Tian, Jianzhuang Liu, Qi Tian, Rongrong Ji

Our approach, referred to as FilterSketch, encodes the second-order information of pre-trained weights, which enables the representation capacity of pruned networks to be recovered with a simple fine-tuning procedure.

Network Pruning

HiViT: Hierarchical Vision Transformer Meets Masked Image Modeling

1 code implementation30 May 2022 Xiaosong Zhang, Yunjie Tian, Wei Huang, Qixiang Ye, Qi Dai, Lingxi Xie, Qi Tian

A key idea of efficient implementation is to discard the masked image patches (or tokens) throughout the target network (encoder), which requires the encoder to be a plain vision transformer (e. g., ViT), albeit hierarchical vision transformers (e. g., Swin Transformer) have potentially better properties in formulating vision inputs.

Transfer Learning

SdAE: Self-distillated Masked Autoencoder

1 code implementation31 Jul 2022 Yabo Chen, Yuchen Liu, Dongsheng Jiang, Xiaopeng Zhang, Wenrui Dai, Hongkai Xiong, Qi Tian

We also analyze how to build good views for the teacher branch to produce latent representation from the perspective of information bottleneck.

Descriptive Self-Supervised Learning

Bottom-Up Temporal Action Localization with Mutual Regularization

1 code implementation ECCV 2020 Peisen Zhao, Lingxi Xie, Chen Ju, Ya zhang, Yan-Feng Wang, Qi Tian

To alleviate this problem, we introduce two regularization terms to mutually regularize the learning procedure: the Intra-phase Consistency (IntraC) regularization is proposed to make the predictions verified inside each phase; and the Inter-phase Consistency (InterC) regularization is proposed to keep consistency between these phases.

Temporal Action Localization

Multi-Cue Correlation Filters for Robust Visual Tracking

1 code implementation CVPR 2018 Ning Wang, Wengang Zhou, Qi Tian, Richang Hong, Meng Wang, Houqiang Li

By combining different types of features, our approach constructs multiple experts through Discriminative Correlation Filter (DCF) and each of them tracks the target independently.

Visual Tracking

Visual Recognition by Request

1 code implementation CVPR 2023 Chufeng Tang, Lingxi Xie, Xiaopeng Zhang, Xiaolin Hu, Qi Tian

Humans have the ability of recognizing visual semantics in an unlimited granularity, but existing visual recognition algorithms cannot achieve this goal.

Instance Segmentation Semantic Segmentation

ChatterBox: Multi-round Multimodal Referring and Grounding

1 code implementation24 Jan 2024 Yunjie Tian, Tianren Ma, Lingxi Xie, Jihao Qiu, Xi Tang, Yuan Zhang, Jianbin Jiao, Qi Tian, Qixiang Ye

In this study, we establish a baseline for a new task named multimodal multi-round referring and grounding (MRG), opening up a promising direction for instance-level multimodal dialogues.

Language Modelling Visual Grounding

Adaptive Graph Representation Learning for Video Person Re-identification

1 code implementation5 Sep 2019 Yiming Wu, Omar El Farouk Bourahla, Xi Li, Fei Wu, Qi Tian, Xue Zhou

While correlations between parts are ignored in the previous methods, to leverage the relations of different parts, we propose an innovative adaptive graph representation learning scheme for video person Re-ID, which enables the contextual interactions between relevant regional features.

Graph Representation Learning Video-Based Person Re-Identification

Large-Scale Spatio-Temporal Person Re-identification: Algorithms and Benchmark

2 code implementations31 May 2021 Xiujun Shu, Xiao Wang, Xianghao Zang, Shiliang Zhang, Yuanqi Chen, Ge Li, Qi Tian

We also verified that models pre-trained on LaST can generalize well on existing datasets with short-term and cloth-changing scenarios.

Person Re-Identification

NeuSample: Neural Sample Field for Efficient View Synthesis

1 code implementation30 Nov 2021 Jiemin Fang, Lingxi Xie, Xinggang Wang, Xiaopeng Zhang, Wenyu Liu, Qi Tian

Neural radiance fields (NeRF) have shown great potentials in representing 3D scenes and synthesizing novel views, but the computational overhead of NeRF at the inference stage is still heavy.

Wavelet-Based Dual-Branch Network for Image Demoireing

1 code implementation14 Jul 2020 Lin Liu, Jianzhuang Liu, Shanxin Yuan, Gregory Slabaugh, Ales Leonardis, Wengang Zhou, Qi Tian

When smartphone cameras are used to take photos of digital screens, usually moire patterns result, severely degrading photo quality.

Demoire Image Restoration +1

Federated Domain Generalization With Generalization Adjustment

1 code implementation CVPR 2023 Ruipeng Zhang, Qinwei Xu, Jiangchao Yao, Ya zhang, Qi Tian, Yanfeng Wang

Federated Domain Generalization (FedDG) attempts to learn a global model in a privacy-preserving manner that generalizes well to new clients possibly with domain shift.

Domain Generalization Fairness +1

Adversarial Training Towards Robust Multimedia Recommender System

1 code implementation19 Sep 2018 Jinhui Tang, Xiaoyu Du, Xiangnan He, Fajie Yuan, Qi Tian, Tat-Seng Chua

To this end, we propose a novel solution named Adversarial Multimedia Recommendation (AMR), which can lead to a more robust multimedia recommender model by using adversarial learning.

Information Retrieval Multimedia

Bag of Instances Aggregation Boosts Self-supervised Distillation

1 code implementation ICLR 2022 Haohang Xu, Jiemin Fang, Xiaopeng Zhang, Lingxi Xie, Xinggang Wang, Wenrui Dai, Hongkai Xiong, Qi Tian

Here bag of instances indicates a set of similar samples constructed by the teacher and are grouped within a bag, and the goal of distillation is to aggregate compact representations over the student with respect to instances in a bag.

Contrastive Learning Self-Supervised Learning

A Bi-Step Grounding Paradigm for Large Language Models in Recommendation Systems

1 code implementation16 Aug 2023 Keqin Bao, Jizhi Zhang, Wenjie Wang, Yang Zhang, Zhengyi Yang, Yancheng Luo, Chong Chen, Fuli Feng, Qi Tian

As the focus on Large Language Models (LLMs) in the field of recommendation intensifies, the optimization of LLMs for recommendation purposes (referred to as LLM4Rec) assumes a crucial role in augmenting their effectiveness in providing recommendations.

Collaborative Filtering Recommendation Systems

Enhancing Person Re-identification in a Self-trained Subspace

1 code implementation20 Apr 2017 Xun Yang, Meng Wang, Richang Hong, Qi Tian, Yong Rui

To address this problem, in this paper, we propose a self-trained subspace learning paradigm for person re-ID which effectively utilizes both labeled and unlabeled data to learn a discriminative subspace where person images across disjoint camera views can be easily matched.

Person Re-Identification

Partial Class Activation Attention for Semantic Segmentation

1 code implementation CVPR 2022 Sun-Ao Liu, Hongtao Xie, Hai Xu, Yongdong Zhang, Qi Tian

Current attention-based methods for semantic segmentation mainly model pixel relation through pairwise affinity and coarse segmentation.

Relation Segmentation +1

Boosting Segment Anything Model Towards Open-Vocabulary Learning

1 code implementation6 Dec 2023 Xumeng Han, Longhui Wei, Xuehui Yu, Zhiyang Dou, Xin He, Kuiran Wang, Zhenjun Han, Qi Tian

The recent Segment Anything Model (SAM) has emerged as a new paradigmatic vision foundation model, showcasing potent zero-shot generalization and flexible prompting.

Object Object Localization +2

HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data

1 code implementation22 Nov 2023 Qifan Yu, Juncheng Li, Longhui Wei, Liang Pang, Wentao Ye, Bosheng Qin, Siliang Tang, Qi Tian, Yueting Zhuang

Multi-modal Large Language Models (MLLMs) tuned on machine-generated instruction-following data have demonstrated remarkable performance in various multi-modal understanding and generation tasks.

Attribute counterfactual +3

Deep Multimodal Neural Architecture Search

1 code implementation25 Apr 2020 Zhou Yu, Yuhao Cui, Jun Yu, Meng Wang, DaCheng Tao, Qi Tian

Most existing works focus on a single task and design neural architectures manually, which are highly task-specific and hard to generalize to different tasks.

Image-text matching Neural Architecture Search +4

Semantic-Aware Generation for Self-Supervised Visual Representation Learning

1 code implementation25 Nov 2021 Yunjie Tian, Lingxi Xie, Xiaopeng Zhang, Jiemin Fang, Haohang Xu, Wei Huang, Jianbin Jiao, Qi Tian, Qixiang Ye

In this paper, we propose a self-supervised visual representation learning approach which involves both generative and discriminative proxies, where we focus on the former part by requiring the target network to recover the original image based on the mid-level features.

Representation Learning Semantic Segmentation

Skeleton-Parted Graph Scattering Networks for 3D Human Motion Prediction

1 code implementation31 Jul 2022 Maosen Li, Siheng Chen, Zijing Zhang, Lingxi Xie, Qi Tian, Ya zhang

To address the first issue, we propose adaptive graph scattering, which leverages multiple trainable band-pass graph filters to decompose pose features into richer graph spectrum bands.

Human motion prediction motion prediction

Towards a Unified View on Visual Parameter-Efficient Transfer Learning

1 code implementation3 Oct 2022 Bruce X. B. Yu, Jianlong Chang, Lingbo Liu, Qi Tian, Chang Wen Chen

Towards this goal, we propose a framework with a unified view of PETL called visual-PETL (V-PETL) to investigate the effects of different PETL techniques, data scales of downstream domains, positions of trainable parameters, and other aspects affecting the trade-off.

Action Recognition Image Classification +2

CooGAN: A Memory-Efficient Framework for High-Resolution Facial Attribute Editing

1 code implementation ECCV 2020 Xuanhong Chen, Bingbing Ni, Naiyuan Liu, Ziang Liu, Yiliu Jiang, Loc Truong, Qi Tian

In contrast to great success of memory-consuming face editing methods at a low resolution, to manipulate high-resolution (HR) facial images, i. e., typically larger than 7682 pixels, with very limited memory is still challenging.

Attribute Image Generation +2

Greedy Gradient Ensemble for Robust Visual Question Answering

1 code implementation ICCV 2021 Xinzhe Han, Shuhui Wang, Chi Su, Qingming Huang, Qi Tian

Language bias is a critical issue in Visual Question Answering (VQA), where models often exploit dataset biases for the final decision without considering the image information.

Question Answering Visual Question Answering

Hadamard Matrix Guided Online Hashing

1 code implementation11 May 2019 Mingbao Lin, Rongrong Ji, Hong Liu, Xiaoshuai Sun, Shen Chen, Qi Tian

We then treat the learning of hash functions as a set of binary classification problems to fit the assigned target code.

Binary Classification

Single Camera Training for Person Re-identification

1 code implementation24 Sep 2019 Tianyu Zhang, Lingxi Xie, Longhui Wei, Yongfei Zhang, Bo Li, Qi Tian

Differently, this paper investigates ReID in an unexplored single-camera-training (SCT) setting, where each person in the training set appears in only one camera.

Metric Learning Person Re-Identification

GOLD-NAS: Gradual, One-Level, Differentiable

1 code implementation7 Jul 2020 Kaifeng Bi, Lingxi Xie, Xin Chen, Longhui Wei, Qi Tian

There has been a large literature of neural architecture search, but most existing work made use of heuristic rules that largely constrained the search flexibility.

Image Classification Neural Architecture Search

DE-Net: Dynamic Text-guided Image Editing Adversarial Networks

1 code implementation2 Jun 2022 Ming Tao, Bing-Kun Bao, Hao Tang, Fei Wu, Longhui Wei, Qi Tian

To solve these limitations, we propose: (i) a Dynamic Editing Block (DEBlock) which composes different editing modules dynamically for various editing requirements.

text-guided-image-editing

Towards Visual Feature Translation

1 code implementation CVPR 2019 Jie Hu, Rongrong Ji, Hong Liu, Shengchuan Zhang, Cheng Deng, Qi Tian

In this paper, we make the first attempt towards visual feature translation to break through the barrier of using features across different visual search systems.

Translation

Iterative Reorganization with Weak Spatial Constraints: Solving Arbitrary Jigsaw Puzzles for Unsupervised Representation Learning

1 code implementation CVPR 2019 Chen Wei, Lingxi Xie, Xutong Ren, Yingda Xia, Chi Su, Jiaying Liu, Qi Tian, Alan L. Yuille

We consider spatial contexts, for which we solve so-called jigsaw puzzles, i. e., each image is cut into grids and then disordered, and the goal is to recover the correct configuration.

General Classification Image Classification +4

Towards 3D Molecule-Text Interpretation in Language Models

1 code implementation25 Jan 2024 Sihang Li, Zhiyuan Liu, Yanchen Luo, Xiang Wang, Xiangnan He, Kenji Kawaguchi, Tat-Seng Chua, Qi Tian

Through 3D molecule-text alignment and 3D molecule-centric instruction tuning, 3D-MoLM establishes an integration of 3D molecular encoder and LM.

Instruction Following Language Modelling +3

Wnet: Audio-Guided Video Object Segmentation via Wavelet-Based Cross-Modal Denoising Networks

1 code implementation CVPR 2022 Wenwen Pan, Haonan Shi, Zhou Zhao, Jieming Zhu, Xiuqiang He, Zhigeng Pan, Lianli Gao, Jun Yu, Fei Wu, Qi Tian

Audio-Guided video semantic segmentation is a challenging problem in visual analysis and editing, which automatically separates foreground objects from background in a video sequence according to the referring audio expressions.

Denoising Segmentation +3

AVT: Unsupervised Learning of Transformation Equivariant Representations by Autoencoding Variational Transformations

1 code implementation ICCV 2019 Guo-Jun Qi, Liheng Zhang, Chang Wen Chen, Qi Tian

This ensures the resultant TERs of individual images contain the {\em intrinsic} information about their visual structures that would equivary {\em extricably} under various transformations in a generalized {\em nonlinear} case.

Self-Adaptively Learning to Demoire from Focused and Defocused Image Pairs

1 code implementation3 Nov 2020 Lin Liu, Shanxin Yuan, Jianzhuang Liu, Liping Bao, Gregory Slabaugh, Qi Tian

In this paper, we propose a self-adaptive learning method for demoireing a high-frequency image, with the help of an additional defocused moire-free blur image.

Demoire Test-time Adaptation

Creating Something from Nothing: Unsupervised Knowledge Distillation for Cross-Modal Hashing

1 code implementation CVPR 2020 Hengtong Hu, Lingxi Xie, Richang Hong, Qi Tian

In recent years, cross-modal hashing (CMH) has attracted increasing attentions, mainly because its potential ability of mapping contents from different modalities, especially in vision and language, into the same space, so that it becomes efficient in cross-modal data retrieval.

Knowledge Distillation Retrieval

Dual Distribution Alignment Network for Generalizable Person Re-Identification

1 code implementation27 Jul 2020 Peixian Chen, Pingyang Dai, Jianzhuang Liu, Feng Zheng, Qi Tian, Rongrong Ji

Domain generalization (DG) serves as a promising solution to handle person Re-Identification (Re-ID), which trains the model using labels from the source domain alone, and then directly adopts the trained model to the target domain without model updating.

Domain Generalization Generalizable Person Re-identification

DATA: Domain-Aware and Task-Aware Self-supervised Learning

1 code implementation CVPR 2022 Qing Chang, Junran Peng, Lingxie Xie, Jiajun Sun, Haoran Yin, Qi Tian, Zhaoxiang Zhang

However, due to the high training costs and the unconsciousness of downstream usages, most self-supervised learning methods lack the capability to correspond to the diversities of downstream scenarios, as there are various data domains, different vision tasks and latency constraints on models.

Image Classification Model Selection +5

Prune Spatio-temporal Tokens by Semantic-aware Temporal Accumulation

1 code implementation ICCV 2023 Shuangrui Ding, Peisen Zhao, Xiaopeng Zhang, Rui Qian, Hongkai Xiong, Qi Tian

Based on the STA score, we are able to progressively prune the tokens without introducing any additional parameters or requiring further re-training.

Video Recognition

DeeCap: Dynamic Early Exiting for Efficient Image Captioning

1 code implementation CVPR 2022 Zhengcong Fei, Xu Yan, Shuhui Wang, Qi Tian

On one hand, the representation in shallow layers lacks high-level semantic and sufficient cross-modal fusion information for accurate prediction.

Image Captioning Imitation Learning

Projection & Probability-Driven Black-Box Attack

1 code implementation CVPR 2020 Jie Li, Rongrong Ji, Hong Liu, Jianzhuang Liu, Bineng Zhong, Cheng Deng, Qi Tian

For reducing the solution space, we first model the adversarial perturbation optimization problem as a process of recovering frequency-sparse perturbations with compressed sensing, under the setting that random noise in the low-frequency space is more likely to be adversarial.

DR2-Net: Deep Residual Reconstruction Network for Image Compressive Sensing

1 code implementation19 Feb 2017 Hantao Yao, Feng Dai, Dongming Zhang, Yike Ma, Shiliang Zhang, Yongdong Zhang, Qi Tian

Accordingly, DR$^{2}$-Net consists of two components, \emph{i. e.,} linear mapping network and residual network, respectively.

Compressive Sensing Image Reconstruction

Being Comes from Not-being: Open-vocabulary Text-to-Motion Generation with Wordless Training

1 code implementation CVPR 2023 Junfan Lin, Jianlong Chang, Lingbo Liu, Guanbin Li, Liang Lin, Qi Tian, Chang Wen Chen

During inference, instead of changing the motion generator, our method reformulates the input text into a masked motion as the prompt for the motion generator to ``reconstruct'' the motion.

Language Modelling Zero-Shot Learning

A Real-time Global Inference Network for One-stage Referring Expression Comprehension

1 code implementation7 Dec 2019 Yiyi Zhou, Rongrong Ji, Gen Luo, Xiaoshuai Sun, Jinsong Su, Xinghao Ding, Chia-Wen Lin, Qi Tian

Referring Expression Comprehension (REC) is an emerging research spot in computer vision, which refers to detecting the target region in an image given an text description.

feature selection Referring Expression +1

Circumventing Outliers of AutoAugment with Knowledge Distillation

1 code implementation ECCV 2020 Longhui Wei, An Xiao, Lingxi Xie, Xin Chen, Xiaopeng Zhang, Qi Tian

AutoAugment has been a powerful algorithm that improves the accuracy of many vision tasks, yet it is sensitive to the operator space as well as hyper-parameters, and an improper setting may degenerate network optimization.

Data Augmentation General Classification +2

Searching towards Class-Aware Generators for Conditional Generative Adversarial Networks

1 code implementation25 Jun 2020 Peng Zhou, Lingxi Xie, Xiaopeng Zhang, Bingbing Ni, Qi Tian

To learn the sampling policy, a Markov decision process is embedded into the search algorithm and a moving average is applied for better stability.

Image Generation

When Parameter-efficient Tuning Meets General-purpose Vision-language Models

1 code implementation16 Dec 2023 Yihang Zhai, Haixin Wang, Jianlong Chang, Xinlong Yang, Jinan Sun, Shikun Zhang, Qi Tian

Instruction tuning has shown promising potential for developing general-purpose AI capabilities by using large-scale pre-trained models and boosts growing research to integrate multimodal information for creative applications.

General Greedy De-bias Learning

1 code implementation20 Dec 2021 Xinzhe Han, Shuhui Wang, Chi Su, Qingming Huang, Qi Tian

Existing de-bias learning frameworks try to capture specific dataset bias by annotations but they fail to handle complicated OOD scenarios.

Image Classification Question Answering +1

Active Pointly-Supervised Instance Segmentation

1 code implementation23 Jul 2022 Chufeng Tang, Lingxi Xie, Gang Zhang, Xiaopeng Zhang, Qi Tian, Xiaolin Hu

In this paper, we present an economic active learning setting, named active pointly-supervised instance segmentation (APIS), which starts with box-level annotations and iteratively samples a point within the box and asks if it falls on the object.

Active Learning Instance Segmentation +2

AiluRus: A Scalable ViT Framework for Dense Prediction

1 code implementation NeurIPS 2023 Jin Li, Yaoming Wang, Xiaopeng Zhang, Bowen Shi, Dongsheng Jiang, Chenglin Li, Wenrui Dai, Hongkai Xiong, Qi Tian

Specifically, at the intermediate layer of the ViT, we utilize a spatial-aware density-based clustering algorithm to select representative tokens from the token sequence.

object-detection Object Detection +1

FedSkip: Combatting Statistical Heterogeneity with Federated Skip Aggregation

1 code implementation14 Dec 2022 Ziqing Fan, Yanfeng Wang, Jiangchao Yao, Lingjuan Lyu, Ya zhang, Qi Tian

However, in addition to previous explorations for improvement in federated averaging, our analysis shows that another critical bottleneck is the poorer optima of client models in more heterogeneous conditions.

Federated Learning

Learning Transferable Pedestrian Representation from Multimodal Information Supervision

1 code implementation12 Apr 2023 Liping Bao, Longhui Wei, Xiaoyu Qiu, Wengang Zhou, Houqiang Li, Qi Tian

Recent researches on unsupervised person re-identification~(reID) have demonstrated that pre-training on unlabeled person images achieves superior performance on downstream reID tasks than pre-training on ImageNet.

Attribute Contrastive Learning +3

One-bit Supervision for Image Classification

1 code implementation NeurIPS 2020 Hengtong Hu, Lingxi Xie, Zewei Du, Richang Hong, Qi Tian

Instead of training a model upon the accurate label of each sample, our setting requires the model to query with a predicted label of each sample and learn from the answer whether the guess is correct.

Classification General Classification +1

Dilated Context Integrated Network with Cross-Modal Consensus for Temporal Emotion Localization in Videos

1 code implementation3 Aug 2022 Juncheng Li, Junlin Xie, Linchao Zhu, Long Qian, Siliang Tang, Wenqiao Zhang, Haochen Shi, Shengyu Zhang, Longhui Wei, Qi Tian, Yueting Zhuang

In this paper, we introduce a new task, named Temporal Emotion Localization in videos~(TEL), which aims to detect human emotions and localize their corresponding temporal boundaries in untrimmed videos with aligned subtitles.

Emotion Classification Temporal Action Localization +1

Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model

1 code implementation28 Mar 2024 Zhicai Wang, Longhui Wei, Tan Wang, Heyu Chen, Yanbin Hao, Xiang Wang, Xiangnan He, Qi Tian

Text-to-image (T2I) generative models have recently emerged as a powerful tool, enabling the creation of photo-realistic images and giving rise to a multitude of applications.

Data Augmentation Image Classification

Fast Non-Local Neural Networks with Spectral Residual Learning

1 code implementation MM '19: Proceedings of the 27th ACM International Conference on Multimedia 2019 Lu Chi, Guiyu Tian, Yadong Mu, Lingxi Xie, Qi Tian

We show its equivalence to conducting residual learning in some spectral domain and carefully re-formulate a variety of neural layers into their spectral forms, such as ReLU or convolutions.

Pose Estimation Video Classification

DisturbLabel: Regularizing CNN on the Loss Layer

2 code implementations CVPR 2016 Lingxi Xie, Jingdong Wang, Zhen Wei, Meng Wang, Qi Tian

During a long period of time we are combating over-fitting in the CNN training process with model regularization, including weight decay, model averaging, data augmentation, etc.

Data Augmentation

Adapting Shortcut With Normalizing Flow: An Efficient Tuning Framework for Visual Recognition

1 code implementation CVPR 2023 Yaoming Wang, Bowen Shi, Xiaopeng Zhang, Jin Li, Yuchen Liu, Wenrui Dai, Chenglin Li, Hongkai Xiong, Qi Tian

To mitigate the computational and storage demands, recent research has explored Parameter-Efficient Fine-Tuning (PEFT), which focuses on tuning a minimal number of parameters for efficient adaptation.

Probabilistic Tree-of-thought Reasoning for Answering Knowledge-intensive Complex Questions

1 code implementation23 Nov 2023 Shulin Cao, Jiajie Zhang, Jiaxin Shi, Xin Lv, Zijun Yao, Qi Tian, Juanzi Li, Lei Hou

During reasoning, for leaf nodes, LLMs choose a more confident answer from Closed-book QA that employs parametric knowledge and Open-book QA that employs retrieved external knowledge, thus eliminating the negative retrieval problem.

Retrieval

Loss re-scaling VQA: Revisiting the LanguagePrior Problem from a Class-imbalance View

1 code implementation30 Oct 2020 Yangyang Guo, Liqiang Nie, Zhiyong Cheng, Qi Tian, Min Zhang

Concretely, we design a novel interpretation scheme whereby the loss of mis-predicted frequent and sparse answers of the same question type is distinctly exhibited during the late training phase.

Face Recognition Image Classification +2

Latency-Aware Differentiable Neural Architecture Search

1 code implementation17 Jan 2020 Yuhui Xu, Lingxi Xie, Xiaopeng Zhang, Xin Chen, Bowen Shi, Qi Tian, Hongkai Xiong

However, these methods suffer the difficulty in optimizing network, so that the searched network is often unfriendly to hardware.

Neural Architecture Search

Semi-Autoregressive Image Captioning

1 code implementation11 Oct 2021 Xu Yan, Zhengcong Fei, Zekang Li, Shuhui Wang, Qingming Huang, Qi Tian

Non-autoregressive image captioning with continuous iterative refinement, which eliminates the sequential dependence in a sentence generation, can achieve comparable performance to the autoregressive counterparts with a considerable acceleration.

Image Captioning Sentence

Prototype-guided Cross-task Knowledge Distillation for Large-scale Models

1 code implementation26 Dec 2022 Deng Li, Aming Wu, Yahong Han, Qi Tian

Considering the complexity and variability of real scene tasks, we propose a Prototype-guided Cross-task Knowledge Distillation (ProC-KD) approach to transfer the intrinsic local-level object knowledge of a large-scale teacher network to various task scenarios.

Knowledge Distillation

SIFT Meets CNN: A Decade Survey of Instance Retrieval

1 code implementation5 Aug 2016 Liang Zheng, Yi Yang, Qi Tian

This survey presents milestones in modern instance retrieval, reviews a broad selection of previous works in different categories, and provides insights on the connection between SIFT and CNN-based methods.

Content-Based Image Retrieval Retrieval

API-Net: Robust Generative Classifier via a Single Discriminator

1 code implementation ECCV 2020 Xinshuai Dong, Hong Liu, Rongrong Ji, Liujuan Cao, Qixiang Ye, Jianzhuang Liu, Qi Tian

On the contrary, a discriminative classifier only models the conditional distribution of labels given inputs, but benefits from effective optimization owing to its succinct structure.

Robust classification

Entity-enhanced Adaptive Reconstruction Network for Weakly Supervised Referring Expression Grounding

1 code implementation18 Jul 2022 Xuejing Liu, Liang Li, Shuhui Wang, Zheng-Jun Zha, Zechao Li, Qi Tian, Qingming Huang

Second, most previous weakly supervised REG methods ignore the discriminative location and context of the referent, causing difficulties in distinguishing the target from other same-category objects.

Attribute Referring Expression +2

Zigzag Learning for Weakly Supervised Object Detection

no code implementations CVPR 2018 Xiaopeng Zhang, Jiashi Feng, Hongkai Xiong, Qi Tian

Unlike them, we propose a zigzag learning strategy to simultaneously discover reliable object instances and prevent the model from overfitting initial seeds.

Object object-detection +1

Social Anchor-Unit Graph Regularized Tensor Completion for Large-Scale Image Retagging

no code implementations12 Apr 2018 Jinhui Tang, Xiangbo Shu, Zechao Li, Yu-Gang Jiang, Qi Tian

Recent approaches simultaneously explore visual, user and tag information to improve the performance of image retagging by constructing and exploring an image-tag-user graph.

Graph Learning TAG

A Novel Multi-Task Tensor Correlation Neural Network for Facial Attribute Prediction

no code implementations9 Apr 2018 Mingxing Duan, Kenli Li, Qi Tian

In this paper, we propose a novel multi-attribute tensor correlation neural network (MTCN) for face attribute prediction.

Attribute Multi-Task Learning

The Unmanned Aerial Vehicle Benchmark: Object Detection and Tracking

no code implementations ECCV 2018 Dawei Du, Yuankai Qi, Hongyang Yu, Yifan Yang, Kaiwen Duan, Guorong Li, Weigang Zhang, Qingming Huang, Qi Tian

Selected from 10 hours raw videos, about 80, 000 representative frames are fully annotated with bounding boxes as well as up to 14 kinds of attributes (e. g., weather condition, flying altitude, camera view, vehicle category, and occlusion) for three fundamental computer vision tasks: object detection, single object tracking, and multiple object tracking.

Multiple Object Tracking Object +3

LVreID: Person Re-Identification with Long Sequence Videos

no code implementations20 Dec 2017 Jianing Li, Shiliang Zhang, Jingdong Wang, Wen Gao, Qi Tian

This paper mainly establishes a large-scale Long sequence Video database for person re-IDentification (LVreID).

Person Re-Identification

Pseudo-positive regularization for deep person re-identification

no code implementations17 Nov 2017 Fuqing Zhu, Xiangwei Kong, Haiyan Fu, Qi Tian

A small proportion of these retrieved samples are randomly selected as the Pseudo Positive samples and added to the target training set for the supervised CNN training.

Data Augmentation Person Re-Identification

Deep Representation Learning with Part Loss for Person Re-Identification

no code implementations4 Jul 2017 Hantao Yao, Shiliang Zhang, Yongdong Zhang, Jintao Li, Qi Tian

The representation learning risk is evaluated by the proposed part loss, which automatically generates several parts for an image, and computes the person classification loss on each part separately.

Classification General Classification +2

Learning to Learn Image Classifiers with Visual Analogy

no code implementations CVPR 2019 Linjun Zhou, Peng Cui, Shiqiang Yang, Wenwu Zhu, Qi Tian

We then propose an out-of-sample embedding method to learn the embedding of a new class represented by a few samples through its visual analogy with base classes and derive the classification parameters for the new class.

Classification General Classification +1

Pose-driven Deep Convolutional Model for Person Re-identification

no code implementations ICCV 2017 Chi Su, Jianing Li, Shiliang Zhang, Junliang Xing, Wen Gao, Qi Tian

Our deep architecture explicitly leverages the human part cues to alleviate the pose variations and learn robust feature representations from both the global image and different local parts.

Person Re-Identification

E$^2$BoWs: An End-to-End Bag-of-Words Model via Deep Convolutional Neural Network

no code implementations18 Sep 2017 Xiaobin Liu, Shiliang Zhang, Tiejun Huang, Qi Tian

To conquer these issues, we propose an End-to-End BoWs (E$^2$BoWs) model based on Deep Convolutional Neural Network (DCNN).

Image Retrieval Quantization +1

GLAD: Global-Local-Alignment Descriptor for Pedestrian Retrieval

no code implementations13 Sep 2017 Longhui Wei, Shiliang Zhang, Hantao Yao, Wen Gao, Qi Tian

Targeting to solve these problems, this work proposes a Global-Local-Alignment Descriptor (GLAD) and an efficient indexing and retrieval framework, respectively.

Person Re-Identification Representation Learning +1

Multidimensional Scaling on Multiple Input Distance Matrices

no code implementations1 May 2016 Song Bai, Xiang Bai, Longin Jan Latecki, Qi Tian

How to do multidimensional scaling on multiple input distance matrices is still unsolved to our best knowledge.

One-Shot Fine-Grained Instance Retrieval

no code implementations4 Jul 2017 Hantao Yao, Shiliang Zhang, Yongdong Zhang, Jintao Li, Qi Tian

Aiming to conquer this issue, we propose a retrieval task named One-Shot Fine-Grained Instance Retrieval (OSFGIR).

Fine-Grained Visual Categorization Image Retrieval +1

Part-based Deep Hashing for Large-scale Person Re-identification

no code implementations5 May 2017 Fuqing Zhu, Xiangwei Kong, Liang Zheng, Haiyan Fu, Qi Tian

In the experiment, we show that the proposed Part-based Deep Hashing method yields very competitive re-id accuracy on the large-scale Market-1501 and Market-1501+500K datasets.

Deep Hashing Large-Scale Person Re-Identification

Person Re-identification in the Wild

no code implementations CVPR 2017 Liang Zheng, Hengheng Zhang, Shaoyan Sun, Manmohan Chandraker, Yi Yang, Qi Tian

Our baselines address three issues: the performance of various combinations of detectors and recognizers, mechanisms for pedestrian detection to help improve overall re-identification accuracy and assessing the effectiveness of different detectors for re-identification.

Benchmarking Pedestrian Detection +2

Scalable Person Re-identification on Supervised Smoothed Manifold

no code implementations CVPR 2017 Song Bai, Xiang Bai, Qi Tian

Most existing person re-identification algorithms either extract robust visual features or learn discriminative metrics for person images.

Person Re-Identification

Deep Attributes Driven Multi-Camera Person Re-identification

no code implementations11 May 2016 Chi Su, Shiliang Zhang, Junliang Xing, Wen Gao, Qi Tian

And we propose a semi-supervised attribute learning framework which progressively boosts the accuracy of attributes only using a limited number of labeled data.

Attribute Metric Learning +1

Geometric Neural Phrase Pooling: Modeling the Spatial Co-occurrence of Neurons

no code implementations21 Jul 2016 Lingxi Xie, Qi Tian, John Flynn, Jingdong Wang, Alan Yuille

For this, we consider the neurons in the hidden layer as neural words, and construct a set of geometric neural phrases on top of them.

Image Classification

Coarse2Fine: Two-Layer Fusion For Image Retrieval

no code implementations4 Jul 2016 Gaipeng Kong, Le Dong, Wenpu Dong, Liang Zheng, Qi Tian

Departing from the previous methods fusing multiple image descriptors simultaneously, C2F is featured by a layered procedure composed by filtering and refining.

Image Retrieval Retrieval +1

InterActive: Inter-Layer Activeness Propagation

no code implementations CVPR 2016 Lingxi Xie, Liang Zheng, Jingdong Wang, Alan Yuille, Qi Tian

An increasing number of computer vision tasks can be tackled with deep features, which are the intermediate outputs of a pre-trained Convolutional Neural Network.

Descriptive General Classification

Good Practice in CNN Feature Transfer

no code implementations1 Apr 2016 Liang Zheng, Yali Zhao, Shengjin Wang, Jingdong Wang, Qi Tian

The objective of this paper is the effective transfer of the Convolutional Neural Network (CNN) feature in image search and classification.

General Classification Image Retrieval

Geometric Hypergraph Learning for Visual Tracking

no code implementations18 Mar 2016 Dawei Du, Honggang Qi, Longyin Wen, Qi Tian, Qingming Huang, Siwei Lyu

Graph based representation is widely used in visual tracking field by finding correct correspondences between target parts in consecutive frames.

Visual Tracking

Person Re-identification Meets Image Search

no code implementations7 Feb 2015 Liang Zheng, Liyue Shen, Lu Tian, Shengjin Wang, Jiahao Bu, Qi Tian

In the light of recent advances in image search, this paper proposes to treat person re-identification as an image search problem.

Image Retrieval Person Re-Identification

Visual Reranking with Improved Image Graph

no code implementations3 Jun 2014 Ziqiong Liu, Shengjin Wang, Liang Zheng, Qi Tian

This paper introduces an improved reranking method for the Bag-of-Words (BoW) based image search.

Image Retrieval

Seeing the Big Picture: Deep Embedding with Contextual Evidences

no code implementations1 Jun 2014 Liang Zheng, Shengjin Wang, Fei He, Qi Tian

Specifically, the Convolutional Neural Network (CNN) is employed to extract features from regional and global patches, leading to the so-called "Deep Embedding" framework.

Image Classification Image Retrieval +1

Bayes Merging of Multiple Vocabularies for Scalable Image Retrieval

no code implementations CVPR 2014 Liang Zheng, Shengjin Wang, Wengang Zhou, Qi Tian

Albeit simple, Bayes merging can be well applied in various merging tasks, and consistently improves the baselines on multi-vocabulary merging.

Image Retrieval Quantization +1

Super-pixel cloud detection using Hierarchical Fusion CNN

no code implementations19 Oct 2018 Han Liu, Dan Zeng, Qi Tian

Secondly, super-pixel level database is used to train our cloud detection models based on CNN and deep forest.

Binary Classification Cloud Detection +2

Generalized Coarse-to-Fine Visual Recognition with Progressive Training

no code implementations29 Nov 2018 Xutong Ren, Lingxi Xie, Chen Wei, Siyuan Qiao, Chi Su, Jiaying Liu, Qi Tian, Elliot K. Fishman, Alan L. Yuille

Computer vision is difficult, partly because the desired mathematical function connecting input and output data is often complex, fuzzy and thus hard to learn.

Image Classification Object Localization +1

Phase Collaborative Network for Two-Phase Medical Image Segmentation

no code implementations28 Nov 2018 Huangjie Zheng, Lingxi Xie, Tianwei Ni, Ya zhang, Yan-Feng Wang, Qi Tian, Elliot K. Fishman, Alan L. Yuille

However, in medical image analysis, fusing prediction from two phases is often difficult, because (i) there is a domain gap between two phases, and (ii) the semantic labels are not pixel-wise corresponded even for images scanned from the same patient.

Image Segmentation Medical Image Segmentation +3

Domain-Invariant Adversarial Learning for Unsupervised Domain Adaption

no code implementations30 Nov 2018 Yexun Zhang, Ya zhang, Yan-Feng Wang, Qi Tian

Unsupervised domain adaption aims to learn a powerful classifier for the target domain given a labeled source data set and an unlabeled target data set.

Domain Adaptation Generative Adversarial Network

Accelerate CNN via Recursive Bayesian Pruning

no code implementations ICCV 2019 Yuefu Zhou, Ya zhang, Yan-Feng Wang, Qi Tian

A new dropout-based measurement of redundancy, which facilitate the computation of posterior assuming inter-layer dependency, is introduced.

Super-Bit Locality-Sensitive Hashing

no code implementations NeurIPS 2012 Jianqiu Ji, Jianmin Li, Shuicheng Yan, Bo Zhang, Qi Tian

Sign-random-projection locality-sensitive hashing (SRP-LSH) is a probabilistic dimension reduction method which provides an unbiased estimate of angular similarity, yet suffers from the large variance of its estimation.

Dimensionality Reduction Retrieval

Deep Hashing via Discrepancy Minimization

no code implementations CVPR 2018 Zhixiang Chen, Xin Yuan, Jiwen Lu, Qi Tian, Jie zhou

This paper presents a discrepancy minimizing model to address the discrete optimization problem in hashing learning.

Deep Hashing

Collaborative Deep Reinforcement Learning for Multi-Object Tracking

no code implementations ECCV 2018 Liangliang Ren, Jiwen Lu, Zifeng Wang, Qi Tian, Jie zhou

To address this, we develop a deep prediction-decision network in our C-DRL, which simultaneously detects and predicts objects under a unified network via deep reinforcement learning.

Multi-Object Tracking Object +2

Binary Code Ranking with Weighted Hamming Distance

no code implementations CVPR 2013 Lei Zhang, Yongdong Zhang, Jinhu Tang, Ke Lu, Qi Tian

In this paper, we propose a weighted Hamming distance ranking algorithm (WhRank) to rank the binary codes of hashing methods.

Lp-Norm IDF for Large Scale Image Search

no code implementations CVPR 2013 Liang Zheng, Shengjin Wang, Ziqiong Liu, Qi Tian

Further, by counting for the term-frequency in each image, the proposed L p -norm IDF helps to alleviate the visual word burstiness phenomenon.

Image Retrieval

Orientational Pyramid Matching for Recognizing Indoor Scenes

no code implementations CVPR 2014 Lingxi Xie, Jingdong Wang, Baining Guo, Bo Zhang, Qi Tian

The novelty lies in that OPM uses the 3D orientations to form the pyramid and produce the pooling regions, which is unlike SPM that uses the spatial positions to form the pyramid.

General Classification Scene Classification +1

Semi-supervised Relational Topic Model for Weakly Annotated Image Recognition in Social Media

no code implementations CVPR 2014 Zhenxing Niu, Gang Hua, Xinbo Gao, Qi Tian

In such way, we can efficiently leverage the loosely related tags, and build an intermediate level representation for a collection of weakly annotated images.

Interaction Part Mining: A Mid-Level Approach for Fine-Grained Action Recognition

no code implementations CVPR 2015 Yang Zhou, Bingbing Ni, Richang Hong, Meng Wang, Qi Tian

Secondly, these object regions are matched and tracked across frames to form a large spatio-temporal graph based on the appearance matching and the dense motion trajectories through them.

Fine-grained Action Recognition Human-Object Interaction Detection +2

Picking Deep Filter Responses for Fine-Grained Image Recognition

no code implementations CVPR 2016 Xiaopeng Zhang, Hongkai Xiong, Wengang Zhou, Weiyao Lin, Qi Tian

Recognizing fine-grained sub-categories such as birds and dogs is extremely challenging due to the highly localized and subtle differences in some specific parts.

Fine-Grained Image Recognition

Cascaded Interactional Targeting Network for Egocentric Video Analysis

no code implementations CVPR 2016 Yang Zhou, Bingbing Ni, Richang Hong, Xiaokang Yang, Qi Tian

Firstly, a novel EM-like learning framework is proposed to train the pixel-level deep convolutional neural network (DCNN) by seamlessly integrating weakly supervised data (i. e., massive bounding box annotations) with a small set of strongly supervised data (i. e., fully annotated hand segmentation maps) to achieve state-of-the-art hand segmentation performance.

Action Recognition Foreground Segmentation +4

Task-Driven Dynamic Fusion: Reducing Ambiguity in Video Description

no code implementations CVPR 2017 Xishan Zhang, Ke Gao, Yongdong Zhang, Dongming Zhang, Jintao Li, Qi Tian

This paper contributes to: 1)The first in-depth study of the weakness inherent in data-driven static fusion methods for video captioning.

Video Captioning Video Description

RIDE: Reversal Invariant Descriptor Enhancement

no code implementations ICCV 2015 Lingxi Xie, Jingdong Wang, Weiyao Lin, Bo Zhang, Qi Tian

In many fine-grained object recognition datasets, image orientation (left/right) might vary from sample to sample.

Object Recognition

Scalable Person Re-Identification: A Benchmark

no code implementations ICCV 2015 Liang Zheng, Liyue Shen, Lu Tian, Shengjin Wang, Jingdong Wang, Qi Tian

As a minor contribution, inspired by recent advances in large-scale image search, this paper proposes an unsupervised Bag-of-Words descriptor.

Image Retrieval Person Re-Identification

Multi-Task Learning With Low Rank Attribute Embedding for Person Re-Identification

no code implementations ICCV 2015 Chi Su, Fan Yang, Shiliang Zhang, Qi Tian, Larry S. Davis, Wen Gao

Since attributes are generally correlated, we introduce a low rank attribute embedding into the MTL formulation to embed original binary attributes to a continuous attribute space, where incorrect and incomplete attributes are rectified and recovered to better describe people.

Attribute Multi-Task Learning +1

Similarity Gaussian Process Latent Variable Model for Multi-Modal Data Analysis

no code implementations ICCV 2015 Guoli Song, Shuhui Wang, Qingming Huang, Qi Tian

Data from real applications involve multiple modalities representing content with the same semantics and deliver rich information from complementary aspects.

Retrieval

Ensemble Diffusion for Retrieval

no code implementations ICCV 2017 Song Bai, Zhichao Zhou, Jingdong Wang, Xiang Bai, Longin Jan Latecki, Qi Tian

This stimulates a great research interest of considering similarity fusion in the framework of diffusion process (i. e., fusion with diffusion) for robust retrieval.

3D Shape Classification 3D Shape Retrieval +2

Adversarial Attack and Defense on Point Sets

no code implementations28 Feb 2019 Jiancheng Yang, Qiang Zhang, Rongyao Fang, Bingbing Ni, Jinxian Liu, Qi Tian

A set of novel 3D point cloud attack operations are proposed via pointwise gradient perturbation and adversarial point attachment / detachment.

Adversarial Attack

Modeling Point Clouds with Self-Attention and Gumbel Subset Sampling

no code implementations CVPR 2019 Jiancheng Yang, Qiang Zhang, Bingbing Ni, Linguo Li, Jinxian Liu, Mengdie Zhou, Qi Tian

Thereby, we for the first time propose an end-to-end learnable and task-agnostic sampling operation, named Gumbel Subset Sampling (GSS), to select a representative subset of input points.

BridgeNet: A Continuity-Aware Probabilistic Network for Age Estimation

no code implementations CVPR 2019 Wanhua Li, Jiwen Lu, Jianjiang Feng, Chunjing Xu, Jie zhou, Qi Tian

Existing methods for age estimation usually apply a divide-and-conquer strategy to deal with heterogeneous data caused by the non-stationary aging process.

Age Estimation MORPH

Deep Fitting Degree Scoring Network for Monocular 3D Object Detection

no code implementations CVPR 2019 Lijie Liu, Jiwen Lu, Chunjing Xu, Qi Tian, Jie zhou

In this paper, we propose to learn a deep fitting degree scoring network for monocular 3D object detection, which aims to score fitting degree between proposals and object conclusively.

Monocular 3D Object Detection Object +2

Handwritten Chinese Font Generation with Collaborative Stroke Refinement

no code implementations30 Apr 2019 Chuan Wen, Jie Chang, Ya zhang, Siheng Chen, Yan-Feng Wang, Mei Han, Qi Tian

Automatic character generation is an appealing solution for new typeface design, especially for Chinese typefaces including over 3700 most commonly-used characters.

Font Generation

Efficient Discrete Supervised Hashing for Large-scale Cross-modal Retrieval

no code implementations3 May 2019 Tao Yao, Xiangwei Kong, Lianshan Yan, Wenjing Tang, Qi Tian

In this paper, to address above issues, we propose a supervised cross-modal hashing method based on matrix factorization dubbed Efficient Discrete Supervised Hashing (EDSH).

Cross-Modal Retrieval Quantization +1

Cannot find the paper you are looking for? You can Submit a new open access paper.