Search Results for author: Qixiang Ye

Found 96 papers, 64 papers with code

API-Net: Robust Generative Classifier via a Single Discriminator

1 code implementation • ECCV 2020 • Xinshuai Dong, Hong Liu, Rongrong Ji, Liujuan Cao, Qixiang Ye, Jianzhuang Liu, Qi Tian

On the contrary, a discriminative classifier only models the conditional distribution of labels given inputs, but benefits from effective optimization owing to its succinct structure.

Robust classification

Paper
Code

Language-Driven Visual Consensus for Zero-Shot Semantic Segmentation

no code implementations • 13 Mar 2024 • ZiCheng Zhang, Tong Zhang, Yi Zhu, Jianzhuang Liu, Xiaodan Liang, Qixiang Ye, Wei Ke

To mitigate these issues, we propose a Language-Driven Visual Consensus (LDVC) approach, fostering improved alignment of semantic and visual information. Specifically, we leverage class embeddings as anchors due to their discrete and abstract nature, steering vision features toward class embeddings.

Language Modelling Semantic Segmentation +1

Paper
Add Code

Towards Deviation-Robust Agent Navigation via Perturbation-Aware Contrastive Learning

no code implementations • 9 Mar 2024 • Bingqian Lin, Yanxin Long, Yi Zhu, Fengda Zhu, Xiaodan Liang, Qixiang Ye, Liang Lin

For encouraging the agent to well capture the difference brought by perturbation, a perturbation-aware contrastive learning mechanism is further developed by contrasting perturbation-free trajectory encodings and perturbation-based counterparts.

Contrastive Learning Navigate +1

Paper
Add Code

Ray Denoising: Depth-aware Hard Negative Sampling for Multi-view 3D Object Detection

2 code implementations • 6 Feb 2024 • Feng Liu, Tengteng Huang, Qianjing Zhang, Haotian Yao, Chi Zhang, Fang Wan, Qixiang Ye, Yanzhao Zhou

Multi-view 3D object detection systems often struggle with generating precise predictions due to the challenges in estimating depth from images, increasing redundant and incorrect detections.

Ranked #2 on 3D Object Detection on nuScenes Camera Only

3D Object Detection Denoising +1

Paper
Code

Virtual Classification: Modulating Domain-Specific Knowledge for Multidomain Crowd Counting

1 code implementation • 6 Feb 2024 • Mingyue Guo, Binghui Chen, Zhaoyi Yan, YaoWei Wang, Qixiang Ye

Multidomain crowd counting aims to learn a general model for multiple diverse datasets.

Crowd Counting

Paper
Code

ControlCap: Controllable Region-level Captioning

1 code implementation • 31 Jan 2024 • Yuzhong Zhao, Yue Liu, Zonghao Guo, Weijia Wu, Chen Gong, Fang Wan, Qixiang Ye

The multimodal model is constrained to generate captions within a few sub-spaces containing the control words, which increases the opportunity of hitting less frequent captions, alleviating the caption degeneration issue.

Ranked #1 on Dense Captioning on Visual Genome

Dense Captioning

Paper
Code

CPR++: Object Localization via Single Coarse Point Supervision

2 code implementations • 30 Jan 2024 • Xuehui Yu, Pengfei Chen, Kuiran Wang, Xumeng Han, Guorong Li, Zhenjun Han, Qixiang Ye, Jianbin Jiao

CPR reduces the semantic variance by selecting a semantic centre point in a neighbourhood region to replace the initial annotated point.

Object Object Localization

632

Paper
Code

ChatterBox: Multi-round Multimodal Referring and Grounding

1 code implementation • 24 Jan 2024 • Yunjie Tian, Tianren Ma, Lingxi Xie, Jihao Qiu, Xi Tang, Yuan Zhang, Jianbin Jiao, Qi Tian, Qixiang Ye

In this study, we establish a baseline for a new task named multimodal multi-round referring and grounding (MRG), opening up a promising direction for instance-level multimodal dialogues.

Language Modelling Visual Grounding

Paper
Code

VMamba: Visual State Space Model

2 code implementations • 18 Jan 2024 • Yue Liu, Yunjie Tian, Yuzhong Zhao, Hongtian Yu, Lingxi Xie, YaoWei Wang, Qixiang Ye, Yunfan Liu

Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) have long been the predominant backbone networks for visual representation learning.

Computational Efficiency Representation Learning

1,377

Paper
Code

Regressor-Segmenter Mutual Prompt Learning for Crowd Counting

no code implementations • 4 Dec 2023 • Mingyue Guo, Li Yuan, Zhaoyi Yan, Binghui Chen, YaoWei Wang, Qixiang Ye

In this study, we propose mutual prompt learning (mPrompt), which leverages a regressor and a segmenter as guidance for each other, solving bias and inaccuracy caused by annotation variance while distinguishing foreground from background.

Crowd Counting

Paper
Add Code

Spatial Transform Decoupling for Oriented Object Detection

1 code implementation • 21 Aug 2023 • Hongtian Yu, Yunjie Tian, Qixiang Ye, Yunfan Liu

Vision Transformers (ViTs) have achieved remarkable success in computer vision tasks.

Ranked #1 on Object Detection In Aerial Images on HRSC2016 (using extra training data)

Object object-detection +2

Paper
Code

Generative Prompt Model for Weakly Supervised Object Localization

1 code implementation • ICCV 2023 • Yuzhong Zhao, Qixiang Ye, Weijia Wu, Chunhua Shen, Fang Wan

During training, GenPromp converts image category labels to learnable prompt embeddings which are fed to a generative model to conditionally recover the input image with noise and learn representative embeddings.

Ranked #1 on Weakly-Supervised Object Localization on CUB-200-2011 (Top-1 Localization Accuracy metric, using extra training data)

Image Denoising Language Modelling +2

Paper
Code

Album Storytelling with Iterative Story-aware Captioning and Large Language Models

no code implementations • 22 May 2023 • Munan Ning, Yujia Xie, Dongdong Chen, Zeyin Song, Lu Yuan, Yonghong Tian, Qixiang Ye, Li Yuan

One natural approach is to use caption models to describe each photo in the album, and then use LLMs to summarize and rewrite the generated captions into an engaging story.

Paper
Add Code

Generic-to-Specific Distillation of Masked Autoencoders

1 code implementation • CVPR 2023 • Wei Huang, Zhiliang Peng, Li Dong, Furu Wei, Jianbin Jiao, Qixiang Ye

Lightweight ViT models limited by the model capacity, however, benefit little from those pre-training mechanisms.

Image Classification Knowledge Distillation +3

Paper
Code

Spectral Aware Softmax for Visible-Infrared Person Re-Identification

no code implementations • 3 Feb 2023 • Lei Tan, Pingyang Dai, Qixiang Ye, Mingliang Xu, Yongjian Wu, Rongrong Ji

Based on the observation and analysis of SA-Softmax, we modify the SA-Softmax with the Feature Mask and Absolute-Similarity Term to alleviate the ambiguous optimization during model training.

Person Re-Identification

Paper
Add Code

Unsupervised Domain Adaptation on Person Re-Identification via Dual-level Asymmetric Mutual Learning

no code implementations • 29 Jan 2023 • Qiong Wu, Jiahan Li, Pingyang Dai, Qixiang Ye, Liujuan Cao, Yongjian Wu, Rongrong Ji

The knowledge transfer between two networks is based on an asymmetric mutual learning manner.

Person Re-Identification Pseudo Label +2

Paper
Add Code

DQnet: Cross-Model Detail Querying for Camouflaged Object Detection

no code implementations • 16 Dec 2022 • Wei Sun, Chengao Liu, Linyan Zhang, Yu Li, Pengxu Wei, Chang Liu, Jialing Zou, Jianbin Jiao, Qixiang Ye

Optimizing a convolutional neural network (CNN) for camouflaged object detection (COD) tends to activate local discriminative regions while ignoring complete object extent, causing the partial activation issue which inevitably leads to missing or redundant regions of objects.

Object object-detection +2

Paper
Add Code

Proposal Distribution Calibration for Few-Shot Object Detection

1 code implementation • 15 Dec 2022 • Bohao Li, Chang Liu, Mengnan Shi, Xiaozhong Chen, Xiangyang Ji, Qixiang Ye

Adapting object detectors learned with sufficient supervision to novel classes under low data regimes is charming yet challenging.

Few-Shot Object Detection Object +1

Paper
Code

CircleNet: Reciprocating Feature Adaptation for Robust Pedestrian Detection

no code implementations • 12 Dec 2022 • Tianliang Zhang, Zhenjun Han, Huijuan Xu, Baochang Zhang, Qixiang Ye

In this paper we propose a novel feature learning model, referred to as CircleNet, to achieve feature adaptation by mimicking the process humans looking at low resolution and occluded objects: focusing on it again, at a finer scale, if the object can not be identified clearly for the first time.

object-detection Object Detection +1

Paper
Add Code

Feature Calibration Network for Occluded Pedestrian Detection

no code implementations • 12 Dec 2022 • Tianliang Zhang, Qixiang Ye, Baochang Zhang, Jianzhuang Liu, Xiaopeng Zhang, Qi Tian

FC-Net is based on the observation that the visible parts of pedestrians are selective and decisive for detection, and is implemented as a self-paced feature learning framework with a self-activation (SA) module and a feature calibration (FC) module.

Pedestrian Detection

Paper
Add Code

Fast-iTPN: Integrally Pre-Trained Transformer Pyramid Network with Token Migration

1 code implementation • CVPR 2023 • Yunjie Tian, Lingxi Xie, Jihao Qiu, Jianbin Jiao, YaoWei Wang, Qi Tian, Qixiang Ye

iTPN is born with two elaborated designs: 1) The first pre-trained feature pyramid upon vision transformer (ViT).

object-detection Object Detection +1

149

Paper
Code

Beyond Instance Discrimination: Relation-aware Contrastive Self-supervised Learning

no code implementations • 2 Nov 2022 • Yifei Zhang, Chang Liu, Yu Zhou, Weiping Wang, Qixiang Ye, Xiangyang Ji

In this paper, we present relation-aware contrastive self-supervised learning (ReCo) to integrate instance relations, i. e., global distribution relation and local interpolation relation, into the CSL framework in a plug-and-play fashion.

Relation Self-Supervised Learning

Paper
Add Code

A Unified View of Masked Image Modeling

1 code implementation • 19 Oct 2022 • Zhiliang Peng, Li Dong, Hangbo Bao, Qixiang Ye, Furu Wei

Masked image modeling has demonstrated great potential to eliminate the label-hungry problem of training large-scale vision Transformers, achieving impressive performance on various downstream tasks.

Image Classification Segmentation +1

Paper
Code

Multi-Agent Automated Machine Learning

no code implementations • CVPR 2023 • Zhaozhi Wang, Kefan Su, Jian Zhang, Huizhu Jia, Qixiang Ye, Xiaodong Xie, Zongqing Lu

In this paper, we propose multi-agent automated machine learning (MA2ML) with the aim to effectively handle joint optimization of modules in automated machine learning (AutoML).

Data Augmentation Multi-agent Reinforcement Learning +1

Paper
Add Code

Learnable Distribution Calibration for Few-Shot Class-Incremental Learning

no code implementations • 1 Oct 2022 • Binghao Liu, Boyu Yang, Lingxi Xie, Ren Wang, Qi Tian, Qixiang Ye

LDC is built upon a parameterized calibration unit (PCU), which initializes biased distributions for all classes based on classifier vectors (memory-free) and a single covariance matrix.

Few-Shot Class-Incremental Learning Few-Shot Learning +2

Paper
Add Code

BEiT v2: Masked Image Modeling with Vector-Quantized Visual Tokenizers

2 code implementations • 12 Aug 2022 • Zhiliang Peng, Li Dong, Hangbo Bao, Qixiang Ye, Furu Wei

The large-size BEiT v2 obtains 87. 3% top-1 accuracy for ImageNet-1K (224 size) fine-tuning, and 56. 7% mIoU on ADE20K for semantic segmentation.

Ranked #27 on Self-Supervised Image Classification on ImageNet

Knowledge Distillation Representation Learning +2

18,266

Paper
Code

Point-to-Box Network for Accurate Object Detection via Single Point Supervision

3 code implementations • 14 Jul 2022 • Pengfei Chen, Xuehui Yu, Xumeng Han, Najmul Hassan, Kai Wang, Jiachen Li, Jian Zhao, Humphrey Shi, Zhenjun Han, Qixiang Ye

However, the performance gap between point supervised object detection (PSOD) and bounding box supervised detection remains large.

Attribute Multiple Instance Learning +3

632

Paper
Code

HiViT: Hierarchical Vision Transformer Meets Masked Image Modeling

1 code implementation • 30 May 2022 • Xiaosong Zhang, Yunjie Tian, Wei Huang, Qixiang Ye, Qi Dai, Lingxi Xie, Qi Tian

A key idea of efficient implementation is to discard the masked image patches (or tokens) throughout the target network (encoder), which requires the encoder to be a plain vision transformer (e. g., ViT), albeit hierarchical vision transformers (e. g., Swin Transformer) have potentially better properties in formulating vision inputs.

Transfer Learning

Paper
Code

Integrally Migrating Pre-trained Transformer Encoder-decoders for Visual Object Detection

3 code implementations • ICCV 2023 • Feng Liu, Xiaosong Zhang, Zhiliang Peng, Zonghao Guo, Fang Wan, Xiangyang Ji, Qixiang Ye

Except for the backbone networks, however, other components such as the detector head and the feature pyramid network (FPN) remain trained from scratch, which hinders fully tapping the potential of representation models.

Ranked #3 on Few-Shot Object Detection on MS-COCO (30-shot)

Few-Shot Object Detection Object +2

Paper
Code

Beyond Masking: Demystifying Token-Based Pre-Training for Vision Transformers

1 code implementation • 27 Mar 2022 • Yunjie Tian, Lingxi Xie, Jiemin Fang, Mengnan Shi, Junran Peng, Xiaopeng Zhang, Jianbin Jiao, Qi Tian, Qixiang Ye

The past year has witnessed a rapid development of masked image modeling (MIM).

Paper
Code

Object Localization under Single Coarse Point Supervision

2 code implementations • CVPR 2022 • Xuehui Yu, Pengfei Chen, Di wu, Najmul Hassan, Guorong Li, Junchi Yan, Humphrey Shi, Qixiang Ye, Zhenjun Han

In this study, we propose a POL method using coarse point annotations, relaxing the supervision signals from accurate key points to freely spotted points.

Multiple Instance Learning Object +1

632

Paper
Code

Global2Local: A Joint-Hierarchical Attention for Video Captioning

no code implementations • 13 Mar 2022 • Chengpeng Dai, Fuhai Chen, Xiaoshuai Sun, Rongrong Ji, Qixiang Ye, Yongjian Wu

Recently, automatic video captioning has attracted increasing attention, where the core challenge lies in capturing the key semantic items, like objects and actions as well as their spatial-temporal correlations from the redundant frames and semantic content.

Video Captioning

Paper
Add Code

CrossRectify: Leveraging Disagreement for Semi-supervised Object Detection

1 code implementation • 26 Jan 2022 • Chengcheng Ma, Xingjia Pan, Qixiang Ye, Fan Tang, WeiMing Dong, Changsheng Xu

Semi-supervised object detection has recently achieved substantial progress.

Object object-detection +3

Paper
Code

P2P-Loc: Point to Point Tiny Person Localization

no code implementations • 31 Dec 2021 • Xuehui Yu, Di wu, Qixiang Ye, Jianbin Jiao, Zhenjun Han

As a result, we propose a point self-refinement approach that iteratively updates point annotations in a self-paced way.

Object Object Localization

Paper
Add Code

Exploring Complicated Search Spaces with Interleaving-Free Sampling

no code implementations • 5 Dec 2021 • Yunjie Tian, Lingxi Xie, Jiemin Fang, Jianbin Jiao, Qixiang Ye, Qi Tian

In this paper, we build the search algorithm upon a complicated search space with long-distance connections, and show that existing weight-sharing search algorithms mostly fail due to the existence of \textbf{interleaved connections}.

Neural Architecture Search

Paper
Add Code

Feature-Gate Coupling for Dynamic Network Pruning

1 code implementation • 29 Nov 2021 • Mengnan Shi, Chang Liu, Qixiang Ye, Jianbin Jiao

Gating modules have been widely explored in dynamic network pruning to reduce the run-time computational cost of deep neural networks while preserving the representation of features.

Contrastive Learning Network Pruning

Paper
Code

Semantic-Aware Generation for Self-Supervised Visual Representation Learning

1 code implementation • 25 Nov 2021 • Yunjie Tian, Lingxi Xie, Xiaopeng Zhang, Jiemin Fang, Haohang Xu, Wei Huang, Jianbin Jiao, Qi Tian, Qixiang Ye

In this paper, we propose a self-supervised visual representation learning approach which involves both generative and discriminative proxies, where we focus on the former part by requiring the target network to recover the original image based on the mid-level features.

Ranked #63 on Semantic Segmentation on Cityscapes test

Representation Learning Semantic Segmentation

Paper
Code

Long-tailed Distribution Adaptation

1 code implementation • 6 Oct 2021 • Zhiliang Peng, Wei Huang, Zonghao Guo, Xiaosong Zhang, Jianbin Jiao, Qixiang Ye

We propose to jointly optimize empirical risks of the unbalanced and balanced domains and approximate their domain divergence by intra-class and inter-class distances, with the aim to adapt models trained on the long-tailed distribution to general distributions in an interpretable way.

Domain Adaptation Instance Segmentation +3

Paper
Code

GraFormer: Graph Convolution Transformer for 3D Pose Estimation

1 code implementation • 17 Sep 2021 • Weixi Zhao, Yunjie Tian, Qixiang Ye, Jianbin Jiao, Weiqiang Wang

Exploiting relations among 2D joints plays a crucial role yet remains semi-developed in 2D-to-3D pose estimation.

3D Pose Estimation Implicit Relations

Paper
Code

Adversarial Reinforced Instruction Attacker for Robust Vision-Language Navigation

1 code implementation • 23 Jul 2021 • Bingqian Lin, Yi Zhu, Yanxin Long, Xiaodan Liang, Qixiang Ye, Liang Lin

Specifically, we propose a Dynamic Reinforced Instruction Attacker (DR-Attacker), which learns to mislead the navigator to move to the wrong target by destroying the most instructive information in instructions at different timesteps.

Vision and Language Navigation Vision-Language Navigation

Paper
Code

Rethinking Sampling Strategies for Unsupervised Person Re-identification

2 code implementations • 7 Jul 2021 • Xumeng Han, Xuehui Yu, Guorong Li, Jian Zhao, Gang Pan, Qixiang Ye, Jianbin Jiao, Zhenjun Han

While extensive research has focused on the framework design and loss function, this paper shows that sampling strategy plays an equally important role.

Ranked #6 on Unsupervised Person Re-Identification on DukeMTMC-reID

Pseudo Label Representation Learning +1

Paper
Code

Cogradient Descent for Dependable Learning

no code implementations • 20 Jun 2021 • Runqi Wang, Baochang Zhang, Li'an Zhuo, Qixiang Ye, David Doermann

Conventional gradient descent methods compute the gradients for multiple variables through the partial derivative.

Image Inpainting Image Reconstruction +1

Paper
Add Code

Beyond Bounding-Box: Convex-Hull Feature Adaptation for Oriented and Densely Packed Object Detection

2 code implementations • CVPR 2021 • Zonghao Guo, Chang Liu, Xiaosong Zhang, Jianbin Jiao, Xiangyang Ji, Qixiang Ye

Detecting oriented and densely packed objects remains challenging for spatial feature aliasing caused by the intersection of reception fields between objects.

Ranked #34 on Object Detection In Aerial Images on DOTA (using extra training data)

object-detection Object Detection In Aerial Images

1,718

Paper
Code

Anti-aliasing Semantic Reconstruction for Few-Shot Semantic Segmentation

1 code implementation • CVPR 2021 • Binghao Liu, Yao Ding, Jianbin Jiao, Xiangyang Ji, Qixiang Ye

Encouraging progress in few-shot semantic segmentation has been made by leveraging features learned upon base classes with sufficient training data to represent novel classes with few-shot examples.

Ranked #68 on Few-Shot Semantic Segmentation on COCO-20i (1-shot)

Few-Shot Semantic Segmentation Segmentation +1

Paper
Code

Towards Compact CNNs via Collaborative Compression

1 code implementation • CVPR 2021 • Yuchao Li, Shaohui Lin, Jianzhuang Liu, Qixiang Ye, Mengdi Wang, Fei Chao, Fan Yang, Jincheng Ma, Qi Tian, Rongrong Ji

Channel pruning and tensor decomposition have received extensive attention in convolutional neural network compression.

Neural Network Compression Tensor Decomposition

Paper
Code

Conformer: Local Features Coupling Global Representations for Visual Recognition

4 code implementations • ICCV 2021 • Zhiliang Peng, Wei Huang, Shanzhi Gu, Lingxi Xie, YaoWei Wang, Jianbin Jiao, Qixiang Ye

Within Convolutional Neural Network (CNN), the convolution operations are good at extracting local features but experience difficulty to capture global representations.

Ranked #322 on Image Classification on ImageNet

Image Classification Instance Segmentation +4

3,137

Paper
Code

Multiple instance active learning for object detection

1 code implementation • CVPR 2021 • Tianning Yuan, Fang Wan, Mengying Fu, Jianzhuang Liu, Songcen Xu, Xiangyang Ji, Qixiang Ye

Despite the substantial progress of active learning for image recognition, there still lacks an instance-level active learning method specified for object detection.

Ranked #1 on Active Object Detection on MS COCO

Active Object Detection Multiple Instance Learning +3

323

Paper
Code

Learnable Expansion-and-Compression Network for Few-shot Class-Incremental Learning

no code implementations • 6 Apr 2021 • Boyu Yang, Mingbao Lin, Binghao Liu, Mengying Fu, Chang Liu, Rongrong Ji, Qixiang Ye

By tentatively expanding network nodes, LEC-Net enlarges the representation capacity of features, alleviating feature drift of old network from the perspective of model regularization.

Few-Shot Class-Incremental Learning Incremental Learning

Paper
Add Code

TS-CAM: Token Semantic Coupled Attention Map for Weakly Supervised Object Localization

2 code implementations • ICCV 2021 • Wei Gao, Fang Wan, Xingjia Pan, Zhiliang Peng, Qi Tian, Zhenjun Han, Bolei Zhou, Qixiang Ye

TS-CAM finally couples the patch tokens with the semantic-agnostic attention map to achieve semantic-aware localization.

Object Weakly-Supervised Object Localization

130

Paper
Code

Beyond Max-Margin: Class Margin Equilibrium for Few-shot Object Detection

2 code implementations • CVPR 2021 • Bohao Li, Boyu Yang, Chang Liu, Feng Liu, Rongrong Ji, Qixiang Ye

Few-shot object detection has made substantial progressby representing novel class objects using the feature representation learned upon a set of base class objects.

Ranked #14 on Few-Shot Object Detection on MS-COCO (10-shot)

Few-Shot Object Detection object-detection

Paper
Code

Harmonic Feature Activation for Few-Shot Semantic Segmentation

1 code implementation • IEEE Transactions on Image Processing 2021 • Binghao Liu, Jianbin Jiao, Qixiang Ye

HFA is formulated as a bilinear model, which takes charge of the pixel-wise dense correlation (bilinear feature activation) between query and support images in a systematic way.

Few-Shot Semantic Segmentation Segmentation +1

Paper
Code

Network Pruning using Adaptive Exemplar Filters

1 code implementation • 20 Jan 2021 • Mingbao Lin, Rongrong Ji, Shaojie Li, Yan Wang, Yongjian Wu, Feiyue Huang, Qixiang Ye

Inspired by the face recognition community, we use a message passing algorithm Affinity Propagation on the weight matrices to obtain an adaptive number of exemplars, which then act as the preserved filters.

Face Recognition Network Pruning

Paper
Code

Occlude Them All: Occlusion-Aware Attention Network for Occluded Person Re-ID

no code implementations • ICCV 2021 • Peixian Chen, Wenfeng Liu, Pingyang Dai, Jianzhuang Liu, Qixiang Ye, Mingliang Xu, Qi'an Chen, Rongrong Ji

To avoid such problematic models in occluded person ReID, we propose the Occlusion-Aware Mask Network (OAMN).

Person Re-Identification

Paper
Add Code

Self-Motivated Communication Agent for Real-World Vision-Dialog Navigation

no code implementations • ICCV 2021 • Yi Zhu, Yue Weng, Fengda Zhu, Xiaodan Liang, Qixiang Ye, Yutong Lu, Jianbin Jiao

Vision-Dialog Navigation (VDN) requires an agent to ask questions and navigate following the human responses to find target objects.

Imitation Learning Navigate

Paper
Add Code

Towards Spatio-Temporal Video Scene Text Detection via Temporal Clustering

no code implementations • 19 Nov 2020 • Yuanqiang Cai, Chang Liu, Weiqiang Wang, Qixiang Ye

With only bounding-box annotations in the spatial domain, existing video scene text detection (VSTD) benchmarks lack temporal relation of text instances among video frames, which hinders the development of video text-related applications.

Clustering Scene Text Detection +1

Paper
Add Code

Adaptive Linear Span Network for Object Skeleton Detection

1 code implementation • 8 Nov 2020 • Chang Liu, Yunjie Tian, Jianbin Jiao, Qixiang Ye

Conventional networks for object skeleton detection are usually hand-crafted.

Edge Detection Neural Architecture Search +2

116

Paper
Code

The 1st Tiny Object Detection Challenge:Methods and Results

1 code implementation • 16 Sep 2020 • Xuehui Yu, Zhenjun Han, Yuqi Gong, Nan Jiang, Jian Zhao, Qixiang Ye, Jie Chen, Yuan Feng, Bin Zhang, Xiaodi Wang, Ying Xin, Jingwei Liu, Mingyuan Mao, Sheng Xu, Baochang Zhang, Shumin Han, Cheng Gao, Wei Tang, Lizuo Jin, Mingbo Hong, Yuchao Yang, Shuiwang Li, Huan Luo, Qijun Zhao, Humphrey Shi

The 1st Tiny Object Detection (TOD) Challenge aims to encourage research in developing novel and accurate methods for tiny object detection in images which have wide views, with a current focus on tiny person detection.

Human Detection Object +2

632

Paper
Code

Prototype Mixture Models for Few-shot Semantic Segmentation

1 code implementation • ECCV 2020 • Boyu Yang, Chang Liu, Bohao Li, Jianbin Jiao, Qixiang Ye

Few-shot segmentation is challenging because objects within the support and query images could significantly differ in appearance and pose.

Ranked #4 on Few-Shot Semantic Segmentation on PASCAL-5i (10-Shot)

Few-Shot Semantic Segmentation Segmentation +1

163

Paper
Code

Component Divide-and-Conquer for Real-World Image Super-Resolution

1 code implementation • ECCV 2020 • Pengxu Wei, Ziwei Xie, Hannan Lu, Zongyuan Zhan, Qixiang Ye, WangMeng Zuo, Liang Lin

Learning an SR model with conventional pixel-wise loss usually is easily dominated by flat regions and edges, and fails to infer realistic details of complex textures.

Image Super-Resolution

176

Paper
Code

Learning Task-oriented Disentangled Representations for Unsupervised Domain Adaptation

no code implementations • 27 Jul 2020 • Pingyang Dai, Peixian Chen, Qiong Wu, Xiaopeng Hong, Qixiang Ye, Qi Tian, Rongrong Ji

This drawback limits the flexibility of UDA in complicated open-set tasks where no labels are shared between domains.

Retrieval Unsupervised Domain Adaptation

Paper
Add Code

Discretization-Aware Architecture Search

1 code implementation • 7 Jul 2020 • Yunjie Tian, Chang Liu, Lingxi Xie, Jianbin Jiao, Qixiang Ye

The search cost of neural architecture search (NAS) has been largely reduced by weight-sharing methods.

Image Classification Neural Architecture Search

Paper
Code

Progressive Cluster Purification for Unsupervised Feature Learning

1 code implementation • 6 Jul 2020 • Yifei Zhang, Chang Liu, Yu Zhou, Wei Wang, Weiping Wang, Qixiang Ye

In this work, we propose a novel clustering based method, which, by iteratively excluding class inconsistent samples during progressive cluster formation, alleviates the impact of noise samples in a simple-yet-effective manner.

Clustering Specificity

Paper
Code

Multiple Expert Brainstorming for Domain Adaptive Person Re-identification

2 code implementations • ECCV 2020 • Yunpeng Zhai, Qixiang Ye, Shijian Lu, Mengxi Jia, Rongrong Ji, Yonghong Tian

Often the best performing deep neural models are ensembles of multiple base-level networks, nevertheless, ensemble learning with respect to domain adaptive person re-ID remains unexplored.

Domain Adaptive Person Re-Identification Ensemble Learning +1

103

Paper
Code

Domain Contrast for Domain Adaptive Object Detection

no code implementations • 26 Jun 2020 • Feng Liu, Xiaoxong Zhang, Fang Wan, Xiangyang Ji, Qixiang Ye

We present Domain Contrast (DC), a simple yet effective approach inspired by contrastive learning for training domain adaptive detectors.

Contrastive Learning Object +2

Paper
Add Code

iffDetector: Inference-aware Feature Filtering for Object Detection

1 code implementation • 23 Jun 2020 • Mingyuan Mao, Yuxin Tian, Baochang Zhang, Qixiang Ye, Wanquan Liu, Guodong Guo, David Doermann

In this paper, we propose a new feature optimization approach to enhance features and suppress background noise in both the training and inference stages.

Object object-detection +1

Paper
Code

Video Playback Rate Perception for Self-supervisedSpatio-Temporal Representation Learning

1 code implementation • 20 Jun 2020 • Yuan Yao, Chang Liu, Dezhao Luo, Yu Zhou, Qixiang Ye

The generative perception model acts as a feature decoder to focus on comprehending high temporal resolution and short-term representation by introducing a motion-attention mechanism.

Action Recognition Representation Learning +2

Paper
Code

Cogradient Descent for Bilinear Optimization

no code implementations • CVPR 2020 • Li'an Zhuo, Baochang Zhang, Linlin Yang, Hanlin Chen, Qixiang Ye, David Doermann, Guodong Guo, Rongrong Ji

Conventional learning methods simplify the bilinear model by regarding two intrinsically coupled factors independently, which degrades the optimization procedure.

Image Reconstruction Network Pruning

Paper
Add Code

Rethinking Performance Estimation in Neural Architecture Search

1 code implementation • CVPR 2020 • Xiawu Zheng, Rongrong Ji, Qiang Wang, Qixiang Ye, Zhenguo Li, Yonghong Tian, Qi Tian

In this paper, we provide a novel yet systematic rethinking of PE in a resource constrained regime, termed budgeted PE (BPE), which precisely and effectively estimates the performance of an architecture sampled from an architecture space.

Neural Architecture Search

165

Paper
Code

AD-Cluster: Augmented Discriminative Clustering for Domain Adaptive Person Re-identification

no code implementations • CVPR 2020 • Yunpeng Zhai, Shijian Lu, Qixiang Ye, Xuebo Shan, Jie Chen, Rongrong Ji, Yonghong Tian

Domain adaptive person re-identification (re-ID) is a challenging task, especially when person identities in target domains are unknown.

Ranked #8 on Unsupervised Domain Adaptation on Duke to Market

Clustering Domain Adaptive Person Re-Identification +2

Paper
Add Code

Architecture Disentanglement for Deep Neural Networks

1 code implementation • ICCV 2021 • Jie Hu, Liujuan Cao, Qixiang Ye, Tong Tong, Shengchuan Zhang, Ke Li, Feiyue Huang, Rongrong Ji, Ling Shao

Based on the experimental results, we present three new findings that provide fresh insights into the inner logic of DNNs.

AutoML Disentanglement

Paper
Code

Self-Guided Adaptation: Progressive Representation Alignment for Domain Adaptive Object Detection

no code implementations • 19 Mar 2020 • Zongxian Li, Qixiang Ye, Chong Zhang, Jingjing Liu, Shijian Lu, Yonghong Tian

In this work, we propose a Self-Guided Adaptation (SGA) model, target at aligning feature representation and transferring object detection models across domains while considering the instantaneous alignment difficulty.

object-detection Object Detection +1

Paper
Add Code

Filter Sketch for Network Pruning

1 code implementation • 23 Jan 2020 • Mingbao Lin, Liujuan Cao, Shaojie Li, Qixiang Ye, Yonghong Tian, Jianzhuang Liu, Qi Tian, Rongrong Ji

Our approach, referred to as FilterSketch, encodes the second-order information of pre-trained weights, which enables the representation capacity of pruned networks to be recovered with a simple fine-tuning procedure.

Network Pruning

Paper
Code

Video Cloze Procedure for Self-Supervised Spatio-Temporal Learning

1 code implementation • 2 Jan 2020 • Dezhao Luo, Chang Liu, Yu Zhou, Dongbao Yang, Can Ma, Qixiang Ye, Weiping Wang

As a proxy task, it converts rich self-supervised representations into video clip operations (options), which enhances the flexibility and reduces the complexity of representation learning.

Ranked #11 on Self-supervised Video Retrieval on HMDB51

Representation Learning Retrieval +4

Paper
Code

Scale Match for Tiny Person Detection

2 code implementations • 23 Dec 2019 • Xuehui Yu, Yuqi Gong, Nan Jiang, Qixiang Ye, Zhenjun Han

In this paper, we introduce a new benchmark, referred to as TinyPerson, opening up a promising directionfor tiny object detection in a long distance and with mas-sive backgrounds.

Human Detection Object +2

632

Paper
Code

Multiple Anchor Learning for Visual Object Detection

3 code implementations • CVPR 2020 • Wei Ke, Tianliang Zhang, Zeyi Huang, Qixiang Ye, Jianzhuang Liu, Dong Huang

In this paper, we propose a Multiple Instance Learning (MIL) approach that selects anchors and jointly optimizes the two modules of a CNN-based object detector.

Ranked #116 on Object Detection on COCO test-dev

General Classification Multiple Instance Learning +3

Paper
Code

SPSTracker: Sub-Peak Suppression of Response Map for Robust Object Tracking

1 code implementation • 2 Dec 2019 • Qintao Hu, Lijun Zhou, Xiaoxiao Wang, Yao Mao, Jianlin Zhang, Qixiang Ye

Modern visual trackers usually construct online learning models under the assumption that the feature response has a Gaussian distribution with target-centered peak response.

Object Tracking

Paper
Code

FreeAnchor: Learning to Match Anchors for Visual Object Detection

4 code implementations • NeurIPS 2019 • Xiaosong Zhang, Fang Wan, Chang Liu, Rongrong Ji, Qixiang Ye

In this study, we propose a learning-to-match approach to break IoU restriction, allowing objects to match anchors in a flexible manner.

Ranked #136 on Object Detection on COCO test-dev

Object object-detection +1

27,708

Paper
Code

Information Competing Process for Learning Diversified Representations

1 code implementation • NeurIPS 2019 • Jie Hu, Rongrong Ji, Shengchuan Zhang, Xiaoshuai Sun, Qixiang Ye, Chia-Wen Lin, Qi Tian

Learning representations with diversified information remains as an open problem.

General Classification Image Classification +2

Paper
Code

Attribute Guided Unpaired Image-to-Image Translation with Semi-supervised Learning

1 code implementation • 29 Apr 2019 • Xinyang Li, Jie Hu, Shengchuan Zhang, Xiaopeng Hong, Qixiang Ye, Chenglin Wu, Rongrong Ji

Especially, AGUIT benefits from two-fold: (1) It adopts a novel semi-supervised learning process by translating attributes of labeled data to unlabeled data, and then reconstructing the unlabeled data by a cycle consistency operation.

Attribute Disentanglement +2

Paper
Code

C-MIL: Continuation Multiple Instance Learning for Weakly Supervised Object Detection

1 code implementation • CVPR 2019 • Fang Wan, Chang Liu, Wei Ke, Xiangyang Ji, Jianbin Jiao, Qixiang Ye

Weakly supervised object detection (WSOD) is a challenging task when provided with image category supervision but required to simultaneously learn object locations and object detectors.

Ranked #9 on Weakly Supervised Object Detection on PASCAL VOC 2007

Multiple Instance Learning Object +3

Paper
Code

Towards Optimal Structured CNN Pruning via Generative Adversarial Learning

1 code implementation • CVPR 2019 • Shaohui Lin, Rongrong Ji, Chenqian Yan, Baochang Zhang, Liujuan Cao, Qixiang Ye, Feiyue Huang, David Doermann

In this paper, we propose an effective structured pruning approach that jointly prunes filters as well as other structures in an end-to-end manner.

Paper
Code

Min-Entropy Latent Model for Weakly Supervised Object Detection

1 code implementation • CVPR 2018 • Fang Wan, Pengxu Wei, Zhenjun Han, Jianbin Jiao, Qixiang Ye

Weakly supervised object detection is a challenging task when provided with image category supervision but required to learn, at the same time, object locations and object detectors.

Ranked #19 on Weakly Supervised Object Detection on PASCAL VOC 2012 test

Image Classification Object +3

Paper
Code

SIXray : A Large-scale Security Inspection X-ray Benchmark for Prohibited Item Discovery in Overlapping Images

1 code implementation • 2 Jan 2019 • Caijing Miao, Lingxi Xie, Fang Wan, Chi Su, Hongye Liu, Jianbin Jiao, Qixiang Ye

In particular, the advantage of CHR is more significant in the scenarios with fewer positive training samples, which demonstrates its potential application in real-world security inspection.

Object Localization

116

Paper
Code

Similarity-preserving Image-image Domain Adaptation for Person Re-identification

no code implementations • 26 Nov 2018 • Weijian Deng, Liang Zheng, Qixiang Ye, Yi Yang, Jianbin Jiao

It first preserves two types of unsupervised similarity, namely, self-similarity of an image before and after translation, and domain-dissimilarity of a translated source image and a target image.

Domain Adaptation Generative Adversarial Network +2

Paper
Add Code

Linear Span Network for Object Skeleton Detection

no code implementations • ECCV 2018 • Chang Liu, Wei Ke, Fei Qin, Qixiang Ye

Hinted by this, we formalize a Linear Span framework, and propose Linear Span Network (LSN) modified by Linear Span Units (LSUs), which minimize the reconstruction error of convolutional network.

Object Object Skeleton Detection

Paper
Add Code

SRN: Side-output Residual Network for Object Reflection Symmetry Detection and Beyond

1 code implementation • 17 Jul 2018 • Wei Ke, Jie Chen, Jianbin Jiao, Guoying Zhao, Qixiang Ye

The end-to-end deep learning approach, referred to as a side-output residual network (SRN), leverages the output residual units (RUs) to fit the errors between the object ground-truth symmetry and the side-outputs of multiple stages.

Edge Detection Hand Pose Estimation +2

Paper
Code

Weakly Supervised Instance Segmentation using Class Peak Response

1 code implementation • CVPR 2018 • Yanzhao Zhou, Yi Zhu, Qixiang Ye, Qiang Qiu, Jianbin Jiao

Motivated by this, we first design a process to stimulate peaks to emerge from a class response map.

Ranked #11 on Image-level Supervised Instance Segmentation on PASCAL VOC 2012 val (using extra training data)

General Classification Image-level Supervised Instance Segmentation +3

345

Paper
Code

Image-Image Domain Adaptation with Preserved Self-Similarity and Domain-Dissimilarity for Person Re-identification

2 code implementations • CVPR 2018 • Weijian Deng, Liang Zheng, Qixiang Ye, Guoliang Kang, Yi Yang, Jianbin Jiao

To this end, we propose to preserve two types of unsupervised similarities, 1) self-similarity of an image before and after translation, and 2) domain-dissimilarity of a translated source image and a target image.

Ranked #3 on Unsupervised Person Re-Identification on MSMT17->DukeMTMC-reID

Generative Adversarial Network Person Re-Identification +2

3,126

Paper
Code

Soft Proposal Networks for Weakly Supervised Object Localization

1 code implementation • ICCV 2017 • Yi Zhu, Yanzhao Zhou, Qixiang Ye, Qiang Qiu, Jianbin Jiao

Weakly supervised object localization remains challenging, where only image labels instead of bounding boxes are available during training.

Ranked #2 on Weakly Supervised Object Detection on MS COCO

Object Weakly Supervised Object Detection +1

210

Paper
Code

Deep Spatio-temporal Manifold Network for Action Recognition

no code implementations • 9 May 2017 • Ce Li, Chen Chen, Baochang Zhang, Qixiang Ye, Jungong Han, Rongrong Ji

Visual data such as videos are often sampled from complex manifold.

Action Recognition Temporal Action Localization

Paper
Add Code

SRN: Side-output Residual Network for Object Symmetry Detection in the Wild

1 code implementation • CVPR 2017 • Wei Ke, Jie Chen, Jianbin Jiao, Guoying Zhao, Qixiang Ye

By stacking RUs in a deep-to-shallow manner, SRN exploits the 'flow' of errors among multiple scales to ease the problems of fitting complex outputs with limited layers, suppressing the complex backgrounds, and effectively matching object symmetry of different scales.

Object Symmetry Detection

Paper
Code

A Graphical Social Topology Model for Multi-Object Tracking

no code implementations • 14 Feb 2017 • Shan Gao, Xiaogang Chen, Qixiang Ye, Junliang Xing, Arjan Kuijper, Xiangyang Ji

Inspired with the social affinity property of moving objects, we propose a Graphical Social Topology (GST) model, which estimates the group dynamics by jointly modeling the group structure and the states of objects using a topological representation.

Multi-Object Tracking Object

Paper
Add Code

Oriented Response Networks

1 code implementation • CVPR 2017 • Yanzhao Zhou, Qixiang Ye, Qiang Qiu, Jianbin Jiao

DCNNs using ARFs, referred to as Oriented Response Networks (ORNs), can produce within-class rotation-invariant deep features while maintaining inter-class discrimination for classification tasks.

Ranked #83 on Image Classification on CIFAR-100 (using extra training data)

General Classification Image Classification

224

Paper
Code

Output Constraint Transfer for Kernelized Correlation Filter in Tracking

no code implementations • 16 Dec 2016 • Baochang Zhang, Zhigang Li, Xian-Bin Cao, Qixiang Ye, Chen Chen, Linlin Shen, Alessandro Perina, Rongrong Ji

Kernelized Correlation Filter (KCF) is one of the state-of-the-art object trackers.

Bayesian Optimization

Paper
Add Code

Self-learning Scene-specific Pedestrian Detectors using a Progressive Latent Model

no code implementations • CVPR 2017 • Qixiang Ye, Tianliang Zhang, Qiang Qiu, Baochang Zhang, Jie Chen, Guillermo Sapiro

In this paper, a self-learning approach is proposed towards solving scene-specific pedestrian detection problem without any human' annotation involved.

Object Object Discovery +5

Paper
Add Code

A scalable convolutional neural network for task-specified scenarios via knowledge distillation

no code implementations • 19 Sep 2016 • Mengnan Shi, Fei Qin, Qixiang Ye, Zhenjun Han, Jianbin Jiao

In this paper, we explore the redundancy in convolutional neural network, which scales with the complexity of vision tasks.

Knowledge Distillation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.