Search Results for author: Qixiang Ye

Found 104 papers, 69 papers with code

API-Net: Robust Generative Classifier via a Single Discriminator

1 code implementation ECCV 2020 Xinshuai Dong, Hong Liu, Rongrong Ji, Liujuan Cao, Qixiang Ye, Jianzhuang Liu, Qi Tian

On the contrary, a discriminative classifier only models the conditional distribution of labels given inputs, but benefits from effective optimization owing to its succinct structure.

Robust classification

Vision Calorimeter for Anti-neutron Reconstruction: A Baseline

1 code implementation20 Aug 2024 Hongtian Yu, Yangu Li, Mingrui Wu, Letian Shen, Yue Liu, Yunxuan Song, Qixiang Ye, Xiaorui Lyu, Yajun Mao, Yangheng Zheng, Yunfan Liu

In this study, we introduce Vision Calorimeter (ViC), a baseline method for anti-neutron reconstruction that leverages deep learning detectors to analyze the implicit relationships between EMC responses and incident $\bar{n}$ characteristics.

Position

Depth-guided Texture Diffusion for Image Semantic Segmentation

no code implementations17 Aug 2024 Wei Sun, Yuan Li, Qixiang Ye, Jianbin Jiao, Yanzhao Zhou

By integrating this enriched depth map with the original RGB image into a joint feature embedding, our method effectively bridges the disparity between the depth map and the image, enabling more accurate semantic segmentation.

Object object-detection +4

Correspondence-Guided SfM-Free 3D Gaussian Splatting for NVS

no code implementations16 Aug 2024 Wei Sun, Xiaosong Zhang, Fang Wan, Yanzhao Zhou, Yuan Li, Qixiang Ye, Jianbin Jiao

In SfM-free methods, inaccurate initial poses lead to misalignment issue, which, under the constraints of per-pixel image loss functions, results in excessive gradients, causing unstable optimization and poor convergence for NVS.

Camera Pose Estimation Novel View Synthesis +1

Evaluation of Text-to-Video Generation Models: A Dynamics Perspective

1 code implementation1 Jul 2024 Mingxiang Liao, Hannan Lu, Xinyu Zhang, Fang Wan, Tianyu Wang, Yuzhong Zhao, WangMeng Zuo, Qixiang Ye, Jingdong Wang

For this purpose, we establish a new benchmark comprising text prompts that fully reflect multiple dynamics grades, and define a set of dynamics scores corresponding to various temporal granularities to comprehensively evaluate the dynamics of each generated video.

Text-to-Video Generation Video Generation

ClawMachine: Fetching Visual Tokens as An Entity for Referring and Grounding

1 code implementation17 Jun 2024 Tianren Ma, Lingxi Xie, Yunjie Tian, Boyu Yang, Yuan Zhang, David Doermann, Qixiang Ye

Existing methods, including proxy encoding and geometry encoding, incorporate additional syntax to encode the object's location, bringing extra burdens in training MLLMs to communicate between language and vision.

Decoder Visual Reasoning

vHeat: Building Vision Models upon Heat Conduction

1 code implementation26 May 2024 Zhaozhi Wang, Yue Liu, Yunfan Liu, Hongtian Yu, YaoWei Wang, Qixiang Ye, Yunjie Tian

A fundamental problem in learning robust and expressive visual representations lies in efficiently estimating the spatial relationships of visual semantics throughout the entire image.

Computational Efficiency

DynRefer: Delving into Region-level Multi-modality Tasks via Dynamic Resolution

1 code implementation25 May 2024 Yuzhong Zhao, Feng Liu, Yue Liu, Mingxiang Liao, Chen Gong, Qixiang Ye, Fang Wan

Unfortunately, most of existing methods using fixed visual inputs remain lacking the resolution adaptability to find out precise language descriptions.

Attribute

Language-Driven Visual Consensus for Zero-Shot Semantic Segmentation

no code implementations13 Mar 2024 ZiCheng Zhang, Tong Zhang, Yi Zhu, Jianzhuang Liu, Xiaodan Liang, Qixiang Ye, Wei Ke

To mitigate these issues, we propose a Language-Driven Visual Consensus (LDVC) approach, fostering improved alignment of semantic and visual information. Specifically, we leverage class embeddings as anchors due to their discrete and abstract nature, steering vision features toward class embeddings.

Decoder Language Modelling +2

Towards Deviation-Robust Agent Navigation via Perturbation-Aware Contrastive Learning

no code implementations9 Mar 2024 Bingqian Lin, Yanxin Long, Yi Zhu, Fengda Zhu, Xiaodan Liang, Qixiang Ye, Liang Lin

For encouraging the agent to well capture the difference brought by perturbation, a perturbation-aware contrastive learning mechanism is further developed by contrasting perturbation-free trajectory encodings and perturbation-based counterparts.

Contrastive Learning Navigate +1

Ray Denoising: Depth-aware Hard Negative Sampling for Multi-view 3D Object Detection

2 code implementations6 Feb 2024 Feng Liu, Tengteng Huang, Qianjing Zhang, Haotian Yao, Chi Zhang, Fang Wan, Qixiang Ye, Yanzhao Zhou

Multi-view 3D object detection systems often struggle with generating precise predictions due to the challenges in estimating depth from images, increasing redundant and incorrect detections.

3D Object Detection Denoising +1

ControlCap: Controllable Region-level Captioning

1 code implementation31 Jan 2024 Yuzhong Zhao, Yue Liu, Zonghao Guo, Weijia Wu, Chen Gong, Fang Wan, Qixiang Ye

The multimodal model is constrained to generate captions within a few sub-spaces containing the control words, which increases the opportunity of hitting less frequent captions, alleviating the caption degeneration issue.

Dense Captioning

CPR++: Object Localization via Single Coarse Point Supervision

2 code implementations30 Jan 2024 Xuehui Yu, Pengfei Chen, Kuiran Wang, Xumeng Han, Guorong Li, Zhenjun Han, Qixiang Ye, Jianbin Jiao

CPR reduces the semantic variance by selecting a semantic centre point in a neighbourhood region to replace the initial annotated point.

Object Object Localization

ChatterBox: Multi-round Multimodal Referring and Grounding

1 code implementation24 Jan 2024 Yunjie Tian, Tianren Ma, Lingxi Xie, Jihao Qiu, Xi Tang, Yuan Zhang, Jianbin Jiao, Qi Tian, Qixiang Ye

In this study, we establish a baseline for a new task named multimodal multi-round referring and grounding (MRG), opening up a promising direction for instance-level multimodal dialogues.

Language Modelling Visual Grounding

VMamba: Visual State Space Model

8 code implementations18 Jan 2024 Yue Liu, Yunjie Tian, Yuzhong Zhao, Hongtian Yu, Lingxi Xie, YaoWei Wang, Qixiang Ye, Yunfan Liu

Designing computationally efficient network architectures persists as an ongoing necessity in computer vision.

Computational Efficiency Language Modelling +1

Regressor-Segmenter Mutual Prompt Learning for Crowd Counting

no code implementations CVPR 2024 Mingyue Guo, Li Yuan, Zhaoyi Yan, Binghui Chen, YaoWei Wang, Qixiang Ye

In this study, we propose mutual prompt learning (mPrompt), which leverages a regressor and a segmenter as guidance for each other, solving bias and inaccuracy caused by annotation variance while distinguishing foreground from background.

Crowd Counting

Spatial Transform Decoupling for Oriented Object Detection

1 code implementation21 Aug 2023 Hongtian Yu, Yunjie Tian, Qixiang Ye, Yunfan Liu

Vision Transformers (ViTs) have achieved remarkable success in computer vision tasks.

Ranked #2 on Object Detection In Aerial Images on DOTA (using extra training data)

Object object-detection +2

Generative Prompt Model for Weakly Supervised Object Localization

1 code implementation ICCV 2023 Yuzhong Zhao, Qixiang Ye, Weijia Wu, Chunhua Shen, Fang Wan

During training, GenPromp converts image category labels to learnable prompt embeddings which are fed to a generative model to conditionally recover the input image with noise and learn representative embeddings.

 Ranked #1 on Weakly-Supervised Object Localization on CUB-200-2011 (Top-1 Localization Accuracy metric, using extra training data)

Image Denoising Language Modelling +2

Album Storytelling with Iterative Story-aware Captioning and Large Language Models

no code implementations22 May 2023 Munan Ning, Yujia Xie, Dongdong Chen, Zeyin Song, Lu Yuan, Yonghong Tian, Qixiang Ye, Li Yuan

One natural approach is to use caption models to describe each photo in the album, and then use LLMs to summarize and rewrite the generated captions into an engaging story.

Generic-to-Specific Distillation of Masked Autoencoders

1 code implementation CVPR 2023 Wei Huang, Zhiliang Peng, Li Dong, Furu Wei, Jianbin Jiao, Qixiang Ye

Lightweight ViT models limited by the model capacity, however, benefit little from those pre-training mechanisms.

Decoder Image Classification +4

Spectral Aware Softmax for Visible-Infrared Person Re-Identification

no code implementations3 Feb 2023 Lei Tan, Pingyang Dai, Qixiang Ye, Mingliang Xu, Yongjian Wu, Rongrong Ji

Based on the observation and analysis of SA-Softmax, we modify the SA-Softmax with the Feature Mask and Absolute-Similarity Term to alleviate the ambiguous optimization during model training.

Person Re-Identification

DQnet: Cross-Model Detail Querying for Camouflaged Object Detection

no code implementations16 Dec 2022 Wei Sun, Chengao Liu, Linyan Zhang, Yu Li, Pengxu Wei, Chang Liu, Jialing Zou, Jianbin Jiao, Qixiang Ye

Optimizing a convolutional neural network (CNN) for camouflaged object detection (COD) tends to activate local discriminative regions while ignoring complete object extent, causing the partial activation issue which inevitably leads to missing or redundant regions of objects.

Object object-detection +2

Proposal Distribution Calibration for Few-Shot Object Detection

1 code implementation15 Dec 2022 Bohao Li, Chang Liu, Mengnan Shi, Xiaozhong Chen, Xiangyang Ji, Qixiang Ye

Adapting object detectors learned with sufficient supervision to novel classes under low data regimes is charming yet challenging.

Few-Shot Object Detection Object +1

CircleNet: Reciprocating Feature Adaptation for Robust Pedestrian Detection

no code implementations12 Dec 2022 Tianliang Zhang, Zhenjun Han, Huijuan Xu, Baochang Zhang, Qixiang Ye

In this paper we propose a novel feature learning model, referred to as CircleNet, to achieve feature adaptation by mimicking the process humans looking at low resolution and occluded objects: focusing on it again, at a finer scale, if the object can not be identified clearly for the first time.

object-detection Object Detection +1

Feature Calibration Network for Occluded Pedestrian Detection

no code implementations12 Dec 2022 Tianliang Zhang, Qixiang Ye, Baochang Zhang, Jianzhuang Liu, Xiaopeng Zhang, Qi Tian

FC-Net is based on the observation that the visible parts of pedestrians are selective and decisive for detection, and is implemented as a self-paced feature learning framework with a self-activation (SA) module and a feature calibration (FC) module.

Pedestrian Detection

Beyond Instance Discrimination: Relation-aware Contrastive Self-supervised Learning

no code implementations2 Nov 2022 Yifei Zhang, Chang Liu, Yu Zhou, Weiping Wang, Qixiang Ye, Xiangyang Ji

In this paper, we present relation-aware contrastive self-supervised learning (ReCo) to integrate instance relations, i. e., global distribution relation and local interpolation relation, into the CSL framework in a plug-and-play fashion.

Relation Self-Supervised Learning

A Unified View of Masked Image Modeling

1 code implementation19 Oct 2022 Zhiliang Peng, Li Dong, Hangbo Bao, Qixiang Ye, Furu Wei

Masked image modeling has demonstrated great potential to eliminate the label-hungry problem of training large-scale vision Transformers, achieving impressive performance on various downstream tasks.

Image Classification Segmentation +1

Multi-Agent Automated Machine Learning

no code implementations CVPR 2023 Zhaozhi Wang, Kefan Su, Jian Zhang, Huizhu Jia, Qixiang Ye, Xiaodong Xie, Zongqing Lu

In this paper, we propose multi-agent automated machine learning (MA2ML) with the aim to effectively handle joint optimization of modules in automated machine learning (AutoML).

Data Augmentation Multi-agent Reinforcement Learning +1

Learnable Distribution Calibration for Few-Shot Class-Incremental Learning

no code implementations1 Oct 2022 Binghao Liu, Boyu Yang, Lingxi Xie, Ren Wang, Qi Tian, Qixiang Ye

LDC is built upon a parameterized calibration unit (PCU), which initializes biased distributions for all classes based on classifier vectors (memory-free) and a single covariance matrix.

Few-Shot Class-Incremental Learning Few-Shot Learning +2

BEiT v2: Masked Image Modeling with Vector-Quantized Visual Tokenizers

2 code implementations12 Aug 2022 Zhiliang Peng, Li Dong, Hangbo Bao, Qixiang Ye, Furu Wei

The large-size BEiT v2 obtains 87. 3% top-1 accuracy for ImageNet-1K (224 size) fine-tuning, and 56. 7% mIoU on ADE20K for semantic segmentation.

Knowledge Distillation Representation Learning +2

HiViT: Hierarchical Vision Transformer Meets Masked Image Modeling

1 code implementation30 May 2022 Xiaosong Zhang, Yunjie Tian, Wei Huang, Qixiang Ye, Qi Dai, Lingxi Xie, Qi Tian

A key idea of efficient implementation is to discard the masked image patches (or tokens) throughout the target network (encoder), which requires the encoder to be a plain vision transformer (e. g., ViT), albeit hierarchical vision transformers (e. g., Swin Transformer) have potentially better properties in formulating vision inputs.

Transfer Learning

Integrally Migrating Pre-trained Transformer Encoder-decoders for Visual Object Detection

3 code implementations ICCV 2023 Feng Liu, Xiaosong Zhang, Zhiliang Peng, Zonghao Guo, Fang Wan, Xiangyang Ji, Qixiang Ye

Except for the backbone networks, however, other components such as the detector head and the feature pyramid network (FPN) remain trained from scratch, which hinders fully tapping the potential of representation models.

Decoder Few-Shot Object Detection +3

Object Localization under Single Coarse Point Supervision

2 code implementations CVPR 2022 Xuehui Yu, Pengfei Chen, Di wu, Najmul Hassan, Guorong Li, Junchi Yan, Humphrey Shi, Qixiang Ye, Zhenjun Han

In this study, we propose a POL method using coarse point annotations, relaxing the supervision signals from accurate key points to freely spotted points.

Multiple Instance Learning Object +1

Global2Local: A Joint-Hierarchical Attention for Video Captioning

no code implementations13 Mar 2022 Chengpeng Dai, Fuhai Chen, Xiaoshuai Sun, Rongrong Ji, Qixiang Ye, Yongjian Wu

Recently, automatic video captioning has attracted increasing attention, where the core challenge lies in capturing the key semantic items, like objects and actions as well as their spatial-temporal correlations from the redundant frames and semantic content.

Video Captioning

P2P-Loc: Point to Point Tiny Person Localization

no code implementations31 Dec 2021 Xuehui Yu, Di wu, Qixiang Ye, Jianbin Jiao, Zhenjun Han

As a result, we propose a point self-refinement approach that iteratively updates point annotations in a self-paced way.

Object Object Localization

Exploring Complicated Search Spaces with Interleaving-Free Sampling

no code implementations5 Dec 2021 Yunjie Tian, Lingxi Xie, Jiemin Fang, Jianbin Jiao, Qixiang Ye, Qi Tian

In this paper, we build the search algorithm upon a complicated search space with long-distance connections, and show that existing weight-sharing search algorithms mostly fail due to the existence of \textbf{interleaved connections}.

Neural Architecture Search

Self-supervised Feature-Gate Coupling for Dynamic Network Pruning

1 code implementation29 Nov 2021 Mengnan Shi, Chang Liu, Jianbin Jiao, Qixiang Ye

Gating modules have been widely explored in dynamic network pruning to reduce the run-time computational cost of deep neural networks while preserving the representation of features.

Contrastive Learning Network Pruning

Semantic-Aware Generation for Self-Supervised Visual Representation Learning

1 code implementation25 Nov 2021 Yunjie Tian, Lingxi Xie, Xiaopeng Zhang, Jiemin Fang, Haohang Xu, Wei Huang, Jianbin Jiao, Qi Tian, Qixiang Ye

In this paper, we propose a self-supervised visual representation learning approach which involves both generative and discriminative proxies, where we focus on the former part by requiring the target network to recover the original image based on the mid-level features.

Representation Learning Semantic Segmentation

Long-tailed Distribution Adaptation

1 code implementation6 Oct 2021 Zhiliang Peng, Wei Huang, Zonghao Guo, Xiaosong Zhang, Jianbin Jiao, Qixiang Ye

We propose to jointly optimize empirical risks of the unbalanced and balanced domains and approximate their domain divergence by intra-class and inter-class distances, with the aim to adapt models trained on the long-tailed distribution to general distributions in an interpretable way.

Domain Adaptation Instance Segmentation +3

GraFormer: Graph Convolution Transformer for 3D Pose Estimation

1 code implementation17 Sep 2021 Weixi Zhao, Yunjie Tian, Qixiang Ye, Jianbin Jiao, Weiqiang Wang

Exploiting relations among 2D joints plays a crucial role yet remains semi-developed in 2D-to-3D pose estimation.

3D Pose Estimation Implicit Relations

Adversarial Reinforced Instruction Attacker for Robust Vision-Language Navigation

1 code implementation23 Jul 2021 Bingqian Lin, Yi Zhu, Yanxin Long, Xiaodan Liang, Qixiang Ye, Liang Lin

Specifically, we propose a Dynamic Reinforced Instruction Attacker (DR-Attacker), which learns to mislead the navigator to move to the wrong target by destroying the most instructive information in instructions at different timesteps.

Vision and Language Navigation Vision-Language Navigation

Rethinking Sampling Strategies for Unsupervised Person Re-identification

2 code implementations7 Jul 2021 Xumeng Han, Xuehui Yu, Guorong Li, Jian Zhao, Gang Pan, Qixiang Ye, Jianbin Jiao, Zhenjun Han

While extensive research has focused on the framework design and loss function, this paper shows that sampling strategy plays an equally important role.

Pseudo Label Representation Learning +1

Cogradient Descent for Dependable Learning

no code implementations20 Jun 2021 Runqi Wang, Baochang Zhang, Li'an Zhuo, Qixiang Ye, David Doermann

Conventional gradient descent methods compute the gradients for multiple variables through the partial derivative.

Image Inpainting Image Reconstruction +1

Beyond Bounding-Box: Convex-Hull Feature Adaptation for Oriented and Densely Packed Object Detection

2 code implementations CVPR 2021 Zonghao Guo, Chang Liu, Xiaosong Zhang, Jianbin Jiao, Xiangyang Ji, Qixiang Ye

Detecting oriented and densely packed objects remains challenging for spatial feature aliasing caused by the intersection of reception fields between objects.

Ranked #34 on Object Detection In Aerial Images on DOTA (using extra training data)

object-detection Object Detection In Aerial Images

Anti-aliasing Semantic Reconstruction for Few-Shot Semantic Segmentation

1 code implementation CVPR 2021 Binghao Liu, Yao Ding, Jianbin Jiao, Xiangyang Ji, Qixiang Ye

Encouraging progress in few-shot semantic segmentation has been made by leveraging features learned upon base classes with sufficient training data to represent novel classes with few-shot examples.

Few-Shot Semantic Segmentation Segmentation +1

Conformer: Local Features Coupling Global Representations for Visual Recognition

4 code implementations ICCV 2021 Zhiliang Peng, Wei Huang, Shanzhi Gu, Lingxi Xie, YaoWei Wang, Jianbin Jiao, Qixiang Ye

Within Convolutional Neural Network (CNN), the convolution operations are good at extracting local features but experience difficulty to capture global representations.

Image Classification Instance Segmentation +4

Multiple instance active learning for object detection

1 code implementation CVPR 2021 Tianning Yuan, Fang Wan, Mengying Fu, Jianzhuang Liu, Songcen Xu, Xiangyang Ji, Qixiang Ye

Despite the substantial progress of active learning for image recognition, there still lacks an instance-level active learning method specified for object detection.

Active Object Detection Multiple Instance Learning +3

Learnable Expansion-and-Compression Network for Few-shot Class-Incremental Learning

no code implementations6 Apr 2021 Boyu Yang, Mingbao Lin, Binghao Liu, Mengying Fu, Chang Liu, Rongrong Ji, Qixiang Ye

By tentatively expanding network nodes, LEC-Net enlarges the representation capacity of features, alleviating feature drift of old network from the perspective of model regularization.

Few-Shot Class-Incremental Learning Incremental Learning

Beyond Max-Margin: Class Margin Equilibrium for Few-shot Object Detection

2 code implementations CVPR 2021 Bohao Li, Boyu Yang, Chang Liu, Feng Liu, Rongrong Ji, Qixiang Ye

Few-shot object detection has made substantial progressby representing novel class objects using the feature representation learned upon a set of base class objects.

Few-Shot Object Detection object-detection

Harmonic Feature Activation for Few-Shot Semantic Segmentation

1 code implementation IEEE Transactions on Image Processing 2021 Binghao Liu, Jianbin Jiao, Qixiang Ye

HFA is formulated as a bilinear model, which takes charge of the pixel-wise dense correlation (bilinear feature activation) between query and support images in a systematic way.

Few-Shot Semantic Segmentation Segmentation +1

Network Pruning using Adaptive Exemplar Filters

1 code implementation20 Jan 2021 Mingbao Lin, Rongrong Ji, Shaojie Li, Yan Wang, Yongjian Wu, Feiyue Huang, Qixiang Ye

Inspired by the face recognition community, we use a message passing algorithm Affinity Propagation on the weight matrices to obtain an adaptive number of exemplars, which then act as the preserved filters.

Face Recognition Network Pruning

Self-Motivated Communication Agent for Real-World Vision-Dialog Navigation

no code implementations ICCV 2021 Yi Zhu, Yue Weng, Fengda Zhu, Xiaodan Liang, Qixiang Ye, Yutong Lu, Jianbin Jiao

Vision-Dialog Navigation (VDN) requires an agent to ask questions and navigate following the human responses to find target objects.

Imitation Learning Navigate

Towards Spatio-Temporal Video Scene Text Detection via Temporal Clustering

no code implementations19 Nov 2020 Yuanqiang Cai, Chang Liu, Weiqiang Wang, Qixiang Ye

With only bounding-box annotations in the spatial domain, existing video scene text detection (VSTD) benchmarks lack temporal relation of text instances among video frames, which hinders the development of video text-related applications.

Clustering Scene Text Detection +1

The 1st Tiny Object Detection Challenge:Methods and Results

1 code implementation16 Sep 2020 Xuehui Yu, Zhenjun Han, Yuqi Gong, Nan Jiang, Jian Zhao, Qixiang Ye, Jie Chen, Yuan Feng, Bin Zhang, Xiaodi Wang, Ying Xin, Jingwei Liu, Mingyuan Mao, Sheng Xu, Baochang Zhang, Shumin Han, Cheng Gao, Wei Tang, Lizuo Jin, Mingbo Hong, Yuchao Yang, Shuiwang Li, Huan Luo, Qijun Zhao, Humphrey Shi

The 1st Tiny Object Detection (TOD) Challenge aims to encourage research in developing novel and accurate methods for tiny object detection in images which have wide views, with a current focus on tiny person detection.

Human Detection Object +2

Component Divide-and-Conquer for Real-World Image Super-Resolution

1 code implementation ECCV 2020 Pengxu Wei, Ziwei Xie, Hannan Lu, Zongyuan Zhan, Qixiang Ye, WangMeng Zuo, Liang Lin

Learning an SR model with conventional pixel-wise loss usually is easily dominated by flat regions and edges, and fails to infer realistic details of complex textures.

Image Super-Resolution

Discretization-Aware Architecture Search

1 code implementation7 Jul 2020 Yunjie Tian, Chang Liu, Lingxi Xie, Jianbin Jiao, Qixiang Ye

The search cost of neural architecture search (NAS) has been largely reduced by weight-sharing methods.

Image Classification Neural Architecture Search

Progressive Cluster Purification for Unsupervised Feature Learning

1 code implementation6 Jul 2020 Yifei Zhang, Chang Liu, Yu Zhou, Wei Wang, Weiping Wang, Qixiang Ye

In this work, we propose a novel clustering based method, which, by iteratively excluding class inconsistent samples during progressive cluster formation, alleviates the impact of noise samples in a simple-yet-effective manner.

Clustering Specificity

Multiple Expert Brainstorming for Domain Adaptive Person Re-identification

2 code implementations ECCV 2020 Yunpeng Zhai, Qixiang Ye, Shijian Lu, Mengxi Jia, Rongrong Ji, Yonghong Tian

Often the best performing deep neural models are ensembles of multiple base-level networks, nevertheless, ensemble learning with respect to domain adaptive person re-ID remains unexplored.

Domain Adaptive Person Re-Identification Ensemble Learning +1

Domain Contrast for Domain Adaptive Object Detection

no code implementations26 Jun 2020 Feng Liu, Xiaoxong Zhang, Fang Wan, Xiangyang Ji, Qixiang Ye

We present Domain Contrast (DC), a simple yet effective approach inspired by contrastive learning for training domain adaptive detectors.

Contrastive Learning Object +2

iffDetector: Inference-aware Feature Filtering for Object Detection

1 code implementation23 Jun 2020 Mingyuan Mao, Yuxin Tian, Baochang Zhang, Qixiang Ye, Wanquan Liu, Guodong Guo, David Doermann

In this paper, we propose a new feature optimization approach to enhance features and suppress background noise in both the training and inference stages.

Object object-detection +1

Video Playback Rate Perception for Self-supervisedSpatio-Temporal Representation Learning

1 code implementation20 Jun 2020 Yuan Yao, Chang Liu, Dezhao Luo, Yu Zhou, Qixiang Ye

The generative perception model acts as a feature decoder to focus on comprehending high temporal resolution and short-term representation by introducing a motion-attention mechanism.

Action Recognition Decoder +3

Cogradient Descent for Bilinear Optimization

no code implementations CVPR 2020 Li'an Zhuo, Baochang Zhang, Linlin Yang, Hanlin Chen, Qixiang Ye, David Doermann, Guodong Guo, Rongrong Ji

Conventional learning methods simplify the bilinear model by regarding two intrinsically coupled factors independently, which degrades the optimization procedure.

Image Reconstruction Network Pruning

Rethinking Performance Estimation in Neural Architecture Search

1 code implementation CVPR 2020 Xiawu Zheng, Rongrong Ji, Qiang Wang, Qixiang Ye, Zhenguo Li, Yonghong Tian, Qi Tian

In this paper, we provide a novel yet systematic rethinking of PE in a resource constrained regime, termed budgeted PE (BPE), which precisely and effectively estimates the performance of an architecture sampled from an architecture space.

Neural Architecture Search

Architecture Disentanglement for Deep Neural Networks

1 code implementation ICCV 2021 Jie Hu, Liujuan Cao, Qixiang Ye, Tong Tong, Shengchuan Zhang, Ke Li, Feiyue Huang, Rongrong Ji, Ling Shao

Based on the experimental results, we present three new findings that provide fresh insights into the inner logic of DNNs.

AutoML Disentanglement

Self-Guided Adaptation: Progressive Representation Alignment for Domain Adaptive Object Detection

no code implementations19 Mar 2020 Zongxian Li, Qixiang Ye, Chong Zhang, Jingjing Liu, Shijian Lu, Yonghong Tian

In this work, we propose a Self-Guided Adaptation (SGA) model, target at aligning feature representation and transferring object detection models across domains while considering the instantaneous alignment difficulty.

object-detection Object Detection +1

Filter Sketch for Network Pruning

1 code implementation23 Jan 2020 Mingbao Lin, Liujuan Cao, Shaojie Li, Qixiang Ye, Yonghong Tian, Jianzhuang Liu, Qi Tian, Rongrong Ji

Our approach, referred to as FilterSketch, encodes the second-order information of pre-trained weights, which enables the representation capacity of pruned networks to be recovered with a simple fine-tuning procedure.

Network Pruning

Video Cloze Procedure for Self-Supervised Spatio-Temporal Learning

1 code implementation2 Jan 2020 Dezhao Luo, Chang Liu, Yu Zhou, Dongbao Yang, Can Ma, Qixiang Ye, Weiping Wang

As a proxy task, it converts rich self-supervised representations into video clip operations (options), which enhances the flexibility and reduces the complexity of representation learning.

Representation Learning Retrieval +4

Scale Match for Tiny Person Detection

2 code implementations23 Dec 2019 Xuehui Yu, Yuqi Gong, Nan Jiang, Qixiang Ye, Zhenjun Han

In this paper, we introduce a new benchmark, referred to as TinyPerson, opening up a promising directionfor tiny object detection in a long distance and with mas-sive backgrounds.

Human Detection Object +2

Multiple Anchor Learning for Visual Object Detection

3 code implementations CVPR 2020 Wei Ke, Tianliang Zhang, Zeyi Huang, Qixiang Ye, Jianzhuang Liu, Dong Huang

In this paper, we propose a Multiple Instance Learning (MIL) approach that selects anchors and jointly optimizes the two modules of a CNN-based object detector.

General Classification Multiple Instance Learning +3

SPSTracker: Sub-Peak Suppression of Response Map for Robust Object Tracking

1 code implementation2 Dec 2019 Qintao Hu, Lijun Zhou, Xiaoxiao Wang, Yao Mao, Jianlin Zhang, Qixiang Ye

Modern visual trackers usually construct online learning models under the assumption that the feature response has a Gaussian distribution with target-centered peak response.

Object Tracking

FreeAnchor: Learning to Match Anchors for Visual Object Detection

4 code implementations NeurIPS 2019 Xiaosong Zhang, Fang Wan, Chang Liu, Rongrong Ji, Qixiang Ye

In this study, we propose a learning-to-match approach to break IoU restriction, allowing objects to match anchors in a flexible manner.

Object object-detection +1

Attribute Guided Unpaired Image-to-Image Translation with Semi-supervised Learning

1 code implementation29 Apr 2019 Xinyang Li, Jie Hu, Shengchuan Zhang, Xiaopeng Hong, Qixiang Ye, Chenglin Wu, Rongrong Ji

Especially, AGUIT benefits from two-fold: (1) It adopts a novel semi-supervised learning process by translating attributes of labeled data to unlabeled data, and then reconstructing the unlabeled data by a cycle consistency operation.

Attribute Disentanglement +2

C-MIL: Continuation Multiple Instance Learning for Weakly Supervised Object Detection

1 code implementation CVPR 2019 Fang Wan, Chang Liu, Wei Ke, Xiangyang Ji, Jianbin Jiao, Qixiang Ye

Weakly supervised object detection (WSOD) is a challenging task when provided with image category supervision but required to simultaneously learn object locations and object detectors.

Multiple Instance Learning Object +3

Towards Optimal Structured CNN Pruning via Generative Adversarial Learning

1 code implementation CVPR 2019 Shaohui Lin, Rongrong Ji, Chenqian Yan, Baochang Zhang, Liujuan Cao, Qixiang Ye, Feiyue Huang, David Doermann

In this paper, we propose an effective structured pruning approach that jointly prunes filters as well as other structures in an end-to-end manner.

Min-Entropy Latent Model for Weakly Supervised Object Detection

1 code implementation CVPR 2018 Fang Wan, Pengxu Wei, Zhenjun Han, Jianbin Jiao, Qixiang Ye

Weakly supervised object detection is a challenging task when provided with image category supervision but required to learn, at the same time, object locations and object detectors.

Image Classification Object +3

SIXray : A Large-scale Security Inspection X-ray Benchmark for Prohibited Item Discovery in Overlapping Images

1 code implementation2 Jan 2019 Caijing Miao, Lingxi Xie, Fang Wan, Chi Su, Hongye Liu, Jianbin Jiao, Qixiang Ye

In particular, the advantage of CHR is more significant in the scenarios with fewer positive training samples, which demonstrates its potential application in real-world security inspection.

Object Localization

Similarity-preserving Image-image Domain Adaptation for Person Re-identification

no code implementations26 Nov 2018 Weijian Deng, Liang Zheng, Qixiang Ye, Yi Yang, Jianbin Jiao

It first preserves two types of unsupervised similarity, namely, self-similarity of an image before and after translation, and domain-dissimilarity of a translated source image and a target image.

Domain Adaptation Generative Adversarial Network +2

Linear Span Network for Object Skeleton Detection

no code implementations ECCV 2018 Chang Liu, Wei Ke, Fei Qin, Qixiang Ye

Hinted by this, we formalize a Linear Span framework, and propose Linear Span Network (LSN) modified by Linear Span Units (LSUs), which minimize the reconstruction error of convolutional network.

Object Object Skeleton Detection

SRN: Side-output Residual Network for Object Reflection Symmetry Detection and Beyond

1 code implementation17 Jul 2018 Wei Ke, Jie Chen, Jianbin Jiao, Guoying Zhao, Qixiang Ye

The end-to-end deep learning approach, referred to as a side-output residual network (SRN), leverages the output residual units (RUs) to fit the errors between the object ground-truth symmetry and the side-outputs of multiple stages.

Edge Detection Hand Pose Estimation +2

Image-Image Domain Adaptation with Preserved Self-Similarity and Domain-Dissimilarity for Person Re-identification

2 code implementations CVPR 2018 Weijian Deng, Liang Zheng, Qixiang Ye, Guoliang Kang, Yi Yang, Jianbin Jiao

To this end, we propose to preserve two types of unsupervised similarities, 1) self-similarity of an image before and after translation, and 2) domain-dissimilarity of a translated source image and a target image.

Generative Adversarial Network Person Re-Identification +2

SRN: Side-output Residual Network for Object Symmetry Detection in the Wild

1 code implementation CVPR 2017 Wei Ke, Jie Chen, Jianbin Jiao, Guoying Zhao, Qixiang Ye

By stacking RUs in a deep-to-shallow manner, SRN exploits the 'flow' of errors among multiple scales to ease the problems of fitting complex outputs with limited layers, suppressing the complex backgrounds, and effectively matching object symmetry of different scales.

Diversity Object +1

A Graphical Social Topology Model for Multi-Object Tracking

no code implementations14 Feb 2017 Shan Gao, Xiaogang Chen, Qixiang Ye, Junliang Xing, Arjan Kuijper, Xiangyang Ji

Inspired with the social affinity property of moving objects, we propose a Graphical Social Topology (GST) model, which estimates the group dynamics by jointly modeling the group structure and the states of objects using a topological representation.

Multi-Object Tracking Object

Oriented Response Networks

1 code implementation CVPR 2017 Yanzhao Zhou, Qixiang Ye, Qiang Qiu, Jianbin Jiao

DCNNs using ARFs, referred to as Oriented Response Networks (ORNs), can produce within-class rotation-invariant deep features while maintaining inter-class discrimination for classification tasks.

Ranked #83 on Image Classification on CIFAR-100 (using extra training data)

General Classification Image Classification

Self-learning Scene-specific Pedestrian Detectors using a Progressive Latent Model

no code implementations CVPR 2017 Qixiang Ye, Tianliang Zhang, Qiang Qiu, Baochang Zhang, Jie Chen, Guillermo Sapiro

In this paper, a self-learning approach is proposed towards solving scene-specific pedestrian detection problem without any human' annotation involved.

Object Object Discovery +5

A scalable convolutional neural network for task-specified scenarios via knowledge distillation

no code implementations19 Sep 2016 Mengnan Shi, Fei Qin, Qixiang Ye, Zhenjun Han, Jianbin Jiao

In this paper, we explore the redundancy in convolutional neural network, which scales with the complexity of vision tasks.

Knowledge Distillation

Cannot find the paper you are looking for? You can Submit a new open access paper.