no code implementations • 18 Jul 2024 • Xin Li, Bingchen Li, Yeying Jin, Cuiling Lan, Hanxin Zhu, Yulin Ren, Zhibo Chen
Compressed Image Super-resolution (CSR) aims to simultaneously super-resolve the compressed images and tackle the challenging hybrid distortions caused by compression.
no code implementations • CVPR 2024 • Tianci Bi, Xiaoyi Zhang, Zhizheng Zhang, Wenxuan Xie, Cuiling Lan, Yan Lu, Nanning Zheng
Significant progress has been made in scene text detection models since the rise of deep learning, but scene text layout analysis, which aims to group detected text instances as paragraphs, has not kept pace.
no code implementations • 20 Feb 2024 • Jiaqi Xu, Cuiling Lan, Wenxuan Xie, Xuejin Chen, Yan Lu
A pivotal challenge is the development of an efficient method to encapsulate video content into a set of representative tokens to align with LLMs.
no code implementations • 15 Feb 2024 • Tao Yang, Cuiling Lan, Yan Lu, Nanning Zheng
Disentangled representation learning strives to extract the intrinsic factors within observed data.
no code implementations • 8 Dec 2023 • Jiaqi Xu, Cuiling Lan, Wenxuan Xie, Xuejin Chen, Yan Lu
To address these issues, we introduce a simple yet effective retrieval-based video language model (R-VLM) for efficient and interpretable long video QA.
1 code implementation • 4 Oct 2023 • Hongruixuan Chen, Cuiling Lan, Jian Song, Clifford Broni-Bediako, Junshi Xia, Naoto Yokoya
Optical high-resolution imagery and OSM data are two important data sources of change detection (CD).
1 code implementation • ICCV 2023 • Dongwon Kim, Namyup Kim, Cuiling Lan, Suha Kwak
Referring image segmentation, the task of segmenting any arbitrary entities described in free-form texts, opens up a variety of vision applications.
1 code implementation • 18 Aug 2023 • Xin Li, Yulin Ren, Xin Jin, Cuiling Lan, Xingrui Wang, Wenjun Zeng, Xinchao Wang, Zhibo Chen
Image restoration (IR) has been an indispensable and challenging task in the low-level vision field, which strives to improve the subjective quality of images distorted by various forms of degradation.
2 code implementations • ICCV 2023 • Zhipeng Huang, Zhizheng Zhang, Cuiling Lan, Zheng-Jun Zha, Yan Lu, Baining Guo
With this insight, we propose Adaptive Frequency Filtering (AFF) token mixer.
no code implementations • 29 May 2023 • Tao Yang, Yuwang Wang, Cuiling Lan, Yan Lu, Nanning Zheng
In this paper, we study several typical disentangled representation learning works in terms of both disentanglement and compositional generalization abilities, and we provide an important insight: vector-based representation (using a vector instead of a scalar to represent a concept) is the key to empower both good disentanglement and strong compositional generalization.
2 code implementations • CVPR 2023 • Xin Li, Bingchen Li, Xin Jin, Cuiling Lan, Zhibo Chen
In this paper, we are the first to propose a novel training strategy for image restoration from the causality perspective, to improve the generalization ability of DNNs for unknown degradations.
1 code implementation • 21 Jan 2023 • Zongyu Guo, Cuiling Lan, Zhizheng Zhang, Yan Lu, Zhibo Chen
In this paper, we propose an efficient NP framework dubbed Versatile Neural Processes (VNP), which largely increases the capability of approximating functions.
no code implementations • ICCV 2023 • Hewei Guo, Liping Ren, Jingjing Fu, Yuwang Wang, Zhizheng Zhang, Cuiling Lan, Haoqian Wang, Xinwen Hou
Targeting for detecting anomalies of various sizes for complicated normal patterns, we propose a Template-guided Hierarchical Feature Restoration method, which introduces two key techniques, bottleneck compression and template-guided compensation, for anomaly-free feature restoration.
Ranked #16 on Anomaly Detection on MVTec LOCO AD
1 code implementation • 6 Dec 2022 • Xin Li, Cuiling Lan, Guoqiang Wei, Zhibo Chen
In this way, our message broadcasting encourages the group tokens to learn more informative and diverse information for effective domain alignment.
Ranked #1 on Unsupervised Domain Adaptation on DomainNet
no code implementations • CVPR 2022 • Namyup Kim, Dongwon Kim, Cuiling Lan, Wenjun Zeng, Suha Kwak
Most of existing methods for this task rely heavily on convolutional neural networks, which however have trouble capturing long-range dependencies between entities in the language expression and are not flexible enough for modeling interactions between the two different modalities.
no code implementations • CVPR 2023 • Shiqi Lin, Zhizheng Zhang, Zhipeng Huang, Yan Lu, Cuiling Lan, Peng Chu, Quanzeng You, Jiang Wang, Zicheng Liu, Amey Parulkar, Viraj Navkal, Zhibo Chen
Improving the generalization ability of Deep Neural Networks (DNNs) is critical for their practical uses, which has been a longstanding challenge.
2 code implementations • 11 Mar 2022 • Guoqiang Wei, Zhizheng Zhang, Cuiling Lan, Yan Lu, Zhibo Chen
In this work, we propose an innovative token-mixer, dubbed Active Token Mixer (ATM), to actively incorporate flexible contextual information distributed across different channels from other tokens into the given query token.
Ranked #64 on Object Detection on COCO minival
1 code implementation • 28 Jan 2022 • Tao Yu, Zhizheng Zhang, Cuiling Lan, Yan Lu, Zhibo Chen
For deep reinforcement learning (RL) from pixels, learning effective state representations is crucial for achieving high performance.
no code implementations • CVPR 2022 • Zhipeng Huang, Zhizheng Zhang, Cuiling Lan, Wenjun Zeng, Peng Chu, Quanzeng You, Jiang Wang, Zicheng Liu, Zheng-Jun Zha
In this paper, to address more practical scenarios, we propose a new task, Lifelong Unsupervised Domain Adaptive (LUDA) person ReID.
Domain Adaptive Person Re-Identification Knowledge Distillation +4
no code implementations • 26 Nov 2021 • Xin Li, Zhizheng Zhang, Guoqiang Wei, Cuiling Lan, Wenjun Zeng, Xin Jin, Zhibo Chen
In this paper, we propose a novel Confounder Identification-free Causal Visual Feature Learning (CICF) method, which obviates the need for identifying confounders.
no code implementations • 7 Nov 2021 • Pengfei Zhang, Cuiling Lan, Wenjun Zeng, Junliang Xing, Jianru Xue, Nanning Zheng
Skeleton data is of low dimension.
no code implementations • 28 Oct 2021 • Liang Xu, Cuiling Lan, Wenjun Zeng, Cewu Lu
Skeleton data carries valuable motion information and is widely explored in human action recognition.
no code implementations • 29 Sep 2021 • Namyup Kim, Taeyoung Son, Jaehyun Pahk, Cuiling Lan, Wenjun Zeng, Suha Kwak
We also present a method which injects styles of the web-crawled images into training images on-the-fly during training, which enables the network to experience images of diverse styles with reliable labels for effective training.
no code implementations • 31 Jul 2021 • Kecheng Zheng, Cuiling Lan, Wenjun Zeng, Jiawei Liu, Zhizheng Zhang, Zheng-Jun Zha
Occluded person re-identification (ReID) aims to match person images with occlusion.
1 code implementation • NeurIPS 2021 • Guoqiang Wei, Cuiling Lan, Wenjun Zeng, Zhizheng Zhang, Zhibo Chen
Unsupervised domain adaptive classifcation intends to improve the classifcation performance on unlabeled target domain.
2 code implementations • NeurIPS 2021 • Tao Yu, Cuiling Lan, Wenjun Zeng, Mingxiao Feng, Zhizheng Zhang, Zhibo Chen
In this work, we propose a novel method, dubbed PlayVirtual, which augments cycle-consistent virtual trajectories to enhance the data efficiency for RL feature representation learning.
Continuous Control (100k environment steps) Continuous Control (500k environment steps) +3
no code implementations • 25 Mar 2021 • Zhizheng Zhang, Cuiling Lan, Wenjun Zeng, Quanzeng You, Zicheng Liu, Kecheng Zheng, Zhibo Chen
Each recomposed feature, obtained based on the domain-invariant feature (which enables a reliable inheritance of identity) and an enhancement from a domain specific feature (which enables the approximation of real distributions), is thus an "ideal" augmentation.
1 code implementation • CVPR 2021 • Guoqiang Wei, Cuiling Lan, Wenjun Zeng, Zhibo Chen
For unsupervised domain adaptation (UDA), to alleviate the effect of domain shift, many approaches align the source and target domains in the feature space by adversarial learning or by explicitly aligning their statistics.
no code implementations • ICCV 2021 • Xin Jin, Cuiling Lan, Wenjun Zeng, Zhibo Chen
Many unsupervised domain adaptation (UDA) methods exploit domain adversarial training to align the features to reduce domain gap, where a feature extractor is trained to fool a domain discriminator in order to have aligned feature distributions.
1 code implementation • 2 Mar 2021 • Jindong Wang, Cuiling Lan, Chang Liu, Yidong Ouyang, Tao Qin, Wang Lu, Yiqiang Chen, Wenjun Zeng, Philip S. Yu
Domain generalization deals with a challenging setting where one or several different but related domain(s) are given, and the goal is to learn a model that can generalize to an unseen test domain.
no code implementations • 7 Feb 2021 • Rodolfo Quispe, Cuiling Lan, Wenjun Zeng, Helio Pedrini
Vehicle Re-Identification (V-ReID) is a critical task that associates the same vehicle across images from different camera viewpoints.
Ranked #1 on Vehicle Re-Identification on VeRi-Wild Large
1 code implementation • 3 Jan 2021 • Xin Jin, Cuiling Lan, Wenjun Zeng, Zhibo Chen
In this paper, we design a novel Style Normalization and Restitution module (SNR) to simultaneously ensure both high generalization and discrimination capability of the networks.
1 code implementation • 16 Dec 2020 • Kecheng Zheng, Cuiling Lan, Wenjun Zeng, Zhizheng Zhang, Zheng-Jun Zha
Based on this finding, we propose to exploit the uncertainty (measured by consistency levels) to evaluate the reliability of the pseudo-label of a sample and incorporate the uncertainty to re-weight its contribution within various ReID losses, including the identity (ID) classification loss per sample, the triplet loss, and the contrastive loss.
no code implementations • 9 Oct 2020 • Zhizheng Zhang, Cuiling Lan, Wenjun Zeng, Zhibo Chen, Shih-Fu Chang
In this work, we propose Uncertainty-Aware Few-Shot framework for image classification by modeling uncertainty of the similarities of query-support pairs and performing uncertainty-aware optimization.
no code implementations • 22 Jun 2020 • Xin Jin, Cuiling Lan, Wen-Jun Zeng, Zhibo Chen
To ensure high discrimination, we propose a Feature Restoration (FR) operation to distill task-relevant features from the residual information and use them to compensate for the aligned features.
Ranked #80 on Domain Generalization on PACS
no code implementations • 8 Jun 2020 • Zhizheng Zhang, Cuiling Lan, Wen-Jun Zeng, Zhibo Chen, Shih-Fu Chang
There is a lack of loss design which enables the joint optimization of multiple instances (of multiple classes) within per-query optimization for person ReID.
no code implementations • ECCV 2020 • Xin Jin, Cuiling Lan, Wen-Jun Zeng, Zhibo Chen
To address this problem, we introduce a global distance-distributions separation (GDS) constraint over the two distributions to encourage the clear separation of positive and negative samples from a global view.
1 code implementation • CVPR 2020 • Xin Jin, Cuiling Lan, Wen-Jun Zeng, Zhibo Chen, Li Zhang
Existing fully-supervised person re-identification (ReID) methods usually suffer from poor generalization capability caused by domain gaps.
Ranked #9 on Unsupervised Domain Adaptation on Market to Duke
no code implementations • CVPR 2020 • Zhizheng Zhang, Cuiling Lan, Wen-Jun Zeng, Zhibo Chen
In this paper, we propose an attentive feature aggregation module, namely Multi-Granularity Reference-aided Attentive Feature Aggregation (MG-RAFA), to delicately aggregate spatio-temporal features into a discriminative video-level feature representation.
no code implementations • 17 Jan 2020 • Xiaolin Song, Yuyang Zhao, Jingyu Yang, Cuiling Lan, Wenjun Zeng
To exploit such flexible and comprehensive information, we propose a semi-supervised Feature Pyramidal Correlation and Residual Reconstruction Network (FPCR-Net) for optical flow estimation from frame pairs.
no code implementations • 15 Jan 2020 • Xin Jin, Cuiling Lan, Wen-Jun Zeng, Zhibo Chen
To the best of our knowledge, we are the first to make use of multi-shots of an object in a teacher-student learning manner for effectively boosting the single image based re-id.
no code implementations • 3 Sep 2019 • Pengfei Zhang, Jianru Xue, Cuiling Lan, Wen-Jun Zeng, Zhanning Gao, Nanning Zheng
For an RNN block, an EleAttG is used for adaptively modulating the input by assigning different levels of importance, i. e., attention, to each element/dimension of the input.
Ranked #3 on Skeleton Based Action Recognition on SYSU 3D
1 code implementation • 30 May 2019 • Xin Jin, Cuiling Lan, Wen-Jun Zeng, Guoqiang Wei, Zhibo Chen
Specifically, we build a Semantics Aligning Network (SAN) which consists of a base network as encoder (SA-Enc) for re-ID, and a decoder (SA-Dec) for reconstructing/regressing the densely semantics aligned full texture image.
no code implementations • 17 Apr 2019 • Xin Jin, Cuiling Lan, Wen-Jun Zeng, Zhizheng Zhang, Zhibo Chen
We achieve this by the context interaction among the features of different scales.
1 code implementation • CVPR 2020 • Zhizheng Zhang, Cuiling Lan, Wen-Jun Zeng, Xin Jin, Zhibo Chen
For person re-identification (re-id), attention mechanisms have become attractive as they aim at strengthening discriminative features and suppressing irrelevant ones, which matches well the key of re-id, i. e., discriminative feature learning.
no code implementations • 3 Apr 2019 • Wentong Liao, Cuiling Lan, Wen-Jun Zeng, Michael Ying Yang, Bodo Rosenhahn
We further explore more powerful representations by integrating language prior with the visual context in the transformation for the scene graph generation.
2 code implementations • CVPR 2020 • Pengfei Zhang, Cuiling Lan, Wen-Jun Zeng, Junliang Xing, Jianru Xue, Nanning Zheng
Skeleton-based human action recognition has attracted great interest thanks to the easy accessibility of the human skeleton data.
Ranked #1 on Skeleton Based Action Recognition on SYSU 3D
no code implementations • 30 Jan 2019 • Guoqiang Wei, Cuiling Lan, Wen-Jun Zeng, Zhibo Chen
The diversity of capturing viewpoints and the flexibility of the human poses, however, remain some significant challenges.
no code implementations • CVPR 2019 • Zhizheng Zhang, Cuiling Lan, Wen-Jun Zeng, Zhibo Chen
We propose a densely semantically aligned person re-identification framework.
no code implementations • 11 Sep 2018 • Xiaolin Song, Cuiling Lan, Wen-Jun Zeng, Junliang Xing, Jingyu Yang, Xiaoyan Sun
We propose a video level 2D feature representation by transforming the convolutional features of all frames to a 2D feature map, referred to as VideoMap.
Ranked #51 on Action Recognition on UCF101
no code implementations • ECCV 2018 • Pengfei Zhang, Jianru Xue, Cuiling Lan, Wen-Jun Zeng, Zhanning Gao, Nanning Zheng
We propose adding a simple yet effective Element-wiseAttention Gate (EleAttG) to an RNN block (e. g., all RNN neurons in a network layer) that empowers the RNN neurons to have the attentiveness capability.
Ranked #102 on Skeleton Based Action Recognition on NTU RGB+D
2 code implementations • 20 Apr 2018 • Pengfei Zhang, Cuiling Lan, Junliang Xing, Wen-Jun Zeng, Jianru Xue, Nanning Zheng
In order to alleviate the effects of view variations, this paper introduces a novel view adaptation scheme, which automatically determines the virtual observation viewpoints in a learning based data driven manner.
Ranked #1 on Skeleton Based Action Recognition on UWA3D
no code implementations • ICCV 2017 • Ke Sun, Cuiling Lan, Junliang Xing, Wen-Jun Zeng, Dong Liu, Jingdong Wang
We present a two-stage normalization scheme, human body normalization and limb normalization, to make the distribution of the relative joint locations compact, resulting in easier learning of convolutional spatial models and more accurate pose estimation.
1 code implementation • ICCV 2017 • Pengfei Zhang, Cuiling Lan, Junliang Xing, Wen-Jun Zeng, Jianru Xue, Nanning Zheng
Rather than re-positioning the skeletons based on a human defined prior criterion, we design a view adaptive recurrent neural network (RNN) with LSTM architecture, which enables the network itself to adapt to the most suitable observation viewpoints from end to end.
Ranked #6 on Skeleton Based Action Recognition on SYSU 3D
no code implementations • 18 Nov 2016 • Sijie Song, Cuiling Lan, Junliang Xing, Wen-Jun Zeng, Jiaying Liu
In this work, we propose an end-to-end spatial and temporal attention model for human action recognition from skeleton data.
Ranked #112 on Skeleton Based Action Recognition on NTU RGB+D
1 code implementation • 19 Apr 2016 • Yanghao Li, Cuiling Lan, Junliang Xing, Wen-Jun Zeng, Chunfeng Yuan, Jiaying Liu
In this paper, we study the problem of online action detection from streaming skeleton data.
no code implementations • 24 Mar 2016 • Wentao Zhu, Cuiling Lan, Junliang Xing, Wen-Jun Zeng, Yanghao Li, Li Shen, Xiaohui Xie
Skeleton based action recognition distinguishes human actions using the trajectories of skeleton joints, which provide a very good representation for describing actions.