Search Results for author: Wenjie Pei

Found 51 papers, 21 papers with code

Commonality-Parsing Network across Shape and Appearance for Partially Supervised Instance Segmentation

1 code implementation • ECCV 2020 • Qi Fan, Lei Ke, Wenjie Pei, Chi-Keung Tang, Yu-Wing Tai

We propose to learn the underlying class-agnostic commonalities that can be generalized from mask-annotated categories to novel categories.

Ranked #79 on Instance Segmentation on COCO test-dev

Instance Segmentation Segmentation +1

344

Paper
Code

Self-Support Few-Shot Semantic Segmentation

1 code implementation • 23 Jul 2022 • Qi Fan, Wenjie Pei, Yu-Wing Tai, Chi-Keung Tang

Motivated by the simple Gestalt principle that pixels belonging to the same object are more similar than those to different objects of same class, we propose a novel self-support matching strategy to alleviate this problem, which uses query prototypes to match query features, where the query prototypes are collected from high-confidence query predictions.

Ranked #12 on Few-Shot Semantic Segmentation on PASCAL-5i (5-Shot)

Few-Shot Semantic Segmentation Segmentation +1

Paper
Code

Memory-Attended Recurrent Network for Video Captioning

1 code implementation • CVPR 2019 • Wenjie Pei, Jiyuan Zhang, Xiangrong Wang, Lei Ke, Xiaoyong Shen, Yu-Wing Tai

Typical techniques for video captioning follow the encoder-decoder framework, which can only focus on one source video being processed.

Video Captioning

Paper
Code

CPGAN: Full-Spectrum Content-Parsing Generative Adversarial Networks for Text-to-Image Synthesis

1 code implementation • 18 Dec 2019 • Jiadong Liang, Wenjie Pei, Feng Lu

Typical methods for text-to-image synthesis seek to design effective generative architecture to model the text-to-image mapping directly.

Image Generation Semantic correspondence +1

Paper
Code

Saliency-Associated Object Tracking

1 code implementation • ICCV 2021 • Zikun Zhou, Wenjie Pei, Xin Li, Hongpeng Wang, Feng Zheng, Zhenyu He

A potential limitation of such trackers is that not all patches are equally informative for tracking.

Object Object Tracking

Paper
Code

Multi-Modal Sarcasm Detection via Cross-Modal Graph Convolutional Network

1 code implementation • ACL 2022 • Bin Liang, Chenwei Lou, Xiang Li, Min Yang, Lin Gui, Yulan He, Wenjie Pei, Ruifeng Xu

Then, the descriptions of the objects are served as a bridge to determine the importance of the association between the objects of image modality and the contextual words of text modality, so as to build a cross-modal graph for each multi-modal instance.

Sarcasm Detection

Paper
Code

Multi-Faceted Distillation of Base-Novel Commonality for Few-shot Object Detection

1 code implementation • 22 Jul 2022 • Shuang Wu, Wenjie Pei, Dianwen Mei, Fanglin Chen, Jiandong Tian, Guangming Lu

Most of existing methods for few-shot object detection follow the fine-tuning paradigm, which potentially assumes that the class-agnostic generalizable knowledge can be learned and transferred implicitly from base classes with abundant samples to novel classes with limited samples via such a two-stage training strategy.

Few-Shot Object Detection object-detection

Paper
Code

Temporal Attention-Gated Model for Robust Sequence Classification

1 code implementation • CVPR 2017 • Wenjie Pei, Tadas Baltrušaitis, David M. J. Tax, Louis-Philippe Morency

An important advantage of our approach is interpretability since the temporal attention weights provide a meaningful value for the salience of each time step in the sequence.

Classification General Classification +1

Paper
Code

Hierarchical Contrastive Learning for Pattern-Generalizable Image Corruption Detection

1 code implementation • ICCV 2023 • Xin Feng, Yifeng Xu, Guangming Lu, Wenjie Pei

Detecting corrupted regions by learning the contrastive distinctions rather than the semantic patterns of corruptions, our model has well generalization ability across different corruption patterns.

Contrastive Learning Image Inpainting +1

Paper
Code

Reliability-Hierarchical Memory Network for Scribble-Supervised Video Object Segmentation

1 code implementation • 25 Mar 2023 • Zikun Zhou, Kaige Mao, Wenjie Pei, Hongpeng Wang, YaoWei Wang, Zhenyu He

To be specific, RHMNet first only uses the memory in the high-reliability level to locate the region with high reliability belonging to the target, which is highly similar to the initial target scribble.

Semantic Segmentation Video Object Segmentation +1

Paper
Code

SiamCorners: Siamese Corner Networks for Visual Tracking

1 code implementation • 15 Apr 2021 • Kai Yang, Zhenyu He, Wenjie Pei, Zikun Zhou, Xin Li, Di Yuan, Haijun Zhang

By tracking a target as a pair of corners, we avoid the need to design the anchor boxes.

Region Proposal Visual Tracking

Paper
Code

Deep-Masking Generative Network: A Unified Framework for Background Restoration from Superimposed Images

1 code implementation • 9 Oct 2020 • Xin Feng, Wenjie Pei, Zihui Jia, Fanglin Chen, David Zhang, Guangming Lu

In this work we present the Deep-Masking Generative Network (DMGN), which is a unified framework for background restoration from the superimposed images and is able to cope with different types of noise.

Image Dehazing Image Generation +3

Paper
Code

An Informative Tracking Benchmark

1 code implementation • 13 Dec 2021 • Xin Li, Qiao Liu, Wenjie Pei, Qiuhong Shen, YaoWei Wang, Huchuan Lu, Ming-Hsuan Yang

Along with the rapid progress of visual tracking, existing benchmarks become less informative due to redundancy of samples and weak discrimination between current trackers, making evaluations on all datasets extremely time-consuming.

Visual Tracking

Paper
Code

Global Tracking via Ensemble of Local Trackers

1 code implementation • CVPR 2022 • Zikun Zhou, Jianqiu Chen, Wenjie Pei, Kaige Mao, Hongpeng Wang, Zhenyu He

While it can exploit the temporal context like historical appearances and locations of the target, a potential limitation of such strategy is that the local tracker tends to misidentify a nearby distractor as the target instead of activating the re-detector when the real target is out of view.

Paper
Code

Alleviating the Sample Selection Bias in Few-shot Learning by Removing Projection to the Centroid

2 code implementations • 30 Oct 2022 • Jing Xu, Xu Luo, Xinglin Pan, Wenjie Pei, Yanan Li, Zenglin Xu

In this paper, we find that this problem usually occurs when the positions of support samples are in the vicinity of task centroid -- the mean of all class centroids in the task.

Few-Shot Learning Selection bias

Paper
Code

SA$^2$VP: Spatially Aligned-and-Adapted Visual Prompt

1 code implementation • 16 Dec 2023 • Wenjie Pei, Tongqi Xia, Fanglin Chen, Jinsong Li, Jiandong Tian, Guangming Lu

Typical methods for visual prompt tuning follow the sequential modeling paradigm stemming from NLP, which represents an input image as a flattened sequence of token embeddings and then learns a set of unordered parameterized tokens prefixed to the sequence representation as the visual prompts for task adaptation of large vision models.

Image Classification Visual Prompt Tuning

Paper
Code

Boosting Few-shot 3D Point Cloud Segmentation via Query-Guided Enhancement

1 code implementation • 6 Aug 2023 • Zhenhua Ning, Zhuotao Tian, Guangming Lu, Wenjie Pei

Although extensive research has been conducted on 3D point cloud segmentation, effectively adapting generic models to novel categories remains a formidable challenge.

Point Cloud Segmentation Segmentation

Paper
Code

Single Shot Self-Reliant Scene Text Spotter by Decoupled yet Collaborative Detection and Recognition

1 code implementation • 15 Jul 2022 • Jingjing Wu, Pengyuan Lyu, Guangming Lu, Chengquan Zhang, Wenjie Pei

Typical text spotters follow the two-stage spotting paradigm which detects the boundary for a text instance first and then performs text recognition within the detected regions.

Ranked #5 on Text Spotting on ICDAR 2015

Text Detection Text Spotting

Paper
Code

D$^2$ST-Adapter: Disentangled-and-Deformable Spatio-Temporal Adapter for Few-shot Action Recognition

1 code implementation • 3 Dec 2023 • Wenjie Pei, Qizhong Tan, Guangming Lu, Jiandong Tian

In particular, we devise the anisotropic Deformable Spatio-Temporal Attention module as the core component of D$^2$ST-Adapter, which can be tailored with anisotropic sampling densities along spatial and temporal domains to learn spatial and temporal features specifically in corresponding pathways, allowing our D$^2$ST-Adapter to encode features in a global view in 3D spatio-temporal space while maintaining a lightweight design.

Few-Shot action recognition Few Shot Action Recognition +1

Paper
Code

Global-Local Stepwise Generative Network for Ultra High-Resolution Image Restoration

1 code implementation • 16 Jul 2022 • Xin Feng, Haobo Ji, Wenjie Pei, Fanglin Chen, Guangming Lu

While the research on image background restoration from regular size of degraded images has achieved remarkable progress, restoring ultra high-resolution (e. g., 4K) images remains an extremely challenging task due to the explosion of computational complexity and memory usage, as well as the deficiency of annotated data.

4k Image Dehazing +3

Paper
Code

Learning Sequence Representations by Non-local Recurrent Neural Memory

1 code implementation • 20 Jul 2022 • Wenjie Pei, Xin Feng, Canmiao Fu, Qiong Cao, Guangming Lu, Yu-Wing Tai

The key challenge of sequence representation learning is to capture the long-range temporal dependencies.

Representation Learning

Paper
Code

Unsupervised Learning of Sequence Representations by Autoencoders

no code implementations • 3 Apr 2018 • Wenjie Pei, David M. J. Tax

Sequence data is challenging for machine learning approaches, because the lengths of the sequences may vary between samples.

Paper
Add Code

Attended End-to-end Architecture for Age Estimation from Facial Expression Videos

no code implementations • 23 Nov 2017 • Wenjie Pei, Hamdi Dibeklioğlu, Tadas Baltrušaitis, David M. J. Tax

In this paper, we present an end-to-end architecture for age estimation, called Spatially-Indexed Attention Model (SIAM), which is able to simultaneously learn both the appearance and dynamics of age from raw videos of facial expressions.

Age Estimation

Paper
Add Code

Interacting Attention-gated Recurrent Networks for Recommendation

no code implementations • 5 Sep 2017 • Wenjie Pei, Jie Yang, Zhu Sun, Jie Zhang, Alessandro Bozzon, David M. J. Tax

In particular, we propose a novel attention scheme to learn the attention scores of user and item history in an interacting way, thus to account for the dependencies between user and item dynamics in shaping user-item interactions.

Paper
Add Code

Modeling Time Series Similarity with Siamese Recurrent Networks

no code implementations • 15 Mar 2016 • Wenjie Pei, David M. J. Tax, Laurens van der Maaten

Traditional techniques for measuring similarities between time series are based on handcrafted similarity measures, whereas more recent learning-based approaches cannot exploit external supervision.

domain classification General Classification +5

Paper
Add Code

Time Series Classification using the Hidden-Unit Logistic Model

no code implementations • 16 Jun 2015 • Wenjie Pei, Hamdi Dibeklioğlu, David M. J. Tax, Laurens van der Maaten

We present a new model for time series classification, called the hidden-unit logistic model, that uses binary stochastic hidden units to model latent structure in the data.

Action Recognition Action Unit Detection +9

Paper
Add Code

Non-local Recurrent Neural Memory for Supervised Sequence Modeling

no code implementations • ICCV 2019 • Canmiao Fu, Wenjie Pei, Qiong Cao, Chaopeng Zhang, Yong Zhao, Xiaoyong Shen, Yu-Wing Tai

Typical methods for supervised sequence modeling are built upon the recurrent neural networks to capture temporal dependencies.

Action Recognition Sentiment Analysis

Paper
Add Code

Push for Center Learning via Orthogonalization and Subspace Masking for Person Re-Identification

no code implementations • 28 Aug 2019 • Weinong Wang, Wenjie Pei, Qiong Cao, Shu Liu, Yu-Wing Tai

Person re-identification aims to identify whether pairs of images belong to the same person or not.

Person Re-Identification

Paper
Add Code

Reflective Decoding Network for Image Captioning

no code implementations • ICCV 2019 • Lei Ke, Wenjie Pei, Ruiyu Li, Xiaoyong Shen, Yu-Wing Tai

State-of-the-art image captioning methods mostly focus on improving visual features, less attention has been paid to utilizing the inherent properties of language to boost captioning performance.

Ranked #4 on Image Captioning on MS COCO

Image Captioning Position +1

Paper
Add Code

Push for Quantization: Deep Fisher Hashing

no code implementations • 31 Aug 2019 • Yunqiang Li, Wenjie Pei, Yufei zha, Jan van Gemert

In this paper we push for quantization: We optimize maximum class separability in the binary space.

Quantization Semantic Similarity +1

Paper
Add Code

CPGAN: Content-Parsing Generative Adversarial Networks for Text-to-Image Synthesis

no code implementations • ECCV 2020 • Jiadong Liang, Wenjie Pei, Feng Lu

Typical methods for text-to-image synthesis seek to design effective generative architecture to model the text-to-image mapping directly.

Image Generation Semantic correspondence +1

Paper
Add Code

Self-Supervised Tracking via Target-Aware Data Synthesis

no code implementations • 21 Jun 2021 • Xin Li, Wenjie Pei, YaoWei Wang, Zhenyu He, Huchuan Lu, Ming-Hsuan Yang

While deep-learning based tracking methods have achieved substantial progress, they entail large-scale and high-quality annotated data for sufficient training.

Representation Learning Self-Supervised Learning +1

Paper
Add Code

Audio2Gestures: Generating Diverse Gestures from Speech Audio with Conditional Variational Autoencoders

no code implementations • ICCV 2021 • Jing Li, Di Kang, Wenjie Pei, Xuefei Zhe, Ying Zhang, Zhenyu He, Linchao Bao

In order to overcome this problem, we propose a novel conditional variational autoencoder (VAE) that explicitly models one-to-many audio-to-motion mapping by splitting the cross-modal latent code into shared code and motion-specific code.

Ranked #3 on Gesture Generation on BEAT

Gesture Generation

Paper
Add Code

Generative Memory-Guided Semantic Reasoning Model for Image Inpainting

no code implementations • 1 Oct 2021 • Xin Feng, Wenjie Pei, Fengjun Li, Fanglin Chen, David Zhang, Guangming Lu

Most existing methods for image inpainting focus on learning the intra-image priors from the known regions of the current input image to infer the content of the corrupted regions in the same image.

Image Inpainting

Paper
Add Code

Stepwise-Refining Speech Separation Network via Fine-Grained Encoding in High-order Latent Domain

no code implementations • 10 Oct 2021 • Zengwei Yao, Wenjie Pei, Fanglin Chen, Guangming Lu, David Zhang

Existing methods for speech separation either transform the speech signals into frequency domain to perform separation or seek to learn a separable embedding space by constructing a latent domain based on convolutional filters.

Ranked #7 on Speech Separation on WHAMR!

speech-recognition Speech Recognition +1

Paper
Add Code

Label-Aware Distribution Calibration for Long-tailed Classification

no code implementations • 9 Nov 2021 • Chaozheng Wang, Shuzheng Gao, Cuiyun Gao, Pengyun Wang, Wenjie Pei, Lujia Pan, Zenglin Xu

Real-world data usually present long-tailed distributions.

Classification

Paper
Add Code

Pedestrian Detection by Exemplar-Guided Contrastive Learning

no code implementations • 17 Nov 2021 • Zebin Lin, Wenjie Pei, Fanglin Chen, David Zhang, Guangming Lu

Instead of learning each of these diverse pedestrian appearance features individually as most existing methods do, we propose to perform contrastive learning to guide the feature learning in such a way that the semantic distance between pedestrians with different appearances in the learned feature space is minimized to eliminate the appearance diversities, whilst the distance between pedestrians and background is maximized.

Ranked #1 on Pedestrian Detection on TJU-Ped-campus

Contrastive Learning Pedestrian Detection

Paper
Add Code

U2-Former: A Nested U-shaped Transformer for Image Restoration

no code implementations • 4 Dec 2021 • Haobo Ji, Xin Feng, Wenjie Pei, Jinxing Li, Guangming Lu

While Transformer has achieved remarkable performance in various high-level vision tasks, it is still challenging to exploit the full potential of Transformer in image restoration.

Ranked #16 on Image Dehazing on SOTS Outdoor

Computational Efficiency Contrastive Learning +3

Paper
Add Code

Exploring Category-correlated Feature for Few-shot Image Classification

no code implementations • 14 Dec 2021 • Jing Xu, Xinglin Pan, Xu Luo, Wenjie Pei, Zenglin Xu

To alleviate this problem, we present a simple yet effective feature rectification method by exploring the category correlation between novel and base classes as the prior knowledge.

Classification Few-Shot Image Classification

Paper
Add Code

Few-Shot Object Detection by Knowledge Distillation Using Bag-of-Visual-Words Representations

no code implementations • 25 Jul 2022 • Wenjie Pei, Shuang Wu, Dianwen Mei, Fanglin Chen, Jiandong Tian, Guangming Lu

In this work we design a novel knowledge distillation framework to guide the learning of the object detector and thereby restrain the overfitting in both the pre-training stage on base classes and fine-tuning stage on novel classes.

Few-Shot Object Detection Knowledge Distillation +2

Paper
Add Code

Learning Generalizable Latent Representations for Novel Degradations in Super Resolution

no code implementations • 25 Jul 2022 • Fengjun Li, Xin Feng, Fanglin Chen, Guangming Lu, Wenjie Pei

The real-world degradations can be beyond the simulation scope by the handcrafted degradations, which are referred to as novel degradations.

Blind Super-Resolution Image Super-Resolution +1

Paper
Add Code

Layout-Bridging Text-to-Image Synthesis

no code implementations • 12 Aug 2022 • Jiadong Liang, Wenjie Pei, Feng Lu

Specifically, we formulate the text-to-layout generation as a sequence-to-sequence modeling task, and build our model upon Transformer to learn the spatial relationships between objects by modeling the sequential dependencies between them.

Image Generation

Paper
Add Code

SSORN: Self-Supervised Outlier Removal Network for Robust Homography Estimation

no code implementations • 30 Aug 2022 • Yi Li, Wenjie Pei, Zhenyu He

In this paper, we attempt to build a deep learning model that mimics all four steps in the traditional homography estimation pipeline.

Denoising Homography Estimation

Paper
Add Code

Semantic-Aware Local-Global Vision Transformer

no code implementations • 27 Nov 2022 • Jiatong Zhang, Zengwei Yao, Fanglin Chen, Guangming Lu, Wenjie Pei

Second, instead of only performing local self-attention within local windows as Swin Transformer does, the proposed SALG performs both 1) local intra-region self-attention for learning fine-grained features within each region and 2) global inter-region feature propagation for modeling global dependencies among all regions.

Ranked #854 on Image Classification on ImageNet

Image Classification Semantic Segmentation

Paper
Add Code

Activating the Discriminability of Novel Classes for Few-shot Segmentation

no code implementations • 2 Dec 2022 • Dianwen Mei, Wei Zhuo, Jiandong Tian, Guangming Lu, Wenjie Pei

To circumvent these two challenges, we propose to activate the discriminability of novel classes explicitly in both the feature encoding stage and the prediction stage for segmentation.

Segmentation

Paper
Add Code

Audio2Gestures: Generating Diverse Gestures from Audio

no code implementations • 17 Jan 2023 • Jing Li, Di Kang, Wenjie Pei, Xuefei Zhe, Ying Zhang, Linchao Bao, Zhenyu He

Finally, we demonstrate that our method can be readily used to generate motion sequences with user-specified motion clips on the timeline.

Gesture Generation

Paper
Add Code

Feature Decoupling-Recycling Network for Fast Interactive Segmentation

no code implementations • 7 Aug 2023 • Huimin Zeng, Weinong Wang, Xin Tao, Zhiwei Xiong, Yu-Wing Tai, Wenjie Pei

First, our model decouples the learning of source image semantics from the encoding of user guidance to process two types of input domains separately.

Image Segmentation Interactive Segmentation +3

Paper
Add Code

Scene-Generalizable Interactive Segmentation of Radiance Fields

no code implementations • 9 Aug 2023 • Songlin Tang, Wenjie Pei, Xin Tao, Tanghui Jia, Guangming Lu, Yu-Wing Tai

Existing methods for interactive segmentation in radiance fields entail scene-specific optimization and thus cannot generalize across different scenes, which greatly limits their applicability.

Interactive Segmentation Segmentation +1

Paper
Add Code

Robust 3D Tracking with Quality-Aware Shape Completion

no code implementations • 17 Dec 2023 • Jingwen Zhang, Zikun Zhou, Guangming Lu, Jiandong Tian, Wenjie Pei

Considering that, we propose to construct a synthetic target representation composed of dense and complete point clouds depicting the target shape precisely by shape completion for robust 3D tracking.

3D Single Object Tracking Object Tracking

Paper
Add Code

Saliency-Aware Regularized Graph Neural Network

no code implementations • 1 Jan 2024 • Wenjie Pei, Weina Xu, Zongze Wu, Weichao Li, Jinfan Wang, Guangming Lu, Xiangrong Wang

In this work, we propose the Saliency-Aware Regularized Graph Neural Network (SAR-GNN) for graph classification, which consists of two core modules: 1) a traditional graph neural network serving as the backbone for learning node features and 2) the Graph Neural Memory designed to distill a compact graph representation from node features of the backbone.

Graph Classification Representation Learning +2

Paper
Add Code

Domain-Rectifying Adapter for Cross-Domain Few-Shot Segmentation

no code implementations • 16 Apr 2024 • Jiapeng Su, Qi Fan, Guangming Lu, Fanglin Chen, Wenjie Pei

Instead, our key idea is to adapt a small adapter for rectifying diverse target domain styles to the source domain.

Cross-Domain Few-Shot Few-Shot Semantic Segmentation +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.