Search Results for author: Wenjie Pei

Found 50 papers, 21 papers with code

CPGAN: Content-Parsing Generative Adversarial Networks for Text-to-Image Synthesis

no code implementations ECCV 2020 Jiadong Liang, Wenjie Pei, Feng Lu

Typical methods for text-to-image synthesis seek to design effective generative architecture to model the text-to-image mapping directly.

Image Generation Semantic correspondence +1

Multi-Modal Sarcasm Detection via Cross-Modal Graph Convolutional Network

1 code implementation ACL 2022 Bin Liang, Chenwei Lou, Xiang Li, Min Yang, Lin Gui, Yulan He, Wenjie Pei, Ruifeng Xu

Then, the descriptions of the objects are served as a bridge to determine the importance of the association between the objects of image modality and the contextual words of text modality, so as to build a cross-modal graph for each multi-modal instance.

Sarcasm Detection

Saliency-Aware Regularized Graph Neural Network

no code implementations1 Jan 2024 Wenjie Pei, Weina Xu, Zongze Wu, Weichao Li, Jinfan Wang, Guangming Lu, Xiangrong Wang

In this work, we propose the Saliency-Aware Regularized Graph Neural Network (SAR-GNN) for graph classification, which consists of two core modules: 1) a traditional graph neural network serving as the backbone for learning node features and 2) the Graph Neural Memory designed to distill a compact graph representation from node features of the backbone.

Graph Classification Representation Learning +2

Robust 3D Tracking with Quality-Aware Shape Completion

no code implementations17 Dec 2023 Jingwen Zhang, Zikun Zhou, Guangming Lu, Jiandong Tian, Wenjie Pei

Considering that, we propose to construct a synthetic target representation composed of dense and complete point clouds depicting the target shape precisely by shape completion for robust 3D tracking.

3D Single Object Tracking Object Tracking

SA$^2$VP: Spatially Aligned-and-Adapted Visual Prompt

1 code implementation16 Dec 2023 Wenjie Pei, Tongqi Xia, Fanglin Chen, Jinsong Li, Jiandong Tian, Guangming Lu

Typical methods for visual prompt tuning follow the sequential modeling paradigm stemming from NLP, which represents an input image as a flattened sequence of token embeddings and then learns a set of unordered parameterized tokens prefixed to the sequence representation as the visual prompts for task adaptation of large vision models.

Image Classification

D$^2$ST-Adapter: Disentangled-and-Deformable Spatio-Temporal Adapter for Few-shot Action Recognition

1 code implementation3 Dec 2023 Wenjie Pei, Qizhong Tan, Guangming Lu, Jiandong Tian

Adapting large pre-trained image models to few-shot action recognition has proven to be an effective and efficient strategy for learning robust feature extractors, which is essential for few-shot learning.

Few-Shot action recognition Few Shot Action Recognition +1

Hierarchical Contrastive Learning for Pattern-Generalizable Image Corruption Detection

1 code implementation ICCV 2023 Xin Feng, Yifeng Xu, Guangming Lu, Wenjie Pei

Detecting corrupted regions by learning the contrastive distinctions rather than the semantic patterns of corruptions, our model has well generalization ability across different corruption patterns.

Contrastive Learning Image Inpainting +1

Scene-Generalizable Interactive Segmentation of Radiance Fields

no code implementations9 Aug 2023 Songlin Tang, Wenjie Pei, Xin Tao, Tanghui Jia, Guangming Lu, Yu-Wing Tai

Existing methods for interactive segmentation in radiance fields entail scene-specific optimization and thus cannot generalize across different scenes, which greatly limits their applicability.

Interactive Segmentation Segmentation +1

Feature Decoupling-Recycling Network for Fast Interactive Segmentation

no code implementations7 Aug 2023 Huimin Zeng, Weinong Wang, Xin Tao, Zhiwei Xiong, Yu-Wing Tai, Wenjie Pei

First, our model decouples the learning of source image semantics from the encoding of user guidance to process two types of input domains separately.

Image Segmentation Interactive Segmentation +3

Boosting Few-shot 3D Point Cloud Segmentation via Query-Guided Enhancement

1 code implementation6 Aug 2023 Zhenhua Ning, Zhuotao Tian, Guangming Lu, Wenjie Pei

Although extensive research has been conducted on 3D point cloud segmentation, effectively adapting generic models to novel categories remains a formidable challenge.

Point Cloud Segmentation Segmentation

Reliability-Hierarchical Memory Network for Scribble-Supervised Video Object Segmentation

1 code implementation25 Mar 2023 Zikun Zhou, Kaige Mao, Wenjie Pei, Hongpeng Wang, YaoWei Wang, Zhenyu He

To be specific, RHMNet first only uses the memory in the high-reliability level to locate the region with high reliability belonging to the target, which is highly similar to the initial target scribble.

Semantic Segmentation Video Object Segmentation +1

Audio2Gestures: Generating Diverse Gestures from Audio

no code implementations17 Jan 2023 Jing Li, Di Kang, Wenjie Pei, Xuefei Zhe, Ying Zhang, Linchao Bao, Zhenyu He

Finally, we demonstrate that our method can be readily used to generate motion sequences with user-specified motion clips on the timeline.

Gesture Generation

Activating the Discriminability of Novel Classes for Few-shot Segmentation

no code implementations2 Dec 2022 Dianwen Mei, Wei Zhuo, Jiandong Tian, Guangming Lu, Wenjie Pei

To circumvent these two challenges, we propose to activate the discriminability of novel classes explicitly in both the feature encoding stage and the prediction stage for segmentation.

Segmentation

Semantic-Aware Local-Global Vision Transformer

no code implementations27 Nov 2022 Jiatong Zhang, Zengwei Yao, Fanglin Chen, Guangming Lu, Wenjie Pei

Second, instead of only performing local self-attention within local windows as Swin Transformer does, the proposed SALG performs both 1) local intra-region self-attention for learning fine-grained features within each region and 2) global inter-region feature propagation for modeling global dependencies among all regions.

Image Classification Semantic Segmentation

Alleviating the Sample Selection Bias in Few-shot Learning by Removing Projection to the Centroid

2 code implementations30 Oct 2022 Jing Xu, Xu Luo, Xinglin Pan, Wenjie Pei, Yanan Li, Zenglin Xu

In this paper, we find that this problem usually occurs when the positions of support samples are in the vicinity of task centroid -- the mean of all class centroids in the task.

Few-Shot Learning Selection bias

SSORN: Self-Supervised Outlier Removal Network for Robust Homography Estimation

no code implementations30 Aug 2022 Yi Li, Wenjie Pei, Zhenyu He

In this paper, we attempt to build a deep learning model that mimics all four steps in the traditional homography estimation pipeline.

Denoising Homography Estimation

Layout-Bridging Text-to-Image Synthesis

no code implementations12 Aug 2022 Jiadong Liang, Wenjie Pei, Feng Lu

Specifically, we formulate the text-to-layout generation as a sequence-to-sequence modeling task, and build our model upon Transformer to learn the spatial relationships between objects by modeling the sequential dependencies between them.

Image Generation

Learning Generalizable Latent Representations for Novel Degradations in Super Resolution

no code implementations25 Jul 2022 Fengjun Li, Xin Feng, Fanglin Chen, Guangming Lu, Wenjie Pei

The real-world degradations can be beyond the simulation scope by the handcrafted degradations, which are referred to as novel degradations.

Blind Super-Resolution Image Super-Resolution +1

Few-Shot Object Detection by Knowledge Distillation Using Bag-of-Visual-Words Representations

no code implementations25 Jul 2022 Wenjie Pei, Shuang Wu, Dianwen Mei, Fanglin Chen, Jiandong Tian, Guangming Lu

In this work we design a novel knowledge distillation framework to guide the learning of the object detector and thereby restrain the overfitting in both the pre-training stage on base classes and fine-tuning stage on novel classes.

Few-Shot Object Detection Knowledge Distillation +2

Self-Support Few-Shot Semantic Segmentation

1 code implementation23 Jul 2022 Qi Fan, Wenjie Pei, Yu-Wing Tai, Chi-Keung Tang

Motivated by the simple Gestalt principle that pixels belonging to the same object are more similar than those to different objects of same class, we propose a novel self-support matching strategy to alleviate this problem, which uses query prototypes to match query features, where the query prototypes are collected from high-confidence query predictions.

Few-Shot Semantic Segmentation Segmentation +1

Multi-Faceted Distillation of Base-Novel Commonality for Few-shot Object Detection

1 code implementation22 Jul 2022 Shuang Wu, Wenjie Pei, Dianwen Mei, Fanglin Chen, Jiandong Tian, Guangming Lu

Most of existing methods for few-shot object detection follow the fine-tuning paradigm, which potentially assumes that the class-agnostic generalizable knowledge can be learned and transferred implicitly from base classes with abundant samples to novel classes with limited samples via such a two-stage training strategy.

Few-Shot Object Detection object-detection

Learning Sequence Representations by Non-local Recurrent Neural Memory

1 code implementation20 Jul 2022 Wenjie Pei, Xin Feng, Canmiao Fu, Qiong Cao, Guangming Lu, Yu-Wing Tai

The key challenge of sequence representation learning is to capture the long-range temporal dependencies.

Representation Learning

Global-Local Stepwise Generative Network for Ultra High-Resolution Image Restoration

1 code implementation16 Jul 2022 Xin Feng, Haobo Ji, Wenjie Pei, Fanglin Chen, Guangming Lu

While the research on image background restoration from regular size of degraded images has achieved remarkable progress, restoring ultra high-resolution (e. g., 4K) images remains an extremely challenging task due to the explosion of computational complexity and memory usage, as well as the deficiency of annotated data.

Image Dehazing Image Restoration +2

Single Shot Self-Reliant Scene Text Spotter by Decoupled yet Collaborative Detection and Recognition

1 code implementation15 Jul 2022 Jingjing Wu, Pengyuan Lyu, Guangming Lu, Chengquan Zhang, Wenjie Pei

Typical text spotters follow the two-stage spotting paradigm which detects the boundary for a text instance first and then performs text recognition within the detected regions.

Text Detection Text Spotting

Global Tracking via Ensemble of Local Trackers

1 code implementation CVPR 2022 Zikun Zhou, Jianqiu Chen, Wenjie Pei, Kaige Mao, Hongpeng Wang, Zhenyu He

While it can exploit the temporal context like historical appearances and locations of the target, a potential limitation of such strategy is that the local tracker tends to misidentify a nearby distractor as the target instead of activating the re-detector when the real target is out of view.

Exploring Category-correlated Feature for Few-shot Image Classification

no code implementations14 Dec 2021 Jing Xu, Xinglin Pan, Xu Luo, Wenjie Pei, Zenglin Xu

To alleviate this problem, we present a simple yet effective feature rectification method by exploring the category correlation between novel and base classes as the prior knowledge.

Classification Few-Shot Image Classification

An Informative Tracking Benchmark

1 code implementation13 Dec 2021 Xin Li, Qiao Liu, Wenjie Pei, Qiuhong Shen, YaoWei Wang, Huchuan Lu, Ming-Hsuan Yang

Along with the rapid progress of visual tracking, existing benchmarks become less informative due to redundancy of samples and weak discrimination between current trackers, making evaluations on all datasets extremely time-consuming.

Visual Tracking

U2-Former: A Nested U-shaped Transformer for Image Restoration

no code implementations4 Dec 2021 Haobo Ji, Xin Feng, Wenjie Pei, Jinxing Li, Guangming Lu

While Transformer has achieved remarkable performance in various high-level vision tasks, it is still challenging to exploit the full potential of Transformer in image restoration.

Computational Efficiency Contrastive Learning +3

Pedestrian Detection by Exemplar-Guided Contrastive Learning

no code implementations17 Nov 2021 Zebin Lin, Wenjie Pei, Fanglin Chen, David Zhang, Guangming Lu

Instead of learning each of these diverse pedestrian appearance features individually as most existing methods do, we propose to perform contrastive learning to guide the feature learning in such a way that the semantic distance between pedestrians with different appearances in the learned feature space is minimized to eliminate the appearance diversities, whilst the distance between pedestrians and background is maximized.

Contrastive Learning Pedestrian Detection

Stepwise-Refining Speech Separation Network via Fine-Grained Encoding in High-order Latent Domain

no code implementations10 Oct 2021 Zengwei Yao, Wenjie Pei, Fanglin Chen, Guangming Lu, David Zhang

Existing methods for speech separation either transform the speech signals into frequency domain to perform separation or seek to learn a separable embedding space by constructing a latent domain based on convolutional filters.

speech-recognition Speech Recognition +1

Generative Memory-Guided Semantic Reasoning Model for Image Inpainting

no code implementations1 Oct 2021 Xin Feng, Wenjie Pei, Fengjun Li, Fanglin Chen, David Zhang, Guangming Lu

Most existing methods for image inpainting focus on learning the intra-image priors from the known regions of the current input image to infer the content of the corrupted regions in the same image.

Image Inpainting

Audio2Gestures: Generating Diverse Gestures from Speech Audio with Conditional Variational Autoencoders

no code implementations ICCV 2021 Jing Li, Di Kang, Wenjie Pei, Xuefei Zhe, Ying Zhang, Zhenyu He, Linchao Bao

In order to overcome this problem, we propose a novel conditional variational autoencoder (VAE) that explicitly models one-to-many audio-to-motion mapping by splitting the cross-modal latent code into shared code and motion-specific code.

Gesture Generation

Saliency-Associated Object Tracking

1 code implementation ICCV 2021 Zikun Zhou, Wenjie Pei, Xin Li, Hongpeng Wang, Feng Zheng, Zhenyu He

A potential limitation of such trackers is that not all patches are equally informative for tracking.

Object Object Tracking

Self-Supervised Tracking via Target-Aware Data Synthesis

no code implementations21 Jun 2021 Xin Li, Wenjie Pei, YaoWei Wang, Zhenyu He, Huchuan Lu, Ming-Hsuan Yang

While deep-learning based tracking methods have achieved substantial progress, they entail large-scale and high-quality annotated data for sufficient training.

Representation Learning Self-Supervised Learning +1

Deep-Masking Generative Network: A Unified Framework for Background Restoration from Superimposed Images

1 code implementation9 Oct 2020 Xin Feng, Wenjie Pei, Zihui Jia, Fanglin Chen, David Zhang, Guangming Lu

In this work we present the Deep-Masking Generative Network (DMGN), which is a unified framework for background restoration from the superimposed images and is able to cope with different types of noise.

Image Dehazing Image Generation +3

CPGAN: Full-Spectrum Content-Parsing Generative Adversarial Networks for Text-to-Image Synthesis

1 code implementation18 Dec 2019 Jiadong Liang, Wenjie Pei, Feng Lu

Typical methods for text-to-image synthesis seek to design effective generative architecture to model the text-to-image mapping directly.

Image Generation Semantic correspondence +1

Push for Quantization: Deep Fisher Hashing

no code implementations31 Aug 2019 Yunqiang Li, Wenjie Pei, Yufei zha, Jan van Gemert

In this paper we push for quantization: We optimize maximum class separability in the binary space.

Quantization Semantic Similarity +1

Reflective Decoding Network for Image Captioning

no code implementations ICCV 2019 Lei Ke, Wenjie Pei, Ruiyu Li, Xiaoyong Shen, Yu-Wing Tai

State-of-the-art image captioning methods mostly focus on improving visual features, less attention has been paid to utilizing the inherent properties of language to boost captioning performance.

Image Captioning Position +1

Memory-Attended Recurrent Network for Video Captioning

1 code implementation CVPR 2019 Wenjie Pei, Jiyuan Zhang, Xiangrong Wang, Lei Ke, Xiaoyong Shen, Yu-Wing Tai

Typical techniques for video captioning follow the encoder-decoder framework, which can only focus on one source video being processed.

Video Captioning

Unsupervised Learning of Sequence Representations by Autoencoders

no code implementations3 Apr 2018 Wenjie Pei, David M. J. Tax

Sequence data is challenging for machine learning approaches, because the lengths of the sequences may vary between samples.

Attended End-to-end Architecture for Age Estimation from Facial Expression Videos

no code implementations23 Nov 2017 Wenjie Pei, Hamdi Dibeklioğlu, Tadas Baltrušaitis, David M. J. Tax

In this paper, we present an end-to-end architecture for age estimation, called Spatially-Indexed Attention Model (SIAM), which is able to simultaneously learn both the appearance and dynamics of age from raw videos of facial expressions.

Age Estimation

Interacting Attention-gated Recurrent Networks for Recommendation

no code implementations5 Sep 2017 Wenjie Pei, Jie Yang, Zhu Sun, Jie Zhang, Alessandro Bozzon, David M. J. Tax

In particular, we propose a novel attention scheme to learn the attention scores of user and item history in an interacting way, thus to account for the dependencies between user and item dynamics in shaping user-item interactions.

Temporal Attention-Gated Model for Robust Sequence Classification

1 code implementation CVPR 2017 Wenjie Pei, Tadas Baltrušaitis, David M. J. Tax, Louis-Philippe Morency

An important advantage of our approach is interpretability since the temporal attention weights provide a meaningful value for the salience of each time step in the sequence.

Classification General Classification +1

Modeling Time Series Similarity with Siamese Recurrent Networks

no code implementations15 Mar 2016 Wenjie Pei, David M. J. Tax, Laurens van der Maaten

Traditional techniques for measuring similarities between time series are based on handcrafted similarity measures, whereas more recent learning-based approaches cannot exploit external supervision.

domain classification General Classification +5

Time Series Classification using the Hidden-Unit Logistic Model

no code implementations16 Jun 2015 Wenjie Pei, Hamdi Dibeklioğlu, David M. J. Tax, Laurens van der Maaten

We present a new model for time series classification, called the hidden-unit logistic model, that uses binary stochastic hidden units to model latent structure in the data.

Action Recognition Action Unit Detection +9

Cannot find the paper you are looking for? You can Submit a new open access paper.