Search Results for author: Wenjun Zeng

Found 61 papers, 30 papers with code

Real-Time Dynamic Robot-Assisted Hand-Object Interaction via Motion Primitives

no code implementations29 May 2024 Mingqi Yuan, Huijiang Wang, Kai-Fung Chu, Fumiya Iida, Bo Li, Wenjun Zeng

These challenges arise from the need for accurate real-time perception of human actions, adaptive control algorithms for robots, and the effective coordination between human and robotic movements.

Hand Pose Estimation

Correlation-Embedded Transformer Tracking: A Single-Branch Framework

1 code implementation23 Jan 2024 Fei Xie, Wankou Yang, Chunyu Wang, Lei Chu, Yue Cao, Chao Ma, Wenjun Zeng

Thus, we reformulate the two-branch Siamese tracking as a conceptually simple, fully transformer-based Single-Branch Tracking pipeline, dubbed SBT.

Feature Correlation Visual Object Tracking

Inter-X: Towards Versatile Human-Human Interaction Analysis

no code implementations CVPR 2024 Liang Xu, Xintao Lv, Yichao Yan, Xin Jin, Shuwen Wu, Congsheng Xu, Yifan Liu, Yizhou Zhou, Fengyun Rao, Xingdong Sheng, Yunhui Liu, Wenjun Zeng, Xiaokang Yang

We also equip Inter-X with versatile annotations of more than 34K fine-grained human part-level textual descriptions, semantic interaction categories, interaction order, and the relationship and personality of the subjects.

RLLTE: Long-Term Evolution Project of Reinforcement Learning

2 code implementations28 Sep 2023 Mingqi Yuan, Zequn Zhang, Yang Xu, Shihao Luo, Bo Li, Xin Jin, Wenjun Zeng

We present RLLTE: a long-term evolution, extremely modular, and open-source framework for reinforcement learning (RL) research and application.

Language Modelling Large Language Model +2

Diffusion Models for Image Restoration and Enhancement -- A Comprehensive Survey

1 code implementation18 Aug 2023 Xin Li, Yulin Ren, Xin Jin, Cuiling Lan, Xingrui Wang, Wenjun Zeng, Xinchao Wang, Zhibo Chen

Image restoration (IR) has been an indispensable and challenging task in the low-level vision field, which strives to improve the subjective quality of images distorted by various forms of degradation.

Deblurring Image Restoration +2

One at a Time: Progressive Multi-step Volumetric Probability Learning for Reliable 3D Scene Perception

no code implementations22 Jun 2023 Bohan Li, Yasheng Sun, Jingxin Dong, Zheng Zhu, Jinming Liu, Xin Jin, Wenjun Zeng

Numerous studies have investigated the pivotal role of reliable 3D volume representation in scene perception tasks, such as multi-view stereo (MVS) and semantic scene completion (SSC).

Depth Estimation Representation Learning

Making Offline RL Online: Collaborative World Models for Offline Visual Reinforcement Learning

no code implementations24 May 2023 Qi Wang, Junming Yang, Yunbo Wang, Xin Jin, Wenjun Zeng, Xiaokang Yang

Training offline reinforcement learning (RL) models using visual inputs poses two significant challenges, i. e., the overfitting problem in representation learning and the overestimation bias for expected future rewards.

Offline RL Reinforcement Learning (RL) +2

NaviNeRF: NeRF-based 3D Representation Disentanglement by Latent Semantic Navigation

1 code implementation ICCV 2023 Baao Xie, Bohan Li, Zequn Zhang, Junting Dong, Xin Jin, Jingyu Yang, Wenjun Zeng

They are complementary -- the outer navigation is to identify global-view semantic directions, and the inner refinement dedicates to fine-grained attributes.

Disentanglement

[CLS] Token is All You Need for Zero-Shot Semantic Segmentation

no code implementations13 Apr 2023 Letian Wu, Wenyao Zhang, Tengping Jiang, Wankou Yang, Xin Jin, Wenjun Zeng

Based on that, we build upon the CLIP model as a backbone which we extend with a One-Way [CLS] token navigation from text to the visual branch that enables zero-shot dense prediction, dubbed \textbf{ClsCLIP}.

Few-Shot Semantic Segmentation Language Modelling +4

Inpaint Anything: Segment Anything Meets Image Inpainting

1 code implementation13 Apr 2023 Tao Yu, Runseng Feng, Ruoyu Feng, Jinming Liu, Xin Jin, Wenjun Zeng, Zhibo Chen

We are also very willing to help everyone share and promote new projects based on our Inpaint Anything (IA).

Image Inpainting

Bridging Stereo Geometry and BEV Representation with Reliable Mutual Interaction for Semantic Scene Completion

1 code implementation24 Mar 2023 Bohan Li, Yasheng Sun, Zhujin Liang, Dalong Du, Zhuanghui Zhang, XiaoFeng Wang, Yunnan Wang, Xin Jin, Wenjun Zeng

However, due to the inherent representation gap between stereo geometry and BEV features, it is non-trivial to bridge them for dense prediction task of SSC.

3D Semantic Scene Completion Hallucination +2

Automatic Intrinsic Reward Shaping for Exploration in Deep Reinforcement Learning

1 code implementation26 Jan 2023 Mingqi Yuan, Bo Li, Xin Jin, Wenjun Zeng

We present AIRS: Automatic Intrinsic Reward Shaping that intelligently and adaptively provides high-quality intrinsic rewards to enhance exploration in reinforcement learning (RL).

Benchmarking reinforcement-learning +1

Rewarding Episodic Visitation Discrepancy for Exploration in Reinforcement Learning

no code implementations19 Sep 2022 Mingqi Yuan, Bo Li, Xin Jin, Wenjun Zeng

Exploration is critical for deep reinforcement learning in complex environments with high-dimensional observations and sparse rewards.

Atari Games Benchmarking +3

A Nonparametric Contextual Bandit with Arm-level Eligibility Control for Customer Service Routing

no code implementations8 Sep 2022 Ruofeng Wen, Wenjun Zeng, Yi Liu

Routing contacts to eligible SMEs turns out to be a non-trivial problem because SMEs' domain eligibility is subject to training quality and can change over time.

Thompson Sampling

Robust Multi-Object Tracking by Marginal Inference

no code implementations7 Aug 2022 Yifu Zhang, Chunyu Wang, Xinggang Wang, Wenjun Zeng, Wenyu Liu

To address the problem, we present an efficient approach to compute a marginal probability for each pair of objects in real time.

Multi-Object Tracking Object

VirtualPose: Learning Generalizable 3D Human Pose Models from Virtual Data

1 code implementation20 Jul 2022 Jiajun Su, Chunyu Wang, Xiaoxuan Ma, Wenjun Zeng, Yizhou Wang

While monocular 3D pose estimation seems to have achieved very accurate results on the public datasets, their generalization ability is largely overlooked.

3D Multi-Person Pose Estimation (absolute) 3D Pose Estimation

ReSTR: Convolution-free Referring Image Segmentation Using Transformers

no code implementations CVPR 2022 Namyup Kim, Dongwon Kim, Cuiling Lan, Wenjun Zeng, Suha Kwak

Most of existing methods for this task rely heavily on convolutional neural networks, which however have trouble capturing long-range dependencies between entities in the language expression and are not flexible enough for modeling interactions between the two different modalities.

Image Segmentation Referring Expression Segmentation +2

ActFormer: A GAN-based Transformer towards General Action-Conditioned 3D Human Motion Generation

no code implementations ICCV 2023 Liang Xu, Ziyang Song, Dongliang Wang, Jing Su, Zhicheng Fang, Chenjing Ding, Weihao Gan, Yichao Yan, Xin Jin, Xiaokang Yang, Wenjun Zeng, Wei Wu

We present a GAN-based Transformer for general action-conditioned 3D human motion generation, including not only single-person actions but also multi-person interactive actions.

Correlation-Aware Deep Tracking

1 code implementation CVPR 2022 Fei Xie, Chunyu Wang, Guangting Wang, Yue Cao, Wankou Yang, Wenjun Zeng

In contrast to the Siamese-like feature extraction, our network deeply embeds cross-image feature correlation in multiple layers of the feature network.

Feature Correlation Visual Object Tracking

Retriever: Learning Content-Style Representation as a Token-Level Bipartite Graph

2 code implementations ICLR 2022 Dacheng Yin, Xuanchi Ren, Chong Luo, Yuwang Wang, Zhiwei Xiong, Wenjun Zeng

Last, an innovative link attention module serves as the decoder to reconstruct data from the decomposed content and style, with the help of the linking keys.

Decoder Quantization +2

Confounder Identification-free Causal Visual Feature Learning

no code implementations26 Nov 2021 Xin Li, Zhizheng Zhang, Guoqiang Wei, Cuiling Lan, Wenjun Zeng, Xin Jin, Zhibo Chen

In this paper, we propose a novel Confounder Identification-free Causal Visual Feature Learning (CICF) method, which obviates the need for identifying confounders.

Domain Generalization Meta-Learning

WEDGE: Web-Image Assisted Domain Generalization for Semantic Segmentation

no code implementations29 Sep 2021 Namyup Kim, Taeyoung Son, Jaehyun Pahk, Cuiling Lan, Wenjun Zeng, Suha Kwak

We also present a method which injects styles of the web-crawled images into training images on-the-fly during training, which enables the network to experience images of diverse styles with reliable labels for effective training.

Domain Generalization Segmentation +1

Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration

1 code implementation12 Sep 2021 Chuanxin Tang, Chong Luo, Zhiyuan Zhao, Dacheng Yin, Yucheng Zhao, Wenjun Zeng

Given a piece of speech and its transcript text, text-based speech editing aims to generate speech that can be seamlessly inserted into the given speech by editing the transcript.

Decoder Voice Conversion

A Battle of Network Structures: An Empirical Study of CNN, Transformer, and MLP

1 code implementation30 Aug 2021 Yucheng Zhao, Guangting Wang, Chuanxin Tang, Chong Luo, Wenjun Zeng, Zheng-Jun Zha

Convolutional neural networks (CNN) are the dominant deep neural network (DNN) architecture for computer vision.

Self-Supervised Visual Representations Learning by Contrastive Mask Prediction

no code implementations ICCV 2021 Yucheng Zhao, Guangting Wang, Chong Luo, Wenjun Zeng, Zheng-Jun Zha

In this paper, we propose a novel contrastive mask prediction (CMP) task for visual representation learning and design a mask contrast (MaskCo) framework to implement the idea.

Representation Learning Self-Supervised Learning

ToAlign: Task-oriented Alignment for Unsupervised Domain Adaptation

1 code implementation NeurIPS 2021 Guoqiang Wei, Cuiling Lan, Wenjun Zeng, Zhizheng Zhang, Zhibo Chen

Unsupervised domain adaptive classifcation intends to improve the classifcation performance on unlabeled target domain.

Unsupervised Domain Adaptation

Understanding Mobile GUI: from Pixel-Words to Screen-Sentences

no code implementations25 May 2021 Jingwen Fu, Xiaoyi Zhang, Yuwang Wang, Wenjun Zeng, Sam Yang, Grayson Hilliard

A dataset, RICO-PW, of screenshots with Pixel-Words annotations is built based on the public RICO dataset, which will be released to help to address the lack of high-quality training data in this area.

Retrieval Sentence

Unsupervised Visual Representation Learning by Tracking Patches in Video

1 code implementation CVPR 2021 Guangting Wang, Yizhou Zhou, Chong Luo, Wenxuan Xie, Wenjun Zeng, Zhiwei Xiong

The proxy task is to estimate the position and size of the image patch in a sequence of video frames, given only the target bounding box in the first frame.

Action Classification Action Recognition +1

S2R-DepthNet: Learning a Generalizable Depth-specific Structural Representation

4 code implementations CVPR 2021 Xiaotian Chen, Yuwang Wang, Xuejin Chen, Wenjun Zeng

S2R-DepthNet consists of: a) a Structure Extraction (STE) module which extracts a domaininvariant structural representation from an image by disentangling the image into domain-invariant structure and domain-specific style components, b) a Depth-specific Attention (DSA) module, which learns task-specific knowledge to suppress depth-irrelevant structures for better depth estimation and generalization, and c) a depth prediction module (DP) to predict depth from the depth-specific representation.

Depth Prediction Domain Generalization +2

Disentanglement-based Cross-Domain Feature Augmentation for Effective Unsupervised Domain Adaptive Person Re-identification

no code implementations25 Mar 2021 Zhizheng Zhang, Cuiling Lan, Wenjun Zeng, Quanzeng You, Zicheng Liu, Kecheng Zheng, Zhibo Chen

Each recomposed feature, obtained based on the domain-invariant feature (which enables a reliable inheritance of identity) and an enhancement from a domain specific feature (which enables the approximation of real distributions), is thus an "ideal" augmentation.

Disentanglement Domain Adaptive Person Re-Identification +2

MetaAlign: Coordinating Domain Alignment and Classification for Unsupervised Domain Adaptation

1 code implementation CVPR 2021 Guoqiang Wei, Cuiling Lan, Wenjun Zeng, Zhibo Chen

For unsupervised domain adaptation (UDA), to alleviate the effect of domain shift, many approaches align the source and target domains in the feature space by adversarial learning or by explicitly aligning their statistics.

Classification General Classification +5

Re-energizing Domain Discriminator with Sample Relabeling for Adversarial Domain Adaptation

no code implementations ICCV 2021 Xin Jin, Cuiling Lan, Wenjun Zeng, Zhibo Chen

Many unsupervised domain adaptation (UDA) methods exploit domain adversarial training to align the features to reduce domain gap, where a feature extractor is trained to fool a domain discriminator in order to have aligned feature distributions.

Unsupervised Domain Adaptation

Generalizing to Unseen Domains: A Survey on Domain Generalization

1 code implementation2 Mar 2021 Jindong Wang, Cuiling Lan, Chang Liu, Yidong Ouyang, Tao Qin, Wang Lu, Yiqiang Chen, Wenjun Zeng, Philip S. Yu

Domain generalization deals with a challenging setting where one or several different but related domain(s) are given, and the goal is to learn a model that can generalize to an unseen test domain.

Domain Generalization Out-of-Distribution Generalization +1

Rethinking Content and Style: Exploring Bias for Unsupervised Disentanglement

1 code implementation21 Feb 2021 Xuanchi Ren, Tao Yang, Yuwang Wang, Wenjun Zeng

From the unsupervised disentanglement perspective, we rethink content and style and propose a formulation for unsupervised C-S disentanglement based on our assumption that different factors are of different importance and popularity for image reconstruction, which serves as a data bias.

3D Reconstruction Disentanglement +4

Learning Disentangled Representation by Exploiting Pretrained Generative Models: A Contrastive Learning View

2 code implementations ICLR 2022 Xuanchi Ren, Tao Yang, Yuwang Wang, Wenjun Zeng

Based on this observation, we argue that it is possible to mitigate the trade-off by $(i)$ leveraging the pretrained generative models with high generation quality, $(ii)$ focusing on discovering the traversal directions as factors for disentangled representation learning.

Contrastive Learning Disentanglement

Towards Building A Group-based Unsupervised Representation Disentanglement Framework

1 code implementation ICLR 2022 Tao Yang, Xuanchi Ren, Yuwang Wang, Wenjun Zeng, Nanning Zheng

We then propose a model, based on existing VAE-based methods, to tackle the unsupervised learning problem of the framework.

Disentanglement

AttributeNet: Attribute Enhanced Vehicle Re-Identification

no code implementations7 Feb 2021 Rodolfo Quispe, Cuiling Lan, Wenjun Zeng, Helio Pedrini

Vehicle Re-Identification (V-ReID) is a critical task that associates the same vehicle across images from different camera viewpoints.

Attribute Vehicle Re-Identification

VAE^2: Preventing Posterior Collapse of Variational Video Predictions in the Wild

no code implementations28 Jan 2021 Yizhou Zhou, Chong Luo, Xiaoyan Sun, Zheng-Jun Zha, Wenjun Zeng

We believe that VAE$^2$ is also applicable to other stochastic sequence prediction problems where training data are lack of stochasticity.

Video Prediction

Style Normalization and Restitution for Domain Generalization and Adaptation

1 code implementation3 Jan 2021 Xin Jin, Cuiling Lan, Wenjun Zeng, Zhibo Chen

In this paper, we design a novel Style Normalization and Restitution module (SNR) to simultaneously ensure both high generalization and discrimination capability of the networks.

Disentanglement Domain Generalization +4

Exploiting Sample Uncertainty for Domain Adaptive Person Re-Identification

1 code implementation16 Dec 2020 Kecheng Zheng, Cuiling Lan, Wenjun Zeng, Zhizheng Zhang, Zheng-Jun Zha

Based on this finding, we propose to exploit the uncertainty (measured by consistency levels) to evaluate the reliability of the pseudo-label of a sample and incorporate the uncertainty to re-weight its contribution within various ReID losses, including the identity (ID) classification loss per sample, the triplet loss, and the contrastive loss.

Clustering Domain Adaptive Person Re-Identification +3

An Empirical Study of the Collapsing Problem in Semi-Supervised 2D Human Pose Estimation

1 code implementation ICCV 2021 Rongchang Xie, Chunyu Wang, Wenjun Zeng, Yizhou Wang

The state-of-the-art methods are consistency-based which learn about unlabeled images by encouraging the model to give consistent predictions for images under different augmentations.

Pose Estimation Semi-Supervised Human Pose Estimation

Re-identification = Retrieval + Verification: Back to Essence and Forward with a New Metric

1 code implementation23 Nov 2020 Zheng Wang, Xin Yuan, Toshihiko Yamasaki, Yutian Lin, Xin Xu, Wenjun Zeng

In essence, current re-ID overemphasizes the importance of retrieval but underemphasizes that of verification, \textit{i. e.}, all returned images are considered as the target.

Image Retrieval Retrieval

AdaFuse: Adaptive Multiview Fusion for Accurate Human Pose Estimation in the Wild

2 code implementations26 Oct 2020 Zhe Zhang, Chunyu Wang, Weichao Qiu, Wenhu Qin, Wenjun Zeng

To make the task truly unconstrained, we present AdaFuse, an adaptive multiview fusion method, which can enhance the features in occluded views by leveraging those in visible views.

3D Human Pose Estimation

Uncertainty-Aware Few-Shot Image Classification

no code implementations9 Oct 2020 Zhizheng Zhang, Cuiling Lan, Wenjun Zeng, Zhibo Chen, Shih-Fu Chang

In this work, we propose Uncertainty-Aware Few-Shot framework for image classification by modeling uncertainty of the similarities of query-support pairs and performing uncertainty-aware optimization.

Classification Few-Shot Image Classification +3

FPCR-Net: Feature Pyramidal Correlation and Residual Reconstruction for Optical Flow Estimation

no code implementations17 Jan 2020 Xiaolin Song, Yuyang Zhao, Jingyu Yang, Cuiling Lan, Wenjun Zeng

To exploit such flexible and comprehensive information, we propose a semi-supervised Feature Pyramidal Correlation and Residual Reconstruction Network (FPCR-Net) for optical flow estimation from frame pairs.

Optical Flow Estimation

Cannot find the paper you are looking for? You can Submit a new open access paper.