Search Results for author: Zhennan Wang

Found 12 papers, 6 papers with code

Text-Video Retrieval with Disentangled Conceptualization and Set-to-Set Alignment

4 code implementations • 20 May 2023 • Peng Jin, Hao Li, Zesen Cheng, Jinfa Huang, Zhennan Wang, Li Yuan, Chang Liu, Jie Chen

In this paper, we propose the Disentangled Conceptualization and Set-to-set Alignment (DiCoSA) to simulate the conceptualizing and reasoning process of human beings.

Retrieval Video Retrieval

Paper
Code

TG-VQA: Ternary Game of Video Question Answering

no code implementations • 17 May 2023 • Hao Li, Peng Jin, Zesen Cheng, Songyang Zhang, Kai Chen, Zhennan Wang, Chang Liu, Jie Chen

Video question answering aims at answering a question about the video content by reasoning the alignment semantics within them.

Contrastive Learning Question Answering +2

Paper
Add Code

Multi-granularity Interaction Simulation for Unsupervised Interactive Segmentation

no code implementations • ICCV 2023 • Kehan Li, Yian Zhao, Zhennan Wang, Zesen Cheng, Peng Jin, Xiangyang Ji, Li Yuan, Chang Liu, Jie Chen

Interactive segmentation enables users to segment as needed by providing cues of objects, which introduces human-computer interaction for many fields, such as image editing and medical image analysis.

Interactive Segmentation

Paper
Add Code

LaPE: Layer-adaptive Position Embedding for Vision Transformers with Independent Layer Normalization

1 code implementation • ICCV 2023 • Runyi Yu, Zhennan Wang, Yinhuai Wang, Kehan Li, Chang Liu, Haoyi Duan, Xiangyang Ji, Jie Chen

A typical way to introduce position information is adding the absolute Position Embedding (PE) to patch embedding before entering VTs.

Image Classification object-detection +3

Paper
Code

Position Embedding Needs an Independent Layer Normalization

1 code implementation • 10 Dec 2022 • Runyi Yu, Zhennan Wang, Yinhuai Wang, Kehan Li, Yian Zhao, Jian Zhang, Guoli Song, Jie Chen

By analyzing the input and output of each encoder layer in VTs using reparameterization and visualization, we find that the default PE joining method (simply adding the PE and patch embedding together) operates the same affine transformation to token embedding and PE, which limits the expressiveness of PE and hence constrains the performance of VTs.

Position

Paper
Code

Fuzzy Positive Learning for Semi-supervised Semantic Segmentation

no code implementations • CVPR 2023 • Pengchong Qiao, Zhidan Wei, Yu Wang, Zhennan Wang, Guoli Song, Fan Xu, Xiangyang Ji, Chang Liu, Jie Chen

Semi-supervised learning (SSL) essentially pursues class boundary exploration with less dependence on human annotations.

Semi-Supervised Semantic Segmentation

Paper
Add Code

ACSeg: Adaptive Conceptualization for Unsupervised Semantic Segmentation

no code implementations • CVPR 2023 • Kehan Li, Zhennan Wang, Zesen Cheng, Runyi Yu, Yian Zhao, Guoli Song, Chang Liu, Li Yuan, Jie Chen

Recently, self-supervised large-scale visual pre-training models have shown great promise in representing pixel-level semantic relationships, significantly promoting the development of unsupervised dense prediction tasks, e. g., unsupervised semantic segmentation (USS).

Image Segmentation Unsupervised Semantic Segmentation

Paper
Add Code

Locality Guidance for Improving Vision Transformers on Tiny Datasets

1 code implementation • 20 Jul 2022 • Kehan Li, Runyi Yu, Zhennan Wang, Li Yuan, Guoli Song, Jie Chen

Therefore, our locality guidance approach is very simple and efficient, and can serve as a basic performance enhancement method for VTs on tiny datasets.

Paper
Code

$L_2$BN: Enhancing Batch Normalization by Equalizing the $L_2$ Norms of Features

no code implementations • 6 Jul 2022 • Zhennan Wang, Kehan Li, Runyi Yu, Yian Zhao, Pengchong Qiao, Chang Liu, Fan Xu, Xiangyang Ji, Guoli Song, Jie Chen

In this paper, we analyze batch normalization from the perspective of discriminability and find the disadvantages ignored by previous studies: the difference in $l_2$ norms of sample features can hinder batch normalization from obtaining more distinguished inter-class features and more compact intra-class features.

Acoustic Scene Classification Image Classification +1

Paper
Add Code

DPR-CAE: Capsule Autoencoder with Dynamic Part Representation for Image Parsing

no code implementations • 30 Apr 2021 • Canqun Xiang, Zhennan Wang, Wenbin Zou, Chen Xu

Parsing an image into a hierarchy of objects, parts, and relations is important and also challenging in many computer vision tasks.

Decoder Translation

Paper
Add Code

MMA Regularization: Decorrelating Weights of Neural Networks by Maximizing the Minimal Angles

1 code implementation • NeurIPS 2020 • Zhennan Wang, Canqun Xiang, Wenbin Zou, Chen Xu

Extensive experiments demonstrate that MMA regularization is able to enhance the generalization ability of various modern models and achieves considerable performance improvements on CIFAR100 and TinyImageNet datasets.

Face Verification

Paper
Code

PR Product: A Substitute for Inner Product in Neural Networks

1 code implementation • ICCV 2019 • Zhennan Wang, Wenbin Zou, Chen Xu

In this paper, we analyze the inner product of weight vector w and data vector x in neural networks from the perspective of vector orthogonal decomposition and prove that the direction gradient of w decreases with the angle between them close to 0 or {\pi}.

General Classification Image Captioning +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.