Search Results for author: Zixin Wen

Found 7 papers, 2 papers with code

How Transformers Learn Diverse Attention Correlations in Masked Vision Pretraining

no code implementations • 4 Mar 2024 • Yu Huang, Zixin Wen, Yuejie Chi, Yingbin Liang

Masked reconstruction, which predicts randomly masked patches from unmasked ones, has emerged as an important approach in self-supervised pretraining.

Position

Paper
Add Code

Revisiting Disentanglement in Downstream Tasks: A Study on Its Necessity for Abstract Visual Reasoning

1 code implementation • 1 Mar 2024 • Ruiqian Nai, Zixin Wen, Ji Li, Yuanzhi Li, Yang Gao

This paper further investigates the necessity of disentangled representation in downstream applications.

Disentanglement Informativeness +1

Paper
Code

What Matters In The Structured Pruning of Generative Language Models?

1 code implementation • 7 Feb 2023 • Michael Santacroce, Zixin Wen, Yelong Shen, Yuanzhi Li

Auto-regressive large language models such as GPT-3 require enormous computational resources to use.

Text Generation

385

Paper
Code

The Mechanism of Prediction Head in Non-contrastive Self-supervised Learning

no code implementations • 12 May 2022 • Zixin Wen, Yuanzhi Li

The substitution effect happens when learning the stronger features in some neurons can substitute for learning these features in other neurons through updating the prediction head.

Self-Supervised Learning

Paper
Add Code

Improving Multi-Modal Learning with Uni-Modal Teachers

no code implementations • 21 Jun 2021 • Chenzhuang Du, Tingle Li, Yichen Liu, Zixin Wen, Tianyu Hua, Yue Wang, Hang Zhao

We name this problem Modality Failure, and hypothesize that the imbalance of modalities and the implicit bias of common objectives in fusion method prevent encoders of each modality from sufficient feature learning.

Ranked #64 on Semantic Segmentation on NYU Depth v2

Image Segmentation Semantic Segmentation

Paper
Add Code

Toward Understanding the Feature Learning Process of Self-supervised Contrastive Learning

no code implementations • 31 May 2021 • Zixin Wen, Yuanzhi Li

We present an underlying principle called $\textbf{feature decoupling}$ to explain the effects of augmentations, where we theoretically characterize how augmentations can reduce the correlations of dense features between positive samples while keeping the correlations of sparse features intact, thereby forcing the neural networks to learn from the self-supervision of sparse features.

Contrastive Learning Self-Supervised Learning

Paper
Add Code

Convergence of End-to-End Training in Deep Unsupervised Contrastive Learning

no code implementations • 17 Feb 2020 • Zixin Wen

Unsupervised contrastive learning has gained increasing attention in the latest research and has proven to be a powerful method for learning representations from unlabeled data.

Contrastive Learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.