Search Results for author: Zedong Wang

Found 8 papers, 7 papers with code

LongVQ: Long Sequence Modeling with Vector Quantization on Structured Memory

no code implementations17 Apr 2024 Zicheng Liu, Li Wang, Siyuan Li, Zedong Wang, Haitao Lin, Stan Z. Li

Transformer models have been successful in various sequence processing tasks, but the self-attention mechanism's computational cost limits its practicality for long sequences.

Computational Efficiency Language Modelling +1

Switch EMA: A Free Lunch for Better Flatness and Sharpness

2 code implementations14 Feb 2024 Siyuan Li, Zicheng Liu, Juanxi Tian, Ge Wang, Zedong Wang, Weiyang Jin, Di wu, Cheng Tan, Tao Lin, Yang Liu, Baigui Sun, Stan Z. Li

Exponential Moving Average (EMA) is a widely used weight averaging (WA) regularization to learn flat optima for better generalizations without extra cost in deep neural network (DNN) optimization.

Attribute Image Classification +7

Masked Modeling for Self-supervised Representation Learning on Vision and Beyond

1 code implementation31 Dec 2023 Siyuan Li, Luyuan Zhang, Zedong Wang, Di wu, Lirong Wu, Zicheng Liu, Jun Xia, Cheng Tan, Yang Liu, Baigui Sun, Stan Z. Li

As the deep learning revolution marches on, self-supervised learning has garnered increasing attention in recent years thanks to its remarkable representation learning ability and the low dependence on labeled data.

Representation Learning Self-Supervised Learning

OpenSTL: A Comprehensive Benchmark of Spatio-Temporal Predictive Learning

2 code implementations NeurIPS 2023 Cheng Tan, Siyuan Li, Zhangyang Gao, Wenfei Guan, Zedong Wang, Zicheng Liu, Lirong Wu, Stan Z. Li

Spatio-temporal predictive learning is a learning paradigm that enables models to learn spatial and temporal patterns by predicting future frames from given past frames in an unsupervised manner.

Weather Forecasting

MogaNet: Multi-order Gated Aggregation Network

6 code implementations7 Nov 2022 Siyuan Li, Zedong Wang, Zicheng Liu, Cheng Tan, Haitao Lin, Di wu, ZhiYuan Chen, Jiangbin Zheng, Stan Z. Li

Notably, MogaNet hits 80. 0\% and 87. 8\% accuracy with 5. 2M and 181M parameters on ImageNet-1K, outperforming ParC-Net and ConvNeXt-L, while saving 59\% FLOPs and 17M parameters, respectively.

3D Human Pose Estimation Image Classification +6

OpenMixup: A Comprehensive Mixup Benchmark for Visual Classification

1 code implementation11 Sep 2022 Siyuan Li, Zedong Wang, Zicheng Liu, Di wu, Cheng Tan, Weiyang Jin, Stan Z. Li

Data mixing, or mixup, is a data-dependent augmentation technique that has greatly enhanced the generalizability of modern deep neural networks.

Benchmarking Classification +3

Cannot find the paper you are looking for? You can Submit a new open access paper.