Search Results for author: Zhiyu Zhao

Found 6 papers, 4 papers with code

Learning Macroeconomic Policies based on Microfoundations: A Stackelberg Mean Field Game Approach

no code implementations14 Mar 2024 Qirui Mi, Zhiyu Zhao, Siyu Xia, Yan Song, Jun Wang, Haifeng Zhang

Effective macroeconomic policies play a crucial role in promoting economic growth and social stability.

Asymmetric Masked Distillation for Pre-Training Small Foundation Models

no code implementations6 Nov 2023 Zhiyu Zhao, Bingkun Huang, Sen Xing, Gangshan Wu, Yu Qiao, LiMin Wang

And AMD achieves 73. 3% classification accuracy using the ViT-B model on the Something-in-Something V2 dataset, a 3. 7% improvement over the original ViT-B model from VideoMAE.

Action Classification Action Recognition +3

MGMAE: Motion Guided Masking for Video Masked Autoencoding

1 code implementation ICCV 2023 Bingkun Huang, Zhiyu Zhao, Guozhen Zhang, Yu Qiao, LiMin Wang

Based on this masking volume, we can track the unmasked tokens in time and sample a set of temporal consistent cubes from videos.

Optical Flow Estimation Representation Learning

VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking

1 code implementation CVPR 2023 LiMin Wang, Bingkun Huang, Zhiyu Zhao, Zhan Tong, Yinan He, Yi Wang, Yali Wang, Yu Qiao

Finally, we successfully train a video ViT model with a billion parameters, which achieves a new state-of-the-art performance on the datasets of Kinetics (90. 0% on K400 and 89. 9% on K600) and Something-Something (68. 7% on V1 and 77. 0% on V2).

 Ranked #1 on Self-Supervised Action Recognition on UCF101 (using extra training data)

Action Classification Action Recognition In Videos +3

InternVideo: General Video Foundation Models via Generative and Discriminative Learning

1 code implementation6 Dec 2022 Yi Wang, Kunchang Li, Yizhuo Li, Yinan He, Bingkun Huang, Zhiyu Zhao, Hongjie Zhang, Jilan Xu, Yi Liu, Zun Wang, Sen Xing, Guo Chen, Junting Pan, Jiashuo Yu, Yali Wang, LiMin Wang, Yu Qiao

Specifically, InternVideo efficiently explores masked video modeling and video-language contrastive learning as the pretraining objectives, and selectively coordinates video representations of these two complementary frameworks in a learnable manner to boost various video applications.

 Ranked #1 on Action Recognition on Something-Something V1 (using extra training data)

Action Classification Contrastive Learning +8

Cannot find the paper you are looking for? You can Submit a new open access paper.