Search Results for author: Guanzhi Wang

Found 7 papers, 7 papers with code

RubiksNet: Learnable 3D-Shift for Efficient Video Action Recognition

1 code implementation ECCV 2020 Linxi Fan, Shyamal Buch, Guanzhi Wang, Ryan Cao, Yuke Zhu, Juan Carlos Niebles, Li Fei-Fei

We analyze the suitability of our new primitive for video action recognition and explore several novel variations of our approach to enable stronger representational flexibility while maintaining an efficient design.

Action Recognition Temporal Action Localization +1

VIMA: General Robot Manipulation with Multimodal Prompts

2 code implementations6 Oct 2022 Yunfan Jiang, Agrim Gupta, Zichen Zhang, Guanzhi Wang, Yongqiang Dou, Yanjun Chen, Li Fei-Fei, Anima Anandkumar, Yuke Zhu, Linxi Fan

This work shows that we can express a wide spectrum of robot manipulation tasks with multimodal prompts, interleaving textual and visual tokens.

Imitation Learning Language Modelling +2

SECANT: Self-Expert Cloning for Zero-Shot Generalization of Visual Policies

1 code implementation17 Jun 2021 Linxi Fan, Guanzhi Wang, De-An Huang, Zhiding Yu, Li Fei-Fei, Yuke Zhu, Anima Anandkumar

A student network then learns to mimic the expert policy by supervised learning with strong augmentations, making its representation more robust against visual variations compared to the expert.

Autonomous Driving Image Augmentation +2

Deep Video Matting via Spatio-Temporal Alignment and Aggregation

1 code implementation CVPR 2021 Yanan sun, Guanzhi Wang, Qiao Gu, Chi-Keung Tang, Yu-Wing Tai

Despite the significant progress made by deep learning in natural image matting, there has been so far no representative work on deep learning for video matting due to the inherent technical challenges in reasoning temporal domain and lack of large-scale video matting datasets.

Image Matting Optical Flow Estimation +1

LADN: Local Adversarial Disentangling Network for Facial Makeup and De-Makeup

1 code implementation ICCV 2019 Qiao Gu, Guanzhi Wang, Mang Tik Chiu, Yu-Wing Tai, Chi-Keung Tang

Central to our method are multiple and overlapping local adversarial discriminators in a content-style disentangling network for achieving local detail transfer between facial images, with the use of asymmetric loss functions for dramatic makeup styles with high-frequency details.

Style Transfer

Cannot find the paper you are looking for? You can Submit a new open access paper.