Search Results for author: Mengyu Yang

Found 7 papers, 3 papers with code

AdaViPro: Region-based Adaptive Visual Prompt for Large-Scale Models Adapting

no code implementations20 Mar 2024 Mengyu Yang, Ye Tian, Lanshan Zhang, Xiao Liang, Xuming Ran, Wendong Wang

Recently, prompt-based methods have emerged as a new alternative `parameter-efficient fine-tuning' paradigm, which only fine-tunes a small number of additional parameters while keeping the original model frozen.

Decision Making

View while Moving: Efficient Video Recognition in Long-untrimmed Videos

no code implementations9 Aug 2023 Ye Tian, Mengyu Yang, Lanshan Zhang, Zhizhen Zhang, Yang Liu, Xiaohui Xie, Xirong Que, Wendong Wang

To this end, inspired by human cognition, we propose a novel recognition paradigm of "View while Moving" for efficient long-untrimmed video recognition.

Video Recognition

Improving Social Media Popularity Prediction with Multiple Post Dependencies

no code implementations28 Jul 2023 Zhizhen Zhang, Xiaohui Xie, Mengyu Yang, Ye Tian, Yong Jiang, Yong Cui

Social Media Popularity Prediction has drawn a lot of attention because of its profound impact on many different applications, such as recommendation systems and multimedia advertising.

Recommendation Systems Social Media Popularity Prediction

TriBERT: Human-centric Audio-visual Representation Learning

1 code implementation NeurIPS 2021 Tanzila Rahman, Mengyu Yang, Leonid Sigal

In this work, we introduce TriBERT -- a transformer-based architecture, inspired by ViLBERT, which enables contextual feature learning across three modalities: vision, pose, and audio, with the use of flexible co-attention.

Pose Retrieval Representation Learning +1

TriBERT: Full-body Human-centric Audio-visual Representation Learning for Visual Sound Separation

1 code implementation26 Oct 2021 Tanzila Rahman, Mengyu Yang, Leonid Sigal

In this work, we introduce TriBERT -- a transformer-based architecture, inspired by ViLBERT, which enables contextual feature learning across three modalities: vision, pose, and audio, with the use of flexible co-attention.

Pose Retrieval Representation Learning +1

Mask-Guided Discovery of Semantic Manifolds in Generative Models

1 code implementation15 May 2021 Mengyu Yang, David Rokeby, Xavier Snelgrove

Advances in the realm of Generative Adversarial Networks (GANs) have led to architectures capable of producing amazingly realistic images such as StyleGAN2, which, when trained on the FFHQ dataset, generates images of human faces from random vectors in a lower-dimensional latent space.

Disentanglement

Cannot find the paper you are looking for? You can Submit a new open access paper.