Search Results for author: Yuge Shi

Found 10 papers, 5 papers with code

Memory Consolidation Enables Long-Context Video Understanding

no code implementations8 Feb 2024 Ivana Balažević, Yuge Shi, Pinelopi Papalampidi, Rahma Chaabouni, Skanda Koppula, Olivier J. Hénaff

Most transformer-based video encoders are limited to short temporal contexts due to their quadratic complexity.

Video Understanding

Tuning computer vision models with task rewards

1 code implementation16 Feb 2023 André Susano Pinto, Alexander Kolesnikov, Yuge Shi, Lucas Beyer, Xiaohua Zhai

Misalignment between model predictions and intended usage can be detrimental for the deployment of computer vision models.

Colorization Image Captioning +5

How Robust is Unsupervised Representation Learning to Distribution Shift?

no code implementations17 Jun 2022 Yuge Shi, Imant Daunhawer, Julia E. Vogt, Philip H. S. Torr, Amartya Sanyal

As such, there is a lack of insight on the robustness of the representations learned from unsupervised methods, such as self-supervised learning (SSL) and auto-encoder based algorithms (AE), to distribution shift.

Representation Learning Self-Supervised Learning

Adversarial Masking for Self-Supervised Learning

1 code implementation31 Jan 2022 Yuge Shi, N. Siddharth, Philip H. S. Torr, Adam R. Kosiorek

We propose ADIOS, a masked image model (MIM) framework for self-supervised learning, which simultaneously learns a masking function and an image encoder using an adversarial objective.

Representation Learning Self-Supervised Learning +1

Learning Multimodal VAEs through Mutual Supervision

1 code implementation ICLR 2022 Tom Joy, Yuge Shi, Philip H. S. Torr, Tom Rainforth, Sebastian M. Schmon, N. Siddharth

Here we introduce a novel alternative, the MEME, that avoids such explicit combinations by repurposing semi-supervised VAEs to combine information between modalities implicitly through mutual supervision.

Gradient Matching for Domain Generalization

2 code implementations ICLR 2022 Yuge Shi, Jeffrey Seely, Philip H. S. Torr, N. Siddharth, Awni Hannun, Nicolas Usunier, Gabriel Synnaeve

We perform experiments on both the Wilds benchmark, which captures distribution shift in the real world, as well as datasets in DomainBed benchmark that focuses more on synthetic-to-real transfer.

Domain Generalization

Relating by Contrasting: A Data-efficient Framework for Multimodal Generative Models

no code implementations ICLR 2021 Yuge Shi, Brooks Paige, Philip H. S. Torr, N. Siddharth

Multimodal learning for generative models often refers to the learning of abstract concepts from the commonality of information in multiple modalities, such as vision and language.

Action Anticipation with RBF Kernelized Feature Mapping RNN

no code implementations ECCV 2018 Yuge Shi, Basura Fernando, Richard Hartley

We introduce a novel Recurrent Neural Network-based algorithm for future video feature generation and action anticipation called feature mapping RNN.

Action Anticipation

Variational Mixture-of-Experts Autoencoders for Multi-Modal Deep Generative Models

3 code implementations NeurIPS 2019 Yuge Shi, N. Siddharth, Brooks Paige, Philip H. S. Torr

In this work, we characterise successful learning of such models as the fulfillment of four criteria: i) implicit latent decomposition into shared and private subspaces, ii) coherent joint generation over all modalities, iii) coherent cross-generation across individual modalities, and iv) improved model learning for individual modalities through multi-modal integration.

Cannot find the paper you are looking for? You can Submit a new open access paper.