Search Results for author: Yuge Shi

Found 10 papers, 5 papers with code

Genie: Generative Interactive Environments

no code implementations • 23 Feb 2024 • Jake Bruce, Michael Dennis, Ashley Edwards, Jack Parker-Holder, Yuge Shi, Edward Hughes, Matthew Lai, Aditi Mavalankar, Richie Steigerwald, Chris Apps, Yusuf Aytar, Sarah Bechtle, Feryal Behbahani, Stephanie Chan, Nicolas Heess, Lucy Gonzalez, Simon Osindero, Sherjil Ozair, Scott Reed, Jingwei Zhang, Konrad Zolna, Jeff Clune, Nando de Freitas, Satinder Singh, Tim Rocktäschel

We introduce Genie, the first generative interactive environment trained in an unsupervised manner from unlabelled Internet videos.

Paper
Add Code

Memory Consolidation Enables Long-Context Video Understanding

no code implementations • 8 Feb 2024 • Ivana Balažević, Yuge Shi, Pinelopi Papalampidi, Rahma Chaabouni, Skanda Koppula, Olivier J. Hénaff

Most transformer-based video encoders are limited to short temporal contexts due to their quadratic complexity.

Video Understanding

Paper
Add Code

Tuning computer vision models with task rewards

1 code implementation • 16 Feb 2023 • André Susano Pinto, Alexander Kolesnikov, Yuge Shi, Lucas Beyer, Xiaohua Zhai

Misalignment between model predictions and intended usage can be detrimental for the deployment of computer vision models.

Colorization Image Captioning +5

1,543

Paper
Code

How Robust is Unsupervised Representation Learning to Distribution Shift?

no code implementations • 17 Jun 2022 • Yuge Shi, Imant Daunhawer, Julia E. Vogt, Philip H. S. Torr, Amartya Sanyal

As such, there is a lack of insight on the robustness of the representations learned from unsupervised methods, such as self-supervised learning (SSL) and auto-encoder based algorithms (AE), to distribution shift.

Representation Learning Self-Supervised Learning

Paper
Add Code

Adversarial Masking for Self-Supervised Learning

1 code implementation • 31 Jan 2022 • Yuge Shi, N. Siddharth, Philip H. S. Torr, Adam R. Kosiorek

We propose ADIOS, a masked image model (MIM) framework for self-supervised learning, which simultaneously learns a masking function and an image encoder using an adversarial objective.

Representation Learning Self-Supervised Learning +1

Paper
Code

Learning Multimodal VAEs through Mutual Supervision

1 code implementation • ICLR 2022 • Tom Joy, Yuge Shi, Philip H. S. Torr, Tom Rainforth, Sebastian M. Schmon, N. Siddharth

Here we introduce a novel alternative, the MEME, that avoids such explicit combinations by repurposing semi-supervised VAEs to combine information between modalities implicitly through mutual supervision.

Paper
Code

Gradient Matching for Domain Generalization

2 code implementations • ICLR 2022 • Yuge Shi, Jeffrey Seely, Philip H. S. Torr, N. Siddharth, Awni Hannun, Nicolas Usunier, Gabriel Synnaeve

We perform experiments on both the Wilds benchmark, which captures distribution shift in the real world, as well as datasets in DomainBed benchmark that focuses more on synthetic-to-real transfer.

Domain Generalization

1,328

Paper
Code

Relating by Contrasting: A Data-efficient Framework for Multimodal Generative Models

no code implementations • ICLR 2021 • Yuge Shi, Brooks Paige, Philip H. S. Torr, N. Siddharth

Multimodal learning for generative models often refers to the learning of abstract concepts from the commonality of information in multiple modalities, such as vision and language.

Paper
Add Code

Action Anticipation with RBF Kernelized Feature Mapping RNN

no code implementations • ECCV 2018 • Yuge Shi, Basura Fernando, Richard Hartley

We introduce a novel Recurrent Neural Network-based algorithm for future video feature generation and action anticipation called feature mapping RNN.

Action Anticipation

Paper
Add Code

Variational Mixture-of-Experts Autoencoders for Multi-Modal Deep Generative Models

3 code implementations • NeurIPS 2019 • Yuge Shi, N. Siddharth, Brooks Paige, Philip H. S. Torr

In this work, we characterise successful learning of such models as the fulfillment of four criteria: i) implicit latent decomposition into shared and private subspaces, ii) coherent joint generation over all modalities, iii) coherent cross-generation across individual modalities, and iv) improved model learning for individual modalities through multi-modal integration.

174

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.