1 code implementation • NeurIPS 2021 • Rishabh Kabra, Daniel Zoran, Goker Erdogan, Loic Matthey, Antonia Creswell, Matthew Botvinick, Alexander Lerchner, Christopher P. Burgess
Leveraging the shared structure that exists across different scenes, our model learns to infer two sets of latent representations from RGB video input alone: a set of "object" latents, corresponding to the time-invariant, object-level contents of the scene, as well as a set of "frame" latents, corresponding to global time-varying elements such as viewpoint.
no code implementations • 23 Sep 2014 • Ifeoma Nwogu, Goker Erdogan, Ilker Yildirim, Robert Jacobs
This paper presents a computational model of concept learning using Bayesian inference for a grammatically structured hypothesis space, and test the model on multisensory (visual and haptics) recognition of 3D objects.
no code implementations • 18 Jan 2024 • Ioana Bica, Anastasija Ilić, Matthias Bauer, Goker Erdogan, Matko Bošnjak, Christos Kaplanis, Alexey A. Gritsenko, Matthias Minderer, Charles Blundell, Razvan Pascanu, Jovana Mitrović
We introduce SPARse Fine-grained Contrastive Alignment (SPARC), a simple method for pretraining more fine-grained multimodal representations from image-text pairs.