Search Results for author: Sjoerd van Steenkiste

Found 26 papers, 11 papers with code

DreamSync: Aligning Text-to-Image Generation with Image Understanding Feedback

no code implementations29 Nov 2023 Jiao Sun, Deqing Fu, Yushi Hu, Su Wang, Royi Rassin, Da-Cheng Juan, Dana Alon, Charles Herrmann, Sjoerd van Steenkiste, Ranjay Krishna, Cyrus Rashtchian

Then, it uses two VLMs to select the best generation: a Visual Question Answering model that measures the alignment of generated images to the text, and another that measures the generation's aesthetic quality.

Question Answering Text-to-Image Generation +1

A Systematic Comparison of Syllogistic Reasoning in Humans and Language Models

no code implementations1 Nov 2023 Tiwalayo Eisape, MH Tessler, Ishita Dasgupta, Fei Sha, Sjoerd van Steenkiste, Tal Linzen

A central component of rational behavior is logical inference: the process of determining which conclusions follow from a set of premises.

Logical Fallacies

The Impact of Depth on Compositional Generalization in Transformer Language Models

no code implementations30 Oct 2023 Jackson Petty, Sjoerd van Steenkiste, Ishita Dasgupta, Fei Sha, Dan Garrette, Tal Linzen

Because model latency is approximately linear in the number of layers, these results lead us to the recommendation that, with a given total parameter budget, transformers can be made shallower than is typical without sacrificing performance.

Language Modelling

DyST: Towards Dynamic Neural Scene Representations on Real-World Videos

no code implementations9 Oct 2023 Maximilian Seitzer, Sjoerd van Steenkiste, Thomas Kipf, Klaus Greff, Mehdi S. M. Sajjadi

Our Dynamic Scene Transformer (DyST) model leverages recent work in neural scene representation to learn a latent decomposition of monocular real-world videos into scene content, per-view scene dynamics, and camera pose.

DORSal: Diffusion for Object-centric Representations of Scenes et al

no code implementations13 Jun 2023 Allan Jabri, Sjoerd van Steenkiste, Emiel Hoogeboom, Mehdi S. M. Sajjadi, Thomas Kipf

In this paper, we leverage recent progress in diffusion models to equip 3D scene representation learning models with the ability to render high-fidelity novel views, while retaining benefits such as object-level scene editing to a large degree.

Neural Rendering Object +3

Sensitivity of Slot-Based Object-Centric Models to their Number of Slots

no code implementations30 May 2023 Roland S. Zimmermann, Sjoerd van Steenkiste, Mehdi S. M. Sajjadi, Thomas Kipf, Klaus Greff

Self-supervised methods for learning object-centric representations have recently been applied successfully to various datasets.

Invariant Slot Attention: Object Discovery with Slot-Centric Reference Frames

1 code implementation9 Feb 2023 Ondrej Biza, Sjoerd van Steenkiste, Mehdi S. M. Sajjadi, Gamaleldin F. Elsayed, Aravindh Mahendran, Thomas Kipf

Automatically discovering composable abstractions from raw perceptual data is a long-standing challenge in machine learning.

Object Object Discovery

Exploring through Random Curiosity with General Value Functions

1 code implementation18 Nov 2022 Aditya Ramesh, Louis Kirsch, Sjoerd van Steenkiste, Jürgen Schmidhuber

Furthermore, RC-GVF significantly outperforms previous methods in the absence of ground-truth episodic counts in the partially observable MiniGrid environments.

Efficient Exploration

Object Scene Representation Transformer

no code implementations14 Jun 2022 Mehdi S. M. Sajjadi, Daniel Duckworth, Aravindh Mahendran, Sjoerd van Steenkiste, Filip Pavetić, Mario Lučić, Leonidas J. Guibas, Klaus Greff, Thomas Kipf

A compositional understanding of the world in terms of objects and their geometry in 3D space is considered a cornerstone of human cognition.

Novel View Synthesis Object +1

Unsupervised Learning of Temporal Abstractions with Slot-based Transformers

1 code implementation25 Mar 2022 Anand Gopalakrishnan, Kazuki Irie, Jürgen Schmidhuber, Sjoerd van Steenkiste

The discovery of reusable sub-routines simplifies decision-making and planning in complex reinforcement learning problems.

Decision Making

Test-time Adaptation with Slot-Centric Models

1 code implementation21 Mar 2022 Mihir Prabhudesai, Anirudh Goyal, Sujoy Paul, Sjoerd van Steenkiste, Mehdi S. M. Sajjadi, Gaurav Aggarwal, Thomas Kipf, Deepak Pathak, Katerina Fragkiadaki

In our work, we find evidence that these losses are insufficient for the task of scene decomposition, without also considering architectural inductive biases.

Image Classification Image Segmentation +7

On the Binding Problem in Artificial Neural Networks

no code implementations9 Dec 2020 Klaus Greff, Sjoerd van Steenkiste, Jürgen Schmidhuber

Contemporary neural networks still fall short of human-level generalization, which extends far beyond our direct experiences.

Hierarchical Relational Inference

no code implementations7 Oct 2020 Aleksandar Stanić, Sjoerd van Steenkiste, Jürgen Schmidhuber

Common-sense physical reasoning in the real world requires learning about the interactions of objects and their dynamics.

Common Sense Reasoning

Are Neural Nets Modular? Inspecting Functional Modularity Through Differentiable Weight Masks

1 code implementation ICLR 2021 Róbert Csordás, Sjoerd van Steenkiste, Jürgen Schmidhuber

Neural networks (NNs) whose subnetworks implement reusable functions are expected to offer numerous advantages, including compositionality through efficient recombination of functional building blocks, interpretability, preventing catastrophic interference, etc.

Systematic Generalization

A Perspective on Objects and Systematic Generalization in Model-Based RL

no code implementations3 Jun 2019 Sjoerd van Steenkiste, Klaus Greff, Jürgen Schmidhuber

In order to meet the diverse challenges in solving many real-world problems, an intelligent agent has to be able to dynamically construct a model of its environment.

Systematic Generalization

A Case for Object Compositionality in Deep Generative Models of Images

no code implementations ICLR 2019 Sjoerd van Steenkiste, Karol Kurach, Sylvain Gelly

In this work we propose to structure the generator of a GAN to consider objects and their relations explicitly, and generate images by means of composition.

FVD: A new Metric for Video Generation

no code implementations ICLR Workshop DeepGenStruct 2019 Thomas Unterthiner, Sjoerd van Steenkiste, Karol Kurach, Raphaël Marinier, Marcin Michalski, Sylvain Gelly

While recent generative models of video have had some success, current progress is hampered by the lack of qualitative metrics that consider visual quality, temporal coherence, and diversity of samples.

Representation Learning Video Generation

Towards Accurate Generative Models of Video: A New Metric & Challenges

3 code implementations3 Dec 2018 Thomas Unterthiner, Sjoerd van Steenkiste, Karol Kurach, Raphael Marinier, Marcin Michalski, Sylvain Gelly

To this extent we propose Fr\'{e}chet Video Distance (FVD), a new metric for generative models of video, and StarCraft 2 Videos (SCV), a benchmark of game play from custom starcraft 2 scenarios that challenge the current capabilities of generative models of video.

Representation Learning Starcraft +1

Investigating Object Compositionality in Generative Adversarial Networks

no code implementations ICLR 2019 Sjoerd van Steenkiste, Karol Kurach, Jürgen Schmidhuber, Sylvain Gelly

We present a minimal modification of a standard generator to incorporate this inductive bias and find that it reliably learns to generate images as compositions of objects.

Image Generation Inductive Bias +5

Neural Expectation Maximization

1 code implementation NeurIPS 2017 Klaus Greff, Sjoerd van Steenkiste, Jürgen Schmidhuber

Many real world tasks such as reasoning and physical interaction require identification and manipulation of conceptual entities.

Clustering

Cannot find the paper you are looking for? You can Submit a new open access paper.