no code implementations • 13 Jun 2024 • Ziyi Wu, Yulia Rubanova, Rishabh Kabra, Drew A. Hudson, Igor Gilitschenski, Yusuf Aytar, Sjoerd van Steenkiste, Kelsey R. Allen, Thomas Kipf
By fine-tuning a pre-trained text-to-image diffusion model with this information, our approach enables fine-grained 3D pose and placement control of individual objects in a scene.
no code implementations • 2 Mar 2024 • Ziniu Hu, Ahmet Iscen, Aashi Jain, Thomas Kipf, Yisong Yue, David A. Ross, Cordelia Schmid, Alireza Fathi
SceneCraft first models a scene graph as a blueprint, detailing the spatial relationships among assets in the scene.
no code implementations • 8 Dec 2023 • William F. Whitney, Tatiana Lopez-Guevara, Tobias Pfaff, Yulia Rubanova, Thomas Kipf, Kimberly Stachenfeld, Kelsey R. Allen
Realistic simulation is critical for applications ranging from robotics to animation.
no code implementations • 9 Oct 2023 • Maximilian Seitzer, Sjoerd van Steenkiste, Thomas Kipf, Klaus Greff, Mehdi S. M. Sajjadi
Our Dynamic Scene Transformer (DyST) model leverages recent work in neural scene representation to learn a latent decomposition of monocular real-world videos into scene content, per-view scene dynamics, and camera pose.
no code implementations • ICCV 2023 • Georg Heigold, Matthias Minderer, Alexey Gritsenko, Alex Bewley, Daniel Keysers, Mario Lučić, Fisher Yu, Thomas Kipf
Our model is end-to-end trainable on video data and enjoys improved temporal consistency compared to tracking-by-detection baselines, while retaining the open-world capabilities of the backbone detector.
no code implementations • 21 Jun 2023 • Ondrej Biza, Skye Thompson, Kishore Reddy Pagidi, Abhinav Kumar, Elise van der Pol, Robin Walters, Thomas Kipf, Jan-Willem van de Meent, Lawson L. S. Wong, Robert Platt
We propose a new method, Interaction Warping, for learning SE(3) robotic manipulation policies from a single demonstration.
no code implementations • 13 Jun 2023 • Allan Jabri, Sjoerd van Steenkiste, Emiel Hoogeboom, Mehdi S. M. Sajjadi, Thomas Kipf
In this paper, we leverage recent progress in diffusion models to equip 3D scene representation learning models with the ability to render high-fidelity novel views, while retaining benefits such as object-level scene editing to a large degree.
no code implementations • 30 May 2023 • Roland S. Zimmermann, Sjoerd van Steenkiste, Mehdi S. M. Sajjadi, Thomas Kipf, Klaus Greff
Self-supervised methods for learning object-centric representations have recently been applied successfully to various datasets.
no code implementations • 9 May 2023 • Pradyumna Reddy, Scott Wisdom, Klaus Greff, John R. Hershey, Thomas Kipf
We discuss the results and limitations of our approach in detail, and further outline potential ways to overcome the limitations and directions for future work.
1 code implementation • 10 Feb 2023 • Mostafa Dehghani, Josip Djolonga, Basil Mustafa, Piotr Padlewski, Jonathan Heek, Justin Gilmer, Andreas Steiner, Mathilde Caron, Robert Geirhos, Ibrahim Alabdulmohsin, Rodolphe Jenatton, Lucas Beyer, Michael Tschannen, Anurag Arnab, Xiao Wang, Carlos Riquelme, Matthias Minderer, Joan Puigcerver, Utku Evci, Manoj Kumar, Sjoerd van Steenkiste, Gamaleldin F. Elsayed, Aravindh Mahendran, Fisher Yu, Avital Oliver, Fantine Huot, Jasmijn Bastings, Mark Patrick Collier, Alexey Gritsenko, Vighnesh Birodkar, Cristina Vasconcelos, Yi Tay, Thomas Mensink, Alexander Kolesnikov, Filip Pavetić, Dustin Tran, Thomas Kipf, Mario Lučić, Xiaohua Zhai, Daniel Keysers, Jeremiah Harmsen, Neil Houlsby
The scaling of Transformers has driven breakthrough capabilities for language models.
Ranked #1 on Zero-Shot Transfer Image Classification on ObjectNet
2 code implementations • 9 Feb 2023 • Ondrej Biza, Sjoerd van Steenkiste, Mehdi S. M. Sajjadi, Gamaleldin F. Elsayed, Aravindh Mahendran, Thomas Kipf
Automatically discovering composable abstractions from raw perceptual data is a long-standing challenge in machine learning.
no code implementations • CVPR 2023 • Mehdi S. M. Sajjadi, Aravindh Mahendran, Thomas Kipf, Etienne Pot, Daniel Duckworth, Mario Lucic, Klaus Greff
Our main insight is that one can train a Pose Encoder that peeks at the target image and learns a latent pose embedding which is used by the decoder for view synthesis.
1 code implementation • 1 Nov 2022 • Sadegh Mahdavi, Kevin Swersky, Thomas Kipf, Milad Hashemi, Christos Thrampoulidis, Renjie Liao
In this paper, we study the OOD generalization of neural algorithmic reasoning tasks, where the goal is to learn an algorithm (e. g., sorting, breadth-first search, and depth-first search) from input-output pairs using deep neural networks.
1 code implementation • 12 Oct 2022 • Ziyi Wu, Nikita Dvornik, Klaus Greff, Thomas Kipf, Animesh Garg
While recent object-centric models can successfully decompose a scene into objects, modeling their dynamics effectively still remains a challenge.
1 code implementation • 15 Jun 2022 • Gamaleldin F. Elsayed, Aravindh Mahendran, Sjoerd van Steenkiste, Klaus Greff, Michael C. Mozer, Thomas Kipf
The visual world can be parsimoniously characterized in terms of distinct entities with sparse interactions.
no code implementations • 14 Jun 2022 • Mehdi S. M. Sajjadi, Daniel Duckworth, Aravindh Mahendran, Sjoerd van Steenkiste, Filip Pavetić, Mario Lučić, Leonidas J. Guibas, Klaus Greff, Thomas Kipf
A compositional understanding of the world in terms of objects and their geometry in 3D space is considered a cornerstone of human cognition.
3 code implementations • 12 May 2022 • Matthias Minderer, Alexey Gritsenko, Austin Stone, Maxim Neumann, Dirk Weissenborn, Alexey Dosovitskiy, Aravindh Mahendran, Anurag Arnab, Mostafa Dehghani, Zhuoran Shen, Xiao Wang, Xiaohua Zhai, Thomas Kipf, Neil Houlsby
Combining simple architectures with large-scale pre-training has led to massive improvements in image classification.
Ranked #1 on One-Shot Object Detection on MS COCO
1 code implementation • 27 Apr 2022 • Ondrej Biza, Robert Platt, Jan-Willem van de Meent, Lawson L. S. Wong, Thomas Kipf
We study the problem of binding actions to objects in object-factored world models using action-attention mechanisms.
1 code implementation • 21 Mar 2022 • Mihir Prabhudesai, Anirudh Goyal, Sujoy Paul, Sjoerd van Steenkiste, Mehdi S. M. Sajjadi, Gaurav Aggarwal, Thomas Kipf, Deepak Pathak, Katerina Fragkiadaki
In our work, we find evidence that these losses are insufficient for the task of scene decomposition, without also considering architectural inductive biases.
no code implementations • 11 Mar 2022 • Alexander Bogatskiy, Sanmay Ganguly, Thomas Kipf, Risi Kondor, David W. Miller, Daniel Murnane, Jan T. Offermann, Mariel Pettee, Phiala Shanahan, Chase Shimmin, Savannah Thais
Physical theories grounded in mathematical symmetries are an essential component of our understanding of a wide range of properties of the universe.
1 code implementation • CVPR 2022 • Klaus Greff, Francois Belletti, Lucas Beyer, Carl Doersch, Yilun Du, Daniel Duckworth, David J. Fleet, Dan Gnanapragasam, Florian Golemo, Charles Herrmann, Thomas Kipf, Abhijit Kundu, Dmitry Lagun, Issam Laradji, Hsueh-Ti, Liu, Henning Meyer, Yishu Miao, Derek Nowrouzezahrai, Cengiz Oztireli, Etienne Pot, Noha Radwan, Daniel Rebain, Sara Sabour, Mehdi S. M. Sajjadi, Matan Sela, Vincent Sitzmann, Austin Stone, Deqing Sun, Suhani Vora, Ziyu Wang, Tianhao Wu, Kwang Moo Yi, Fangcheng Zhong, Andrea Tagliasacchi
Data is the driving force of machine learning, with the amount and quality of training data often being more important for the performance of a system than architecture and training details.
1 code implementation • 10 Feb 2022 • Ondrej Biza, Thomas Kipf, David Klee, Robert Platt, Jan-Willem van de Meent, Lawson L. S. Wong
In this paper, we learn to generalize over robotic pick-and-place tasks using object-factored world models, which combat the combinatorial explosion by ensuring that predictions are equivariant to permutations of objects.
3 code implementations • ICLR 2022 • Thomas Kipf, Gamaleldin F. Elsayed, Aravindh Mahendran, Austin Stone, Sara Sabour, Georg Heigold, Rico Jonschkowski, Alexey Dosovitskiy, Klaus Greff
Object-centric representations are a promising path toward more systematic generalization by providing flexible abstractions upon which compositional world models can be built.
1 code implementation • 24 Jul 2021 • Ondrej Biza, Elise van der Pol, Thomas Kipf
World models trained by contrastive learning are a compelling alternative to autoencoder-based world models, which learn by reconstructing pixel states.
no code implementations • 19 Jul 2021 • Petar Veličković, Matko Bošnjak, Thomas Kipf, Alexander Lerchner, Raia Hadsell, Razvan Pascanu, Charles Blundell
Neural networks leverage robust internal representations in order to generalise.
no code implementations • 20 Nov 2020 • Sindy Löwe, Klaus Greff, Rico Jonschkowski, Alexey Dosovitskiy, Thomas Kipf
We address this problem by introducing a global, set-based contrastive loss: instead of contrasting individual slot representations against one another, we aggregate the representations and contrast the joined sets against one another.
8 code implementations • NeurIPS 2020 • Francesco Locatello, Dirk Weissenborn, Thomas Unterthiner, Aravindh Mahendran, Georg Heigold, Jakob Uszkoreit, Alexey Dosovitskiy, Thomas Kipf
Learning object-centric representations of complex scenes is a promising step towards enabling efficient abstract reasoning from low-level perceptual features.
1 code implementation • 27 Feb 2020 • Elise van der Pol, Thomas Kipf, Frans A. Oliehoek, Max Welling
We introduce a contrastive loss function that enforces action equivariance on the learned representations.
3 code implementations • ICLR 2020 • Thomas Kipf, Elise van der Pol, Max Welling
Our experiments demonstrate that C-SWMs can overcome limitations of models based on pixel reconstruction and outperform typical representatives of this model class in highly structured environments, while learning interpretable object-based representations.
4 code implementations • 31 Oct 2019 • Davide Belli, Thomas Kipf
For this, we introduce the Toulouse Road Network dataset, based on real-world publicly-available data.
Ranked #1 on Graph Generation on Toulouse Road Network
1 code implementation • 17 Apr 2019 • Andreas Kipf, Dimitri Vorona, Jonas Müller, Thomas Kipf, Bernhard Radke, Viktor Leis, Peter Boncz, Thomas Neumann, Alfons Kemper
We introduce Deep Sketches, which are compact models of databases that allow us to estimate the result sizes of SQL queries.
Databases
3 code implementations • 4 Dec 2018 • Thomas Kipf, Yujia Li, Hanjun Dai, Vinicius Zambaldi, Alvaro Sanchez-Gonzalez, Edward Grefenstette, Pushmeet Kohli, Peter Battaglia
We introduce Compositional Imitation Learning and Execution (CompILE): a framework for learning reusable, variable-length segments of hierarchically-structured behavior from demonstration data.
1 code implementation • 21 Nov 2018 • Raghavendra Selvan, Thomas Kipf, Max Welling, Antonio Garcia-Uceda Juarez, Jesper H. Pedersen, Jens Petersen, Marleen de Bruijne
Graph refinement, or the task of obtaining subgraphs of interest from over-complete graphs, can have many varied applications.
1 code implementation • 3 Nov 2018 • Cătălina Cangea, Petar Veličković, Nikola Jovanović, Thomas Kipf, Pietro Liò
Recent advances in representation learning on graphs, mainly leveraging graph convolutional networks, have brought a substantial improvement on many graph-based benchmark tasks.
2 code implementations • 3 Sep 2018 • Andreas Kipf, Thomas Kipf, Bernhard Radke, Viktor Leis, Peter Boncz, Alfons Kemper
We describe a new deep learning approach to cardinality estimation.
Databases
10 code implementations • 30 May 2018 • Nicola De Cao, Thomas Kipf
Deep generative models for graph-structured data offer a new angle on the problem of chemical synthesis: by optimizing differentiable models that directly generate molecular graphs, it is possible to side-step expensive search procedures in the discrete and vast space of chemical structures.
Ranked #3 on Molecular Graph Generation on InterBioScreen
no code implementations • 12 Apr 2018 • Raghavendra Selvan, Thomas Kipf, Max Welling, Jesper H. Pedersen, Jens Petersen, Marleen de Bruijne
We present extraction of tree structures, such as airways, from image data as a graph refinement task.
9 code implementations • 3 Apr 2018 • Tim R. Davidson, Luca Falorsi, Nicola De Cao, Thomas Kipf, Jakub M. Tomczak
But although the default choice of a Gaussian distribution for both the prior and posterior represents a mathematically convenient distribution often leading to competitive results, we show that this parameterization fails to model data with a latent hyperspherical structure.
Ranked #6 on Link Prediction on Cora
9 code implementations • ICML 2018 • Thomas Kipf, Ethan Fetaya, Kuan-Chieh Wang, Max Welling, Richard Zemel
Interacting systems are prevalent in nature, from dynamical systems in physics to complex societal dynamics.