Search Results for author: Mehdi S. M. Sajjadi

Found 23 papers, 9 papers with code

DyST: Towards Dynamic Neural Scene Representations on Real-World Videos

no code implementations • 9 Oct 2023 • Maximilian Seitzer, Sjoerd van Steenkiste, Thomas Kipf, Klaus Greff, Mehdi S. M. Sajjadi

Our Dynamic Scene Transformer (DyST) model leverages recent work in neural scene representation to learn a latent decomposition of monocular real-world videos into scene content, per-view scene dynamics, and camera pose.

Paper
Add Code

DORSal: Diffusion for Object-centric Representations of Scenes et al

no code implementations • 13 Jun 2023 • Allan Jabri, Sjoerd van Steenkiste, Emiel Hoogeboom, Mehdi S. M. Sajjadi, Thomas Kipf

In this paper, we leverage recent progress in diffusion models to equip 3D scene representation learning models with the ability to render high-fidelity novel views, while retaining benefits such as object-level scene editing to a large degree.

Neural Rendering Object +3

Paper
Add Code

Sensitivity of Slot-Based Object-Centric Models to their Number of Slots

no code implementations • 30 May 2023 • Roland S. Zimmermann, Sjoerd van Steenkiste, Mehdi S. M. Sajjadi, Thomas Kipf, Klaus Greff

Self-supervised methods for learning object-centric representations have recently been applied successfully to various datasets.

Paper
Add Code

RePAST: Relative Pose Attention Scene Representation Transformer

no code implementations • 3 Apr 2023 • Aleksandr Safin, Daniel Duckworth, Mehdi S. M. Sajjadi

The Scene Representation Transformer (SRT) is a recent method to render novel views at interactive rates.

Paper
Add Code

PaLM-E: An Embodied Multimodal Language Model

2 code implementations • 6 Mar 2023 • Danny Driess, Fei Xia, Mehdi S. M. Sajjadi, Corey Lynch, Aakanksha Chowdhery, Brian Ichter, Ayzaan Wahid, Jonathan Tompson, Quan Vuong, Tianhe Yu, Wenlong Huang, Yevgen Chebotar, Pierre Sermanet, Daniel Duckworth, Sergey Levine, Vincent Vanhoucke, Karol Hausman, Marc Toussaint, Klaus Greff, Andy Zeng, Igor Mordatch, Pete Florence

Large language models excel at a wide range of complex tasks.

Ranked #2 on Visual Question Answering (VQA) on OK-VQA

Language Modelling Large Language Model +2

201

Paper
Code

Invariant Slot Attention: Object Discovery with Slot-Centric Reference Frames

1 code implementation • 9 Feb 2023 • Ondrej Biza, Sjoerd van Steenkiste, Mehdi S. M. Sajjadi, Gamaleldin F. Elsayed, Aravindh Mahendran, Thomas Kipf

Automatically discovering composable abstractions from raw perceptual data is a long-standing challenge in machine learning.

Object Object Discovery

32,804

Paper
Code

RUST: Latent Neural Scene Representations from Unposed Imagery

no code implementations • CVPR 2023 • Mehdi S. M. Sajjadi, Aravindh Mahendran, Thomas Kipf, Etienne Pot, Daniel Duckworth, Mario Lucic, Klaus Greff

Our main insight is that one can train a Pose Encoder that peeks at the target image and learns a latent pose embedding which is used by the decoder for view synthesis.

Novel View Synthesis

Paper
Add Code

Object Scene Representation Transformer

no code implementations • 14 Jun 2022 • Mehdi S. M. Sajjadi, Daniel Duckworth, Aravindh Mahendran, Sjoerd van Steenkiste, Filip Pavetić, Mario Lučić, Leonidas J. Guibas, Klaus Greff, Thomas Kipf

A compositional understanding of the world in terms of objects and their geometry in 3D space is considered a cornerstone of human cognition.

Novel View Synthesis Object +1

Paper
Add Code

Test-time Adaptation with Slot-Centric Models

1 code implementation • 21 Mar 2022 • Mihir Prabhudesai, Anirudh Goyal, Sujoy Paul, Sjoerd van Steenkiste, Mehdi S. M. Sajjadi, Gaurav Aggarwal, Thomas Kipf, Deepak Pathak, Katerina Fragkiadaki

In our work, we find evidence that these losses are insufficient for the task of scene decomposition, without also considering architectural inductive biases.

Image Classification Image Segmentation +7

Paper
Code

Kubric: A scalable dataset generator

1 code implementation • CVPR 2022 • Klaus Greff, Francois Belletti, Lucas Beyer, Carl Doersch, Yilun Du, Daniel Duckworth, David J. Fleet, Dan Gnanapragasam, Florian Golemo, Charles Herrmann, Thomas Kipf, Abhijit Kundu, Dmitry Lagun, Issam Laradji, Hsueh-Ti, Liu, Henning Meyer, Yishu Miao, Derek Nowrouzezahrai, Cengiz Oztireli, Etienne Pot, Noha Radwan, Daniel Rebain, Sara Sabour, Mehdi S. M. Sajjadi, Matan Sela, Vincent Sitzmann, Austin Stone, Deqing Sun, Suhani Vora, Ziyu Wang, Tianhao Wu, Kwang Moo Yi, Fangcheng Zhong, Andrea Tagliasacchi

Data is the driving force of machine learning, with the amount and quality of training data often being more important for the performance of a system than architecture and training details.

Fairness Optical Flow Estimation

2,174

Paper
Code

RegNeRF: Regularizing Neural Radiance Fields for View Synthesis from Sparse Inputs

no code implementations • CVPR 2022 • Michael Niemeyer, Jonathan T. Barron, Ben Mildenhall, Mehdi S. M. Sajjadi, Andreas Geiger, Noha Radwan

We observe that the majority of artifacts in sparse input scenarios are caused by errors in the estimated scene geometry, and by divergent behavior at the start of training.

Novel View Synthesis

Paper
Add Code

NeSF: Neural Semantic Fields for Generalizable Semantic Segmentation of 3D Scenes

no code implementations • 25 Nov 2021 • Suhani Vora, Noha Radwan, Klaus Greff, Henning Meyer, Kyle Genova, Mehdi S. M. Sajjadi, Etienne Pot, Andrea Tagliasacchi, Daniel Duckworth

We present NeSF, a method for producing 3D semantic fields from posed RGB images alone.

3D Semantic Segmentation Segmentation

Paper
Add Code

Scene Representation Transformer: Geometry-Free Novel View Synthesis Through Set-Latent Scene Representations

1 code implementation • CVPR 2022 • Mehdi S. M. Sajjadi, Henning Meyer, Etienne Pot, Urs Bergmann, Klaus Greff, Noha Radwan, Suhani Vora, Mario Lucic, Daniel Duckworth, Alexey Dosovitskiy, Jakob Uszkoreit, Thomas Funkhouser, Andrea Tagliasacchi

In this work, we propose the Scene Representation Transformer (SRT), a method which processes posed or unposed RGB images of a new area, infers a "set-latent scene representation", and synthesises novel views, all in a single feed-forward pass.

Novel View Synthesis Semantic Segmentation

198

Paper
Code

NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections

1 code implementation • CVPR 2021 • Ricardo Martin-Brualla, Noha Radwan, Mehdi S. M. Sajjadi, Jonathan T. Barron, Alexey Dosovitskiy, Daniel Duckworth

We present a learning-based method for synthesizing novel views of complex scenes using only unstructured collections of in-the-wild photographs.

2,618

Paper
Code

From Variational to Deterministic Autoencoders

4 code implementations • ICLR 2020 • Partha Ghosh, Mehdi S. M. Sajjadi, Antonio Vergari, Michael Black, Bernhard Schölkopf

Variational Autoencoders (VAEs) provide a theoretically-backed and popular framework for deep generative models.

Density Estimation

1,680

Paper
Code

Spatio-temporal Transformer Network for Video Restoration

no code implementations • ECCV 2018 • Tae Hyun Kim, Mehdi S. M. Sajjadi, Michael Hirsch, Bernhard Scholkopf

State-of-the-art video restoration methods integrate optical flow estimation networks to utilize temporal information.

Deblurring Optical Flow Estimation +2

Paper
Add Code

Perceptual Video Super Resolution with Enhanced Temporal Consistency

no code implementations • 20 Jul 2018 • Eduardo Pérez-Pellitero, Mehdi S. M. Sajjadi, Michael Hirsch, Bernhard Schölkopf

Together with a video discriminator, we also propose additional loss functions to further reinforce temporal consistency in the generated sequences.

Image Super-Resolution Video Super-Resolution

Paper
Add Code

Assessing Generative Models via Precision and Recall

4 code implementations • NeurIPS 2018 • Mehdi S. M. Sajjadi, Olivier Bachem, Mario Lucic, Olivier Bousquet, Sylvain Gelly

Recent advances in generative modeling have led to an increased interest in the study of statistical divergences as means of model comparison.

Paper
Code

Tempered Adversarial Networks

no code implementations • ICML 2018 • Mehdi S. M. Sajjadi, Giambattista Parascandolo, Arash Mehrjou, Bernhard Schölkopf

A possible explanation for training instabilities is the inherent imbalance between the networks: While the discriminator is trained directly on both real and fake samples, the generator only has control over the fake samples it produces since the real data distribution is fixed by the choice of a given dataset.

Paper
Add Code

Frame-Recurrent Video Super-Resolution

no code implementations • CVPR 2018 • Mehdi S. M. Sajjadi, Raviteja Vemulapalli, Matthew Brown

Recent advances in video super-resolution have shown that convolutional neural networks combined with motion compensation are able to merge information from multiple low-resolution (LR) frames to generate high-quality images.

Ranked #6 on Video Super-Resolution on MSU Video Upscalers: Quality Enhancement (VMAF metric)

Motion Compensation Multi-Frame Super-Resolution +1

Paper
Add Code

EnhanceNet: Single Image Super-Resolution Through Automated Texture Synthesis

4 code implementations • ICCV 2017 • Mehdi S. M. Sajjadi, Bernhard Schölkopf, Michael Hirsch

Single image super-resolution is the task of inferring a high-resolution image from a single low-resolution input.

Ranked #3 on Image Super-Resolution on FFHQ 256 x 256 - 4x upscaling

Image Super-Resolution Texture Synthesis

162

Paper
Code

Depth Estimation Through a Generative Model of Light Field Synthesis

no code implementations • 6 Sep 2016 • Mehdi S. M. Sajjadi, Rolf Köhler, Bernhard Schölkopf, Michael Hirsch

Light field photography captures rich structural information that may facilitate a number of traditional image processing and computer vision tasks.

Depth Estimation

Paper
Add Code

Peer Grading in a Course on Algorithms and Data Structures: Machine Learning Algorithms do not Improve over Simple Baselines

no code implementations • 2 Jun 2015 • Mehdi S. M. Sajjadi, Morteza Alamgir, Ulrike Von Luxburg

Peer grading is the process of students reviewing each others' work, such as homework submissions, and has lately become a popular mechanism used in massive open online courses (MOOCs).

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.