Search Results for author: Igor Gilitschenski

Found 56 papers, 19 papers with code

Pseudo-Simulation for Autonomous Driving

1 code implementation4 Jun 2025 Wei Cao, Marcel Hallgarten, Tianyu Li, Daniel Dauner, Xunjiang Gu, Caojun Wang, Yakov Miron, Marco Aiello, Hongyang Li, Igor Gilitschenski, Boris Ivanovic, Marco Pavone, Andreas Geiger, Kashyap Chitta

Our method then assigns a higher importance to synthetic observations that best match the AV's likely behavior using a novel proximity-based weighting scheme.

NavSim

Calibrated Value-Aware Model Learning with Stochastic Environment Models

no code implementations28 May 2025 Claas Voelcker, Anastasiia Pedan, Arash Ahmadian, Romina Abachi, Igor Gilitschenski, Amir-Massoud Farahmand

The idea of value-aware model learning, that models should produce accurate value estimates, has gained prominence in model-based reinforcement learning.

Model-based Reinforcement Learning

MORE: Mobile Manipulation Rearrangement Through Grounded Language Reasoning

no code implementations5 May 2025 Mohammad Mohammadi, Daniel Honerkamp, Martin Büchner, Matteo Cassinelli, Tim Welschehold, Fabien Despinoy, Igor Gilitschenski, Abhinav Valada

To address these limitations, we propose MORE, a novel approach for enhancing the capabilities of language models to solve zero-shot mobile manipulation planning for rearrangement tasks.

Pippo: High-Resolution Multi-View Humans from a Single Image

no code implementations CVPR 2025 Yash Kant, Ethan Weber, Jin Kyu Kim, Rawal Khirodkar, Su Zhaoen, Julieta Martinez, Igor Gilitschenski, Shunsuke Saito, Timur Bagautdinov

Finally, we also introduce an improved metric to evaluate 3D consistency of multi-view generations, and show that Pippo outperforms existing works on multi-view human generation from a single image.

Mind the Time: Temporally-Controlled Multi-Event Video Generation

no code implementations CVPR 2025 Ziyi Wu, Aliaksandr Siarohin, Willi Menapace, Ivan Skorokhodov, Yuwei Fang, Varnith Chordia, Igor Gilitschenski, Sergey Tulyakov

Generating such sequences with precise temporal control is infeasible with existing video generators that rely on a single paragraph of text as input.

Video Generation

CTRL-D: Controllable Dynamic 3D Scene Editing with Personalized 2D Diffusion

no code implementations CVPR 2025 Kai He, Chin-Hsuan Wu, Igor Gilitschenski

Our fine-tuning enables the model to "learn" the editing ability from a single edited reference image, transforming the complex task of dynamic scene editing into a simple 2D image editing process.

3D scene Editing Novel View Synthesis

GaussianCut: Interactive segmentation via graph cut for 3D Gaussian Splatting

no code implementations12 Nov 2024 Umangi Jain, Ashkan Mirzaei, Igor Gilitschenski

Using 3D Gaussian Splatting (3DGS) as the underlying scene representation simplifies the extraction of objects of interest which are considered to be a subset of the scene's Gaussians.

3DGS graph construction +4

MAD-TD: Model-Augmented Data stabilizes High Update Ratio RL

no code implementations11 Oct 2024 Claas A Voelcker, Marcel Hussing, Eric Eaton, Amir-Massoud Farahmand, Igor Gilitschenski

Our method, Model-Augmented Data for Temporal Difference learning (MAD-TD) uses small amounts of generated data to stabilize high UTD training and achieve competitive performance on the most challenging tasks in the DeepMind control suite.

Deep Reinforcement Learning Reinforcement Learning (RL)

Towards Unsupervised Blind Face Restoration using Diffusion Prior

no code implementations6 Oct 2024 Tianshu Kuai, Sina Honari, Igor Gilitschenski, Alex Levinshtein

The models trained on such synthetic degradations, however, cannot deal with inputs of unseen degradations.

Blind Face Restoration

Realistic Evaluation of Model Merging for Compositional Generalization

1 code implementation26 Sep 2024 Derek Tam, Yash Kant, Brian Lester, Igor Gilitschenski, Colin Raffel

Merging has become a widespread way to cheaply combine individual models into a single model that inherits their capabilities and attains better performance.

image-classification Image Classification +1

When does Self-Prediction help? Understanding Auxiliary Tasks in Reinforcement Learning

1 code implementation25 Jun 2024 Claas Voelcker, Tyler Kastner, Igor Gilitschenski, Amir-Massoud Farahmand

We provide a theoretical analysis of the learning dynamics of observation reconstruction, latent self-prediction, and TD learning in the presence of distractions and observation functions under linear model assumptions.

Auxiliary Learning Prediction +2

NAVSIM: Data-Driven Non-Reactive Autonomous Vehicle Simulation and Benchmarking

2 code implementations21 Jun 2024 Daniel Dauner, Marcel Hallgarten, Tianyu Li, Xinshuo Weng, Zhiyu Huang, Zetong Yang, Hongyang Li, Igor Gilitschenski, Boris Ivanovic, Marco Pavone, Andreas Geiger, Kashyap Chitta

On a large set of challenging scenarios, we observe that simple methods with moderate compute requirements such as TransFuser can match recent large-scale end-to-end driving architectures such as UniAD.

Benchmarking NavSim

Neural Assets: 3D-Aware Multi-Object Scene Synthesis with Image Diffusion Models

no code implementations13 Jun 2024 Ziyi Wu, Yulia Rubanova, Rishabh Kabra, Drew A. Hudson, Igor Gilitschenski, Yusuf Aytar, Sjoerd van Steenkiste, Kelsey R. Allen, Thomas Kipf

By fine-tuning a pre-trained text-to-image diffusion model with this information, our approach enables fine-grained 3D pose and placement control of individual objects in a scene.

Object

RefFusion: Reference Adapted Diffusion Models for 3D Scene Inpainting

no code implementations16 Apr 2024 Ashkan Mirzaei, Riccardo de Lutio, Seung Wook Kim, David Acuna, Jonathan Kelly, Sanja Fidler, Igor Gilitschenski, Zan Gojcic

In this work, we propose an approach for 3D scene inpainting -- the task of coherently replacing parts of the reconstructed scene with desired content.

3D Inpainting Image Inpainting

Producing and Leveraging Online Map Uncertainty in Trajectory Prediction

1 code implementation CVPR 2024 Xunjiang Gu, Guanyu Song, Igor Gilitschenski, Marco Pavone, Boris Ivanovic

High-definition (HD) maps have played an integral role in the development of modern autonomous vehicle (AV) stacks, albeit with high associated labeling and maintenance costs.

Autonomous Driving Prediction +1

Vid2Robot: End-to-end Video-conditioned Policy Learning with Cross-Attention Transformers

no code implementations19 Mar 2024 Vidhi Jain, Maria Attarian, Nikhil J Joshi, Ayzaan Wahid, Danny Driess, Quan Vuong, Pannag R Sanketi, Pierre Sermanet, Stefan Welker, Christine Chan, Igor Gilitschenski, Yonatan Bisk, Debidatta Dwibedi

Vid2Robot uses cross-attention transformer layers between video features and the current robot state to produce the actions and perform the same task as shown in the video.

Dissecting Deep RL with High Update Ratios: Combatting Value Divergence

no code implementations9 Mar 2024 Marcel Hussing, Claas Voelcker, Igor Gilitschenski, Amir-Massoud Farahmand, Eric Eaton

We show that deep reinforcement learning algorithms can retain their ability to learn without resetting network parameters in settings where the number of gradient updates greatly exceeds the number of environment samples by combatting value function divergence.

Deep Reinforcement Learning

Reconstructive Latent-Space Neural Radiance Fields for Efficient 3D Scene Representations

no code implementations27 Oct 2023 Tristan Aumentado-Armstrong, Ashkan Mirzaei, Marcus A. Brubaker, Jonathan Kelly, Alex Levinshtein, Konstantinos G. Derpanis, Igor Gilitschenski

The resulting latent-space NeRF can produce novel views with higher quality than standard colour-space NeRFs, as the AE can correct certain visual artifacts, while rendering over three times faster.

Continual Learning NeRF +1

Watch Your Steps: Local Image and Scene Editing by Text Instructions

no code implementations17 Aug 2023 Ashkan Mirzaei, Tristan Aumentado-Armstrong, Marcus A. Brubaker, Jonathan Kelly, Alex Levinshtein, Konstantinos G. Derpanis, Igor Gilitschenski

A field is trained on relevance maps of training views, denoted as the relevance field, defining the 3D region within which modifications should be made.

Denoising Image Generation +1

trajdata: A Unified Interface to Multiple Human Trajectory Datasets

3 code implementations NeurIPS 2023 Boris Ivanovic, Guanyu Song, Igor Gilitschenski, Marco Pavone

The field of trajectory forecasting has grown significantly in recent years, partially owing to the release of numerous large-scale, real-world human trajectory datasets for autonomous vehicles (AVs) and pedestrian motion tracking.

Autonomous Vehicles Motion Forecasting +1

$λ$-models: Effective Decision-Aware Reinforcement Learning with Latent Models

no code implementations30 Jun 2023 Claas A Voelcker, Arash Ahmadian, Romina Abachi, Igor Gilitschenski, Amir-Massoud Farahmand

The idea of decision-aware model learning, that models should be accurate where it matters for decision-making, has gained prominence in model-based reinforcement learning.

continuous-control Continuous Control +4

EventCLIP: Adapting CLIP for Event-based Object Recognition

1 code implementation10 Jun 2023 Ziyi Wu, Xudong Liu, Igor Gilitschenski

Recent advances in zero-shot and few-shot classification heavily rely on the success of pre-trained vision-language models (VLMs) such as CLIP.

Few-Shot Learning Object +3

SlotDiffusion: Object-Centric Generative Modeling with Diffusion Models

no code implementations NeurIPS 2023 Ziyi Wu, Jingyu Hu, Wuyue Lu, Igor Gilitschenski, Animesh Garg

Finally, we demonstrate the scalability of SlotDiffusion to unconstrained real-world datasets such as PASCAL VOC and COCO, when integrated with self-supervised pre-trained image encoders.

Image Generation Object +5

CAMM: Building Category-Agnostic and Animatable 3D Models from Monocular Videos

1 code implementation14 Apr 2023 Tianshu Kuai, Akash Karthikeyan, Yash Kant, Ashkan Mirzaei, Igor Gilitschenski

Animating an object in 3D often requires an articulated structure, e. g. a kinematic chain or skeleton of the manipulated object with proper skinning weights, to obtain smooth movements and surface deformations.

Object Surface Reconstruction

Invertible Neural Skinning

no code implementations CVPR 2023 Yash Kant, Aliaksandr Siarohin, Riza Alp Guler, Menglei Chai, Jian Ren, Sergey Tulyakov, Igor Gilitschenski

Next, we combine PIN with a differentiable LBS module to build an expressive and end-to-end Invertible Neural Skinning (INS) pipeline.

Building Scalable Video Understanding Benchmarks through Sports

no code implementations17 Jan 2023 Aniket Agarwal, Alex Zhang, Karthik Narasimhan, Igor Gilitschenski, Vishvak Murahari, Yash Kant

Our human studies indicate that ASAP can align videos and annotations with high fidelity, precision, and speed.

Video Understanding

SparsePose: Sparse-View Camera Pose Regression and Refinement

no code implementations CVPR 2023 Samarth Sinha, Jason Y. Zhang, Andrea Tagliasacchi, Igor Gilitschenski, David B. Lindell

Camera pose estimation is a key step in standard 3D reconstruction pipelines that operate on a dense set of images of a single object or scene.

3D Reconstruction Camera Pose Estimation +2

Solving Continuous Control via Q-learning

1 code implementation22 Oct 2022 Tim Seyde, Peter Werner, Wilko Schwarting, Igor Gilitschenski, Martin Riedmiller, Daniela Rus, Markus Wulfmeier

While there has been substantial success for solving continuous control with actor-critic methods, simpler critic-only methods such as Q-learning find limited application in the associated high-dimensional action spaces.

continuous-control Continuous Control +2

LaTeRF: Label and Text Driven Object Radiance Fields

no code implementations4 Jul 2022 Ashkan Mirzaei, Yash Kant, Jonathan Kelly, Igor Gilitschenski

In this paper we introduce LaTeRF, a method for extracting an object of interest from a scene given 2D images of the entire scene, known camera poses, a natural language description of the object, and a set of point-labels of object and non-object points in the input images.

NeRF Object

Deep Latent Competition: Learning to Race Using Visual Control Policies in Latent Space

1 code implementation19 Feb 2021 Wilko Schwarting, Tim Seyde, Igor Gilitschenski, Lucas Liebenwein, Ryan Sander, Sertac Karaman, Daniela Rus

We demonstrate the effectiveness of our algorithm in learning competitive behaviors on a novel multi-agent racing benchmark that requires planning from image observations.

Reinforcement Learning (RL)

Deep Orientation Uncertainty Learning based on a Bingham Loss

1 code implementation ICLR 2020 Igor Gilitschenski, Roshni Sahoo, Wilko Schwarting, Alexander Amini, Sertac Karaman, Daniela Rus

Reasoning about uncertain orientations is one of the core problems in many perception tasks such as object pose estimation or motion estimation.

Motion Estimation Pose Estimation

SiPPing Neural Networks: Sensitivity-informed Provable Pruning of Neural Networks

2 code implementations11 Oct 2019 Cenk Baykal, Lucas Liebenwein, Igor Gilitschenski, Dan Feldman, Daniela Rus

We introduce a pruning algorithm that provably sparsifies the parameters of a trained model in a way that approximately preserves the model's predictive accuracy.

LandmarkBoost: Efficient Visual Context Classifiers for Robust Localization

no code implementations12 Jul 2018 Marcin Dymczyk, Igor Gilitschenski, Juan Nieto, Simon Lynen, Bernhard Zeisl, Roland Siegwart

We propose LandmarkBoost - an approach that, in contrast to the conventional 2D-3D matching methods, casts the search problem as a landmark classification task.

Pose Retrieval Retrieval

Data-Dependent Coresets for Compressing Neural Networks with Applications to Generalization Bounds

no code implementations ICLR 2019 Cenk Baykal, Lucas Liebenwein, Igor Gilitschenski, Dan Feldman, Daniela Rus

We present an efficient coresets-based neural network compression algorithm that sparsifies the parameters of a trained fully-connected neural network in a manner that provably approximates the network's output.

Generalization Bounds Neural Network Compression

Directional Statistics and Filtering Using libDirectional

no code implementations28 Dec 2017 Gerhard Kurz, Igor Gilitschenski, Florian Pfaff, Lukas Drude, Uwe D. Hanebeck, Reinhold Haeb-Umbach, Roland Y. Siegwart

In this paper, we present libDirectional, a MATLAB library for directional statistics and directional estimation.

maplab: An Open Framework for Research in Visual-inertial Mapping and Localization

1 code implementation28 Nov 2017 Thomas Schneider, Marcin Dymczyk, Marius Fehr, Kevin Egger, Simon Lynen, Igor Gilitschenski, Roland Siegwart

On the other hand, maplab provides the research community with a collection of multisession mapping tools that include map merging, visual-inertial batch optimization, and loop closure.

Robotics

Cannot find the paper you are looking for? You can Submit a new open access paper.