Search Results for author: Iro Laina

Found 34 papers, 13 papers with code

Invisible Stitch: Generating Smooth 3D Scenes with Depth Inpainting

no code implementations30 Apr 2024 Paul Engstler, Andrea Vedaldi, Iro Laina, Christian Rupprecht

These works often depend on pre-trained monocular depth estimators to lift the generated images into 3D, fusing them with the existing scene representation.

Benchmarking Depth Completion +2

DGE: Direct Gaussian 3D Editing by Consistent Multi-view Editing

no code implementations29 Apr 2024 Minghao Chen, Iro Laina, Andrea Vedaldi

However, this is often slow as it requires do update a computationally expensive 3D representations such as a neural radiance field, and to do so by using contradictory guidance from a 2D model which is inherently not multi-view consistent.

N2F2: Hierarchical Scene Understanding with Nested Neural Feature Fields

no code implementations16 Mar 2024 Yash Bhalgat, Iro Laina, João F. Henriques, Andrew Zisserman, Andrea Vedaldi

To address this, we introduce Nested Neural Feature Fields (N2F2), a novel approach that employs hierarchical supervision to learn a single feature field, wherein different dimensions within the same high-dimensional feature encode scene properties at varying granularities.

Scene Understanding

IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation

no code implementations13 Feb 2024 Luke Melas-Kyriazi, Iro Laina, Christian Rupprecht, Natalia Neverova, Andrea Vedaldi, Oran Gafni, Filippos Kokkinos

A mitigation is to fine-tune the 2D generator to be multi-view aware, which can help distillation or can be combined with reconstruction networks to output 3D objects directly.

3D Generation 3D Reconstruction +1

SHAP-EDITOR: Instruction-guided Latent 3D Editing in Seconds

no code implementations CVPR 2024 Minghao Chen, Junyu Xie, Iro Laina, Andrea Vedaldi

In particular, we hypothesise that editing can be greatly simplified by first encoding 3D objects in a suitable latent space.

Diffusion Models for Zero-Shot Open-Vocabulary Segmentation

no code implementations15 Jun 2023 Laurynas Karazija, Iro Laina, Andrea Vedaldi, Christian Rupprecht

This provides a distribution of appearances for a given text circumventing the ambiguity problem.

EPIC Fields: Marrying 3D Geometry and Video Understanding

1 code implementation NeurIPS 2023 Vadim Tschernezki, Ahmad Darkhalil, Zhifan Zhu, David Fouhey, Iro Laina, Diane Larlus, Dima Damen, Andrea Vedaldi

Compared to other neural rendering datasets, EPIC Fields is better tailored to video understanding because it is paired with labelled action segments and the recent VISOR segment annotations.

Neural Rendering Video Understanding

Contrastive Lift: 3D Object Instance Segmentation by Slow-Fast Contrastive Fusion

1 code implementation NeurIPS 2023 Yash Bhalgat, Iro Laina, João F. Henriques, Andrew Zisserman, Andrea Vedaldi

Our approach outperforms the state-of-the-art on challenging scenes from the ScanNet, Hypersim, and Replica datasets, as well as on our newly created Messy Rooms dataset, demonstrating the effectiveness and scalability of our slow-fast clustering method.

Clustering Instance Segmentation +2

Training-Free Layout Control with Cross-Attention Guidance

1 code implementation6 Apr 2023 Minghao Chen, Iro Laina, Andrea Vedaldi

We thoroughly evaluate our approach on three benchmarks and provide several qualitative examples and a comparative analysis of the two strategies that demonstrate the superiority of backward guidance compared to forward guidance, as well as prior work.

RealFusion: 360° Reconstruction of Any Object from a Single Image

3 code implementations21 Feb 2023 Luke Melas-Kyriazi, Christian Rupprecht, Iro Laina, Andrea Vedaldi

We consider the problem of reconstructing a full 360{\deg} photographic model of an object from a single image of it.

3D Reconstruction Object

Neural Feature Fusion Fields: 3D Distillation of Self-Supervised 2D Image Representations

no code implementations7 Sep 2022 Vadim Tschernezki, Iro Laina, Diane Larlus, Andrea Vedaldi

We present Neural Feature Fusion Fields (N3F), a method that improves dense 2D image feature extractors when the latter are applied to the analysis of multiple images reconstructible as a 3D scene.

Neural Rendering Retrieval

ClevrTex: A Texture-Rich Benchmark for Unsupervised Multi-Object Segmentation

1 code implementation19 Nov 2021 Laurynas Karazija, Iro Laina, Christian Rupprecht

We benchmark a large set of recent unsupervised multi-object segmentation models on ClevrTex and find all state-of-the-art approaches fail to learn good representations in the textured setting, despite impressive performance on simpler data.

Segmentation Semantic Segmentation +1

The Curious Layperson: Fine-Grained Image Recognition without Expert Labels

1 code implementation5 Nov 2021 Subhabrata Choudhury, Iro Laina, Christian Rupprecht, Andrea Vedaldi

We then train a fine-grained textual similarity model that matches image descriptions with documents on a sentence-level basis.

Cross-Modal Retrieval Fine-Grained Image Recognition +2

Quantifying Learnability and Describability of Visual Concepts Emerging in Representation Learning

no code implementations NeurIPS 2020 Iro Laina, Ruth C. Fong, Andrea Vedaldi

The increasing impact of black box models, and particularly of unsupervised ones, comes with an increasing interest in tools to understand and interpret them.

Clustering Representation Learning

Semantic Image Manipulation Using Scene Graphs

1 code implementation CVPR 2020 Helisa Dhamo, Azade Farshad, Iro Laina, Nassir Navab, Gregory D. Hager, Federico Tombari, Christian Rupprecht

In our work, we address the novel problem of image manipulation from scene graphs, in which a user can edit images by merely applying changes in the nodes or edges of a semantic graph that is generated from the image.

Image Inpainting Image Manipulation +1

2017 Robotic Instrument Segmentation Challenge

3 code implementations18 Feb 2019 Max Allan, Alex Shvets, Thomas Kurmann, Zichen Zhang, Rahul Duggal, Yun-Hsuan Su, Nicola Rieke, Iro Laina, Niveditha Kalavakonda, Sebastian Bodenstedt, Luis Herrera, Wenqi Li, Vladimir Iglovikov, Huoling Luo, Jian Yang, Danail Stoyanov, Lena Maier-Hein, Stefanie Speidel, Mahdi Azizian

In mainstream computer vision and machine learning, public datasets such as ImageNet, COCO and KITTI have helped drive enormous improvements by enabling researchers to understand the strengths and limitations of different algorithms via performance comparison.

Benchmarking Person Re-Identification +2

Dealing with Ambiguity in Robotic Grasping via Multiple Predictions

no code implementations2 Nov 2018 Ghazal Ghazaei, Iro Laina, Christian Rupprecht, Federico Tombari, Nassir Navab, Kianoush Nazarpour

Further, we reformulate the problem of robotic grasping by replacing conventional grasp rectangles with grasp belief maps, which hold more precise location information than a rectangle and account for the uncertainty inherent to the task.

Robotic Grasping

Peeking Behind Objects: Layered Depth Prediction from a Single Image

no code implementations23 Jul 2018 Helisa Dhamo, Keisuke Tateno, Iro Laina, Nassir Navab, Federico Tombari

While conventional depth estimation can infer the geometry of a scene from a single RGB image, it fails to estimate scene regions that are occluded by foreground objects.

Depth Estimation Depth Prediction

Guide Me: Interacting with Deep Networks

no code implementations CVPR 2018 Christian Rupprecht, Iro Laina, Nassir Navab, Gregory D. Hager, Federico Tombari

Interaction and collaboration between humans and intelligent machines has become increasingly important as machine learning methods move into real-world applications that involve end users.

Image Captioning Image Generation

CNN-SLAM: Real-time dense monocular SLAM with learned depth prediction

1 code implementation CVPR 2017 Keisuke Tateno, Federico Tombari, Iro Laina, Nassir Navab

Given the recent advances in depth prediction from Convolutional Neural Networks (CNNs), this paper investigates how predicted depth maps from a deep neural network can be deployed for accurate and dense monocular reconstruction.

Depth Estimation Depth Prediction +1

Cannot find the paper you are looking for? You can Submit a new open access paper.