Search Results for author: Alessio Tonioni

Found 28 papers, 11 papers with code

BRAVE: Broadening the visual encoding of vision-language models

no code implementations • 10 Apr 2024 • Oğuzhan Fatih Kar, Alessio Tonioni, Petra Poklukar, Achin Kulshrestha, Amir Zamir, Federico Tombari

Our results highlight the potential of incorporating different visual biases for a more broad and contextualized visual understanding of VLMs.

Hallucination Language Modelling +1

Paper
Add Code

Snap-it, Tap-it, Splat-it: Tactile-Informed 3D Gaussian Splatting for Reconstructing Challenging Surfaces

no code implementations • 29 Mar 2024 • Mauro Comi, Alessio Tonioni, Max Yang, Jonathan Tremblay, Valts Blukis, Yijiong Lin, Nathan F. Lepora, Laurence Aitchison

Touch and vision go hand in hand, mutually enhancing our ability to understand the world.

Novel View Synthesis Surface Reconstruction

Paper
Add Code

InseRF: Text-Driven Generative Object Insertion in Neural 3D Scenes

no code implementations • 10 Jan 2024 • Mohamad Shahbazi, Liesbeth Claessens, Michael Niemeyer, Edo Collins, Alessio Tonioni, Luc van Gool, Federico Tombari

We introduce InseRF, a novel method for generative object insertion in the NeRF reconstructions of 3D scenes.

3D scene Editing Monocular Depth Estimation +2

Paper
Add Code

Text-Conditioned Resampler For Long Form Video Understanding

no code implementations • 19 Dec 2023 • Bruno Korbar, Yongqin Xian, Alessio Tonioni, Andrew Zisserman, Federico Tombari

In this paper we present a text-conditioned video resampler (TCR) module that uses a pre-trained and frozen visual encoder and large language model (LLM) to process long video sequences for a task.

Ranked #5 on Video Question Answering on NExT-QA

Language Modelling Large Language Model +2

Paper
Add Code

LIME: Localized Image Editing via Attention Regularization in Diffusion Models

no code implementations • 14 Dec 2023 • Enis Simsar, Alessio Tonioni, Yongqin Xian, Thomas Hofmann, Federico Tombari

Diffusion models (DMs) have gained prominence due to their ability to generate high-quality, varied images, with recent advancements in text-to-image generation.

Denoising Semantic Segmentation +1

Paper
Add Code

TouchSDF: A DeepSDF Approach for 3D Shape Reconstruction using Vision-Based Tactile Sensing

no code implementations • 21 Nov 2023 • Mauro Comi, Yijiong Lin, Alex Church, Alessio Tonioni, Laurence Aitchison, Nathan F. Lepora

To address these challenges, we propose TouchSDF, a Deep Learning approach for tactile 3D shape reconstruction that leverages the rich information provided by a vision-based tactile sensor and the expressivity of the implicit neural representation DeepSDF.

3D Shape Reconstruction

Paper
Add Code

TextMesh: Generation of Realistic 3D Meshes From Text Prompts

1 code implementation • 24 Apr 2023 • Christina Tsalicoglou, Fabian Manhardt, Alessio Tonioni, Michael Niemeyer, Federico Tombari

In addition, we propose a novel way to finetune the mesh texture, removing the effect of high saturation and improving the details of the output 3D mesh.

5,591

Paper
Code

NeRF-Supervised Deep Stereo

2 code implementations • CVPR 2023 • Fabio Tosi, Alessio Tonioni, Daniele De Gregorio, Matteo Poggi

We introduce a novel framework for training deep stereo networks effortlessly and without any ground-truth.

Neural Rendering Zero-shot Generalization

329

Paper
Code

NeRF-GAN Distillation for Efficient 3D-Aware Generation with Convolutions

1 code implementation • 22 Mar 2023 • Mohamad Shahbazi, Evangelos Ntavelis, Alessio Tonioni, Edo Collins, Danda Pani Paudel, Martin Danelljan, Luc van Gool

Pose-conditioned convolutional generative models struggle with high-quality 3D-consistent image generation from single-view datasets, due to their lack of sufficient 3D priors.

Image Generation Inductive Bias

Paper
Code

Learning Good Features to Transfer Across Tasks and Domains

no code implementations • 26 Jan 2023 • Pierluigi Zama Ramirez, Adriano Cardace, Luca De Luigi, Alessio Tonioni, Samuele Salti, Luigi Di Stefano

Besides, we propose a set of strategies to constrain the learned feature spaces, to ease learning and increase the generalization capability of the mapping network, thereby considerably improving the final performance of our framework.

Monocular Depth Estimation Semantic Segmentation

Paper
Add Code

LatentSwap3D: Semantic Edits on 3D Image GANs

no code implementations • 2 Dec 2022 • Enis Simsar, Alessio Tonioni, Evin Pınar Örnek, Federico Tombari

3D GANs have the ability to generate latent codes for entire 3D volumes rather than only 2D images.

Feature Importance

Paper
Add Code

ParGAN: Learning Real Parametrizable Transformations

no code implementations • 9 Nov 2022 • Diego Martin Arroyo, Alessio Tonioni, Federico Tombari

Current methods for image-to-image translation produce compelling results, however, the applied transformation is difficult to control, since existing mechanisms are often limited and non-intuitive.

Image-to-Image Translation Translation

Paper
Add Code

LegoFormer: Transformers for Block-by-Block Multi-view 3D Reconstruction

1 code implementation • 23 Jun 2021 • Farid Yagubbayli, Yida Wang, Alessio Tonioni, Federico Tombari

Most modern deep learning-based multi-view 3D reconstruction techniques use RNNs or fusion modules to combine information from multiple images after independently encoding them.

3D Reconstruction Multi-View 3D Reconstruction +1

Paper
Code

Unsupervised Novel View Synthesis from a Single Image

no code implementations • 5 Feb 2021 • Pierluigi Zama Ramirez, Alessio Tonioni, Federico Tombari

Novel view synthesis from a single image aims at generating novel views from a single input image of an object.

Novel View Synthesis

Paper
Add Code

Batch Normalization Embeddings for Deep Domain Generalization

no code implementations • 25 Nov 2020 • Mattia Segu, Alessio Tonioni, Federico Tombari

Several recent methods use multiple datasets to train models to extract domain-invariant features, hoping to generalize to unseen domains.

Ranked #63 on Domain Generalization on PACS

Domain Generalization

Paper
Add Code

A Divide et Impera Approach for 3D Shape Reconstruction from Multiple Views

no code implementations • 17 Nov 2020 • Riccardo Spezialetti, David Joseph Tan, Alessio Tonioni, Keisuke Tateno, Federico Tombari

Estimating the 3D shape of an object from a single or multiple images has gained popularity thanks to the recent breakthroughs powered by deep learning.

3D Shape Reconstruction Object +1

Paper
Add Code

Continual Adaptation for Deep Stereo

1 code implementation • 10 Jul 2020 • Matteo Poggi, Alessio Tonioni, Fabio Tosi, Stefano Mattoccia, Luigi Di Stefano

Thus, our network architecture and adaptation algorithms realize the first real-time self-adaptive deep stereo system and pave the way for a new paradigm that can facilitate practical deployment of end-to-end architectures for dense disparity regression.

Depth Estimation

416

Paper
Code

Unsupervised Domain Adaptation for Depth Prediction from Images

1 code implementation • 9 Sep 2019 • Alessio Tonioni, Matteo Poggi, Stefano Mattoccia, Luigi Di Stefano

Extensive experimental results based on standard datasets and evaluation protocols prove that our technique can address effectively the domain shift issue with both stereo and monocular depth prediction architectures and outperforms other state-of-the-art unsupervised loss functions that may be alternatively deployed to pursue domain adaptation.

Depth Estimation Depth Prediction +1

Paper
Code

Semi-Automatic Labeling for Deep Learning in Robotics

1 code implementation • 5 Aug 2019 • Daniele De Gregorio, Alessio Tonioni, Gianluca Palli, Luigi Di Stefano

In this paper, we propose Augmented Reality Semi-automatic labeling (ARS), a semi-automatic method which leverages on moving a 2D camera by means of a robot, proving precise camera tracking, and an augmented reality pen to define initial object bounding box, to create large labeled datasets with minimal human intervention.

Object object-detection +1

Paper
Code

Real-Time Highly Accurate Dense Depth on a Power Budget using an FPGA-CPU Hybrid SoC

no code implementations • 17 Jul 2019 • Oscar Rahnama, Tommaso Cavallari, Stuart Golodetz, Alessio Tonioni, Thomas Joy, Luigi Di Stefano, Simon Walker, Philip H. S. Torr

Obtaining highly accurate depth from stereo images in real time has many applications across computer vision and robotics, but in some contexts, upper bounds on power consumption constrain the feasible hardware to embedded platforms such as FPGAs.

Paper
Add Code

Learning Across Tasks and Domains

2 code implementations • ICCV 2019 • Pierluigi Zama Ramirez, Alessio Tonioni, Samuele Salti, Luigi Di Stefano

Recent works have proven that many relevant visual tasks are closely related one to another.

Domain Adaptation Monocular Depth Estimation +1

Paper
Code

Learning to Adapt for Stereo

1 code implementation • CVPR 2019 • Alessio Tonioni, Oscar Rahnama, Thomas Joy, Luigi Di Stefano, Thalaiyasingam Ajanthan, Philip H. S. Torr

Real world applications of stereo depth estimation require models that are robust to dynamic variations in the environment.

Autonomous Driving Stereo Depth Estimation

Paper
Code

Domain invariant hierarchical embedding for grocery products recognition

no code implementations • 2 Feb 2019 • Alessio Tonioni, Luigi Di Stefano

Moreover, there exist a significant domain shift between the images that should be recognized at test time, taken in stores by cheap cameras, and those available for training, usually just one or a few studio-quality images per product.

Paper
Add Code

Exploiting Semantics in Adversarial Training for Image-Level Domain Adaptation

no code implementations • 13 Oct 2018 • Pierluigi Zama Ramirez, Alessio Tonioni, Luigi Di Stefano

To prove the effectiveness of our proposal, we show how a semantic segmentation CNN trained on images from the synthetic GTA dataset adapted by our method can improve performance by more than 16% mIoU with respect to the same model trained on synthetic images.

Domain Adaptation Segmentation +2

Paper
Add Code

Real-time self-adaptive deep stereo

1 code implementation • CVPR 2019 • Alessio Tonioni, Fabio Tosi, Matteo Poggi, Stefano Mattoccia, Luigi Di Stefano

Deep convolutional neural networks trained end-to-end are the state-of-the-art methods to regress dense disparity maps from stereo pairs.

Stereo Depth Estimation

416

Paper
Code

A deep learning pipeline for product recognition on store shelves

no code implementations • 3 Oct 2018 • Alessio Tonioni, Eugenio Serra, Luigi Di Stefano

Then, available product databases usually include just one or a few studio-quality images per product (referred to herein as reference images), whilst at test time recognition is performed on pictures displaying a portion of a shelf containing several products and taken in the store by cheap cameras (referred to as query images).

Image Retrieval object-detection +2

Paper
Add Code

Unsupervised Adaptation for Deep Stereo

1 code implementation • ICCV 2017 • Alessio Tonioni, Matteo Poggi, Stefano Mattoccia, Luigi Di Stefano

Recent ground-breaking works have shown that deep neural networks can be trained end-to-end to regress dense disparity maps directly from image pairs.

Paper
Code

Product recognition in store shelves as a sub-graph isomorphism problem

no code implementations • 26 Jul 2017 • Alessio Tonioni, Luigi Di Stefano

The arrangement of products in store shelves is carefully planned to maximize sales and keep customers happy.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.