Search Results for author: Vittorio Ferrari

Found 88 papers, 23 papers with code

HAMMR: HierArchical MultiModal React agents for generic VQA

no code implementations • 8 Apr 2024 • Lluis Castrejon, Thomas Mensink, Howard Zhou, Vittorio Ferrari, Andre Araujo, Jasper Uijlings

We start from a multimodal ReAct-based system and make it hierarchical by enabling our HAMMR agents to call upon other specialized agents.

Optical Character Recognition (OCR) Question Answering +1

Paper
Add Code

Grounding Everything: Emerging Localization Properties in Vision-Language Transformers

1 code implementation • CVPR 2024 • Walid Bousselham, Felix Petersen, Vittorio Ferrari, Hilde Kuehne

To leverage those capabilities, we propose a Grounding Everything Module (GEM) that generalizes the idea of value-value attention introduced by CLIPSurgery to a self-self attention path.

Ranked #1 on Zero Shot Segmentation on ADE20K training-free zero-shot segmentation

Image Retrieval Object Localization +2

Paper
Code

Estimating Generic 3D Room Structures from 2D Annotations

1 code implementation • NeurIPS 2023 • Denys Rozumnyi, Stefan Popov, Kevis-Kokitsi Maninis, Matthias Nießner, Vittorio Ferrari

Based on these 2D annotations, we automatically reconstruct 3D plane equations for the structural elements and their spatial extent in the scene, and connect adjacent elements at the appropriate contact edges.

Scene Understanding

Paper
Code

CAD-Estate: Large-scale CAD Model Annotation in RGB Videos

1 code implementation • ICCV 2023 • Kevis-Kokitsi Maninis, Stefan Popov, Matthias Nießner, Vittorio Ferrari

We propose a method for annotating videos of complex multi-object scenes with a globally-consistent 3D representation of the objects.

3D Object Reconstruction Object +1

Paper
Code

Encyclopedic VQA: Visual questions about detailed properties of fine-grained categories

1 code implementation • ICCV 2023 • Thomas Mensink, Jasper Uijlings, Lluis Castrejon, Arushi Goel, Felipe Cadar, Howard Zhou, Fei Sha, André Araujo, Vittorio Ferrari

Empirically, we show that our dataset poses a hard challenge for large vision+language models as they perform poorly on our dataset: PaLI [14] is state-of-the-art on OK-VQA [37], yet it only achieves 13. 0% accuracy on our dataset.

Question Answering Retrieval +1

33,215

Paper
Code

Tracking by 3D Model Estimation of Unknown Objects in Videos

no code implementations • ICCV 2023 • Denys Rozumnyi, Jiri Matas, Marc Pollefeys, Vittorio Ferrari, Martin R. Oswald

We argue that this representation is limited and instead propose to guide and improve 2D tracking with an explicit object representation, namely the textured 3D shape and 6DoF pose in each video frame.

Object Segmentation +1

Paper
Add Code

Agile Modeling: From Concept to Classifier in Minutes

no code implementations • ICCV 2023 • Otilia Stretcu, Edward Vendrow, Kenji Hata, Krishnamurthy Viswanathan, Vittorio Ferrari, Sasan Tavakkol, Wenlei Zhou, Aditya Avinash, Enming Luo, Neil Gordon Alldrin, Mohammadhossein Bateni, Gabriel Berger, Andrew Bunner, Chun-Ta Lu, Javier A Rey, Giulia Desalvo, Ranjay Krishna, Ariel Fuxman

In reaction, we introduce the problem of Agile Modeling: the process of turning any subjective visual concept into a computer vision model through a real-time user-in-the-loop interactions.

Image Classification

Paper
Add Code

Connecting Vision and Language with Video Localized Narratives

1 code implementation • CVPR 2023 • Paul Voigtlaender, Soravit Changpinyo, Jordi Pont-Tuset, Radu Soricut, Vittorio Ferrari

We propose Video Localized Narratives, a new form of multimodal video annotations connecting vision and language.

Question Answering Video Narrative Grounding +1

Paper
Code

Beyond SOT: Tracking Multiple Generic Objects at Once

1 code implementation • 22 Dec 2022 • Christoph Mayer, Martin Danelljan, Ming-Hsuan Yang, Vittorio Ferrari, Luc van Gool, Alina Kuznetsova

Our approach achieves a 4x faster run-time in case of 10 concurrent objects compared to tracking each object independently and outperforms existing single object trackers on our new benchmark.

Attribute Object +1

3,135

Paper
Code

From colouring-in to pointillism: revisiting semantic segmentation supervision

no code implementations • 25 Oct 2022 • Rodrigo Benenson, Vittorio Ferrari

The prevailing paradigm for producing semantic segmentation training data relies on densely labelling each pixel of each image in the training set, akin to colouring-in books.

Active Learning Segmentation +1

Paper
Add Code

Multi-View Photometric Stereo Revisited

no code implementations • 14 Oct 2022 • Berk Kaya, Suryansh Kumar, Carlos Oliveira, Vittorio Ferrari, Luc van Gool

The proposed approach in this paper exploits the benefit of uncertainty modeling in a deep neural network for a reliable fusion of photometric stereo (PS) and multi-view stereo (MVS) network predictions.

3D Shape Representation

Paper
Add Code

The Missing Link: Finding label relations across datasets

no code implementations • 9 Jun 2022 • Jasper Uijlings, Thomas Mensink, Vittorio Ferrari

To find relations between labels across datasets, we propose methods based on language, on vision, and on their combination.

Specificity Transfer Learning

Paper
Add Code

How stable are Transferability Metrics evaluations?

no code implementations • 4 Apr 2022 • Andrea Agostinelli, Michal Pándy, Jasper Uijlings, Thomas Mensink, Vittorio Ferrari

Transferability metrics is a maturing field with increasing interest, which aims at providing heuristics for selecting the most suitable source models to transfer to a given target dataset, without fine-tuning them all.

Image Classification Semantic Segmentation

Paper
Add Code

RayTran: 3D pose estimation and shape reconstruction of multiple objects from videos with ray-traced transformers

no code implementations • 24 Mar 2022 • Michał J. Tyszkiewicz, Kevis-Kokitsi Maninis, Stefan Popov, Vittorio Ferrari

We propose a transformer-based neural network architecture for multi-object 3D reconstruction from RGB videos.

3D Pose Estimation 3D Reconstruction

Paper
Add Code

Uncertainty-Aware Deep Multi-View Photometric Stereo

no code implementations • CVPR 2022 • Berk Kaya, Suryansh Kumar, Carlos Oliveira, Vittorio Ferrari, Luc van Gool

At each pixel, our approach either selects or discards deep-PS and deep-MVS network prediction depending on the prediction uncertainty measure.

Surface Reconstruction

Paper
Add Code

Motion-from-Blur: 3D Shape and Motion Estimation of Motion-blurred Objects in Videos

1 code implementation • CVPR 2022 • Denys Rozumnyi, Martin R. Oswald, Vittorio Ferrari, Marc Pollefeys

We propose a method for jointly estimating the 3D motion, 3D shape, and appearance of highly motion-blurred objects from a video.

3D Reconstruction Deblurring +1

Paper
Code

Urban Radiance Fields

no code implementations • CVPR 2022 • Konstantinos Rematas, Andrew Liu, Pratul P. Srinivasan, Jonathan T. Barron, Andrea Tagliasacchi, Thomas Funkhouser, Vittorio Ferrari

The goal of this work is to perform 3D reconstruction and novel view synthesis from data captured by scanning platforms commonly deployed for world mapping in urban outdoor environments (e. g., Street View).

3D Reconstruction Novel View Synthesis

Paper
Add Code

Transferability Metrics for Selecting Source Model Ensembles

no code implementations • CVPR 2022 • Andrea Agostinelli, Jasper Uijlings, Thomas Mensink, Vittorio Ferrari

We address the problem of ensemble selection in transfer learning: Given a large pool of source models we want to select an ensemble of models which, after fine-tuning on the target training set, yields the best performance on the target test set.

Semantic Segmentation Transfer Learning

Paper
Add Code

Transferability Estimation using Bhattacharyya Class Separability

no code implementations • CVPR 2022 • Michal Pándy, Andrea Agostinelli, Jasper Uijlings, Vittorio Ferrari, Thomas Mensink

Then, we estimate their pairwise class separability using the Bhattacharyya coefficient, yielding a simple and effective measure of how well the source model transfers to the target task.

Classification Image Classification +2

Paper
Add Code

Neural Radiance Fields Approach to Deep Multi-View Photometric Stereo

no code implementations • 11 Oct 2021 • Berk Kaya, Suryansh Kumar, Francesco Sarno, Vittorio Ferrari, Luc van Gool

Our method performs neural rendering of multi-view images while utilizing surface normals estimated by a deep photometric stereo network.

3D Reconstruction Neural Rendering

Paper
Add Code

Neural Architecture Search for Efficient Uncalibrated Deep Photometric Stereo

no code implementations • 11 Oct 2021 • Francesco Sarno, Suryansh Kumar, Berk Kaya, Zhiwu Huang, Vittorio Ferrari, Luc van Gool

We then perform a continuous relaxation of this search space and present a gradient-based optimization strategy to find an efficient light calibration and normal estimation network.

Neural Architecture Search

Paper
Add Code

Shape from Blur: Recovering Textured 3D Shape and Motion of Fast Moving Objects

1 code implementation • NeurIPS 2021 • Denys Rozumnyi, Martin R. Oswald, Vittorio Ferrari, Marc Pollefeys

We address the novel task of jointly reconstructing the 3D shape, texture, and motion of an object from a single motion-blurred image.

Deblurring Object +2

111

Paper
Code

A Step Toward More Inclusive People Annotations for Fairness

no code implementations • 5 May 2021 • Candice Schumann, Susanna Ricco, Utsav Prabhu, Vittorio Ferrari, Caroline Pantofaru

In this paper, we present a new set of annotations on a subset of the Open Images dataset called the MIAP (More Inclusive Annotations for People) subset, containing bounding boxes and attributes for all of the people visible in those images.

Attribute Fairness

Paper
Add Code

Factors of Influence for Transfer Learning across Diverse Appearance Domains and Task Types

no code implementations • 24 Mar 2021 • Thomas Mensink, Jasper Uijlings, Alina Kuznetsova, Michael Gygli, Vittorio Ferrari

Our study leads to several insights and concrete recommendations: (1) for most tasks there exists a source which significantly outperforms ILSVRC'12 pre-training; (2) the image domain is the most important factor for achieving positive transfer; (3) the source dataset should \emph{include} the image domain of the target dataset to achieve best results; (4) at the same time, we observe only small negative effects when the image domain of the source task is much broader than that of the target; (5) transfer across task types can be beneficial, but its success is heavily dependent on both the source and target task types.

Autonomous Driving Depth Estimation +6

Paper
Add Code

ShaRF: Shape-conditioned Radiance Fields from a Single View

no code implementations • 17 Feb 2021 • Konstantinos Rematas, Ricardo Martin-Brualla, Vittorio Ferrari

We demonstrate in several experiments the effectiveness of our approach in both synthetic and real images.

Disentanglement Object

Paper
Add Code

Telling the What while Pointing to the Where: Multimodal Queries for Image Retrieval

no code implementations • ICCV 2021 • Soravit Changpinyo, Jordi Pont-Tuset, Vittorio Ferrari, Radu Soricut

Most existing image retrieval systems use text queries as a way for the user to express what they are looking for.

Image Retrieval Retrieval

Paper
Add Code

From Points to Multi-Object 3D Reconstruction

no code implementations • CVPR 2021 • Francis Engelmann, Konstantinos Rematas, Bastian Leibe, Vittorio Ferrari

We propose a method to detect and reconstruct multiple 3D objects from a single RGB image.

3D Reconstruction 3D Shape Reconstruction +1

Paper
Add Code

Uncalibrated Neural Inverse Rendering for Photometric Stereo of General Surfaces

no code implementations • CVPR 2021 • Berk Kaya, Suryansh Kumar, Carlos Oliveira, Vittorio Ferrari, Luc van Gool

This paper presents an uncalibrated deep neural network framework for the photometric stereo problem.

Image Reconstruction Inverse Rendering

Paper
Add Code

Vid2CAD: CAD Model Alignment using Multi-View Constraints from Videos

1 code implementation • 8 Dec 2020 • Kevis-Kokitsi Maninis, Stefan Popov, Matthias Nießner, Vittorio Ferrari

We address the task of aligning CAD models to a video sequence of a complex scene containing multiple objects.

Paper
Code

DeFMO: Deblurring and Shape Recovery of Fast Moving Objects

5 code implementations • CVPR 2021 • Denys Rozumnyi, Martin R. Oswald, Vittorio Ferrari, Jiri Matas, Marc Pollefeys

We propose a method that, given a single image with its estimated background, outputs the object's appearance and position in a series of sub-frames as if captured by a high-speed camera (i. e. temporal super-resolution).

Ranked #1 on Video Super-Resolution on Falling Objects

Deblurring Object Tracking +1

9,539

Paper
Code

Efficient Full Image Interactive Segmentation by Leveraging Within-image Appearance Similarity

no code implementations • 16 Jul 2020 • Mykhaylo Andriluka, Stefano Pellegrini, Stefan Popov, Vittorio Ferrari

We leverage a key observation: propagation from labeled to unlabeled pixels does not necessarily require class-specific knowledge, but can be done purely based on appearance similarity within an image.

Interactive Segmentation Semantic Segmentation

Paper
Add Code

CoReNet: Coherent 3D scene reconstruction from a single RGB image

3 code implementations • ECCV 2020 • Stefan Popov, Pablo Bauszat, Vittorio Ferrari

Furthermore, we adapt our model to address the harder task of reconstructing multiple objects from a single image.

3D Scene Reconstruction Decoder +2

100

Paper
Code

Towards Reusable Network Components by Learning Compatible Representations

no code implementations • 8 Apr 2020 • Michael Gygli, Jasper Uijlings, Vittorio Ferrari

This paper proposes to make a first step towards compatible and hence reusable network components.

Computational Efficiency Image Classification +2

Paper
Add Code

C-Flow: Conditional Generative Flow Models for Images and 3D Point Clouds

no code implementations • CVPR 2020 • Albert Pumarola, Stefan Popov, Francesc Moreno-Noguer, Vittorio Ferrari

Flow-based generative models have highly desirable properties like exact log-likelihood evaluation and exact latent-variable inference, however they are still in their infancy and have not received as much attention as alternative generative models.

3D Reconstruction Image Manipulation +1

Paper
Add Code

Neural Voxel Renderer: Learning an Accurate and Controllable Rendering Tool

1 code implementation • CVPR 2020 • Konstantinos Rematas, Vittorio Ferrari

Finally, we show how our neural rendering framework can capture and faithfully render objects from real images and from a diverse set of classes.

Image Generation Neural Rendering

Paper
Code

Connecting Vision and Language with Localized Narratives

1 code implementation • ECCV 2020 • Jordi Pont-Tuset, Jasper Uijlings, Soravit Changpinyo, Radu Soricut, Vittorio Ferrari

We ask annotators to describe an image with their voice while simultaneously hovering their mouse over the region they are describing.

Ranked #2 on Image Captioning on Localized Narratives

Image Captioning Image Generation +1

Paper
Code

Training Object Detectors from Few Weakly-Labeled and Many Unlabeled Images

no code implementations • arXiv 2019 • Zhaohui Yang, Miaojing Shi, Chao Xu, Vittorio Ferrari, Yannis Avrithis

Weakly-supervised object detection attempts to limit the amount of supervision by dispensing the need for bounding boxes, but still assumes image-level labels on the entire training set.

Ranked #23 on Weakly Supervised Object Detection on PASCAL VOC 2012 test (using extra training data)

object-detection Weakly Supervised Object Detection

Paper
Add Code

Continuous Adaptation for Interactive Object Segmentation by Learning from Corrections

no code implementations • ECCV 2020 • Theodora Kontogianni, Michael Gygli, Jasper Uijlings, Vittorio Ferrari

Our approach enables the adaptation to a particular object and its background, to distributions shifts in a test set, to specific object classes, and even to large domain changes, where the imaging modality changes between training and testing.

Ranked #1 on Interactive Segmentation on DRIONS-DB

Interactive Segmentation Object +1

Paper
Add Code

Panoptic Image Annotation with a Collaborative Assistant

no code implementations • 17 Jun 2019 • Jasper R. R. Uijlings, Mykhaylo Andriluka, Vittorio Ferrari

This paper aims to reduce the time to annotate images for panoptic segmentation, which requires annotating segmentation masks and class labels for all object instances and stuff regions.

Panoptic Segmentation Segmentation

Paper
Add Code

Natural Vocabulary Emerges from Free-Form Annotations

no code implementations • 4 Jun 2019 • Jordi Pont-Tuset, Michael Gygli, Vittorio Ferrari

This vocabulary represents the natural distribution of objects well and is learned directly from data, instead of being an educated guess done before collecting any labels.

Paper
Add Code

Efficient Object Annotation via Speaking and Pointing

no code implementations • 25 May 2019 • Michael Gygli, Vittorio Ferrari

We then combine the two stages: annotators draw an object bounding box via the mouse and simultaneously provide its class label via speech.

Object

Paper
Add Code

Large-scale interactive object segmentation with human annotators

no code implementations • CVPR 2019 • Rodrigo Benenson, Stefan Popov, Vittorio Ferrari

Manually annotating object segmentation masks is very time consuming.

Instance Segmentation Interactive Segmentation +3

Paper
Add Code

Learning single-image 3D reconstruction by generative modelling of shape, pose and shading

1 code implementation • 19 Jan 2019 • Paul Henderson, Vittorio Ferrari

Importantly, it can be trained purely from 2D images, without pose annotations, and with only a single view per instance.

3D Reconstruction

311

Paper
Code

Interactive Full Image Segmentation by Considering All Regions Jointly

no code implementations • CVPR 2019 • Eirikur Agustsson, Jasper R. R. Uijlings, Vittorio Ferrari

We propose an interactive, scribble-based annotation framework which operates on the whole image to produce segmentations for all regions.

Image Segmentation Interactive Segmentation +2

Paper
Add Code

Fast Object Class Labelling via Speech

no code implementations • CVPR 2019 • Michael Gygli, Vittorio Ferrari

Modern approaches rely on a hierarchical organization of the vocabulary to reduce annotation time, but remain expensive (several minutes per image for the 200 classes in ILSVRC).

Object

Paper
Add Code

The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale

1 code implementation • 2 Nov 2018 • Alina Kuznetsova, Hassan Rom, Neil Alldrin, Jasper Uijlings, Ivan Krasin, Jordi Pont-Tuset, Shahab Kamali, Stefan Popov, Matteo Malloci, Alexander Kolesnikov, Tom Duerig, Vittorio Ferrari

We present Open Images V4, a dataset of 9. 2M images with unified annotations for image classification, object detection and visual relationship detection.

General Classification Image Classification +5

Paper
Code

Learning to Generate and Reconstruct 3D Meshes with only 2D Supervision

no code implementations • 24 Jul 2018 • Paul Henderson, Vittorio Ferrari

Importantly, it can be trained purely from 2D images, without ground-truth pose annotations, and with a single view per instance.

3D Reconstruction

Paper
Add Code

Detecting Visual Relationships Using Box Attention

no code implementations • 5 Jul 2018 • Alexander Kolesnikov, Alina Kuznetsova, Christoph H. Lampert, Vittorio Ferrari

We propose a new model for detecting visual relationships, such as "person riding motorcycle" or "bottle on table".

object-detection Object Detection

Paper
Add Code

Fluid Annotation: A Human-Machine Collaboration Interface for Full Image Annotation

no code implementations • 20 Jun 2018 • Mykhaylo Andriluka, Jasper R. R. Uijlings, Vittorio Ferrari

As opposed to performing a series of small annotation tasks in isolation, we propose a unified interface for full image annotation in a single pass.

Paper
Add Code

Learning Intelligent Dialogs for Bounding Box Annotation

1 code implementation • CVPR 2018 • Ksenia Konyushkova, Jasper Uijlings, Christoph Lampert, Vittorio Ferrari

We demonstrate that (1) our agents are able to learn efficient annotation strategies in several scenarios, automatically adapting to the image difficulty, the desired quality of the boxes, and the detector strength; (2) in all scenarios the resulting annotation dialogs speed up annotation compared to manual box drawing alone and box verification alone, while also outperforming any fixed combination of verification and drawing in most scenarios; (3) in a realistic scenario where the detector is iteratively re-trained, our agents evolve a series of strategies that reflect the shifting trade-off between verification and drawing as the detector grows stronger.

Paper
Code

Automatic Generation of Constrained Furniture Layouts

no code implementations • 29 Nov 2017 • Paul Henderson, Kartic Subr, Vittorio Ferrari

Efficient authoring of vast virtual environments hinges on algorithms that are able to automatically generate content while also being controllable.

Paper
Add Code

Joint Learning of Object and Action Detectors

no code implementations • ICCV 2017 • Vicky Kalogeiton, Philippe Weinzaepfel, Vittorio Ferrari, Cordelia Schmid

dog and cat jumping, enabling to detect actions of an object without training with these object-actions pairs.

Action Detection Object +1

Paper
Add Code

Active Learning for Human Pose Estimation

no code implementations • ICCV 2017 • Buyu Liu, Vittorio Ferrari

Annotating human poses in realistic scenes is very time consuming, yet necessary for training human pose estimators.

Active Learning Pose Estimation

Paper
Add Code

Revisiting knowledge transfer for training object class detectors

no code implementations • CVPR 2018 • Jasper Uijlings, Stefan Popov, Vittorio Ferrari

We propose to revisit knowledge transfer for training object detectors on target classes from weakly supervised training images, helped by a set of source classes with bounding-box annotations.

Object Transfer Learning

Paper
Add Code

Extreme clicking for efficient object annotation

no code implementations • ICCV 2017 • Dim P. Papadopoulos, Jasper R. R. Uijlings, Frank Keller, Vittorio Ferrari

We crowd-source extreme point annotations for PASCAL VOC 2007 and 2012 and show that (1) annotation time is only 7s per box, 5x faster than the traditional way of drawing boxes [62]; (2) the quality of the boxes is as good as the original ground-truth drawn the traditional way; (3) detectors trained on our annotations are as accurate as those trained on the original ground-truth.

Object

Paper
Add Code

The Devil is in the Decoder: Classification, Regression and GANs

1 code implementation • 18 Jul 2017 • Zbigniew Wojna, Vittorio Ferrari, Sergio Guadarrama, Nathan Silberman, Liang-Chieh Chen, Alireza Fathi, Jasper Uijlings

Many machine vision applications, such as semantic segmentation and depth prediction, require predictions for every pixel of the input image.

Boundary Detection Decoder +5

Paper
Code

How hard can it be? Estimating the difficulty of visual search in an image

no code implementations • CVPR 2016 • Radu Tudor Ionescu, Bogdan Alexe, Marius Leordeanu, Marius Popescu, Dim P. Papadopoulos, Vittorio Ferrari

We address the problem of estimating image difficulty defined as the human response time for solving a visual search task.

Weakly-Supervised Object Localization

Paper
Add Code

Action Tubelet Detector for Spatio-Temporal Action Localization

2 code implementations • ICCV 2017 • Vicky Kalogeiton, Philippe Weinzaepfel, Vittorio Ferrari, Cordelia Schmid

We propose the ACtion Tubelet detector (ACT-detector) that takes as input a sequence of frames and outputs tubelets, i. e., sequences of bounding boxes with associated scores.

Spatio-Temporal Action Localization Temporal Action Localization

104

Paper
Code

Training object class detectors with click supervision

no code implementations • CVPR 2017 • Dim P. Papadopoulos, Jasper R. R. Uijlings, Frank Keller, Vittorio Ferrari

Training object class detectors typically requires a large set of images with objects annotated by bounding boxes.

Multiple Instance Learning Object +1

Paper
Add Code

Objects as context for detecting their semantic parts

no code implementations • CVPR 2018 • Abel Gonzalez-Garcia, Davide Modolo, Vittorio Ferrari

We present a semantic part detection approach that effectively leverages object information. We use the object appearance and its class as indicators of what parts to expect.

Object Semantic Part Detection

Paper
Add Code

Weakly Supervised Object Localization Using Things and Stuff Transfer

no code implementations • ICCV 2017 • Miaojing Shi, Holger Caesar, Vittorio Ferrari

We propose to help weakly supervised object localization for classes where location annotations are not available, by transferring things and stuff knowledge from a source set with available annotations.

Multiple Instance Learning Object +2

Paper
Add Code

COCO-Stuff: Thing and Stuff Classes in Context

10 code implementations • CVPR 2018 • Holger Caesar, Jasper Uijlings, Vittorio Ferrari

To understand stuff and things in context we introduce COCO-Stuff, which augments all 164K images of the COCO 2017 dataset with pixel-wise annotations for 91 stuff classes.

Ranked #1 on Semantic Segmentation on COCO-Stuff

Image Captioning Semantic Segmentation +1

1,084

Paper
Code

Learning Semantic Part-Based Models from Google Images

no code implementations • 11 Sep 2016 • Davide Modolo, Vittorio Ferrari

We evaluate our models on the challenging PASCAL-Part dataset [1] and show how their performance increases at every step of the learning, with the final models more than doubling the performance of directly training from images retrieved by querying for part names (from 12. 9 to 27. 2 AP).

Object object-detection +1

Paper
Add Code

Weakly Supervised Object Localization Using Size Estimates

no code implementations • 15 Aug 2016 • Miaojing Shi, Vittorio Ferrari

We present a technique for weakly supervised object localization (WSOL), building on the observation that WSOL algorithms usually work better on images with bigger objects.

Object Weakly-Supervised Object Localization

Paper
Add Code

Region-based semantic segmentation with end-to-end training

1 code implementation • 26 Jul 2016 • Holger Caesar, Jasper Uijlings, Vittorio Ferrari

We propose a novel method for semantic segmentation, the task of labeling each pixel in an image with a semantic class.

Ranked #1 on Semantic Segmentation on SIFT-flow

Segmentation Semantic Segmentation

Paper
Code

Do semantic parts emerge in Convolutional Neural Networks?

no code implementations • 13 Jul 2016 • Abel Gonzalez-Garcia, Davide Modolo, Vittorio Ferrari

We also investigate the other direction: we determine which semantic parts are the most discriminative and whether they correspond to those parts emerging in the network.

Object Recognition

Paper
Add Code

End-to-end training of object class detectors for mean average precision

no code implementations • 12 Jul 2016 • Paul Henderson, Vittorio Ferrari

We present a method for training CNN-based object class detectors directly using mean average precision (mAP) as the training loss, in a truly end-to-end fashion that includes non-maximum suppression (NMS) at training time.

General Classification

Paper
Add Code

Discovering the Physical Parts of an Articulated Object Class From Multiple Videos

no code implementations • CVPR 2016 • Luca Del Pero, Susanna Ricco, Rahul Sukthankar, Vittorio Ferrari

We propose a motion-based method to discover the physical parts of an articulated object class (e. g. head/torso/leg of a horse) from multiple videos.

Motion Segmentation Object +1

Paper
Add Code

We don't need no bounding-boxes: Training object class detectors using only human verification

1 code implementation • CVPR 2016 • Dim P. Papadopoulos, Jasper R. R. Uijlings, Frank Keller, Vittorio Ferrari

Training object class detectors typically requires a large set of images in which objects are annotated by bounding-boxes.

796

Paper
Code

Behavior Discovery and Alignment of Articulated Object Classes from Unstructured Video

no code implementations • 30 Nov 2015 • Luca Del Pero, Susanna Ricco, Rahul Sukthankar, Vittorio Ferrari

On behavior discovery, we outperform the state-of-the-art Improved DTF descriptor.

Retrieval

Paper
Add Code

Automatically selecting inference algorithms for discrete energy minimisation

no code implementations • 19 Nov 2015 • Paul Henderson, Vittorio Ferrari

Minimisation of discrete energies defined over factors is an important problem in computer vision, and a vast number of MAP inference algorithms have been proposed.

Paper
Add Code

Joint Calibration for Semantic Segmentation

no code implementations • 6 Jul 2015 • Holger Caesar, Jasper Uijlings, Vittorio Ferrari

Semantic segmentation is the task of assigning a class-label to each pixel in an image.

Ranked #2 on Semantic Segmentation on SIFT-flow

Segmentation Semantic Segmentation

Paper
Add Code

What's the Point: Semantic Segmentation with Point Supervision

1 code implementation • 6 Jun 2015 • Amy Bearman, Olga Russakovsky, Vittorio Ferrari, Li Fei-Fei

The semantic image segmentation task presents a trade-off between test time accuracy and training-time annotation cost.

Image Segmentation Semantic Segmentation

Paper
Code

Situational Object Boundary Detection

no code implementations • CVPR 2015 • Jasper Uijlings, Vittorio Ferrari

Intuitively, the appearance of true object boundaries varies from image to image.

Boundary Detection Contour Detection +1

Paper
Add Code

Context Forest for efficient object detection with large mixture models

no code implementations • 3 Mar 2015 • Davide Modolo, Alexander Vezhnevets, Vittorio Ferrari

We present Context Forest (ConF), a technique for predicting properties of the objects in an image based on its global appearance.

object-detection Object Detection

Paper
Add Code

Joint calibration of Ensemble of Exemplar SVMs

no code implementations • CVPR 2015 • Davide Modolo, Alexander Vezhnevets, Olga Russakovsky, Vittorio Ferrari

We formulate joint calibration as a constrained optimization problem and devise an efficient optimization algorithm to find its global optimum.

object-detection Object Detection

Paper
Add Code

Object localization in ImageNet by looking out of the window

no code implementations • 6 Jan 2015 • Alexander Vezhnevets, Vittorio Ferrari

We propose a method for annotating the location of objects in ImageNet.

General Classification Object Localization

Paper
Add Code

Analysing domain shift factors between videos and images for object detection

1 code implementation • 6 Jan 2015 • Vicky Kalogeiton, Vittorio Ferrari, Cordelia Schmid

Object detection is one of the most important challenges in computer vision.

Object object-detection +1

Paper
Code

An active search strategy for efficient object class detection

no code implementations • CVPR 2015 • Abel Gonzalez-Garcia, Alexander Vezhnevets, Vittorio Ferrari

First, we exploit context as the statistical relation between the appearance of a window and its location relative to the object, as observed in the training set.

Object

Paper
Add Code

Recovering Spatiotemporal Correspondence between Deformable Objects by Exploiting Consistent Foreground Motion in Video

no code implementations • 1 Dec 2014 • Luca Del Pero, Susanna Ricco, Rahul Sukthankar, Vittorio Ferrari

Given unstructured videos of deformable objects, we automatically recover spatiotemporal correspondences to map one object to another (such as animals in the wild).

Object

Paper
Add Code

Articulated motion discovery using pairs of trajectories

no code implementations • CVPR 2015 • Luca Del Pero, Susanna Ricco, Rahul Sukthankar, Vittorio Ferrari

We propose an unsupervised approach for discovering characteristic motion patterns in videos of highly articulated objects performing natural, unscripted behaviors, such as tigers in the wild.

Paper
Add Code

Closed-Form Training of Conditional Random Fields for Large Scale Image Segmentation

no code implementations • 27 Mar 2014 • Alexander Kolesnikov, Matthieu Guillaumin, Vittorio Ferrari, Christoph H. Lampert

It is inspired by existing closed-form expressions for the maximum likelihood parameters of a generative graphical model with tree topology.

Image Segmentation Segmentation +1

Paper
Add Code

Associative embeddings for large-scale knowledge transfer with self-assessment

no code implementations • CVPR 2014 • Alexander Vezhnevets, Vittorio Ferrari

By transferring knowledge from the images that have bounding-box annotations to the others, our method is capable of automatically populating ImageNet with many more bounding-boxes and even pixel-level segmentations.

Gaussian Processes Object Localization +1

Paper
Add Code

Models of Semantic Representation with Visual Attributes

no code implementations • ACL 2013 • Carina Silberer, Vittorio Ferrari, Mirella Lapata

Image Retrieval Language Acquisition +3

Paper
Add Code

Fast Energy Minimization Using Learned State Filters

no code implementations • CVPR 2013 • Matthieu Guillaumin, Luc van Gool, Vittorio Ferrari

However, when the graph is fully connected and the pairwise potentials are arbitrary, the complexity of even approximate minimization algorithms such as TRW-S grows quadratically both in the number of nodes and in the number of states a node can take.

Paper
Add Code

Searching for objects driven by context

no code implementations • NeurIPS 2012 • Bogdan Alexe, Nicolas Heess, Yee W. Teh, Vittorio Ferrari

The dominant visual search paradigm for object class detection is sliding windows.

Paper
Add Code

Exploiting spatial overlap to efficiently compute appearance distances between image windows

no code implementations • NeurIPS 2011 • Bogdan Alexe, Viviana Petrescu, Vittorio Ferrari

We present a computationally efficient technique to compute the distance of high-dimensional appearance descriptor vectors between image windows.

Paper
Add Code

Who’s Doing What: Joint Modeling of Names and Verbs for Simultaneous Face and Pose Annotation

no code implementations • NeurIPS 2009 • Jie Luo, Barbara Caputo, Vittorio Ferrari

Given a corpus of news items consisting of images accompanied by text captions, we want to find out ``whos doing what, i. e. associate names and action verbs in the captions to the face and body pose of the persons in the images.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.