Search Results for author: Sanja Fidler

Found 136 papers, 42 papers with code

Don’t Generate Me: Training Differentially Private Generative Models with Sinkhorn Divergence

no code implementations NeurIPS 2021 Tianshi Cao, Alex Bie, Arash Vahdat, Sanja Fidler, Karsten Kreis

Generative models trained with privacy constraints on private data can sidestep this challenge, providing indirect access to private data instead.

Scalable Neural Data Server: A Data Recommender for Transfer Learning

no code implementations NeurIPS 2021 Tianshi Cao, Sasha (Alexandre) Doubov, David Acuna, Sanja Fidler

Thus, the computational cost to each user grows with the number of sources and requires an expensive training step for each data provider. To address these issues, we propose Scalable Neural Data Server (SNDS), a large-scale search engine that can theoretically index thousands of datasets to serve relevant ML data to end users.

Transfer Learning

Neural Fields as Learnable Kernels for 3D Reconstruction

no code implementations26 Nov 2021 Francis Williams, Zan Gojcic, Sameh Khamis, Denis Zorin, Joan Bruna, Sanja Fidler, Or Litany

We present Neural Kernel Fields: a novel method for reconstructing implicit 3D shapes based on a learned kernel ridge regression.

3D Reconstruction

Extracting Triangular 3D Models, Materials, and Lighting From Images

no code implementations24 Nov 2021 Jacob Munkberg, Jon Hasselgren, Tianchang Shen, Jun Gao, Wenzheng Chen, Alex Evans, Thomas Müller, Sanja Fidler

We present an efficient method for joint optimization of topology, materials and lighting from multi-view image observations.

Towards Optimal Strategies for Training Self-Driving Perception Models in Simulation

no code implementations NeurIPS 2021 David Acuna, Jonah Philion, Sanja Fidler

Alternative solutions seek to exploit driving simulators that can generate large amounts of labeled data with a plethora of content variations.

Autonomous Driving Domain Adaptation

Deep Marching Tetrahedra: a Hybrid Representation for High-Resolution 3D Shape Synthesis

no code implementations NeurIPS 2021 Tianchang Shen, Jun Gao, Kangxue Yin, Ming-Yu Liu, Sanja Fidler

The core of DMTet includes a deformable tetrahedral grid that encodes a discretized signed distance function and a differentiable marching tetrahedra layer that converts the implicit signed distance representation to the explicit surface mesh representation.

EditGAN: High-Precision Semantic Image Editing

no code implementations NeurIPS 2021 Huan Ling, Karsten Kreis, Daiqing Li, Seung Wook Kim, Antonio Torralba, Sanja Fidler

EditGAN builds on a GAN framework that jointly models images and their semantic segmentations, requiring only a handful of labeled examples, making it a scalable tool for editing.

Semantic Segmentation

Don't Generate Me: Training Differentially Private Generative Models with Sinkhorn Divergence

no code implementations1 Nov 2021 Tianshi Cao, Alex Bie, Arash Vahdat, Sanja Fidler, Karsten Kreis

Generative models trained with privacy constraints on private data can sidestep this challenge, providing indirect access to private data instead.

DIB-R++: Learning to Predict Lighting and Material with a Hybrid Differentiable Renderer

no code implementations NeurIPS 2021 Wenzheng Chen, Joey Litalien, Jun Gao, Zian Wang, Clement Fuji Tsang, Sameh Khamis, Or Litany, Sanja Fidler

We consider the challenging problem of predicting intrinsic object properties from a single image by exploiting differentiable renderers.

ATISS: Autoregressive Transformers for Indoor Scene Synthesis

no code implementations NeurIPS 2021 Despoina Paschalidou, Amlan Kar, Maria Shugrina, Karsten Kreis, Andreas Geiger, Sanja Fidler

The ability to synthesize realistic and diverse indoor furniture layouts automatically or based on partial input, unlocks many applications, from better interactive 3D tools to data synthesis for training and simulation.

Indoor Scene Synthesis

Causal Scene BERT: Improving object detection by searching for challenging groups

no code implementations29 Sep 2021 Cinjon Resnick, Or Litany, Amlan Kar, Karsten Kreis, James Lucas, Kyunghyun Cho, Sanja Fidler

We verify that the prioritized groups found via intervention are challenging for the object detector and show that retraining with data collected from these groups helps inordinately compared to adding more IID data.

Autonomous Vehicles Object Detection

Physics-based Human Motion Estimation and Synthesis from Videos

no code implementations ICCV 2021 Kevin Xie, Tingwu Wang, Umar Iqbal, Yunrong Guo, Sanja Fidler, Florian Shkurti

We demonstrate both qualitatively and quantitatively significantly improved motion estimation, synthesis quality and physical plausibility achieved by our method on the large scale Human3. 6m dataset \cite{h36m_pami} as compared to prior kinematic and physics-based methods.

Motion Capture Motion Estimation +2

Learning Indoor Inverse Rendering with 3D Spatially-Varying Lighting

no code implementations ICCV 2021 Zian Wang, Jonah Philion, Sanja Fidler, Jan Kautz

In this paper, we propose a unified, learning-based inverse rendering framework that formulates 3D spatially-varying lighting.

Image-to-Image Translation Translation

3DStyleNet: Creating 3D Shapes with Geometric and Texture Style Variations

no code implementations ICCV 2021 Kangxue Yin, Jun Gao, Maria Shugrina, Sameh Khamis, Sanja Fidler

Given a small set of high-quality textured objects, our method can create many novel stylized shapes, resulting in effortless 3D content creation and style-ware data augmentation.

3D Reconstruction Affine Transformation +2

NP-DRAW: A Non-Parametric Structured Latent Variable Model for Image Generation

1 code implementation25 Jun 2021 Xiaohui Zeng, Raquel Urtasun, Richard Zemel, Sanja Fidler, Renjie Liao

1) We propose a non-parametric prior distribution over the appearance of image parts so that the latent variable ``what-to-draw'' per step becomes a categorical random variable.

Image Generation

f-Domain-Adversarial Learning: Theory and Algorithms

no code implementations21 Jun 2021 David Acuna, Guojun Zhang, Marc T. Law, Sanja Fidler

Unsupervised domain adaptation is used in many machine learning applications where, during training, a model has access to unlabeled data in the target domain, and a related labeled dataset.

Learning Theory Unsupervised Domain Adaptation

Low Budget Active Learning via Wasserstein Distance: An Integer Programming Approach

no code implementations5 Jun 2021 Rafid Mahmood, Sanja Fidler, Marc T. Law

Given restrictions on the availability of data, active learning is the process of training a model with limited labeled data by selecting a core subset of an unlabeled data pool to label.

Active Learning

DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort

1 code implementation CVPR 2021 Yuxuan Zhang, Huan Ling, Jun Gao, Kangxue Yin, Jean-Francois Lafleche, Adela Barriuso, Antonio Torralba, Sanja Fidler

To showcase the power of our approach, we generated datasets for 7 image segmentation tasks which include pixel-level labels for 34 human face parts, and 32 car parts.

Semantic Segmentation

Image-Level or Object-Level? A Tale of Two Resampling Strategies for Long-Tailed Detection

1 code implementation12 Apr 2021 Nadine Chang, Zhiding Yu, Yu-Xiong Wang, Anima Anandkumar, Sanja Fidler, Jose M. Alvarez

As a result, image resampling alone is not enough to yield a sufficiently balanced distribution at the object level.

Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks

1 code implementation CVPR 2021 Despoina Paschalidou, Angelos Katharopoulos, Andreas Geiger, Sanja Fidler

The INN allows us to compute the inverse mapping of the homeomorphism, which in turn, enables the efficient computation of both the implicit surface function of a primitive and its mesh, without any additional post-processing.

Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Shapes

1 code implementation CVPR 2021 Towaki Takikawa, Joey Litalien, Kangxue Yin, Karsten Kreis, Charles Loop, Derek Nowrouzezahrai, Alec Jacobson, Morgan McGuire, Sanja Fidler

We introduce an efficient neural representation that, for the first time, enables real-time rendering of high-fidelity neural SDFs, while achieving state-of-the-art geometry reconstruction quality.

Differentially Private Generative Models Through Optimal Transport

no code implementations1 Jan 2021 Tianshi Cao, Alex Bie, Karsten Kreis, Sanja Fidler

Generative models trained with privacy constraints on private data can sidestep this challenge and provide indirect access to the private data instead.

f-Domain-Adversarial Learning: Theory and Algorithms for Unsupervised Domain Adaptation with Neural Networks

no code implementations1 Jan 2021 David Acuna, Guojun Zhang, Marc T Law, Sanja Fidler

We provide empirical results for several f-divergences and show that some, not considered previously in domain-adversarial learning, achieve state-of-the-art results in practice.

Generalization Bounds Learning Theory +1

Personalized Federated Learning with First Order Model Optimization

no code implementations ICLR 2021 Michael Zhang, Karan Sapra, Sanja Fidler, Serena Yeung, Jose M. Alvarez

While federated learning traditionally aims to train a single global model across decentralized local datasets, one model may not always be ideal for all participating clients.

Personalized Federated Learning

Variational Amodal Object Completion

no code implementations NeurIPS 2020 Huan Ling, David Acuna, Karsten Kreis, Seung Wook Kim, Sanja Fidler

In images of complex scenes, objects are often occluding each other which makes perception tasks such as object detection and tracking, or robotic control tasks such as planning, challenging.

Object Detection

UniCon: Universal Neural Controller For Physics-based Character Motion

no code implementations30 Nov 2020 Tingwu Wang, Yunrong Guo, Maria Shugrina, Sanja Fidler

The field of physics-based animation is gaining importance due to the increasing demand for realism in video games and films, and has recently seen wide adoption of data-driven techniques, such as deep reinforcement learning (RL), which learn control from (human) demonstrations.

Emergent Road Rules In Multi-Agent Driving Environments

1 code implementation ICLR 2021 Avik Pal, Jonah Philion, Yuan-Hong Liao, Sanja Fidler

For autonomous vehicles to safely share the road with human drivers, autonomous vehicles must abide by specific "road rules" that human drivers have agreed to follow.

Autonomous Vehicles

Learning Deformable Tetrahedral Meshes for 3D Reconstruction

1 code implementation NeurIPS 2020 Jun Gao, Wenzheng Chen, Tommy Xiang, Clement Fuji Tsang, Alec Jacobson, Morgan McGuire, Sanja Fidler

We introduce Deformable Tetrahedral Meshes (DefTet) as a particular parameterization that utilizes volumetric tetrahedral meshes for the reconstruction problem.

3D Reconstruction

Image GANs meet Differentiable Rendering for Inverse Graphics and Interpretable 3D Neural Rendering

no code implementations ICLR 2021 Yuxuan Zhang, Wenzheng Chen, Huan Ling, Jun Gao, Yinan Zhang, Antonio Torralba, Sanja Fidler

Key to our approach is to exploit GANs as a multi-view data generator to train an inverse graphics network using an off-the-shelf differentiable renderer, and the trained inverse graphics network as a teacher to disentangle the GAN's latent code into interpretable 3D properties.

Neural Rendering

Fed-Sim: Federated Simulation for Medical Imaging

no code implementations1 Sep 2020 Daiqing Li, Amlan Kar, Nishant Ravikumar, Alejandro F. Frangi, Sanja Fidler

Since the model of geometry and material is disentangled from the imaging sensor, it can effectively be trained across multiple medical centers.

Federated Learning

Expressive Telepresence via Modular Codec Avatars

no code implementations ECCV 2020 Hang Chu, Shugao Ma, Fernando de la Torre, Sanja Fidler, Yaser Sheikh

It is important to note that traditional person-specific CAs are learned from few training samples, and typically lack robustness as well as limited expressiveness when transferring facial expressions.

Interactive Annotation of 3D Object Geometry using 2D Scribbles

no code implementations ECCV 2020 Tianchang Shen, Jun Gao, Amlan Kar, Sanja Fidler

We implement our framework as a web service and conduct a user study, where we show that user annotated data using our method effectively facilitates real-world learning tasks.

ScribbleBox: Interactive Annotation Framework for Video Object Segmentation

no code implementations ECCV 2020 Bo-Wen Chen, Huan Ling, Xiaohui Zeng, Gao Jun, Ziyue Xu, Sanja Fidler

Our approach tolerates a modest amount of noise in the box placements, thus typically only a few clicks are needed to annotate tracked boxes to a sufficient accuracy.

Semantic Segmentation Video Object Segmentation +1

Beyond Fixed Grid: Learning Geometric Image Representation with a Deformable Grid

no code implementations ECCV 2020 Jun Gao, Zian Wang, Jinchen Xuan, Sanja Fidler

We also utilize DefGrid at the output layers for the task of object mask annotation, and show that reasoning about object boundaries on our predicted polygonal grid leads to more accurate results over existing pixel-wise and curve-based approaches.

Semantic Segmentation

Meta-Sim2: Unsupervised Learning of Scene Structure for Synthetic Data Generation

no code implementations ECCV 2020 Jeevan Devaranjan, Amlan Kar, Sanja Fidler

In Meta-Sim2, we aim to learn the scene structure in addition to parameters, which is a challenging problem due to its discrete nature.

Synthetic Data Generation

Learning to Generate Diverse Dance Motions with Transformer

no code implementations18 Aug 2020 Jiaman Li, Yihang Yin, Hang Chu, Yi Zhou, Tingwu Wang, Sanja Fidler, Hao Li

We also introduce new evaluation metrics for the quality of synthesized dance motions, and demonstrate that our system can outperform state-of-the-art methods.

Motion Capture motion synthesis

Lift, Splat, Shoot: Encoding Images From Arbitrary Camera Rigs by Implicitly Unprojecting to 3D

1 code implementation ECCV 2020 Jonah Philion, Sanja Fidler

By training on the entire camera rig, we provide evidence that our model is able to learn not only how to represent images but how to fuse predictions from all cameras into a single cohesive representation of the scene while being robust to calibration error.

Autonomous Vehicles Motion Planning +1

Efficient and Information-Preserving Future Frame Prediction and Beyond

no code implementations ICLR 2020 Wei Yu, Yichao Lu, Steve Easterbrook, Sanja Fidler

Applying resolution-preserving blocks is a common practice to maximize information preservation in video prediction, yet their high memory consumption greatly limits their application scenarios.

Object Detection Self-Supervised Learning +1

The EPIC-KITCHENS Dataset: Collection, Challenges and Baselines

2 code implementations29 Apr 2020 Dima Damen, Hazel Doughty, Giovanni Maria Farinella, Sanja Fidler, Antonino Furnari, Evangelos Kazakos, Davide Moltisanti, Jonathan Munro, Toby Perrett, Will Price, Michael Wray

Our dataset features 55 hours of video consisting of 11. 5M frames, which we densely labelled for a total of 39. 6K action segments and 454. 2K object bounding boxes.

Learning to Evaluate Perception Models Using Planner-Centric Metrics

no code implementations CVPR 2020 Jonah Philion, Amlan Kar, Sanja Fidler

The downside of these metrics is that, at worst, they penalize all incorrect detections equally without conditioning on the task or scene, and at best, heuristics need to be chosen to ensure that different mistakes count differently.

3D Object Detection

Neural Data Server: A Large-Scale Search Engine for Transfer Learning Data

no code implementations CVPR 2020 Xi Yan, David Acuna, Sanja Fidler

NDS consists of a dataserver which indexes several large popular image datasets, and aims to recommend data to a client, an end-user with a target application with its own small labeled dataset.

Image Classification Instance Segmentation +3

The Shmoop Corpus: A Dataset of Stories with Loosely Aligned Summaries

1 code implementation30 Dec 2019 Atef Chaudhury, Makarand Tapaswi, Seung Wook Kim, Sanja Fidler

Understanding stories is a challenging reading comprehension problem for machines as it requires reading a large volume of text and following long-range dependencies.

Abstractive Text Summarization Question Answering +1

CrevNet: Conditionally Reversible Video Prediction

no code implementations25 Oct 2019 Wei Yu, Yichao Lu, Steve Easterbrook, Sanja Fidler

Applying resolution-preserving blocks is a common practice to maximize information preservation in video prediction, yet their high memory consumption greatly limits their application scenarios.

Video Prediction

Neural Turtle Graphics for Modeling City Road Layouts

no code implementations ICCV 2019 Hang Chu, Daiqing Li, David Acuna, Amlan Kar, Maria Shugrina, Xinkai Wei, Ming-Yu Liu, Antonio Torralba, Sanja Fidler

We propose Neural Turtle Graphics (NTG), a novel generative model for spatial graphs, and demonstrate its applications in modeling city road layouts.

A Theoretical Analysis of the Number of Shots in Few-Shot Learning

no code implementations ICLR 2020 Tianshi Cao, Marc Law, Sanja Fidler

We introduce a theoretical analysis of the impact of the shot number on Prototypical Networks, a state-of-the-art few-shot classification method.

Classification Few-Shot Learning +1

Video Face Clustering with Unknown Number of Clusters

1 code implementation ICCV 2019 Makarand Tapaswi, Marc T. Law, Sanja Fidler

Understanding videos such as TV series and movies requires analyzing who the characters are and what they are doing.

Face Clustering Metric Learning

Learning to Predict 3D Objects with an Interpolation-based Differentiable Renderer

1 code implementation NeurIPS 2019 Wenzheng Chen, Jun Gao, Huan Ling, Edward J. Smith, Jaakko Lehtinen, Alec Jacobson, Sanja Fidler

Many machine learning models operate on images, but ignore the fact that images are 2D projections formed by 3D geometry interacting with light, in a process called rendering.

Gated-SCNN: Gated Shape CNNs for Semantic Segmentation

3 code implementations ICCV 2019 Towaki Takikawa, David Acuna, Varun Jampani, Sanja Fidler

Here, we propose a new two-stream CNN architecture for semantic segmentation that explicitly wires shape information as a separate processing branch, i. e. shape stream, that processes information in parallel to the classical stream.

Semantic Segmentation

Neural Graph Evolution: Towards Efficient Automatic Robot Design

1 code implementation12 Jun 2019 Tingwu Wang, Yuhao Zhou, Sanja Fidler, Jimmy Ba

To address the two challenges, we formulate automatic robot design as a graph search problem and perform evolution search in graph space.

EigenDamage: Structured Pruning in the Kronecker-Factored Eigenbasis

1 code implementation15 May 2019 Chaoqi Wang, Roger Grosse, Sanja Fidler, Guodong Zhang

Reducing the test time resource requirements of a neural network while preserving test accuracy is crucial for running inference on resource-constrained devices.

Network Pruning

Neural Graph Evolution: Automatic Robot Design

no code implementations ICLR 2019 Tingwu Wang, Yuhao Zhou, Sanja Fidler, Jimmy Ba

To address the two challenges, we formulate automatic robot design as a graph search problem and perform evolution search in graph space.

ACTRCE: Augmenting Experience via Teacher’s Advice

no code implementations ICLR 2019 Yuhuai Wu, Harris Chan, Jamie Kiros, Sanja Fidler, Jimmy Ba

Sparse reward is one of the most challenging problems in reinforcement learning (RL).

Meta-Sim: Learning to Generate Synthetic Datasets

no code implementations ICCV 2019 Amlan Kar, Aayush Prakash, Ming-Yu Liu, Eric Cameracci, Justin Yuan, Matt Rusiniak, David Acuna, Antonio Torralba, Sanja Fidler

Training models to high-end performance requires availability of large labeled datasets, which are expensive to get.

Devil is in the Edges: Learning Semantic Boundaries from Noisy Annotations

1 code implementation CVPR 2019 David Acuna, Amlan Kar, Sanja Fidler

We further reason about true object boundaries during training using a level set formulation, which allows the network to learn from misaligned labels in an end-to-end fashion.

Semantic Segmentation

Action Recognition from Single Timestamp Supervision in Untrimmed Videos

1 code implementation CVPR 2019 Davide Moltisanti, Sanja Fidler, Dima Damen

We propose a method that is supervised by single timestamps located around each action instance, in untrimmed videos.

Action Recognition

Mimicking the In-Camera Color Pipeline for Camera-Aware Object Compositing

no code implementations27 Mar 2019 Jun Gao, Xiao Li, Li-Wei Wang, Sanja Fidler, Stephen Lin

We present a method for compositing virtual objects into a photograph such that the object colors appear to have been processed by the photo's camera imaging pipeline.

Fast Interactive Object Annotation with Curve-GCN

2 code implementations CVPR 2019 Huan Ling, Jun Gao, Amlan Kar, Wenzheng Chen, Sanja Fidler

Our model runs at 29. 3ms in automatic, and 2. 6ms in interactive mode, making it 10x and 100x faster than Polygon-RNN++.

ACTRCE: Augmenting Experience via Teacher's Advice For Multi-Goal Reinforcement Learning

no code implementations12 Feb 2019 Harris Chan, Yuhuai Wu, Jamie Kiros, Sanja Fidler, Jimmy Ba

We first analyze the differences among goal representation, and show that ACTRCE can efficiently solve difficult reinforcement learning problems in challenging 3D navigation tasks, whereas HER with non-language goal representation failed to learn.

Multi-Goal Reinforcement Learning

A Face-to-Face Neural Conversation Model

no code implementations CVPR 2018 Hang Chu, Daiqing Li, Sanja Fidler

The decoder consists of two layers, where the lower layer aims at generating the verbal response and coarse facial expressions, while the second layer fills in the subtle gestures, making the generated output more smooth and natural.

SurfConv: Bridging 3D and 2D Convolution for RGBD Images

1 code implementation CVPR 2018 Hang Chu, Wei-Chiu Ma, Kaustav Kundu, Raquel Urtasun, Sanja Fidler

On the other hand, 3D convolution wastes a large amount of memory on mostly unoccupied 3D space, which consists of only the surface visible to the sensor.

3D Semantic Segmentation

Learning to Caption Images through a Lifetime by Asking Questions

1 code implementation1 Dec 2018 Kevin Shen, Amlan Kar, Sanja Fidler

In order to bring artificial agents into our lives, we will need to go beyond supervised learning on closed datasets to having the ability to continuously expand knowledge.

Active Learning Image Captioning

A Neural Compositional Paradigm for Image Captioning

1 code implementation NeurIPS 2018 Bo Dai, Sanja Fidler, Dahua Lin

Mainstream captioning models often follow a sequential structure to generate captions, leading to issues such as introduction of irrelevant semantics, lack of diversity in the generated captions, and inadequate generalization performance.

Image Captioning

Pose Estimation for Objects with Rotational Symmetry

no code implementations13 Oct 2018 Enric Corona, Kaustav Kundu, Sanja Fidler

In particular, our aim is to infer poses for objects not seen at training time, but for which their 3D CAD models are available at test time.

Pose Estimation

VirtualHome: Simulating Household Activities via Programs

2 code implementations CVPR 2018 Xavier Puig, Kevin Ra, Marko Boben, Jiaman Li, Tingwu Wang, Sanja Fidler, Antonio Torralba

We then implement the most common atomic (inter)actions in the Unity3D game engine, and use our programs to "drive" an artificial agent to execute tasks in a simulated household environment.

Video Understanding

Color Sails: Discrete-Continuous Palettes for Deep Color Exploration

no code implementations7 Jun 2018 Maria Shugrina, Amlan Kar, Karan Singh, Sanja Fidler

Then, the user can adjust color sail parameters to change the base colors, their blending behavior and the number of colors, exploring a wide range of options for the original design.

Visual Reasoning by Progressive Module Networks

1 code implementation ICLR 2019 Seung Wook Kim, Makarand Tapaswi, Sanja Fidler

Thus, a module for a new task learns to query existing modules and composes their outputs in order to produce its own output.

Visual Reasoning

Now You Shake Me: Towards Automatic 4D Cinema

no code implementations CVPR 2018 Yuhao Zhou, Makarand Tapaswi, Sanja Fidler

We are interested in enabling automatic 4D cinema by parsing physical and special effects from untrimmed movies.

MovieGraphs: Towards Understanding Human-Centric Situations from Videos

no code implementations CVPR 2018 Paul Vicol, Makarand Tapaswi, Lluis Castrejon, Sanja Fidler

Towards this goal, we introduce a novel dataset called MovieGraphs which provides detailed, graph-based annotations of social situations depicted in movie clips.

Common Sense Reasoning

Be Your Own Prada: Fashion Synthesis with Structural Coherence

no code implementations ICCV 2017 Shizhan Zhu, Sanja Fidler, Raquel Urtasun, Dahua Lin, Chen Change Loy

In the second stage, a generative model with a newly proposed compositional mapping layer is used to render the final image with precise regions and textures conditioned on this map.

Fashion Synthesis Semantic Segmentation

3D Graph Neural Networks for RGBD Semantic Segmentation

2 code implementations ICCV 2017 Xiaojuan Qi, Renjie Liao, Jiaya Jia, Sanja Fidler, Raquel Urtasun

Each node in the graph corresponds to a set of points and is associated with a hidden representation vector initialized with an appearance feature extracted by a unary CNN from 2D images.

Semantic Segmentation

SGN: Sequential Grouping Networks for Instance Segmentation

no code implementations ICCV 2017 Shu Liu, Jiaya Jia, Sanja Fidler, Raquel Urtasun

By exploiting two-directional information, the second network groups horizontal and vertical lines into connected components.

Instance Segmentation Semantic Segmentation

Sports Field Localization via Deep Structured Models

no code implementations CVPR 2017 Namdar Homayounfar, Sanja Fidler, Raquel Urtasun

In this work, we propose a novel way of efficiently localizing a sports field from a single broadcast image of the game.

Semantic Segmentation

Scene Parsing Through ADE20K Dataset

no code implementations CVPR 2017 Bolei Zhou, Hang Zhao, Xavier Puig, Sanja Fidler, Adela Barriuso, Antonio Torralba

A novel network design called Cascade Segmentation Module is proposed to parse a scene into stuff, objects, and object parts in a cascade and improve over the baselines.

Scene Parsing

Annotating Object Instances with a Polygon-RNN

1 code implementation CVPR 2017 Lluis Castrejon, Kaustav Kundu, Raquel Urtasun, Sanja Fidler

We show that our approach speeds up the annotation process by a factor of 4. 7 across all classes in Cityscapes, while achieving 78. 4% agreement in IoU with original ground-truth, matching the typical agreement between human annotators.

Semantic Segmentation

Open Vocabulary Scene Parsing

no code implementations ICCV 2017 Hang Zhao, Xavier Puig, Bolei Zhou, Sanja Fidler, Antonio Torralba

Recognizing arbitrary objects in the wild has been a challenging problem due to the limitations of existing classification models and datasets.

General Classification Scene Parsing

Towards Diverse and Natural Image Descriptions via a Conditional GAN

1 code implementation ICCV 2017 Bo Dai, Sanja Fidler, Raquel Urtasun, Dahua Lin

Despite the substantial progress in recent years, the image captioning techniques are still far from being perfect. Sentences produced by existing methods, e. g. those based on RNNs, are often overly rigid and lacking in variability.

Image Captioning

Proximal Deep Structured Models

no code implementations NeurIPS 2016 Shenlong Wang, Sanja Fidler, Raquel Urtasun

Many problems in real-world applications involve predicting continuous-valued random variables that are statistically related.

Image Denoising Optical Flow Estimation

TorontoCity: Seeing the World with a Million Eyes

no code implementations ICCV 2017 Shenlong Wang, Min Bai, Gellert Mattyus, Hang Chu, Wenjie Luo, Bin Yang, Justin Liang, Joel Cheverie, Sanja Fidler, Raquel Urtasun

In this paper we introduce the TorontoCity benchmark, which covers the full greater Toronto area (GTA) with 712. 5 $km^2$ of land, 8439 $km$ of road and around 400, 000 buildings.

Instance Segmentation Semantic Segmentation

Efficient Summarization with Read-Again and Copy Mechanism

no code implementations10 Nov 2016 Wenyuan Zeng, Wenjie Luo, Sanja Fidler, Raquel Urtasun

Towards this goal, we first introduce a simple mechanism that first reads the input sequence before committing to a representation of each word.

Semantic Understanding of Scenes through the ADE20K Dataset

20 code implementations18 Aug 2016 Bolei Zhou, Hang Zhao, Xavier Puig, Tete Xiao, Sanja Fidler, Adela Barriuso, Antonio Torralba

Scene parsing, or recognizing and segmenting objects and stuff in an image, is one of the key problems in computer vision.

Scene Parsing Semantic Segmentation

Find your Way by Observing the Sun and Other Semantic Cues

no code implementations23 Jun 2016 Wei-Chiu Ma, Shenlong Wang, Marcus A. Brubaker, Sanja Fidler, Raquel Urtasun

In this paper we present a robust, efficient and affordable approach to self-localization which does not require neither GPS nor knowledge about the appearance of the world.

HD Maps: Fine-Grained Road Segmentation by Parsing Ground and Aerial Images

no code implementations CVPR 2016 Gellert Mattyus, Shenlong Wang, Sanja Fidler, Raquel Urtasun

In this paper we present an approach to enhance existing maps with fine grained segmentation categories such as parking spots and sidewalk, as well as the number and location of road lanes.

Soccer Field Localization from a Single Image

no code implementations10 Apr 2016 Namdar Homayounfar, Sanja Fidler, Raquel Urtasun

In this work, we propose a novel way of efficiently localizing a soccer field from a single broadcast image of the game.

Instance-Level Segmentation for Autonomous Driving with Deep Densely Connected MRFs

no code implementations CVPR 2016 Ziyu Zhang, Sanja Fidler, Raquel Urtasun

Our aim is to provide a pixel-wise instance-level labeling of a monocular image in the context of autonomous driving.

Autonomous Driving

Enhancing Road Maps by Parsing Aerial Images Around the World

no code implementations ICCV 2015 Gellert Mattyus, Shenlong Wang, Sanja Fidler, Raquel Urtasun

In recent years, contextual models that exploit maps have been shown to be very effective for many recognition and localization tasks.

Semantic Segmentation

Learning to Combine Mid-Level Cues for Object Proposal Generation

no code implementations ICCV 2015 Tom Lee, Sanja Fidler, Sven Dickinson

In this paper, we introduce Parametric Min-Loss (PML), a novel structured learning framework for parametric energy functions.

Object Proposal Generation Object Recognition

Lost Shopping! Monocular Localization in Large Indoor Spaces

no code implementations ICCV 2015 Shenlong Wang, Sanja Fidler, Raquel Urtasun

In this paper we propose a novel approach to localization in very large indoor spaces (i. e., 200+ store shopping malls) that takes a single image and a floor plan of the environment as input.

Translation

Order-Embeddings of Images and Language

1 code implementation19 Nov 2015 Ivan Vendrov, Ryan Kiros, Sanja Fidler, Raquel Urtasun

Hypernymy, textual entailment, and image captioning can be seen as special cases of a single visual-semantic hierarchy over words, sentences, and images.

Cross-Modal Retrieval Image Captioning +1

Aligning Books and Movies: Towards Story-like Visual Explanations by Watching Movies and Reading Books

3 code implementations ICCV 2015 Yukun Zhu, Ryan Kiros, Richard Zemel, Ruslan Salakhutdinov, Raquel Urtasun, Antonio Torralba, Sanja Fidler

Books are a rich source of both fine-grained information, how a character, an object or a scene looks like, as well as high-level semantics, what someone is thinking, feeling and how these states evolve through a story.

Sentence Embedding

Skip-Thought Vectors

16 code implementations NeurIPS 2015 Ryan Kiros, Yukun Zhu, Ruslan Salakhutdinov, Richard S. Zemel, Antonio Torralba, Raquel Urtasun, Sanja Fidler

The end result is an off-the-shelf encoder that can produce highly generic sentence representations that are robust and perform well in practice.

Rent3D: Floor-Plan Priors for Monocular Layout Estimation

no code implementations CVPR 2015 Chenxi Liu, Alexander G. Schwing, Kaustav Kundu, Raquel Urtasun, Sanja Fidler

What sets us apart from past work in layout estimation is the use of floor plans as a source of prior knowledge, as well as localization of each image within a bigger space (apartment).

Predicting Deep Zero-Shot Convolutional Neural Networks using Textual Descriptions

no code implementations ICCV 2015 Jimmy Ba, Kevin Swersky, Sanja Fidler, Ruslan Salakhutdinov

One of the main challenges in Zero-Shot Learning of visual categories is gathering semantic attributes to accompany images.

Zero-Shot Learning

Generating Multi-Sentence Lingual Descriptions of Indoor Scenes

no code implementations28 Feb 2015 Dahua Lin, Chen Kong, Sanja Fidler, Raquel Urtasun

This paper proposes a novel framework for generating lingual descriptions of indoor scenes.

Text Generation

A Framework for Symmetric Part Detection in Cluttered Scenes

no code implementations5 Feb 2015 Tom Lee, Sanja Fidler, Alex Levinshtein, Cristian Sminchisescu, Sven Dickinson

The role of symmetry in computer vision has waxed and waned in importance during the evolution of the field from its earliest days.

Neuroaesthetics in Fashion: Modeling the Perception of Fashionability

no code implementations Conference 2015 Edgar Simo-Serra, Sanja Fidler, Francesc Moreno-Noguer, Raquel Urtasun

Importantly, our model is able to give rich feedback back to the user, conveying which garments or even scenery she/he should change in order to improve fashionability.

Learning a Hierarchical Compositional Shape Vocabulary for Multi-class Object Representation

no code implementations23 Aug 2014 Sanja Fidler, Marko Boben, Ales Leonardis

At the top-level of the vocabulary, the compositions are sufficiently large and complex to represent the whole shapes of the objects.

Human-Machine CRFs for Identifying Bottlenecks in Holistic Scene Understanding

no code implementations16 Jun 2014 Roozbeh Mottaghi, Sanja Fidler, Alan Yuille, Raquel Urtasun, Devi Parikh

Recent trends in image understanding have pushed for holistic scene understanding models that jointly reason about various tasks such as object detection, scene recognition, shape analysis, contextual reasoning, and local appearance based classifiers.

Object Detection Scene Recognition +2

Beat the MTurkers: Automatic Image Labeling from Weak 3D Supervision

1 code implementation CVPR 2014 Liang-Chieh Chen, Sanja Fidler, Alan L. Yuille, Raquel Urtasun

Labeling large-scale datasets with very accurate object segmentations is an elaborate task that requires a high degree of quality control and a budget of tens or hundreds of thousands of dollars.

Autonomous Driving

Analyzing Semantic Segmentation Using Hybrid Human-Machine CRFs

no code implementations CVPR 2013 Roozbeh Mottaghi, Sanja Fidler, Jian Yao, Raquel Urtasun, Devi Parikh

Recent trends in semantic image segmentation have pushed for holistic scene understanding models that jointly reason about various tasks such as object detection, scene recognition, shape analysis, contextual reasoning.

Object Detection Scene Recognition +2

A Sentence Is Worth a Thousand Pixels

no code implementations CVPR 2013 Sanja Fidler, Abhishek Sharma, Raquel Urtasun

We are interested in holistic scene understanding where images are accompanied with text in the form of complex sentential descriptions.

Re-Ranking Scene Understanding +2

Bottom-Up Segmentation for Top-Down Detection

no code implementations CVPR 2013 Sanja Fidler, Roozbeh Mottaghi, Alan Yuille, Raquel Urtasun

When employing the parts, we outperform the original DPM [14] in 19 out of 20 classes, achieving an improvement of 8% AP.

Object Detection Semantic Segmentation

3D Object Detection and Viewpoint Estimation with a Deformable 3D Cuboid Model

no code implementations NeurIPS 2012 Sanja Fidler, Sven Dickinson, Raquel Urtasun

We demonstrate the effectiveness of our approach in indoor and outdoor scenarios, and show that our approach outperforms the state-of-the-art in both 2D[Felz09] and 3D object detection[Hedau12].

3D Object Detection Viewpoint Estimation

Evaluating multi-class learning strategies in a generative hierarchical framework for object detection

no code implementations NeurIPS 2009 Sanja Fidler, Marko Boben, Ales Leonardis

We explore and compare their computational behavior (space and time) and detection performance as a function of the number of learned classes on several recognition data sets.

Object Detection

Cannot find the paper you are looking for? You can Submit a new open access paper.