3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera

1 code implementation ICCV 2019 Iro Armeni, Zhi-Yang He, JunYoung Gwak, Amir R. Zamir, Martin Fischer, Jitendra Malik, Silvio Savarese

Given a 3D mesh and registered panoramic images, we construct a graph that spans the entire building and includes semantics on objects (e. g., class, material, and other attributes), rooms (e. g., scene category, volume, etc.)

An Information-Theoretic Metric of Transferability for Task Transfer Learning

1 code implementation ICLR 2019 Yajie Bao, Yang Li, Shao-Lun Huang, Lin Zhang, Amir R. Zamir, Leonidas J. Guibas

An important question in task transfer learning is to determine task transferability, i. e. given a common input domain, estimating to what extent representations learned from a source task can help in learning a target task.

General Classification Scene Understanding +1

Mid-Level Visual Representations Improve Generalization and Sample Efficiency for Learning Visuomotor Policies

1 code implementation31 Dec 2018 Alexander Sax, Bradley Emi, Amir R. Zamir, Leonidas Guibas, Silvio Savarese, Jitendra Malik

This skill set (hereafter mid-level perception) provides the policy with a more processed state of the world compared to raw images.

Object Detection

Generic 3D Representation via Pose Estimation and Matching

1 code implementation23 Oct 2017 Amir R. Zamir, Tilman Wekel, Pulkit Argrawal, Colin Weil, Jitendra Malik, Silvio Savarese

Though a large body of computer vision research has investigated developing generic semantic representations, efforts towards developing a similar representation for 3D has been limited.

Object Pose Estimation +1

Joint 2D-3D-Semantic Data for Indoor Scene Understanding

3 code implementations3 Feb 2017 Iro Armeni, Sasha Sax, Amir R. Zamir, Silvio Savarese

We present a dataset of large-scale indoor spaces that provides a variety of mutually registered modalities from 2D, 2. 5D and 3D domains, with instance-level semantic and geometric annotations.

Scene Understanding

Feedback Networks

1 code implementation CVPR 2017 Amir R. Zamir, Te-Lin Wu, Lin Sun, William Shen, Jitendra Malik, Silvio Savarese

Currently, the most successful learning models in computer vision are based on learning successive representations followed by a decision layer.

3D Semantic Parsing of Large-Scale Indoor Spaces

no code implementations CVPR 2016 Iro Armeni, Ozan Sener, Amir R. Zamir, Helen Jiang, Ioannis Brilakis, Martin Fischer, Silvio Savarese

In this paper, we propose a method for semantic parsing the 3D point cloud of an entire building using a hierarchical approach: first, the raw data is parsed into semantically meaningful spaces (e. g. rooms, etc) that are aligned into a canonical reference coordinate system.

Semantic Parsing

The THUMOS Challenge on Action Recognition for Videos "in the Wild"

no code implementations21 Apr 2016 Haroon Idrees, Amir R. Zamir, Yu-Gang Jiang, Alex Gorban, Ivan Laptev, Rahul Sukthankar, Mubarak Shah

Additionally, we include a comprehensive empirical study evaluating the differences in action recognition between trimmed and untrimmed videos, and how well methods trained on trimmed videos generalize to untrimmed videos.

Action Classification Action Recognition +3

Structural-RNN: Deep Learning on Spatio-Temporal Graphs

2 code implementations CVPR 2016 Ashesh Jain, Amir R. Zamir, Silvio Savarese, Ashutosh Saxena

The proposed method is generic and principled as it can be used for transforming any spatio-temporal graph through employing a certain set of well defined steps.

Human Pose Forecasting Skeleton Based Action Recognition

