Search Results for author: Amir R. Zamir

Found 12 papers, 10 papers with code

Robust Learning Through Cross-Task Consistency

1 code implementation • CVPR 2020 • Amir R. Zamir, Alexander Sax, Nikhil Cheerla, Rohan Suri, Zhangjie Cao, Jitendra Malik, Leonidas J. Guibas

Visual perception entails solving a wide set of tasks (e. g., object detection, depth estimation, etc).

Ranked #1 on Surface Normals Estimation on Taskonomy

3D Reconstruction Monocular Depth Estimation +4

176

Paper
Code

3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera

1 code implementation • ICCV 2019 • Iro Armeni, Zhi-Yang He, JunYoung Gwak, Amir R. Zamir, Martin Fischer, Jitendra Malik, Silvio Savarese

Given a 3D mesh and registered panoramic images, we construct a graph that spans the entire building and includes semantics on objects (e. g., class, material, and other attributes), rooms (e. g., scene category, volume, etc.)

213

Paper
Code

Which Tasks Should Be Learned Together in Multi-task Learning?

1 code implementation • ICML 2020 • Trevor Standley, Amir R. Zamir, Dawn Chen, Leonidas Guibas, Jitendra Malik, Silvio Savarese

Many computer vision applications require solving multiple tasks in real-time.

Multi-Task Learning

Paper
Code

An Information-Theoretic Metric of Transferability for Task Transfer Learning

1 code implementation • ICLR 2019 • Yajie Bao, Yang Li, Shao-Lun Huang, Lin Zhang, Amir R. Zamir, Leonidas J. Guibas

An important question in task transfer learning is to determine task transferability, i. e. given a common input domain, estimating to what extent representations learned from a source task can help in learning a target task.

General Classification Scene Understanding +1

Paper
Code

Mid-Level Visual Representations Improve Generalization and Sample Efficiency for Learning Visuomotor Policies

1 code implementation • 31 Dec 2018 • Alexander Sax, Bradley Emi, Amir R. Zamir, Leonidas Guibas, Silvio Savarese, Jitendra Malik

This skill set (hereafter mid-level perception) provides the policy with a more processed state of the world compared to raw images.

Object Detection

107

Paper
Code

On Evaluation of Embodied Navigation Agents

9 code implementations • 18 Jul 2018 • Peter Anderson, Angel Chang, Devendra Singh Chaplot, Alexey Dosovitskiy, Saurabh Gupta, Vladlen Koltun, Jana Kosecka, Jitendra Malik, Roozbeh Mottaghi, Manolis Savva, Amir R. Zamir

Skillful mobile operation in three-dimensional environments is a primary topic of study in Artificial Intelligence.

Benchmarking

1,765

Paper
Code

Generic 3D Representation via Pose Estimation and Matching

1 code implementation • 23 Oct 2017 • Amir R. Zamir, Tilman Wekel, Pulkit Argrawal, Colin Weil, Jitendra Malik, Silvio Savarese

Though a large body of computer vision research has investigated developing generic semantic representations, efforts towards developing a similar representation for 3D has been limited.

Object Pose Estimation +1

425

Paper
Code

Joint 2D-3D-Semantic Data for Indoor Scene Understanding

3 code implementations • 3 Feb 2017 • Iro Armeni, Sasha Sax, Amir R. Zamir, Silvio Savarese

We present a dataset of large-scale indoor spaces that provides a variety of mutually registered modalities from 2D, 2. 5D and 3D domains, with instance-level semantic and geometric annotations.

Scene Understanding

460

Paper
Code

Feedback Networks

1 code implementation • CVPR 2017 • Amir R. Zamir, Te-Lin Wu, Lin Sun, William Shen, Jitendra Malik, Silvio Savarese

Currently, the most successful learning models in computer vision are based on learning successive representations followed by a decision layer.

Paper
Code

3D Semantic Parsing of Large-Scale Indoor Spaces

no code implementations • CVPR 2016 • Iro Armeni, Ozan Sener, Amir R. Zamir, Helen Jiang, Ioannis Brilakis, Martin Fischer, Silvio Savarese

In this paper, we propose a method for semantic parsing the 3D point cloud of an entire building using a hierarchical approach: first, the raw data is parsed into semantically meaningful spaces (e. g. rooms, etc) that are aligned into a canonical reference coordinate system.

Semantic Parsing

Paper
Add Code

The THUMOS Challenge on Action Recognition for Videos "in the Wild"

no code implementations • 21 Apr 2016 • Haroon Idrees, Amir R. Zamir, Yu-Gang Jiang, Alex Gorban, Ivan Laptev, Rahul Sukthankar, Mubarak Shah

Additionally, we include a comprehensive empirical study evaluating the differences in action recognition between trimmed and untrimmed videos, and how well methods trained on trimmed videos generalize to untrimmed videos.

Action Classification Action Recognition +3

Paper
Add Code

Structural-RNN: Deep Learning on Spatio-Temporal Graphs

2 code implementations • CVPR 2016 • Ashesh Jain, Amir R. Zamir, Silvio Savarese, Ashutosh Saxena

The proposed method is generic and principled as it can be used for transforming any spatio-temporal graph through employing a certain set of well defined steps.

Ranked #4 on Skeleton Based Action Recognition on CAD-120

Human Pose Forecasting Skeleton Based Action Recognition

256

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.