Search Results for author: Georgia Gkioxari

Found 25 papers, 16 papers with code

Differentiable Stereopsis: Meshes from multiple views using differentiable rendering

no code implementations11 Oct 2021 Shubham Goel, Georgia Gkioxari, Jitendra Malik

We propose Differentiable Stereopsis, a multi-view stereo approach that reconstructs shape and texture from few input views and noisy cameras.

Compressed Object Detection

1 code implementation4 Feb 2021 Gedeon Muhawenayo, Georgia Gkioxari

With our approach, we are able to compress a state-of-the-art object detection model by 30. 0% without a loss in performance.

Model Compression Object Detection +2

Accelerating 3D Deep Learning with PyTorch3D

3 code implementations16 Jul 2020 Nikhila Ravi, Jeremy Reizenstein, David Novotny, Taylor Gordon, Wan-Yen Lo, Justin Johnson, Georgia Gkioxari

We address these challenges by introducing PyTorch3D, a library of modular, efficient, and differentiable operators for 3D deep learning.

Autonomous Vehicles

3D Shape Reconstruction from Vision and Touch

1 code implementation NeurIPS 2020 Edward J. Smith, Roberto Calandra, Adriana Romero, Georgia Gkioxari, David Meger, Jitendra Malik, Michal Drozdzal

When a toddler is presented a new toy, their instinctual behaviour is to pick it upand inspect it with their hand and eyes in tandem, clearly searching over its surface to properly understand what they are playing with.

3D Shape Reconstruction

SynSin: End-to-end View Synthesis from a Single Image

3 code implementations CVPR 2020 Olivia Wiles, Georgia Gkioxari, Richard Szeliski, Justin Johnson

Single image view synthesis allows for the generation of new views of a scene given a single input image.

Novel View Synthesis

Bayesian Relational Memory for Semantic Visual Navigation

1 code implementation ICCV 2019 Yi Wu, Yuxin Wu, Aviv Tamar, Stuart Russell, Georgia Gkioxari, Yuandong Tian

We introduce a new memory architecture, Bayesian Relational Memory (BRM), to improve the generalization ability for semantic visual navigation agents in unseen environments, where an agent is given a semantic target to navigate towards.

Visual Navigation

Mesh R-CNN

5 code implementations ICCV 2019 Georgia Gkioxari, Jitendra Malik, Justin Johnson

We propose a system that detects objects in real-world images and produces a triangle mesh giving the full 3D shape of each detected object.

3D Shape Modeling

Multi-Target Embodied Question Answering

1 code implementation CVPR 2019 Licheng Yu, Xinlei Chen, Georgia Gkioxari, Mohit Bansal, Tamara L. Berg, Dhruv Batra

To address this, we propose a modular architecture composed of a program generator, a controller, a navigator, and a VQA module.

Embodied Question Answering Question Answering

Embodied Question Answering in Photorealistic Environments with Point Cloud Perception

no code implementations CVPR 2019 Erik Wijmans, Samyak Datta, Oleksandr Maksymets, Abhishek Das, Georgia Gkioxari, Stefan Lee, Irfan Essa, Devi Parikh, Dhruv Batra

To help bridge the gap between internet vision-style problems and the goal of vision for embodied perception we instantiate a large-scale navigation task -- Embodied Question Answering [1] in photo-realistic environments (Matterport 3D).

Embodied Question Answering Question Answering

Neural Modular Control for Embodied Question Answering

2 code implementations26 Oct 2018 Abhishek Das, Georgia Gkioxari, Stefan Lee, Devi Parikh, Dhruv Batra

We use imitation learning to warm-start policies at each level of the hierarchy, dramatically increasing sample efficiency, followed by reinforcement learning.

Embodied Question Answering Imitation Learning +1

Learning and Planning with a Semantic Model

no code implementations ICLR 2019 Yi Wu, Yuxin Wu, Aviv Tamar, Stuart Russell, Georgia Gkioxari, Yuandong Tian

Building deep reinforcement learning agents that can generalize and adapt to unseen environments remains a fundamental challenge for AI.

Visual Navigation

Building Generalizable Agents with a Realistic and Rich 3D Environment

5 code implementations ICLR 2018 Yi Wu, Yuxin Wu, Georgia Gkioxari, Yuandong Tian

To generalize to unseen environments, an agent needs to be robust to low-level variations (e. g. color, texture, object changes), and also high-level variations (e. g. layout changes of the environment).

Data Augmentation

Data Distillation: Towards Omni-Supervised Learning

4 code implementations CVPR 2018 Ilija Radosavovic, Piotr Dollár, Ross Girshick, Georgia Gkioxari, Kaiming He

We investigate omni-supervised learning, a special regime of semi-supervised learning in which the learner exploits all available labeled data plus internet-scale sources of unlabeled data.

Keypoint Detection Object Detection

Embodied Question Answering

4 code implementations CVPR 2018 Abhishek Das, Samyak Datta, Georgia Gkioxari, Stefan Lee, Devi Parikh, Dhruv Batra

We present a new AI task -- Embodied Question Answering (EmbodiedQA) -- where an agent is spawned at a random location in a 3D environment and asked a question ("What color is the car?").

Embodied Question Answering Question Answering

Detecting and Recognizing Human-Object Interactions

2 code implementations CVPR 2018 Georgia Gkioxari, Ross Girshick, Piotr Dollár, Kaiming He

Our hypothesis is that the appearance of a person -- their pose, clothing, action -- is a powerful cue for localizing the objects they are interacting with.

Human-Object Interaction Detection

Mask R-CNN

150 code implementations ICCV 2017 Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick

Our approach efficiently detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance.

3D Instance Segmentation Human Part Segmentation +7

Chained Predictions Using Convolutional Neural Networks

no code implementations8 May 2016 Georgia Gkioxari, Alexander Toshev, Navdeep Jaitly

In this model the output variables for a given input are predicted sequentially using neural networks.

Pose Estimation

Contextual Action Recognition with R*CNN

2 code implementations ICCV 2015 Georgia Gkioxari, Ross Girshick, Jitendra Malik

In this work, we exploit the simple observation that actions are accompanied by contextual cues to build a strong action recognition system.

Action Recognition General Classification +1

R-CNNs for Pose Estimation and Action Detection

no code implementations19 Jun 2014 Georgia Gkioxari, Bharath Hariharan, Ross Girshick, Jitendra Malik

We present convolutional neural networks for the tasks of keypoint (pose) prediction and action classification of people in unconstrained images.

Action Classification Action Detection +3

Using k-Poselets for Detecting People and Localizing Their Keypoints

no code implementations CVPR 2014 Georgia Gkioxari, Bharath Hariharan, Ross Girshick, Jitendra Malik

A k-poselet is a deformable part model (DPM) with k parts, where each of the parts is a poselet, aligned to a specific configuration of keypoints based on ground-truth annotations.

Human Detection

Articulated Pose Estimation Using Discriminative Armlet Classifiers

no code implementations CVPR 2013 Georgia Gkioxari, Pablo Arbelaez, Lubomir Bourdev, Jitendra Malik

We propose a novel approach for human pose estimation in real-world cluttered scenes, and focus on the challenging problem of predicting the pose of both arms for each person in the image.

Pose Estimation

Cannot find the paper you are looking for? You can Submit a new open access paper.