Search Results for author: Katerina Fragkiadaki

Found 36 papers, 12 papers with code

TIDEE: Tidying Up Novel Rooms using Visuo-Semantic Commonsense Priors

no code implementations21 Jul 2022 Gabriel Sarch, Zhaoyuan Fang, Adam W. Harley, Paul Schydlo, Michael J. Tarr, Saurabh Gupta, Katerina Fragkiadaki

We introduce TIDEE, an embodied agent that tidies up a disordered scene based on learned commonsense object placement and room arrangement priors.

A Simple Baseline for BEV Perception Without LiDAR

no code implementations16 Jun 2022 Adam W. Harley, Zhaoyuan Fang, Jie Li, Rares Ambrus, Katerina Fragkiadaki

Current methods use multi-view RGB data collected from cameras around the vehicle and neurally "lift" features from the perspective images to the 2D ground plane, yielding a "bird's eye view" (BEV) feature representation of the 3D space around the vehicle.

Autonomous Vehicles Data Augmentation

Particle Video Revisited: Tracking Through Occlusions Using Point Trajectories

1 code implementation8 Apr 2022 Adam W. Harley, Zhaoyuan Fang, Katerina Fragkiadaki

In this paper, we revisit Sand and Teller's "particle video" approach, and study pixel tracking as a long-range motion estimation problem, where every pixel is described with a trajectory that locates it in multiple future frames.

Motion Estimation Object Tracking +1

Generating Fast and Slow: Scene Decomposition via Reconstruction

no code implementations21 Mar 2022 Mihir Prabhudesai, Anirudh Goyal, Deepak Pathak, Katerina Fragkiadaki

We show the proposed curriculum suffices to break the reconstruction-segmentation trade-off, and slow inference greatly improves segmentation in out-of-distribution scenes.

Object Discovery Scene Segmentation

Bottom Up Top Down Detection Transformers for Language Grounding in Images and Point Clouds

1 code implementation16 Dec 2021 Ayush Jain, Nikolaos Gkanatsios, Ishita Mediratta, Katerina Fragkiadaki

We propose a language grounding model that attends on the referential utterance and on the object proposal pool computed from a pre-trained detector to decode referenced objects with a detection head, without selecting them from the pool.

object-detection Object Detection +1

Language Modulated Detection and Detection Modulated Language Grounding in 2D and 3D Scenes

no code implementations29 Sep 2021 Ayush Jain, Nikolaos Gkanatsios, Ishita Mediratta, Katerina Fragkiadaki

Object detectors are typically trained on a fixed vocabulary of objects and attributes that is often too restrictive for open-domain language grounding, where the language utterance may refer to visual entities in various levels of abstraction, such as a cat, the leg of a cat, or the stain on the front leg of the chair.

object-detection Object Detection

CoCoNets: Continuous Contrastive 3D Scene Representations

1 code implementation CVPR 2021 Shamit Lal, Mihir Prabhudesai, Ishita Mediratta, Adam W. Harley, Katerina Fragkiadaki

This paper explores self-supervised learning of amodal 3D feature representations from RGB and RGB-D posed images and videos, agnostic to object and scene semantic content, and evaluates the resulting scene representations in the downstream tasks of visual correspondence, object tracking, and object detection.

3D Object Detection Contrastive Learning +3

HyperDynamics: Meta-Learning Object and Agent Dynamics with Hypernetworks

no code implementations17 Mar 2021 Zhou Xian, Shamit Lal, Hsiao-Yu Tung, Emmanouil Antonios Platanios, Katerina Fragkiadaki

We propose HyperDynamics, a dynamics meta-learning framework that conditions on an agent's interactions with the environment and optionally its visual observations, and generates the parameters of neural dynamics models based on inferred properties of the dynamical system.


HyperDynamics: Generating Expert Dynamics Models by Observation

no code implementations ICLR 2021 Zhou Xian, Shamit Lal, Hsiao-Yu Tung, Emmanouil Antonios Platanios, Katerina Fragkiadaki

We propose HyperDynamics, a framework that conditions on an agent’s interactions with the environment and optionally its visual observations, and generates the parameters of neural dynamics models based on inferred properties of the dynamical system.

Move to See Better: Self-Improving Embodied Object Detection

1 code implementation30 Nov 2020 Zhaoyuan Fang, Ayush Jain, Gabriel Sarch, Adam W. Harley, Katerina Fragkiadaki

Experiments on both indoor and outdoor datasets show that (1) our method obtains high-quality 2D and 3D pseudo-labels from multi-view RGB-D data; (2) fine-tuning with these pseudo-labels improves the 2D detector significantly in the test environment; (3) training a 3D detector with our pseudo-labels outperforms a prior self-supervised method by a large margin; (4) given weak supervision, our method can generate better pseudo-labels for novel objects.

object-detection Object Detection

3D-OES: Viewpoint-Invariant Object-Factorized Environment Simulators

no code implementations12 Nov 2020 Hsiao-Yu Fish Tung, Zhou Xian, Mihir Prabhudesai, Shamit Lal, Katerina Fragkiadaki

Object motion predictions are computed by a graph neural network that operates over the object features extracted from the 3D neural scene representation.

Disentangling 3D Prototypical Networks For Few-Shot Concept Learning

1 code implementation ICLR 2021 Mihir Prabhudesai, Shamit Lal, Darshan Patil, Hsiao-Yu Tung, Adam W Harley, Katerina Fragkiadaki

We present neural architectures that disentangle RGB-D images into objects' shapes and styles and a map of the background scene, and explore their applications for few-shot 3D object detection and few-shot concept classification.

3D Object Detection object-detection +2

3D Object Recognition By Corresponding and Quantizing Neural 3D Scene Representations

no code implementations30 Oct 2020 Mihir Prabhudesai, Shamit Lal, Hsiao-Yu Fish Tung, Adam W. Harley, Shubhankar Potdar, Katerina Fragkiadaki

We can compare the 3D feature maps of two objects by searching alignment across scales and 3D rotations, and, as a result of the operation, we can estimate pose and scale changes without the need for 3D pose annotations.

3D Object Recognition Pose Estimation

Tracking Emerges by Looking Around Static Scenes, with Neural 3D Mapping

no code implementations ECCV 2020 Adam W. Harley, Shrinidhi K. Lakshmikanth, Paul Schydlo, Katerina Fragkiadaki

We propose to leverage multiview data of \textit{static points} in arbitrary scenes (static or dynamic), to learn a neural 3D mapping module which produces features that are correspondable across time.

3D Object Tracking Object Tracking

Epipolar Transformers

1 code implementation CVPR 2020 Yihui He, Rui Yan, Katerina Fragkiadaki, Shoou-I Yu

The intuition is: given a 2D location p in the current view, we would like to first find its corresponding point p' in a neighboring view, and then combine the features at p' with the features at p, thus leading to a 3D-aware feature at p. Inspired by stereo matching, the epipolar transformer leverages epipolar constraints and feature matching to approximate the features at p'.

3D Hand Pose Estimation 3D Human Pose Estimation +2

Image Disentanglement and Uncooperative Re-Entanglement for High-Fidelity Image-to-Image Translation

no code implementations11 Jan 2019 Adam W. Harley, Shih-En Wei, Jason Saragih, Katerina Fragkiadaki

Cross-domain image-to-image translation should satisfy two requirements: (1) preserve the information that is common to both domains, and (2) generate convincing images covering variations that appear in the target domain.

Disentanglement Image-to-Image Translation +1

Reinforcement Learning of Active Vision for Manipulating Objects under Occlusions

1 code implementation20 Nov 2018 Ricson Cheng, Arpit Agarwal, Katerina Fragkiadaki

We propose hand/eye con-trollers that learn to move the camera to keep the object within the field of viewand visible, in coordination to manipulating it to achieve the desired goal, e. g., pushing it to a target location.


Model Learning for Look-ahead Exploration in Continuous Control

1 code implementation20 Nov 2018 Arpit Agarwal, Katharina Muelling, Katerina Fragkiadaki

We propose an exploration method that incorporates look-ahead search over basic learnt skills and their dynamics, and use it for reinforcement learning (RL) of manipulation policies .

Continuous Control

Geometry-Aware Recurrent Neural Networks for Active Visual Recognition

no code implementations NeurIPS 2018 Ricson Cheng, Ziyan Wang, Katerina Fragkiadaki

We present recurrent geometry-aware neural networks that integrate visual information across multiple views of a scene into 3D latent feature tensors, while maintaining an one-to-one mapping between 3D physical locations in the world scene and latent feature locations.

3D Reconstruction object-detection +2

Reward Learning from Narrated Demonstrations

no code implementations CVPR 2018 Hsiao-Yu Fish Tung, Adam W. Harley, Liang-Kang Huang, Katerina Fragkiadaki

Humans effortlessly "program" one another by communicating goals and desires in natural language.

Depth-Adaptive Computational Policies for Efficient Visual Tracking

no code implementations1 Jan 2018 Chris Ying, Katerina Fragkiadaki

Current convolutional neural networks algorithms for video object tracking spend the same amount of computation for each object and video frame.

Video Object Tracking Visual Tracking

Adversarial Inverse Graphics Networks: Learning 2D-to-3D Lifting and Image-to-Image Translation from Unpaired Supervision

no code implementations ICCV 2017 Hsiao-Yu Fish Tung, Adam W. Harley, William Seto, Katerina Fragkiadaki

Researchers have developed excellent feed-forward models that learn to map images to desired outputs, such as to the images' latent factors, or to other images, using supervised learning.

3D Human Pose Estimation Image-to-Image Translation +2

Motion Prediction Under Multimodality with Conditional Stochastic Networks

no code implementations5 May 2017 Katerina Fragkiadaki, Jonathan Huang, Alex Alemi, Sudheendra Vijayanarasimhan, Susanna Ricco, Rahul Sukthankar

In this work, we present stochastic neural network architectures that handle such multimodality through stochasticity: future trajectories of objects, body joints or frames are represented as deep, non-linear transformations of random (as opposed to deterministic) variables.

motion prediction Optical Flow Estimation +2

SfM-Net: Learning of Structure and Motion from Video

no code implementations25 Apr 2017 Sudheendra Vijayanarasimhan, Susanna Ricco, Cordelia Schmid, Rahul Sukthankar, Katerina Fragkiadaki

We propose SfM-Net, a geometry-aware neural network for motion estimation in videos that decomposes frame-to-frame pixel motion in terms of scene and object depth, camera motion and 3D object rotations and translations.

Motion Estimation Optical Flow Estimation

Learning Visual Predictive Models of Physics for Playing Billiards

no code implementations23 Nov 2015 Katerina Fragkiadaki, Pulkit Agrawal, Sergey Levine, Jitendra Malik

The ability to plan and execute goal specific actions in varied, unexpected settings is a central requirement of intelligent agents.

Recurrent Network Models for Human Dynamics

no code implementations ICCV 2015 Katerina Fragkiadaki, Sergey Levine, Panna Felsen, Jitendra Malik

We propose the Encoder-Recurrent-Decoder (ERD) model for recognition and prediction of human body pose in videos and motion capture.

Human Dynamics Human Pose Forecasting +2

Human Pose Estimation with Iterative Error Feedback

1 code implementation CVPR 2016 Joao Carreira, Pulkit Agrawal, Katerina Fragkiadaki, Jitendra Malik

Hierarchical feature extractors such as Convolutional Networks (ConvNets) have achieved impressive performance on a variety of classification tasks using purely feedforward processing.

Pose Estimation Semantic Segmentation

Learning to Segment Moving Objects in Videos

no code implementations CVPR 2015 Katerina Fragkiadaki, Pablo Arbelaez, Panna Felsen, Jitendra Malik

We segment moving objects in videos by ranking spatio-temporal segment proposals according to "moving objectness": how likely they are to contain a moving object.

Video Segmentation Video Semantic Segmentation

Grouping-Based Low-Rank Trajectory Completion and 3D Reconstruction

no code implementations NeurIPS 2014 Katerina Fragkiadaki, Marta Salas, Pablo Arbelaez, Jitendra Malik

Furthermore, NRSfM needs to be robust to noise in both segmentation and tracking, e. g., drifting, segmentation ``leaking'', optical flow ``bleeding'' etc.

3D Reconstruction Optical Flow Estimation +2

Pose from Flow and Flow from Pose

no code implementations CVPR 2013 Katerina Fragkiadaki, Han Hu, Jianbo Shi

The pose labeled segments and corresponding articulated joints are used to improve the motion flow fields by proposing kinematically constrained affine displacements on body parts.

Motion Estimation

Cannot find the paper you are looking for? You can Submit a new open access paper.