Search Results for author: Jana Kosecka

Found 34 papers, 10 papers with code

Fingerspelling PoseNet: Enhancing Fingerspelling Translation with Pose-Based Transformer Models

1 code implementation20 Nov 2023 Pooya Fayyazsanavi, Negar Nejatishahidin, Jana Kosecka

We also propose a novel two-stage inference approach that re-ranks the hypotheses using the language model capabilities of the decoder.

Hand Pose Estimation Language Modelling +3

Labeling Indoor Scenes with Fusion of Out-of-the-Box Perception Models

no code implementations17 Nov 2023 Yimeng Li, Navid Rajabi, Sulabh Shrestha, Md Alimoor Reza, Jana Kosecka

We aim to develop a cost-effective labeling approach to obtain pseudo-labels for semantic segmentation and object instance detection in indoor environments, with the ultimate goal of facilitating the training of lightweight models for various downstream tasks.

Object object-detection +3

Towards Grounded Visual Spatial Reasoning in Multi-Modal Vision Language Models

no code implementations18 Aug 2023 Navid Rajabi, Jana Kosecka

With pre-training of vision-and-language models (VLMs) on large-scale datasets of image-text pairs, several recent works showed that these pre-trained models lack fine-grained understanding, such as the ability to count and recognize verbs, attributes, or relationships.

Image-text matching Question Answering +3

Graph-CoVis: GNN-based Multi-view Panorama Global Pose Estimation

no code implementations26 Apr 2023 Negar Nejatishahidin, Will Hutchcroft, Manjunath Narayana, Ivaylo Boyadzhiev, Yuguang Li, Naji Khosravan, Jana Kosecka, Sing Bing Kang

In this paper, we address the problem of wide-baseline camera pose estimation from a group of 360$^\circ$ panoramas under upright-camera assumption.

Pose Estimation

U2RLE: Uncertainty-Guided 2-Stage Room Layout Estimation

no code implementations17 Apr 2023 Pooya Fayyazsanavi, Zhiqiang Wan, Will Hutchcroft, Ivaylo Boyadzhiev, Yuguang Li, Jana Kosecka, Sing Bing Kang

While the existing deep learning-based room layout estimation techniques demonstrate good overall accuracy, they are less effective for distant floor-wall boundary.

Room Layout Estimation

Comparison of Model-Free and Model-Based Learning-Informed Planning for PointGoal Navigation

1 code implementation17 Dec 2022 Yimeng Li, Arnab Debnath, Gregory J. Stein, Jana Kosecka

In this work, we compare the state-of-the-art Deep Reinforcement Learning based approaches with Partially Observable Markov Decision Process (POMDP) formulation of the point goal navigation problem.

PointGoal Navigation Problem Decomposition +3

Learning-Augmented Model-Based Planning for Visual Exploration

no code implementations15 Nov 2022 Yimeng Li, Arnab Debnath, Gregory Stein, Jana Kosecka

Our approach surpasses the greedy strategies by 2. 1% and the RL-based exploration methods by 8. 4% in terms of coverage.

Self-supervised Pre-training for Semantic Segmentation in an Indoor Scene

no code implementations4 Oct 2022 Sulabh Shrestha, Yimeng Li, Jana Kosecka

Given the spatial and temporal consistency cues used for pixel level data association, we use a variant of contrastive learning to train a DCNN model for predicting semantic segmentation from RGB views in the target environment.

Contrastive Learning Segmentation +1

Object Pose Estimation using Mid-level Visual Representations

1 code implementation2 Mar 2022 Negar Nejatishahidin, Pooya Fayyazsanavi, Jana Kosecka

The deep convolutional network models (CNN) for pose estimation are typically trained and evaluated on datasets specifically curated for object detection, pose estimation, or 3D reconstruction, which requires large amounts of training data.

3D Reconstruction Object +5

Uncertainty Aware Proposal Segmentation for Unknown Object Detection

no code implementations25 Nov 2021 Yimeng Li, Jana Kosecka

Recent efforts in deploying Deep Neural Networks for object detection in real world applications, such as autonomous driving, assume that all relevant object classes have been observed during training.

Autonomous Driving Object +5

Generative Multi-Stream Architecture For American Sign Language Recognition

no code implementations9 Mar 2020 Dom Huh, Sai Gurrapu, Frederick Olson, Huzefa Rangwala, Parth Pathak, Jana Kosecka

With advancements in deep model architectures, tasks in computer vision can reach optimal convergence provided proper data preprocessing and model parameter initialization.

Sign Language Recognition

Hierarchical Kinematic Human Mesh Recovery

no code implementations ECCV 2020 Georgios Georgakis, Ren Li, Srikrishna Karanam, Terrence Chen, Jana Kosecka, Ziyan Wu

In this work, we address this gap by proposing a new technique for regression of human parametric model that is explicitly informed by the known hierarchical structure, including joint interdependencies of the model.

Human Mesh Recovery regression

Learning View and Target Invariant Visual Servoing for Navigation

1 code implementation4 Mar 2020 Yimeng Li, Jana Kosecka

The advances in deep reinforcement learning recently revived interest in data-driven learning based approaches to navigation.

Robot Navigation

FineHand: Learning Hand Shapes for American Sign Language Recognition

no code implementations4 Mar 2020 Al Amin Hosain, Panneer Selvam Santhalingam, Parth Pathak, Huzefa Rangwala, Jana Kosecka

American Sign Language recognition is a difficult gesture recognition problem, characterized by fast, highly articulate gestures.

Gesture Recognition Sign Language Recognition

Simultaneous Mapping and Target Driven Navigation

2 code implementations18 Nov 2019 Georgios Georgakis, Yimeng Li, Jana Kosecka

This work presents a modular architecture for simultaneous mapping and target driven navigation in indoors environments.

Semantic Segmentation

Sign Language Recognition Analysis using Multimodal Data

no code implementations24 Sep 2019 Al Amin Hosain, Panneer Selvam Santhalingam, Parth Pathak, Jana Kosecka, Huzefa Rangwala

Despite having similarity with the well-studied human activity recognition, the use of 3D skeleton data in sign language recognition is rare.

Human Activity Recognition Sign Language Recognition

Learning Local RGB-to-CAD Correspondences for Object Pose Estimation

1 code implementation ICCV 2019 Georgios Georgakis, Srikrishna Karanam, Ziyan Wu, Jana Kosecka

In this paper, we solve this key problem of existing methods requiring expensive 3D pose annotations by proposing a new method that matches RGB images to CAD models for object pose estimation.

Object Pose Estimation

Self-supervisory Signals for Object Discovery and Detection

no code implementations8 Jun 2018 Etienne Pot, Alexander Toshev, Jana Kosecka

In robotic applications, we often face the challenge of discovering new objects while having very little or no labelled training data.

Clustering Object +1

Visual Representations for Semantic Target Driven Navigation

3 code implementations15 May 2018 Arsalan Mousavian, Alexander Toshev, Marek Fiser, Jana Kosecka, Ayzaan Wahid, James Davidson

We propose to using high level semantic and contextual features including segmentation and detection masks obtained by off-the-shelf state-of-the-art vision as observations and use deep network to learn the navigation policy.

Domain Adaptation Visual Navigation

Target Driven Instance Detection

1 code implementation13 Mar 2018 Phil Ammirato, Cheng-Yang Fu, Mykhailo Shvets, Jana Kosecka, Alexander C. Berg

While state-of-the-art general object detectors are getting better and better, there are not many systems specifically designed to take advantage of the instance detection problem.


End-to-end learning of keypoint detector and descriptor for pose invariant 3D matching

no code implementations CVPR 2018 Georgios Georgakis, Srikrishna Karanam, Ziyan Wu, Jan Ernst, Jana Kosecka

Finding correspondences between images or 3D scans is at the heart of many computer vision and image retrieval applications and is often enabled by matching local keypoint descriptors.

Image Retrieval Keypoint Detection +2

Dense Piecewise Planar RGB-D SLAM for Indoor Environments

no code implementations1 Aug 2017 Phi-Hung Le, Jana Kosecka

The paper exploits weak Manhattan constraints to parse the structure of indoor environments from RGB-D video sequences in an online setting.

Visual Odometry

A Dataset for Developing and Benchmarking Active Vision

no code implementations27 Feb 2017 Phil Ammirato, Patrick Poirson, Eunbyung Park, Jana Kosecka, Alexander C. Berg

We present a new public dataset with a focus on simulating robotic vision tasks in everyday indoor environments using real imagery.

Benchmarking General Classification +5

Synthesizing Training Data for Object Detection in Indoor Scenes

no code implementations25 Feb 2017 Georgios Georgakis, Arsalan Mousavian, Alexander C. Berg, Jana Kosecka

In this work we explore the ability of using synthetically generated composite images for training state-of-the-art object detectors, especially for object instance detection.

Object object-detection +1

3D Bounding Box Estimation Using Deep Learning and Geometry

11 code implementations CVPR 2017 Arsalan Mousavian, Dragomir Anguelov, John Flynn, Jana Kosecka

In contrast to current techniques that only regress the 3D orientation of an object, our method first regresses relatively stable 3D object properties using a deep convolutional neural network and then combines these estimates with geometric constraints provided by a 2D object bounding box to produce a complete 3D bounding box.

3D Object Detection Object +4

Multiview RGB-D Dataset for Object Instance Detection

no code implementations26 Sep 2016 Georgios Georgakis, Md. Alimoor Reza, Arsalan Mousavian, Phi-Hung Le, Jana Kosecka

This paper presents a new multi-view RGB-D dataset of nine kitchen scenes, each containing several objects in realistic cluttered environments including a subset of objects from the BigBird dataset.

Object object-detection +1

Fast Single Shot Detection and Pose Estimation

no code implementations19 Sep 2016 Patrick Poirson, Phil Ammirato, Cheng-Yang Fu, Wei Liu, Jana Kosecka, Alexander C. Berg

For applications in navigation and robotics, estimating the 3D pose of objects is as important as detection.

Object Tracking Pose Estimation

Semantic Image Based Geolocation Given a Map

no code implementations1 Sep 2016 Arsalan Mousavian, Jana Kosecka

In this work we present an approach for geo-locating a novel view and determining camera location and orientation using a map and a sparse set of geo-tagged reference views.

Visual Place Recognition

Reinforcement Learning for Semantic Segmentation in Indoor Scenes

no code implementations3 Jun 2016 Md. Alimoor Reza, Jana Kosecka

Future advancements in robot autonomy and sophistication of robotics tasks rest on robust, efficient, and task-dependent semantic understanding of the environment.

reinforcement-learning Reinforcement Learning (RL) +2

Joint Semantic Segmentation and Depth Estimation with Deep Convolutional Networks

no code implementations25 Apr 2016 Arsalan Mousavian, Hamed Pirsiavash, Jana Kosecka

The proposed model is trained and evaluated on NYUDepth V2 dataset outperforming the state of the art methods on semantic segmentation and achieving comparable results on the task of depth estimation.

Depth Estimation Segmentation +1

Deep Convolutional Features for Image Based Retrieval and Scene Categorization

no code implementations20 Sep 2015 Arsalan Mousavian, Jana Kosecka

Several recent approaches showed how the representations learned by Convolutional Neural Networks can be repurposed for novel tasks.

Image Retrieval Retrieval

Nonparametric Scene Parsing with Adaptive Feature Relevance and Semantic Context

no code implementations CVPR 2013 Gautam Singh, Jana Kosecka

This paper presents a nonparametric approach to semantic parsing using small patches and simple gradient, color and location features.

Retrieval Scene Parsing +2

Cannot find the paper you are looking for? You can Submit a new open access paper.