Search Results for author: Gregory D. Hager

Found 62 papers, 24 papers with code

SAGE: SLAM with Appearance and Geometry Prior for Endoscopy

1 code implementation19 Feb 2022 Xingtong Liu, Zhaoshuo Li, Masaru Ishii, Gregory D. Hager, Russell H. Taylor, Mathias Unberath

In endoscopy, many applications (e. g., surgical navigation) would benefit from a real-time method that can simultaneously track the endoscope and reconstruct the dense 3D geometry of the observed anatomy from a monocular endoscopic video.

Anatomy Simultaneous Localization and Mapping

"Good Robot! Now Watch This!": Repurposing Reinforcement Learning for Task-to-Task Transfer

1 code implementation Conference On Robot Learning (CoRL) 2021 Andrew Hundt, Aditya Murali, Priyanka Hubli, Ran Liu, Nakul Gopalan, Matthew Gombolay, Gregory D. Hager

Based upon this insight, we propose See-SPOT-Run (SSR), a new computational approach to robot learning that enables a robot to complete a variety of real robot tasks in novel problem domains without task-specific training.

Few-Shot Learning Meta Reinforcement Learning +3

Learn Proportional Derivative Controllable Latent Space from Pixels

no code implementations15 Oct 2021 Weiyao Wang, Marin Kobilarov, Gregory D. Hager

Recent advances in latent space dynamics model from pixels show promising progress in vision-based model predictive control (MPC).

Neighborhood Normalization for Robust Geometric Feature Learning

1 code implementation CVPR 2021 Xingtong Liu, Benjamin D. Killeen, Ayushi Sinha, Masaru Ishii, Gregory D. Hager, Russell H. Taylor, Mathias Unberath

Extracting geometric features from 3D models is a common first step in applications such as 3D registration, tracking, and scene flow estimation.

Scene Flow Estimation

Localization and Control of Magnetic Suture Needles in Cluttered Surgical Site with Blood and Tissue

no code implementations20 May 2021 Will Pryor, Yotam Barnoy, Suraj Raval, Xiaolong Liu, Lamar Mair, Daniel Lerner, Onder Erin, Gregory D. Hager, Yancy Diaz-Mercado, Axel Krieger

Our localization method combines neural network-based segmentation and classical techniques, and we are able to consistently locate our needle with 0. 73 mm RMS error in clean environments and 2. 72 mm RMS error in challenging environments with blood and occlusion.

Visual Localization

Single View Geocentric Pose in the Wild

1 code implementation18 May 2021 Gordon Christie, Kevin Foster, Shea Hagstrom, Gregory D. Hager, Myron Z. Brown

Current methods for Earth observation tasks such as semantic mapping, map alignment, and change detection rely on near-nadir images; however, often the first available images in response to dynamic world events such as natural disasters are oblique.

Change Detection

"Train one, Classify one, Teach one" -- Cross-surgery transfer learning for surgical step recognition

no code implementations24 Feb 2021 Daniel Neimark, Omri Bar, Maya Zohar, Gregory D. Hager, Dotan Asselmann

Such pre-training enables TSAN to learn workflow steps of a new laparoscopic procedure type from only a small number of labeled samples from the target procedure.

BIG-bench Machine Learning Self-Supervised Learning +3

SAFCAR: Structured Attention Fusion for Compositional Action Recognition

no code implementations3 Dec 2020 Tae Soo Kim, Gregory D. Hager

We present a general framework for compositional action recognition -- i. e. action recognition where the labels are composed out of simpler components such as subjects, atomic-actions and objects.

Action Recognition Time Series +1

Fine-grained activity recognition for assembly videos

no code implementations2 Dec 2020 Jonathan D. Jones, Cathryn Cortesa, Amy Shelton, Barbara Landau, Sanjeev Khudanpur, Gregory D. Hager

In this paper we address the task of recognizing assembly actions as a structure (e. g. a piece of furniture or a toy block tower) is built up from a set of primitive objects.

Action Recognition

Nothing But Geometric Constraints: A Model-Free Method for Articulated Object Pose Estimation

no code implementations30 Nov 2020 Qihao Liu, Weichao Qiu, Weiyao Wang, Gregory D. Hager, Alan L. Yuille

We propose an unsupervised vision-based system to estimate the joint configurations of the robot arm from a sequence of RGB or RGB-D images without knowing the model a priori, and then adapt it to the task of category-independent articulated object pose estimation.

Optical Flow Estimation Pose Estimation

Autonomously Navigating a Surgical Tool Inside the Eye by Learning from Demonstration

no code implementations16 Nov 2020 Ji Woong Kim, Changyan He, Muller Urias, Peter Gehlbach, Gregory D. Hager, Iulian Iordachita, Marin Kobilarov

We show that the network can reliably navigate a needle surgical tool to various desired locations within 137 microns accuracy in physical experiments and 94 microns in simulation on average, and generalizes well to unseen situations such as in the presence of auxiliary surgical tools, variable eye backgrounds, and brightness conditions.

Autonomous Navigation Depth Estimation +1

Learning Representations of Endoscopic Videos to Detect Tool Presence Without Supervision

1 code implementation27 Aug 2020 David Z. Li, Masaru Ishii, Russell H. Taylor, Gregory D. Hager, Ayushi Sinha

We use three different methods to manipulate these latent representations in order to predict tool presence in each frame.

Anatomy-Aware Siamese Network: Exploiting Semantic Asymmetry for Accurate Pelvic Fracture Detection in X-ray Images

no code implementations ECCV 2020 Haomin Chen, Yirui Wang, Kang Zheng, Weijian Li, Chi-Tung Cheng, Adam P. Harrison, Jing Xiao, Gregory D. Hager, Le Lu, Chien-Hung Liao, Shun Miao

A new contrastive feature learning component in our Siamese network is designed to optimize the deep image features being more salient corresponding to the underlying semantic asymmetries (caused by pelvic fracture occurrences).


Learning Geocentric Object Pose in Oblique Monocular Images

1 code implementation CVPR 2020 Gordon Christie, Rodrigo Rene Rai Munoz Abujder, Kevin Foster, Shea Hagstrom, Gregory D. Hager, Myron Z. Brown

An object's geocentric pose, defined as the height above ground and orientation with respect to gravity, is a powerful representation of real-world structure for object detection, segmentation, and localization tasks using RGBD images.

object-detection Object Detection +2

Artificial Intelligence-based Clinical Decision Support for COVID-19 -- Where Art Thou?

no code implementations5 Jun 2020 Mathias Unberath, Kimia Ghobadi, Scott Levin, Jeremiah Hinson, Gregory D. Hager

The COVID-19 crisis has brought about new clinical questions, new workflows, and accelerated distributed healthcare needs.

Semantic Image Manipulation Using Scene Graphs

1 code implementation CVPR 2020 Helisa Dhamo, Azade Farshad, Iro Laina, Nassir Navab, Gregory D. Hager, Federico Tombari, Christian Rupprecht

In our work, we address the novel problem of image manipulation from scene graphs, in which a user can edit images by merely applying changes in the nodes or edges of a semantic graph that is generated from the image.

Image Inpainting Image Manipulation +1

Reconstructing Sinus Anatomy from Endoscopic Video -- Towards a Radiation-free Approach for Quantitative Longitudinal Assessment

1 code implementation18 Mar 2020 Xingtong Liu, Maia Stiber, Jindan Huang, Masaru Ishii, Gregory D. Hager, Russell H. Taylor, Mathias Unberath

Reconstructing accurate 3D surface models of sinus anatomy directly from an endoscopic video is a promising avenue for cross-sectional and longitudinal analysis to better understand the relationship between sinus anatomy and surgical outcomes.

3D Reconstruction Anatomy

Extremely Dense Point Correspondences using a Learned Feature Descriptor

1 code implementation CVPR 2020 Xingtong Liu, Yiping Zheng, Benjamin Killeen, Masaru Ishii, Gregory D. Hager, Russell H. Taylor, Mathias Unberath

In direct comparison to recent local and dense descriptors on an in-house sinus endoscopy dataset, we demonstrate that our proposed dense descriptor can generalize to unseen patients and scopes, thereby largely improving the performance of Structure from Motion (SfM) in terms of model density and completeness.

3D Reconstruction Anatomy +2

Car Pose in Context: Accurate Pose Estimation with Ground Plane Constraints

no code implementations9 Dec 2019 Pengfei Li, Weichao Qiu, Michael Peven, Gregory D. Hager, Alan L. Yuille

Scene context is a powerful constraint on the geometry of objects within the scene in cases, such as surveillance, where the camera geometry is unknown and image quality may be poor.

Car Pose Estimation

DASZL: Dynamic Action Signatures for Zero-shot Learning

no code implementations8 Dec 2019 Tae Soo Kim, Jonathan D. Jones, Michael Peven, Zihao Xiao, Jin Bai, Yi Zhang, Weichao Qiu, Alan Yuille, Gregory D. Hager

There are many realistic applications of activity recognition where the set of potential activity descriptions is combinatorially large.

Action Detection Activity Detection +3

RSA: Randomized Simulation as Augmentation for Robust Human Action Recognition

no code implementations3 Dec 2019 Yi Zhang, Xinyue Wei, Weichao Qiu, Zihao Xiao, Gregory D. Hager, Alan Yuille

In this paper, we propose the Randomized Simulation as Augmentation (RSA) framework which augments real-world training data with synthetic data to improve the robustness of action recognition networks.

Action Recognition Temporal Action Localization

Action Recognition Using Volumetric Motion Representations

1 code implementation19 Nov 2019 Michael Peven, Gregory D. Hager, Austin Reiter

In this work, we introduce a novel representation of motion as a voxelized 3D vector field and demonstrate how it can be used to improve performance of action recognition networks.

Action Recognition Data Augmentation +2

"Good Robot!": Efficient Reinforcement Learning for Multi-Step Visual Tasks with Sim to Real Transfer

1 code implementation25 Sep 2019 Andrew Hundt, Benjamin Killeen, Nicholas Greene, Hongtao Wu, Heeyeon Kwon, Chris Paxton, Gregory D. Hager

We are able to create real stacks in 100% of trials with 61% efficiency and real rows in 100% of trials with 59% efficiency by directly loading the simulation-trained model on the real robot with no additional real-world fine-tuning.

reinforcement-learning Reinforcement Learning (RL)

Self-supervised Dense 3D Reconstruction from Monocular Endoscopic Video

no code implementations6 Sep 2019 Xingtong Liu, Ayushi Sinha, Masaru Ishii, Gregory D. Hager, Russell H. Taylor, Mathias Unberath

We present a self-supervised learning-based pipeline for dense 3D reconstruction from full-length monocular endoscopic videos without a priori modeling of anatomy or shading.

3D Reconstruction Anatomy +1

Automated Surgical Activity Recognition with One Labeled Sequence

no code implementations20 Jul 2019 Robert DiPietro, Gregory D. Hager

Prior work has demonstrated the feasibility of automated activity recognition in robot-assisted surgery from motion data.

Activity Recognition

sharpDARTS: Faster and More Accurate Differentiable Architecture Search

1 code implementation23 Mar 2019 Andrew Hundt, Varun Jain, Gregory D. Hager

We have performed an in-depth analysis to identify limitations in a widely used search space and a recent architecture search method, Differentiable Architecture Search (DARTS).

Hyperparameter Optimization Image Classification +1

Dense Depth Estimation in Monocular Endoscopy with Self-supervised Learning Methods

1 code implementation20 Feb 2019 Xingtong Liu, Ayushi Sinha, Masaru Ishii, Gregory D. Hager, Austin Reiter, Russell H. Taylor, Mathias Unberath

We present a self-supervised approach to training convolutional neural networks for dense depth estimation from monocular endoscopy data without a priori modeling of anatomy or shading.

Anatomy Computed Tomography (CT) +2

Semantic Stereo for Incidental Satellite Images

1 code implementation21 Nov 2018 Marc Bosch, Kevin Foster, Gordon Christie, Sean Wang, Gregory D. Hager, Myron Brown

The increasingly common use of incidental satellite images for stereo reconstruction versus rigidly tasked binocular or trinocular coincident collection is helping to enable timely global-scale 3D mapping; however, reliable stereo correspondence from multi-date image pairs remains very challenging due to seasonal appearance differences and scene change.

3D Reconstruction Scene Segmentation

The CoSTAR Block Stacking Dataset: Learning with Workspace Constraints

3 code implementations27 Oct 2018 Andrew Hundt, Varun Jain, Chia-Hung Lin, Chris Paxton, Gregory D. Hager

We show that a mild relaxation of the task and workspace constraints implicit in existing object grasping datasets can cause neural network based grasping algorithms to fail on even a simple block stacking task when executed under more realistic circumstances.

6D Pose Estimation using RGBD Industrial Robots +5

Unsupervised Learning for Surgical Motion by Learning to Predict the Future

no code implementations8 Jun 2018 Robert DiPietro, Gregory D. Hager

We show that it is possible to learn meaningful representations of surgical motion, without supervision, by learning to predict the future.

Future prediction Information Retrieval +1

Endoscopic navigation in the absence of CT imaging

no code implementations8 Jun 2018 Ayushi Sinha, Xingtong Liu, Austin Reiter, Masaru Ishii, Gregory D. Hager, Russell H. Taylor

Clinical examinations that involve endoscopic exploration of the nasal cavity and sinuses often do not have a reference image to provide structural context to the clinician.

Computed Tomography (CT)

Visual Robot Task Planning

1 code implementation30 Mar 2018 Chris Paxton, Yotam Barnoy, Kapil Katyal, Raman Arora, Gregory D. Hager

In this work, we propose a neural network architecture and associated planning algorithm that (1) learns a representation of the world useful for generating prospective futures after the application of high-level actions, (2) uses this generative model to simulate the result of sequences of high-level actions in a variety of environments, and (3) uses this same representation to evaluate these actions and perform tree search to find a sequence of high-level actions in a new environment.

Imitation Learning Robot Task Planning

Guide Me: Interacting with Deep Networks

no code implementations CVPR 2018 Christian Rupprecht, Iro Laina, Nassir Navab, Gregory D. Hager, Federico Tombari

Interaction and collaboration between humans and intelligent machines has become increasingly important as machine learning methods move into real-world applications that involve end users.

Image Captioning Image Generation

A Unified Framework for Multi-View Multi-Class Object Pose Estimation

no code implementations ECCV 2018 Chi Li, Jin Bai, Gregory D. Hager

To learn discriminative pose features, we integrate three new capabilities into a deep Convolutional Neural Network (CNN): an inference scheme that combines both classification and pose regression based on a uniform tessellation of the Special Euclidean group in three dimensions (SE(3)), the fusion of class priors into the training process via a tiled class map, and an additional regularization using deep supervision with an object mask.

Pose Estimation

Occupancy Map Prediction Using Generative and Fully Convolutional Networks for Vehicle Navigation

no code implementations6 Mar 2018 Kapil Katyal, Katie Popek, Chris Paxton, Joseph Moore, Kevin Wolfe, Philippe Burlina, Gregory D. Hager

In these situations, the robot's ability to reason about its future motion is often severely limited by sensor field of view (FOV).

Navigate SSIM

Deep Supervision with Intermediate Concepts

no code implementations8 Jan 2018 Chi Li, M. Zeeshan Zia, Quoc-Huy Tran, Xiang Yu, Gregory D. Hager, Manmohan Chandraker

In this work, we explore an approach for injecting prior domain structure into neural network training by supervising hidden layers of a CNN with intermediate concepts that normally are not observed in practice.

Image Classification

Learning to Imagine Manipulation Goals for Robot Task Planning

no code implementations8 Nov 2017 Chris Paxton, Kapil Katyal, Christian Rupprecht, Raman Arora, Gregory D. Hager

Ideally, we would combine the ability of machine learning to leverage big data for learning the semantics of a task, while using techniques from task planning to reliably generalize to new environment.

Robot Task Planning

Temporal and Physical Reasoning for Perception-Based Robotic Manipulation

2 code implementations11 Oct 2017 Felix Jonathan, Chris Paxton, Gregory D. Hager

Accurate knowledge of object poses is crucial to successful robotic manipulation tasks, and yet most current approaches only work in laboratory settings.


Combining Neural Networks and Tree Search for Task and Motion Planning in Challenging Environments

no code implementations22 Mar 2017 Chris Paxton, Vasumathi Raman, Gregory D. Hager, Marin Kobilarov

This paper investigates the ability of neural networks to learn both LTL constraints and control policies in order to generate task plans in complex environments.


Analyzing and Exploiting NARX Recurrent Neural Networks for Long-Term Dependencies

no code implementations ICLR 2018 Robert DiPietro, Christian Rupprecht, Nassir Navab, Gregory D. Hager

Recurrent neural networks (RNNs) have achieved state-of-the-art performance on many diverse tasks, from machine translation to surgical activity recognition, yet training RNNs to capture long-term dependencies remains difficult.

Activity Recognition Machine Translation +1

Anatomically Constrained Video-CT Registration via the V-IMLOP Algorithm

no code implementations25 Oct 2016 Seth D. Billings, Ayushi Sinha, Austin Reiter, Simon Leonard, Masaru Ishii, Gregory D. Hager, Russell H. Taylor

Functional endoscopic sinus surgery (FESS) is a surgical procedure used to treat acute cases of sinusitis and other sinus diseases.

Temporal Convolutional Networks: A Unified Approach to Action Segmentation

1 code implementation29 Aug 2016 Colin Lea, Rene Vidal, Austin Reiter, Gregory D. Hager

The dominant paradigm for video-based action segmentation is composed of two steps: first, for each frame, compute low-level features using Dense Trajectories or a Convolutional Neural Network that encode spatiotemporal information locally, and second, input these features into a classifier that captures high-level temporal relationships, such as a Recurrent Neural Network (RNN).

Action Segmentation

SANTIAGO: Spine Association for Neuron Topology Improvement and Graph Optimization

no code implementations8 Aug 2016 William Gray Roncal, Colin Lea, Akira Baruah, Gregory D. Hager

Our automated approach improves the local subgraph score by more than four times and the full graph score by 60 percent.

Recognizing Surgical Activities with Recurrent Neural Networks

3 code implementations20 Jun 2016 Robert DiPietro, Colin Lea, Anand Malpani, Narges Ahmidi, S. Swaroop Vedula, Gyusung I. Lee, Mija R. Lee, Gregory D. Hager

In contrast, we work on recognizing both gestures and longer, higher-level activites, or maneuvers, and we model the mapping from kinematics to gestures/maneuvers with recurrent neural networks.

Gesture Recognition

Segmental Spatiotemporal CNNs for Fine-grained Action Segmentation

no code implementations9 Feb 2016 Colin Lea, Austin Reiter, Rene Vidal, Gregory D. Hager

We propose a model for action segmentation which combines low-level spatiotemporal features with a high-level segmental classifier.

Action Classification Action Segmentation +3

Automated Objective Surgical Skill Assessment in the Operating Room Using Unstructured Tool Motion

no code implementations18 Dec 2014 Piyush Poddar, Narges Ahmidi, S. Swaroop Vedula, Lisa Ishii, Gregory D. Hager, Masaru Ishii

Previous work on surgical skill assessment using intraoperative tool motion in the operating room (OR) has focused on highly-structured surgical tasks such as cholecystectomy.


Hierarchical Sparse and Collaborative Low-Rank Representation for Emotion Recognition

1 code implementation7 Oct 2014 Xiang Xiang, Minh Dao, Gregory D. Hager, Trac. D. Tran

In this paper, we design a Collaborative-Hierarchical Sparse and Low-Rank (C-HiSLR) model that is natural for recognizing human emotion in visual data.

Emotion Recognition General Classification +1

VESICLE: Volumetric Evaluation of Synaptic Interfaces using Computer vision at Large Scale

no code implementations14 Mar 2014 William Gray Roncal, Michael Pekala, Verena Kaynig-Fittkau, Dean M. Kleissas, Joshua T. Vogelstein, Hanspeter Pfister, Randal Burns, R. Jacob Vogelstein, Mark A. Chevillet, Gregory D. Hager

An open challenge problem at the forefront of modern neuroscience is to obtain a comprehensive mapping of the neural pathways that underlie human brain function; an enhanced understanding of the wiring diagram of the brain promises to lead to new breakthroughs in diagnosing and treating neurological disorders.

object-detection Object Detection

Cannot find the paper you are looking for? You can Submit a new open access paper.