Search Results for author: Yoichi Sato

Found 44 papers, 9 papers with code

Surgical Skill Assessment via Video Semantic Aggregation

no code implementations4 Aug 2022 Zhenqiang Li, Lin Gu, Weimin WANG, Ryosuke Nakamura, Yoichi Sato

Automated video-based assessment of surgical skills is a promising task in assisting young surgical trainees, especially in poor-resource areas.

Representation Learning

CompNVS: Novel View Synthesis with Scene Completion

no code implementations23 Jul 2022 Zuoyue Li, Tianxing Fan, Zhenqiang Li, Zhaopeng Cui, Yoichi Sato, Marc Pollefeys, Martin R. Oswald

We introduce a scalable framework for novel view synthesis from RGB-D images with largely incomplete scene coverage.

Novel View Synthesis Scene Understanding

Compound Prototype Matching for Few-shot Action Recognition

no code implementations12 Jul 2022 Lijin Yang, Yifei HUANG, Yoichi Sato

Each global prototype is encouraged to summarize a specific aspect from the entire video, for example, the start/evolution of the action.

Few Shot Action Recognition Video Similarity

Precise Affordance Annotation for Egocentric Action Video Datasets

no code implementations11 Jun 2022 Zecheng Yu, Yifei HUANG, Ryosuke Furuta, Takuma Yagi, Yusuke Goutsu, Yoichi Sato

Object affordance is an important concept in human-object interaction, providing information on action possibilities based on human motor capacity and objects' physical property thus benefiting tasks such as action anticipation and robot imitation learning.

Action Anticipation Affordance Recognition +1

Object Instance Identification in Dynamic Environments

1 code implementation10 Jun 2022 Takuma Yagi, Md Tasnimul Hasan, Yoichi Sato

We study the problem of identifying object instances in a dynamic environment where people interact with the objects.

feature selection

Efficient Annotation and Learning for 3D Hand Pose Estimation: A Survey

no code implementations5 Jun 2022 Takehiko Ohkawa, Ryosuke Furuta, Yoichi Sato

In this survey, we present comprehensive analysis of 3D hand pose estimation from the perspective of efficient annotation and learning.

3D Hand Pose Estimation Domain Adaptation +1

Domain Adaptive Hand Keypoint and Pixel Localization in the Wild

no code implementations16 Mar 2022 Takehiko Ohkawa, Yu-Jhe Li, Qichen Fu, Ryosuke Furuta, Kris M. Kitani, Yoichi Sato

We aim to improve the performance of regressing hand keypoints and segmenting pixel-level hand masks under new imaging conditions (e. g., outdoors) when we only have labeled images taken under very different conditions (e. g., indoors).

Domain Adaptation Knowledge Distillation

Background Mixup Data Augmentation for Hand and Object-in-Contact Detection

no code implementations28 Feb 2022 Koya Tango, Takehiko Ohkawa, Ryosuke Furuta, Yoichi Sato

Detecting the positions of human hands and objects-in-contact (hand-object detection) in each video frame is vital for understanding human activities from videos.

Data Augmentation object-detection +1

Interact Before Align: Leveraging Cross-Modal Knowledge for Domain Adaptive Action Recognition

no code implementations CVPR 2022 Lijin Yang, Yifei HUANG, Yusuke Sugano, Yoichi Sato

Different from previous works, we find that the cross-domain alignment can be more effectively done by using cross-modal interaction first.

Action Recognition

Leveraging Human Selective Attention for Medical Image Analysis with Limited Training Data

no code implementations2 Dec 2021 Yifei HUANG, Xiaoxiao Li, Lijin Yang, Lin Gu, Yingying Zhu, Hirofumi Seo, Qiuming Meng, Tatsuya Harada, Yoichi Sato

Then we design a novel Auxiliary Attention Block (AAB) to allow information from SAN to be utilized by the backbone encoder to focus on selective areas.

Tumor Segmentation

Stacked Temporal Attention: Improving First-person Action Recognition by Emphasizing Discriminative Clips

no code implementations2 Dec 2021 Lijin Yang, Yifei HUANG, Yusuke Sugano, Yoichi Sato

Previous works explored to address this problem by applying temporal attention but failed to consider the global context of the full video, which is critical for determining the relatively significant parts.

Action Recognition Video Understanding

Neural Routing by Memory

no code implementations NeurIPS 2021 Kaipeng Zhang, Zhenqiang Li, Zhifeng Li, Wei Liu, Yoichi Sato

However, they use the same procedure sequence for all inputs, regardless of the intermediate features. This paper proffers a simple yet effective idea of constructing parallel procedures and assigning similar intermediate features to the same specialized procedures in a divide-and-conquer fashion.

Hand-Object Contact Prediction via Motion-Based Pseudo-Labeling and Guided Progressive Label Correction

1 code implementation19 Oct 2021 Takuma Yagi, Md Tasnimul Hasan, Yoichi Sato

In this study, we introduce a video-based method for predicting contact between a hand and an object.

Ego4D: Around the World in 3,000 Hours of Egocentric Video

1 code implementation CVPR 2022 Kristen Grauman, Andrew Westbury, Eugene Byrne, Zachary Chavis, Antonino Furnari, Rohit Girdhar, Jackson Hamburger, Hao Jiang, Miao Liu, Xingyu Liu, Miguel Martin, Tushar Nagarajan, Ilija Radosavovic, Santhosh Kumar Ramakrishnan, Fiona Ryan, Jayant Sharma, Michael Wray, Mengmeng Xu, Eric Zhongcong Xu, Chen Zhao, Siddhant Bansal, Dhruv Batra, Vincent Cartillier, Sean Crane, Tien Do, Morrie Doulaty, Akshay Erapalli, Christoph Feichtenhofer, Adriano Fragomeni, Qichen Fu, Abrham Gebreselasie, Cristina Gonzalez, James Hillis, Xuhua Huang, Yifei HUANG, Wenqi Jia, Weslie Khoo, Jachym Kolar, Satwik Kottur, Anurag Kumar, Federico Landini, Chao Li, Yanghao Li, Zhenqiang Li, Karttikeya Mangalam, Raghava Modhugu, Jonathan Munro, Tullie Murrell, Takumi Nishiyasu, Will Price, Paola Ruiz Puentes, Merey Ramazanova, Leda Sari, Kiran Somasundaram, Audrey Southerland, Yusuke Sugano, Ruijie Tao, Minh Vo, Yuchen Wang, Xindi Wu, Takuma Yagi, Ziwei Zhao, Yunyi Zhu, Pablo Arbelaez, David Crandall, Dima Damen, Giovanni Maria Farinella, Christian Fuegen, Bernard Ghanem, Vamsi Krishna Ithapu, C. V. Jawahar, Hanbyul Joo, Kris Kitani, Haizhou Li, Richard Newcombe, Aude Oliva, Hyun Soo Park, James M. Rehg, Yoichi Sato, Jianbo Shi, Mike Zheng Shou, Antonio Torralba, Lorenzo Torresani, Mingfei Yan, Jitendra Malik

We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite.

De-identification Ethics

Spatio-Temporal Perturbations for Video Attribution

1 code implementation1 Sep 2021 Zhenqiang Li, Weimin WANG, Zuoyue Li, Yifei HUANG, Yoichi Sato

The attribution method provides a direction for interpreting opaque neural networks in a visual way by identifying and visualizing the input regions/pixels that dominate the output of a network.

Video Understanding

EPIC-KITCHENS-100 Unsupervised Domain Adaptation Challenge for Action Recognition 2021: Team M3EM Technical Report

no code implementations18 Jun 2021 Lijin Yang, Yifei HUANG, Yusuke Sugano, Yoichi Sato

In this report, we describe the technical details of our submission to the 2021 EPIC-KITCHENS-100 Unsupervised Domain Adaptation Challenge for Action Recognition.

Action Recognition Unsupervised Domain Adaptation

Towards Visually Explaining Video Understanding Networks with Perturbation

2 code implementations1 May 2020 Zhenqiang Li, Weimin WANG, Zuoyue Li, Yifei HUANG, Yoichi Sato

''Making black box models explainable'' is a vital problem that accompanies the development of deep learning networks.

Video Understanding

Manipulation-skill Assessment from Videos with Spatial Attention Network

no code implementations9 Jan 2019 Zhenqiang Li, Yifei Huang, Minjie Cai, Yoichi Sato

Recent advances in computer vision have made it possible to automatically assess from videos the manipulation skills of humans in performing a task, which breeds many important applications in domains such as health rehabilitation and manufacturing.

Mutual Context Network for Jointly Estimating Egocentric Gaze and Actions

no code implementations7 Jan 2019 Yifei Huang, Zhenqiang Li, Minjie Cai, Yoichi Sato

In this work, we address two coupled tasks of gaze prediction and action recognition in egocentric videos by exploring their mutual context.

Action Recognition Gaze Prediction

Understanding hand-object manipulation by modeling the contextual relationship between actions, grasp types and object attributes

no code implementations22 Jul 2018 Minjie Cai, Kris Kitani, Yoichi Sato

In the proposed model, we explore various semantic relationships between actions, grasp types and object attributes, and show how the context can be used to boost the recognition of each component.

Predicting Gaze in Egocentric Video by Learning Task-dependent Attention Transition

2 code implementations ECCV 2018 Yifei Huang, Minjie Cai, Zhenqiang Li, Yoichi Sato

We present a new computational model for gaze prediction in egocentric videos by exploring patterns in temporal shift of gaze fixations (attention transition) that are dependent on egocentric manipulation tasks.

Gaze Prediction Saliency Prediction

Future Person Localization in First-Person Videos

1 code implementation CVPR 2018 Takuma Yagi, Karttikeya Mangalam, Ryo Yonetani, Yoichi Sato

We present a new task that predicts future locations of people observed in first-person videos.

From RGB to Spectrum for Natural Scenes via Manifold-Based Mapping

no code implementations ICCV 2017 Yan Jia, Yinqiang Zheng, Lin Gu, Art Subpa-Asa, Antony Lam, Yoichi Sato, Imari Sato

Spectral analysis of natural scenes can provide much more detailed information about the scene than an ordinary RGB camera.

Dimensionality Reduction

Fast Multi-frame Stereo Scene Flow with Motion Segmentation

no code implementations CVPR 2017 Tatsunori Taniai, Sudipta N. Sinha, Yoichi Sato

This unified framework benefits all four tasks - stereo, optical flow, visual odometry and motion segmentation leading to overall higher accuracy and efficiency.

Motion Segmentation Optical Flow Estimation +3

Hierarchical Gaussian Descriptors with Application to Person Re-Identification

no code implementations14 Jun 2017 Tetsu Matsukawa, Takahiro Okabe, Einoshin Suzuki, Yoichi Sato

To solve this problem, we describe a local region in an image via hierarchical Gaussian distribution in which both means and covariances are included in their parameters.

Image Classification Person Re-Identification

Ego-Surfing: Person Localization in First-Person Videos Using Ego-Motion Signatures

no code implementations15 Jun 2016 Ryo Yonetani, Kris M. Kitani, Yoichi Sato

We envision a future time when wearable cameras are worn by the masses and recording first-person point-of-view videos of everyday life.

Video Retrieval

Exploiting Spectral-Spatial Correlation for Coded Hyperspectral Image Restoration

no code implementations CVPR 2016 Ying Fu, Yinqiang Zheng, Imari Sato, Yoichi Sato

In this paper, we propose an effective method for coded hyperspectral image restoration, which exploits extensive structure sparsity in the hyperspectral image.

Image Restoration

Joint Recovery of Dense Correspondence and Cosegmentation in Two Images

no code implementations CVPR 2016 Tatsunori Taniai, Sudipta N. Sinha, Yoichi Sato

We propose a new technique to jointly recover cosegmentation and dense per-pixel correspondence in two images.

Recognizing Micro-Actions and Reactions From Paired Egocentric Videos

no code implementations CVPR 2016 Ryo Yonetani, Kris M. Kitani, Yoichi Sato

We aim to understand the dynamics of social interactions between two people by recognizing their actions and reactions using a head-mounted camera.

Video Summarization

Hierarchical Gaussian Descriptor for Person Re-Identification

no code implementations CVPR 2016 Tetsu Matsukawa, Takahiro Okabe, Einoshin Suzuki, Yoichi Sato

In both steps, unlike the hierarchical covariance descriptor, the proposed descriptor can model both the mean and the covariance information of pixel features properly.

Image Classification Person Re-Identification

Continuous 3D Label Stereo Matching using Local Expansion Moves

2 code implementations28 Mar 2016 Tatsunori Taniai, Yasuyuki Matsushita, Yoichi Sato, Takeshi Naemura

The local expansion moves extend traditional expansion moves by two ways: localization and spatial propagation.

Patch Matching Stereo Matching +1

Adaptive Spatial-Spectral Dictionary Learning for Hyperspectral Image Denoising

no code implementations ICCV 2015 Ying Fu, Antony Lam, Imari Sato, Yoichi Sato

Hyperspectral imaging is beneficial in a diverse range of applications from diagnostic medicine, to agriculture, to surveillance to name a few.

Dictionary Learning Hyperspectral Image Denoising +1

Separating Fluorescent and Reflective Components by Using a Single Hyperspectral Image

no code implementations ICCV 2015 Yinqiang Zheng, Ying Fu, Antony Lam, Imari Sato, Yoichi Sato

This paper introduces a novel method to separate fluorescent and reflective components in the spectral domain.

Ego-Surfing First-Person Videos

no code implementations CVPR 2015 Ryo Yonetani, Kris M. Kitani, Yoichi Sato

We incorporate this feature into our proposed approach that computes the motion correlation over supervoxel hierarchies to localize target instances in observer videos.

Uncalibrated Photometric Stereo Based on Elevation Angle Recovery From BRDF Symmetry of Isotropic Materials

no code implementations CVPR 2015 Feng Lu, Imari Sato, Yoichi Sato

This sort of symmetry can be observed in a 1D BRDF slice from a subset of surface normals with the same azimuth angle, and we use it to devise an efficient modeling and solution method to constrain and recover the elevation angles of surface normals accurately.

Illumination and Reflectance Spectra Separation of a Hyperspectral Image Meets Low-Rank Matrix Factorization

no code implementations CVPR 2015 Yinqiang Zheng, Imari Sato, Yoichi Sato

This paper addresses the illumination and reflectance spectra separation (IRSS) problem of a hyperspectral image captured under general spectral illumination.

Learning-by-Synthesis for Appearance-based 3D Gaze Estimation

no code implementations CVPR 2014 Yusuke Sugano, Yasuyuki Matsushita, Yoichi Sato

Unlike existing appearance-based methods that assume person-specific training data, we use a large amount of cross-subject training data to train a 3D gaze estimator.

3D Reconstruction Gaze Estimation

Reflectance and Fluorescent Spectra Recovery based on Fluorescent Chromaticity Invariance under Varying Illumination

no code implementations CVPR 2014 Ying Fu, Antony Lam, Yasuyuki Kobashi, Imari Sato, Takahiro Okabe, Yoichi Sato

We then show that given the spectral reflectance and fluorescent chromaticity, the fluorescence absorption and emission spectra can also be estimated.

3D Reconstruction

Shape-Preserving Half-Projective Warps for Image Stitching

no code implementations CVPR 2014 Che-Han Chang, Yoichi Sato, Yung-Yu Chuang

It provides good alignment accuracy as projective warps while preserving the perspective of individual image as similarity warps.

Image Stitching

Uncalibrated Photometric Stereo for Unknown Isotropic Reflectances

no code implementations CVPR 2013 Feng Lu, Yasuyuki Matsushita, Imari Sato, Takahiro Okabe, Yoichi Sato

We propose an uncalibrated photometric stereo method that works with general and unknown isotropic reflectances.

Cannot find the paper you are looking for? You can Submit a new open access paper.