Search Results for author: Satoshi Tsutsui

Found 22 papers, 11 papers with code

A Data Driven Approach for Compound Figure Separation Using Convolutional Neural Networks

no code implementations15 Mar 2017 Satoshi Tsutsui, David Crandall

CNNs eliminate the need for manually designing features and separation rules, but require a large amount of annotated training data.

Transfer Learning

Using Artificial Tokens to Control Languages for Multilingual Image Caption Generation

1 code implementation20 Jun 2017 Satoshi Tsutsui, David Crandall

Recent work in computer vision has yielded impressive results in automatically describing images with natural language.

Caption Generation

Distantly Supervised Road Segmentation

no code implementations21 Aug 2017 Satoshi Tsutsui, Tommi Kerola, Shunta Saito

We present an approach for road segmentation that only requires image-level annotations at training time.

Road Segmentation Segmentation

Minimizing Supervision for Free-space Segmentation

1 code implementation16 Nov 2017 Satoshi Tsutsui, Tommi Kerola, Shunta Saito, David J. Crandall

Our work demonstrates the potential for performing free-space segmentation without tedious and costly manual annotation, which will be important for adapting autonomous driving systems to different types of vehicles and environments

Autonomous Driving Autonomous Navigation +3

edge2vec: Representation learning using edge semantics for biomedical knowledge discovery

1 code implementation7 Sep 2018 Zheng Gao, Gang Fu, Chunping Ouyang, Satoshi Tsutsui, Xiaozhong Liu, Jeremy Yang, Christopher Gessner, Brian Foote, David Wild, Qi Yu, Ying Ding

We propose this method for its added value relative to existing graph analytical methodology, and in the real world context of biomedical knowledge discovery applicability.

Biomedical Information Retrieval Information Retrieval +3

Active Object Manipulation Facilitates Visual Object Learning: An Egocentric Vision Study

no code implementations4 Jun 2019 Satoshi Tsutsui, Dian Zhi, Md. Alimoor Reza, David Crandall, Chen Yu

Inspired by the remarkable ability of the infant visual learning system, a recent study collected first-person images from children to analyze the `training data' that they receive.

Few-Shot Learning Object

A Computational Model of Early Word Learning from the Infant's Point of View

1 code implementation4 Jun 2020 Satoshi Tsutsui, Arjun Chandrasekaran, Md. Alimoor Reza, David Crandall, Chen Yu

Human infants have the remarkable ability to learn the associations between object names and visual objects from inherently ambiguous experiences.

Whose hand is this? Person Identification from Egocentric Hand Gestures

no code implementations17 Nov 2020 Satoshi Tsutsui, Yanwei Fu, David Crandall

But while one's own face is not frequently visible, their hands are: in fact, hands are among the most common objects in one's own field of view.

Gesture Recognition Person Identification

How You Move Your Head Tells What You Do: Self-supervised Video Representation Learning with Egocentric Cameras and IMU Sensors

no code implementations4 Oct 2021 Satoshi Tsutsui, Ruta Desai, Karl Ridgeway

We are particularly interested in learning egocentric video representations benefiting from the head-motion generated by users' daily activities, which can be easily obtained from IMU sensors embedded in AR/VR devices.

Representation Learning Self-Supervised Learning

AVA-AVD: Audio-Visual Speaker Diarization in the Wild

7 code implementations29 Nov 2021 Eric Zhongcong Xu, Zeyang Song, Satoshi Tsutsui, Chao Feng, Mang Ye, Mike Zheng Shou

Audio-visual speaker diarization aims at detecting "who spoke when" using both auditory and visual signals.

Relation Network speaker-diarization +1

Novel View Synthesis for High-fidelity Headshot Scenes

1 code implementation31 May 2022 Satoshi Tsutsui, Weijia Mao, Sijing Lin, Yunyi Zhu, Murong Ma, Mike Zheng Shou

Based on these observations, we propose a method to use both NeRF and 3DMM to synthesize a high-fidelity novel view of a scene with a face.

Generative Adversarial Network Novel View Synthesis +1

Action Recognition based on Cross-Situational Action-object Statistics

1 code implementation15 Aug 2022 Satoshi Tsutsui, Xizi Wang, Guangyuan Weng, Yayun Zhang, David Crandall, Chen Yu

We set out to identify properties of training data that lead to action recognition models with greater generalization ability.

Action Recognition Object +1

Single-Stage Open-world Instance Segmentation with Cross-task Consistency Regularization

1 code implementation18 Aug 2022 Xizhe Xue, Dongdong Yu, Lingqiao Liu, Yu Liu, Satoshi Tsutsui, Ying Li, Zehuan Yuan, Ping Song, Mike Zheng Shou

Based on the single-stage instance segmentation framework, we propose a regularization model to predict foreground pixels and use its relation to instance segmentation to construct a cross-task consistency loss.

Autonomous Driving Object +3

Benchmarking White Blood Cell Classification Under Domain Shift

1 code implementation3 Mar 2023 Satoshi Tsutsui, Zhengyang Su, Bihan Wen

Recognizing the types of white blood cells (WBCs) in microscopic images of human blood smears is a fundamental task in the fields of pathology and hematology.

Benchmarking Classification

Mover: Mask and Recovery based Facial Part Consistency Aware Method for Deepfake Video Detection

no code implementations3 Mar 2023 Juan Hu, Xin Liao, Difei Gao, Satoshi Tsutsui, Qian Wang, Zheng Qin, Mike Zheng Shou

Specifically, given a real face image, we first pretrain a masked autoencoder to learn facial part consistency by dividing faces into three parts and randomly masking ROIs, which are then recovered based on the unmasked facial parts.

DeepFake Detection Face Swapping

Recap: Detecting Deepfake Video with Unpredictable Tampered Traces via Recovering Faces and Mapping Recovered Faces

no code implementations19 Aug 2023 Juan Hu, Xin Liao, Difei Gao, Satoshi Tsutsui, Qian Wang, Zheng Qin, Mike Zheng Shou

In the recovering stage, the model focuses on randomly masking regions of interest (ROIs) and reconstructing real faces without unpredictable tampered traces, resulting in a relatively good recovery effect for real faces while a poor recovery effect for fake faces.

DeepFake Detection Face Swapping

Cannot find the paper you are looking for? You can Submit a new open access paper.