Search Results for author: Akihiro Sugimoto

Found 19 papers, 5 papers with code

Region-Based Multiscale Spatiotemporal Saliency for Video

no code implementations4 Aug 2017 Trung-Nghia Le, Akihiro Sugimoto

Detecting salient objects from a video requires exploiting both spatial and temporal knowledge included in the video.

Saliency Detection

Video Salient Object Detection Using Spatiotemporal Deep Features

no code implementations4 Aug 2017 Trung-Nghia Le, Akihiro Sugimoto

STCRF is our extension of CRF to the temporal domain and describes the relationships among neighboring regions both in a frame and over frames.

Object object-detection +4

Semantic Instance Meets Salient Object: Study on Video Semantic Salient Instance Segmentation

no code implementations4 Jul 2018 Trung-Nghia Le, Akihiro Sugimoto

In addition, to tackle the task of VSSIS, we augment the DAVIS-2017 benchmark dataset by assigning semantic ground-truth for salient instance labels, obtaining SEmantic Salient Instance Video (SESIV) dataset.

Instance Segmentation Robot Navigation +3

Linear solution to the minimal absolute pose rolling shutter problem

no code implementations30 Dec 2018 Zuzana Kukelova, Cenek Albl, Akihiro Sugimoto, Tomas Pajdla

Our best 6-point solver, based on the new alternation technique, shows an identical or even better performance than the state-of-the-art R6P solver and is two orders of magnitude faster.

Visual-Relation Conscious Image Generation from Structured-Text

no code implementations ECCV 2020 Duc Minh Vo, Akihiro Sugimoto

We also use individual relation separately to predict from the initial bounding-boxes relation-units for all the relations in the input text.

Relation Text-to-Image Generation

Two-Stream FCNs to Balance Content and Style for Style Transfer

no code implementations19 Nov 2019 Duc Minh Vo, Akihiro Sugimoto

The semantic content feature and the style representation feature are then concatenated adaptively and fed into the decoder to generate style-transferred (stylized) images.

Style Transfer Vocal Bursts Valence Prediction

TetraTSDF: 3D human reconstruction from a single image with a tetrahedral outer shell

1 code implementation CVPR 2020 Hayato Onizuka, Zehra Hayirci, Diego Thomas, Akihiro Sugimoto, Hideaki Uchiyama, Rin-ichiro Taniguchi

In this paper, we propose the tetrahedral outer shell volumetric truncated signed distance function (TetraTSDF) model for the human body, and its corresponding part connection network (PCN) for 3D human body shape regression.

3D Human Reconstruction regression

Anabranch Network for Camouflaged Object Segmentation

2 code implementations Computer Vision and Image Understanding 2019 Trung-Nghia Le, Tam V. Nguyen, Zhongliang Nie, Minh-Triet Tran, Akihiro Sugimoto

Different from existing networks for segmentation, our proposed network possesses the second branch for classification to predict the probability of containing camouflaged object(s) in an image, which is then fused into the main branch for segmentation to boost up the segmentation accuracy.

Benchmarking Camouflaged Object Segmentation +3

Agent-Environment Network for Temporal Action Proposal Generation

no code implementations17 Jul 2021 Viet-Khoa Vo-Ho, Ngan Le, Kashu Yamazaki, Akihiro Sugimoto, Minh-Triet Tran

Temporal action proposal generation is an essential and challenging task that aims at localizing temporal intervals containing human actions in untrimmed videos.

Temporal Action Proposal Generation

PPCD-GAN: Progressive Pruning and Class-Aware Distillation for Large-Scale Conditional GANs Compression

no code implementations16 Mar 2022 Duc Minh Vo, Akihiro Sugimoto, Hideki Nakayama

We push forward neural network compression research by exploiting a novel challenging task of large-scale conditional generative adversarial networks (GANs) compression.

Neural Network Compression

ABN: Agent-Aware Boundary Networks for Temporal Action Proposal Generation

1 code implementation16 Mar 2022 Khoa Vo, Kashu Yamazaki, Sang Truong, Minh-Triet Tran, Akihiro Sugimoto, Ngan Le

Temporal action proposal generation (TAPG) aims to estimate temporal intervals of actions in untrimmed videos, which is a challenging yet plays an important role in many tasks of video analysis and understanding.

Action Detection Temporal Action Proposal Generation

NOC-REK: Novel Object Captioning with Retrieved Vocabulary from External Knowledge

no code implementations CVPR 2022 Duc Minh Vo, Hong Chen, Akihiro Sugimoto, Hideki Nakayama

We propose an end-to-end Novel Object Captioning with Retrieved vocabulary from External Knowledge method (NOC-REK), which simultaneously learns vocabulary retrieval and caption generation, successfully describing novel objects outside of the training dataset.

Object object-detection +2

Video Sparse Transformer With Attention-Guided Memory for Video Object Detection

1 code implementation IEEE Access 2022 Masato Fujitake, Akihiro Sugimoto

In this paper, we enhance features element-wisely before the object candidate region detection, proposing Video Sparse Transformer with Attention-guided Memory (VSTAM).

Object object-detection +3

EVCap: Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World Comprehension

no code implementations27 Nov 2023 Jiaxuan Li, Duc Minh Vo, Akihiro Sugimoto, Hideki Nakayama

Large language models (LLMs)-based image captioning has the capability of describing objects not explicitly observed in training data; yet novel objects occur frequently, necessitating the requirement of sustaining up-to-date object knowledge for open-world comprehension.

Image Captioning Object +1

Cannot find the paper you are looking for? You can Submit a new open access paper.