Search Results for author: Takayuki Okatani

Found 61 papers, 20 papers with code

Temporal Insight Enhancement: Mitigating Temporal Hallucination in Multimodal Large Language Models

no code implementations • 18 Jan 2024 • Li Sun, Liuan Wang, Jun Sun, Takayuki Okatani

This study introduces an innovative method to address event-level hallucinations in MLLMs, focusing on specific temporal understanding in video content.

Hallucination

Paper
Add Code

SBCFormer: Lightweight Network Capable of Full-size ImageNet Classification at 1 FPS on Single Board Computers

1 code implementation • 7 Nov 2023 • Xiangyong Lu, Masanori Suganuma, Takayuki Okatani

For the first time, it achieves an ImageNet-1K top-1 accuracy of around 80% at a speed of 1. 0 frame/sec on the SBC.

Management

Paper
Code

Visual Abductive Reasoning Meets Driving Hazard Prediction

1 code implementation • 7 Oct 2023 • Korawat Charoenpitaks, Van-Quang Nguyen, Masanori Suganuma, Masahiro Takahashi, Ryoma Niihara, Takayuki Okatani

To enable research in this understudied area, a new dataset named the DHPR (Driving Hazard Prediction and Reasoning) dataset is created.

Anomaly Detection Visual Abductive Reasoning

Paper
Code

That's BAD: Blind Anomaly Detection by Implicit Local Feature Clustering

no code implementations • 6 Jul 2023 • Jie Zhang, Masanori Suganuma, Takayuki Okatani

They consider an unsupervised setting, specifically the one-class setting, in which we assume the availability of a set of normal (\textit{i. e.}, anomaly-free) images for training.

Anomaly Detection Clustering +1

Paper
Add Code

Reference-based Motion Blur Removal: Learning to Utilize Sharpness in the Reference Image

no code implementations • 6 Jul 2023 • Han Zou, Masanori Suganuma, Takayuki Okatani

We can utilize an alternative shot of the identical scene, just like in video deblurring, or we can even employ a distinct image from another scene.

Deblurring Image Deblurring

Paper
Add Code

RefVSR++: Exploiting Reference Inputs for Reference-based Video Super-resolution

no code implementations • 6 Jul 2023 • Han Zou, Masanori Suganuma, Takayuki Okatani

Then, we propose an improved method, RefVSR++, which can aggregate two features in parallel in the temporal direction, one for aggregating the fused LR and Ref inputs and the other for Ref inputs over time.

Reference-based Video Super-Resolution Video Super-Resolution

Paper
Add Code

Contextual Affinity Distillation for Image Anomaly Detection

no code implementations • 6 Jul 2023 • Jie Zhang, Masanori Suganuma, Takayuki Okatani

The local student, which is used in previous studies mainly focuses on structural anomaly detection while the global student pays attention to logical anomalies.

Ranked #13 on Anomaly Detection on MVTec LOCO AD

Anomaly Detection Knowledge Distillation

Paper
Add Code

Bridge Damage Cause Estimation Using Multiple Images Based on Visual Question Answering

no code implementations • 18 Feb 2023 • Tatsuro Yamane, Pang-jo Chun, Ji Dang, Takayuki Okatani

For this, a VQA model was developed that uses bridge images for dataset creation and outputs the damage or member name and its existence based on the images and questions.

Question Answering Visual Question Answering

Paper
Add Code

SuperGF: Unifying Local and Global Features for Visual Localization

no code implementations • 23 Dec 2022 • Wenzheng Song, Ran Yan, Boshu Lei, Takayuki Okatani

In this study, we present a novel method called SuperGF, which effectively unifies local and global features for visual localization, leading to a higher trade-off between localization accuracy and computational efficiency.

Computational Efficiency Image Retrieval +4

Paper
Add Code

GRIT: Faster and Better Image captioning Transformer Using Dual Visual Features

2 code implementations • 20 Jul 2022 • Van-Quang Nguyen, Masanori Suganuma, Takayuki Okatani

Current state-of-the-art methods for image captioning employ region-based features, as they provide object-level information that is essential to describe the content of images; they are usually extracted by an object detector such as Faster R-CNN.

Ranked #8 on Image Captioning on nocaps in-domain

Image Captioning

170

Paper
Code

Rectifying Open-set Object Detection: A Taxonomy, Practical Applications, and Proper Evaluation

no code implementations • 20 Jul 2022 • Yusuke Hosoya, Masanori Suganuma, Takayuki Okatani

In this paper, we first point out that the recent studies' formalization of OSOD, which generalizes open-set recognition (OSR) and thus considers an unlimited variety of unknown objects, has a fundamental issue.

Image Classification Object +3

Paper
Add Code

Single-image Defocus Deblurring by Integration of Defocus Map Prediction Tracing the Inverse Problem Computation

no code implementations • 7 Jul 2022 • Qian Ye, Masanori Suganuma, Takayuki Okatani

Considering the spatial variant property of the defocus blur and the blur level indicated in the defocus map, we employ the defocus map as conditional guidance to adjust the features from the input blurring images instead of simple concatenation.

Deblurring Image Deblurring +1

Paper
Add Code

Learning Regularized Multi-Scale Feature Flow for High Dynamic Range Imaging

no code implementations • 6 Jul 2022 • Qian Ye, Masanori Suganuma, Jun Xiao, Takayuki Okatani

Reconstructing ghosting-free high dynamic range (HDR) images of dynamic scenes from a set of multi-exposure images is a challenging task, especially with large object motion and occlusions, leading to visible artifacts using existing methods.

Vocal Bursts Intensity Prediction

Paper
Add Code

Rethinking Unsupervised Domain Adaptation for Semantic Segmentation

2 code implementations • 30 Jun 2022 • Zhijie Wang, Masanori Suganuma, Takayuki Okatani

Due to its high annotation cost, researchers have developed many UDA methods for semantic segmentation, which assume no labeled sample is available in the target domain.

Semantic Segmentation Unsupervised Domain Adaptation

268

Paper
Code

Symmetry-Aware Neural Architecture for Embodied Visual Exploration

no code implementations • CVPR 2022 • Shuang Liu, Takayuki Okatani

We then propose a network design for the actor and the critic to inherently attain these symmetries.

Reinforcement Learning (RL)

Paper
Add Code

Symmetry-aware Neural Architecture for Embodied Visual Navigation

no code implementations • 17 Dec 2021 • Shuang Liu, Takayuki Okatani

We then propose a network design for the actor and the critic to inherently attain these symmetries.

Reinforcement Learning (RL) Visual Navigation

Paper
Add Code

Improved Few-shot Segmentation by Redefinition of the Roles of Multi-level CNN Features

no code implementations • 14 Sep 2021 • Zhijie Wang, Masanori Suganuma, Takayuki Okatani

This study is concerned with few-shot segmentation, i. e., segmenting the region of an unseen object class in a query image, given support image(s) of its instances.

Paper
Add Code

Cross-Region Domain Adaptation for Class-level Alignment

no code implementations • 14 Sep 2021 • Zhijie Wang, Xing Liu, Masanori Suganuma, Takayuki Okatani

To cope with this, we propose a method that applies adversarial training to align two feature distributions in the target domain.

Ranked #1 on Domain Adaptation on Synscapes-to-Cityscapes

Semantic Segmentation Synthetic-to-Real Translation +1

Paper
Add Code

Matching in the Dark: A Dataset for Matching Image Pairs of Low-light Scenes

no code implementations • ICCV 2021 • Wenzheng Song, Masanori Suganuma, Xing Liu, Noriyuki Shimobayashi, Daisuke Maruta, Takayuki Okatani

To consider if and how well we can utilize such information stored in RAW-format images for image matching, we have created a new dataset named MID (matching in the dark).

Paper
Add Code

Progressive and Selective Fusion Network for High Dynamic Range Imaging

1 code implementation • 19 Aug 2021 • Qian Ye, Jun Xiao, Kin-Man Lam, Takayuki Okatani

We propose a novel method that can better fuse the features based on two ideas.

Image Generation Vocal Bursts Intensity Prediction

Paper
Code

Look Wide and Interpret Twice: Improving Performance on Interactive Instruction-following Tasks

1 code implementation • 1 Jun 2021 • Van-Quang Nguyen, Masanori Suganuma, Takayuki Okatani

It then integrates the prediction with the visual information etc., yielding the final prediction of an action and an object.

Instruction Following

Paper
Code

Pushing the Envelope of Thin Crack Detection

no code implementations • 9 Jan 2021 • Liang Xu, Taro Hatsutani, Xing Liu, Engkarat Techapanurak, Han Zou, Takayuki Okatani

We experimentally show that this makes it possible to detect cracks from an image of one-third the resolution of images used for annotation with about the same accuracy.

Paper
Add Code

Bridging In- and Out-of-distribution Samples for Their Better Discriminability

no code implementations • 7 Jan 2021 • Engkarat Techapanurak, Anh-Chuong Dang, Takayuki Okatani

We estimate where the generated samples by a single image transformation lie between ID and OOD using a network trained on clean ID samples.

Out of Distribution (OOD) Detection

Paper
Add Code

Practical Evaluation of Out-of-Distribution Detection Methods for Image Classification

no code implementations • 7 Jan 2021 • Engkarat Techapanurak, Takayuki Okatani

We reconsider the evaluation of OOD detection methods for image recognition.

General Classification Out-of-Distribution Detection +1

Paper
Add Code

Learning To Bundle-Adjust: A Graph Network Approach to Faster Optimization of Bundle Adjustment for Vehicular SLAM

no code implementations • ICCV 2021 • Tetsuya Tanaka, Yukihiro Sasagawa, Takayuki Okatani

Bundle adjustment (BA) occupies a large portion of SfM and visual SLAM's total execution time.

Paper
Add Code

How Can CNNs Use Image Position for Segmentation?

no code implementations • 7 May 2020 • Rito Murase, Masanori Suganuma, Takayuki Okatani

We draw a mixed conclusion from the experimental results; the positional encoding certainly works in some cases, but the absolute image position may not be so important for segmentation tasks as we think.

Image Segmentation Medical Image Segmentation +4

Paper
Add Code

Efficient Attention Mechanism for Visual Dialog that can Handle All the Interactions between Multiple Inputs

1 code implementation • ECCV 2020 • Van-Quang Nguyen, Masanori Suganuma, Takayuki Okatani

It has been a primary concern in recent studies of vision and language tasks to design an effective attention mechanism dealing with interactions between the two modalities.

Ranked #7 on Visual Dialog on Visual Dialog v1.0 test-std

Visual Dialog

Paper
Code

Analysis of Deep Networks for Monocular Depth Estimation Through Adversarial Attacks with Proposal of a Defense Method

no code implementations • 20 Nov 2019 • Junjie Hu, Takayuki Okatani

However, the prediction of saliency maps is itself vulnerable to the attacks, even though it is not the direct target of the attacks.

Monocular Depth Estimation

Paper
Add Code

Analysis and a Solution of Momentarily Missed Detection for Anchor-based Object Detectors

no code implementations • 21 Oct 2019 • Yusuke Hosoya, Masanori Suganuma, Takayuki Okatani

The employment of convolutional neural networks has led to significant performance improvement on the task of object detection.

object-detection Object Detection

Paper
Add Code

Restoring Images with Unknown Degradation Factors by Recurrent Use of a Multi-branch Network

1 code implementation • 10 Jul 2019 • Xing Liu, Masanori Suganuma, Xiyang Luo, Takayuki Okatani

The employment of convolutional neural networks has achieved unprecedented performance in the task of image restoration for a variety of degradation factors.

Deblurring JPEG Artifact Removal +1

Paper
Code

Evaluating Artificial Systems for Pairwise Ranking Tasks Sensitive to Individual Differences

no code implementations • 30 May 2019 • Xing Liu, Takayuki Okatani

There is another type of tasks for which what to predict is human perception itself, in which there are often individual differences.

Paper
Add Code

Hyperparameter-Free Out-of-Distribution Detection Using Softmax of Scaled Cosine Similarity

1 code implementation • 25 May 2019 • Engkarat Techapanurak, Masanori Suganuma, Takayuki Okatani

The ability to detect out-of-distribution (OOD) samples is vital to secure the reliability of deep neural networks in real-world applications.

Metric Learning Out-of-Distribution Detection

Paper
Code

Improving Head Pose Estimation with a Combined Loss and Bounding Box Margin Adjustment

1 code implementation • 14 May 2019 • Mingzhen Shao, Zhun Sun, Mete Ozay, Takayuki Okatani

We address a problem of estimating pose of a person's head from its RGB image.

Head Pose Estimation

Paper
Code

Visualization of Convolutional Neural Networks for Monocular Depth Estimation

1 code implementation • ICCV 2019 • Junjie Hu, Yan Zhang, Takayuki Okatani

We formulate it as an optimization problem of identifying the smallest number of image pixels from which the CNN can estimate a depth map with the minimum difference from the estimate from the entire image.

Interpretable Machine Learning

147

Paper
Code

Dual Residual Networks Leveraging the Potential of Paired Operations for Image Restoration

1 code implementation • CVPR 2019 • Xing Liu, Masanori Suganuma, Zhun Sun, Takayuki Okatani

In this paper, we study design of deep neural networks for tasks of image restoration.

Image Restoration

152

Paper
Code

Toward Explainable Fashion Recommendation

no code implementations • 15 Jan 2019 • Pongsate Tangseng, Takayuki Okatani

For this purpose, we propose a method for quantifying how influential each feature of each item is to the score.

Paper
Add Code

Multi-task Learning of Hierarchical Vision-Language Representation

no code implementations • CVPR 2019 • Duy-Kien Nguyen, Takayuki Okatani

The representation is hierarchical, and prediction for each task is computed from the representation at its corresponding level of the hierarchy.

Multi-Task Learning Question Answering +3

Paper
Add Code

Attention-based Adaptive Selection of Operations for Image Restoration in the Presence of Unknown Combined Distortions

1 code implementation • CVPR 2019 • Masanori Suganuma, Xing Liu, Takayuki Okatani

There are many different types of distortion which affect image quality.

Image Restoration

Paper
Code

Feature Quantization for Defending Against Distortion of Images

no code implementations • CVPR 2018 • Zhun Sun, Mete Ozay, Yan Zhang, Xing Liu, Takayuki Okatani

In this work, we address the problem of improving robustness of convolutional neural networks (CNNs) to image distortion.

Quantization

Paper
Add Code

Recommending Outfits from Personal Closet

no code implementations • 26 Apr 2018 • Pongsate Tangseng, Kota Yamaguchi, Takayuki Okatani

We consider grading a fashion outfit for recommendation, where we assume that users have a closet of items and we aim at producing a score for an arbitrary combination of items in the closet.

General Classification

Paper
Add Code

Improved Fusion of Visual and Language Representations by Dense Symmetric Co-Attention for Visual Question Answering

1 code implementation • CVPR 2018 • Duy-Kien Nguyen, Takayuki Okatani

A key solution to visual question answering (VQA) exists in how to fuse visual and language features extracted from an input image and question.

Visual Question Answering

101

Paper
Code

Revisiting Single Image Depth Estimation: Toward Higher Resolution Maps with Accurate Object Boundaries

4 code implementations • 23 Mar 2018 • Junjie Hu, Mete Ozay, Yan Zhang, Takayuki Okatani

Experimental results show that these two improvements enable to attain higher accuracy than the current state-of-the-arts, which is given by finer resolution reconstruction, for example, with small objects and object boundaries.

Ranked #61 on Monocular Depth Estimation on NYU-Depth V2

Monocular Depth Estimation

370

Paper
Code

Exploiting the Potential of Standard Convolutional Autoencoders for Image Restoration by Evolutionary Search

1 code implementation • ICML 2018 • Masanori Suganuma, Mete Ozay, Takayuki Okatani

Researchers have applied deep neural networks to image restoration tasks, in which they proposed various network architectures, loss functions, and training methods.

Image Restoration

Paper
Code

Deep Structured Energy-Based Image Inpainting

1 code implementation • 24 Jan 2018 • Fazil Altinel, Mete Ozay, Takayuki Okatani

In this paper, we propose a structured image inpainting method employing an energy based model.

Image Inpainting Structured Prediction

Paper
Code

A vision based system for underwater docking

no code implementations • 12 Dec 2017 • Shuang Liu, Mete Ozay, Takayuki Okatani, Hongli Xu, Kai Sun, Yang Lin

In the experiments, we first evaluate performance of the proposed detection module on UDID and its deformed variations.

Pose Estimation Position

Paper
Add Code

HyperNetworks with statistical filtering for defending adversarial examples

no code implementations • 6 Nov 2017 • Zhun Sun, Mete Ozay, Takayuki Okatani

This problem was addressed by employing several defense methods for detection and rejection of particular types of attacks.

General Classification Image Classification

Paper
Add Code

End-to-end learning potentials for structured attribute prediction

no code implementations • 6 Aug 2017 • Kota Yamaguchi, Takayuki Okatani, Takayuki Umeda, Kazuhiko Murasaki, Kyoko Sudo

We present a structured inference approach in deep neural networks for multiple attribute prediction.

Attribute

Paper
Add Code

Linear Discriminant Generative Adversarial Networks

no code implementations • 25 Jul 2017 • Zhun Sun, Mete Ozay, Takayuki Okatani

We develop a novel method for training of GANs for unsupervised and class conditional generation of images, called Linear Discriminant GAN (LD-GAN).

Paper
Add Code

Improving Robustness of Feature Representations to Image Deformations using Powered Convolution in CNNs

no code implementations • 25 Jul 2017 • Zhun Sun, Mete Ozay, Takayuki Okatani

In this work, we address the problem of improvement of robustness of feature representations learned using convolutional neural networks (CNNs) to image deformation.

object-detection Object Detection +1

Paper
Add Code

Information Potential Auto-Encoders

no code implementations • 14 Jun 2017 • Yan Zhang, Mete Ozay, Zhun Sun, Takayuki Okatani

In order to estimate the entropy of the encoding variables and the mutual information, we propose a non-parametric method.

Representation Learning

Paper
Add Code

Truncating Wide Networks using Binary Tree Architectures

1 code implementation • ICCV 2017 • Yan Zhang, Mete Ozay, Shuo-Hao Li, Takayuki Okatani

By employing the proposed architecture on a baseline wide network, we can construct and train a new network with same depth but considerably less number of parameters.

General Classification Image Classification

Paper
Code

Optimization on Product Submanifolds of Convolution Kernels

no code implementations • 22 Jan 2017 • Mete Ozay, Takayuki Okatani

The results show that geometric adaptive step size computation methods of G-SGD can improve training loss and convergence properties of CNNs.

Paper
Add Code

Self-calibration-based Approach to Critical Motion Sequences of Rolling-shutter Structure from Motion

no code implementations • CVPR 2017 • Eisuke Ito, Takayuki Okatani

In this paper we consider critical motion sequences (CMSs) of rolling-shutter (RS) SfM.

Paper
Add Code

Optimization on Submanifolds of Convolution Kernels in CNNs

no code implementations • 22 Oct 2016 • Mete Ozay, Takayuki Okatani

Following our theoretical results, we propose a SGD algorithm with assurance of almost sure convergence of the methods to a solution at single minimum of classification loss of CNNs.

General Classification Image Classification

Paper
Add Code

Recognizing Open-Vocabulary Relations between Objects in Images

no code implementations • PACLIC 2016 • Masayasu Muraoka, Sumit Maharjan, Masaki Saito, Kota Yamaguchi, Naoaki Okazaki, Takayuki Okatani, Kentaro Inui

Language Modelling Object Recognition

Paper
Add Code

Automatic Attribute Discovery with Neural Activations

1 code implementation • 25 Jul 2016 • Sirion Vittayakorn, Takayuki Umeda, Kazuhiko Murasaki, Kyoko Sudo, Takayuki Okatani, Kota Yamaguchi

This paper proposes an automatic approach to discover and analyze visual attributes from a noisy collection of image-text data on the Web.

Attribute

153

Paper
Code

Design of Kernels in Convolutional Neural Networks for Image Classification

1 code implementation • 30 Nov 2015 • Zhun Sun, Mete Ozay, Takayuki Okatani

Despite the effectiveness of Convolutional Neural Networks (CNNs) for image classification, our understanding of the relationship between shape of convolution kernels and learned representations is limited.

Classification General Classification +1

Paper
Code

Integrating Deep Features for Material Recognition

no code implementations • 20 Nov 2015 • Yan Zhang, Mete Ozay, Xing Liu, Takayuki Okatani

We propose a method for integration of features extracted using deep representations of Convolutional Neural Networks (CNNs) each of which is learned using a different image dataset of objects and materials for material recognition.

feature selection Material Recognition +1

Paper
Add Code

Transformation of Markov Random Fields for Marginal Distribution Estimation

no code implementations • CVPR 2015 • Masaki Saito, Takayuki Okatani

Although downsizing MRFs should directly reduce the computational cost, there is no systematic way of doing this, since it is unclear how to obtain the MRF energy for the downsized MRFs and also how to translate the estimates of their marginal distributions to those of the original MRFs.

Paper
Add Code

Detecting Changes in 3D Structure of a Scene from Multi-view Images Captured by a Vehicle-Mounted Camera

no code implementations • CVPR 2013 • Ken Sakurada, Takayuki Okatani, Koichiro Deguchi

The proposed method is compared with the methods that use multi-view stereo (MVS) to reconstruct the scene structures of the two time points and then differentiate them to detect changes.

Change Detection

Paper
Add Code

Discrete MRF Inference of Marginal Densities for Non-uniformly Discretized Variable Space

no code implementations • CVPR 2013 • Masaki Saito, Takayuki Okatani, Koichiro Deguchi

In this paper, we show a novel formulation for this continuous-discrete conversion.

Computational Efficiency Optical Flow Estimation +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.