Search Results for author: Masatoshi Okutomi

Found 52 papers, 12 papers with code

Disparity Estimation Using a Quad-Pixel Sensor

1 code implementation1 Sep 2024 Zhuofeng Wu, Doehyung Lee, Zihua Liu, Kazunori Yoshizaki, Yusuke Monno, Masatoshi Okutomi

Similar to a dual-pixel (DP) sensor, the phase shifting can be regarded as stereo disparity and utilized for depth estimation.

Depth Estimation Disparity Estimation +1

Diffusion-Based Adaptation for Classification of Unknown Degraded Images

1 code implementation CVPRW 2024 Dinesh Daultani, Masayuki Tanaka, Masatoshi Okutomi, Kazuki Endo

To address the issue of imperfect adapted clean images from diffusion models for the classification of degraded images, we propose a novel Diffusion-based Adaptation for Unknown Degraded images (DiffAUD) method based on robust classifiers trained on a few known degradations.

Ranked #6 on Domain Generalization on ImageNet-C (Top 1 Accuracy metric, using extra training data)

Classification Domain Generalization +2

Neural Radiance Fields for Novel View Synthesis in Monocular Gastroscopy

no code implementations29 May 2024 Zijie Jiang, Yusuke Monno, Masatoshi Okutomi, Sho Suzuki, Kenji Miki

Enabling the synthesis of arbitrarily novel viewpoint images within a patient's stomach from pre-captured monocular gastroscopic images is a promising topic in stomach diagnosis.

3D Reconstruction Novel View Synthesis +1

VSRD: Instance-Aware Volumetric Silhouette Rendering for Weakly Supervised 3D Object Detection

1 code implementation CVPR 2024 Zihua Liu, Hiroki Sakuma, Masatoshi Okutomi

In the auto-labeling stage, we represent the surface of each instance as a signed distance field (SDF) and render its silhouette as an instance mask through our proposed instance-aware volumetric silhouette rendering.

Monocular 3D Object Detection Monocular Depth Estimation +4

CFDNet: A Generalizable Foggy Stereo Matching Network with Contrastive Feature Distillation

no code implementations28 Feb 2024 Zihua Liu, Yizhou Li, Masatoshi Okutomi

Stereo matching under foggy scenes remains a challenging task since the scattering effect degrades the visibility and results in less distinctive features for dense correspondence matching.

Contrastive Learning Depth Estimation +1

Digging Into Normal Incorporated Stereo Matching

1 code implementation ACM International Conference on Multimedia 2022 Zihua Liu, Songyan Zhang, Zhicheng Wang, Masatoshi Okutomi

To enhance geometric consistency, especially in low-texture regions, the estimated normal map is then leveraged to calculate a local affinity matrix, providing the residual learning with information about where the correction should refer and thus improving the residual learning efficiency.

Disparity Estimation Stereo Matching

Self-Supervised Spatially Variant PSF Estimation for Aberration-Aware Depth-from-Defocus

no code implementations28 Feb 2024 Zhuofeng Wu, Yusuke Monno, Masatoshi Okutomi

In this paper, we address the task of aberration-aware depth-from-defocus (DfD), which takes account of spatially variant point spread functions (PSFs) of a real camera.

Depth Estimation Self-Supervised Learning

Reflection Removal Using Recurrent Polarization-to-Polarization Network

1 code implementation28 Feb 2024 Wenjiao Bian, Yusuke Monno, Masatoshi Okutomi

This paper addresses reflection removal, which is the task of separating reflection components from a captured image and deriving the image with only transmission components.

Reflection Removal

Global Occlusion-Aware Transformer for Robust Stereo Matching

1 code implementation22 Dec 2023 Zihua Liu, Yizhou Li, Masatoshi Okutomi

Despite the remarkable progress facilitated by learning-based stereo-matching algorithms, the performance in the ill-conditioned regions, such as the occluded regions, remains a bottleneck.

Disparity Estimation Occlusion Estimation +1

Polarimetric PatchMatch Multi-View Stereo

no code implementations11 Nov 2023 Jinyu Zhao, Jumpei Oishi, Yusuke Monno, Masatoshi Okutomi

PatchMatch Multi-View Stereo (PatchMatch MVS) is one of the popular MVS approaches, owing to its balanced accuracy and efficiency.

Stereo Matching

EMR-MSF: Self-Supervised Recurrent Monocular Scene Flow Exploiting Ego-Motion Rigidity

no code implementations ICCV 2023 Zijie Jiang, Masatoshi Okutomi

In this paper, we propose a superior model named EMR-MSF by borrowing the advantages of network architecture design under the scope of supervised learning.

Motion Estimation Scene Flow Estimation +1

Polarimetric Multi-View Inverse Rendering

no code implementations24 Dec 2022 Jinyu Zhao, Yusuke Monno, Masatoshi Okutomi

We then refine the initial model by optimizing photometric rendering errors and polarimetric errors using multi-view RGB, AoP, and DoP images, where we propose a novel polarimetric cost function that enables an effective constraint on the estimated surface normal of each vertex, while considering four possible ambiguous azimuth angles revealed from the AoP measurement.

3D Reconstruction Inverse Rendering

Dual-Pixel Raindrop Removal

no code implementations24 Oct 2022 Yizhou Li, Yusuke Monno, Masatoshi Okutomi

In this paper, we propose the first method using a Dual-Pixel (DP) sensor to better address the raindrop removal.

Raindrop Removal Rain Removal

Two-Step Color-Polarization Demosaicking Network

no code implementations13 Sep 2022 Vy Nguyen, Masayuki Tanaka, Yusuke Monno, Masatoshi Okutomi

Polarization information of light in a scene is valuable for various image processing and computer vision tasks.

Demosaicking Vocal Bursts Valence Prediction

Deep Hyperspectral-Depth Reconstruction Using Single Color-Dot Projection

no code implementations CVPR 2022 Chunyu Li, Yusuke Monno, Masatoshi Okutomi

To jointly reconstruct the depth and the hyperspectral reflectance from a single color-dot image, we propose a novel end-to-end network architecture that effectively incorporates a geometric color-dot pattern loss and a photometric hyperspectral reflectance loss.

Self-Supervised Ego-Motion Estimation Based on Multi-Layer Fusion of RGB and Inferred Depth

1 code implementation3 Mar 2022 Zijie Jiang, Hajime Taira, Naoyuki Miyashita, Masatoshi Okutomi

In this paper, we investigate the effect of different fusion strategies for ego-motion estimation and propose a new framework for self-supervised learning of depth and ego-motion estimation, which performs ego-motion estimation by leveraging RGB and inferred depth information in a Multi-Layer Fusion manner.

Motion Estimation Self-Supervised Learning

Single Image Deraining Network with Rain Embedding Consistency and Layered LSTM

1 code implementation5 Nov 2021 Yizhou Li, Yusuke Monno, Masatoshi Okutomi

For this purpose, an encoder-decoder network draws wide attention, where the encoder is required to encode a high-quality rain embedding which determines the performance of the subsequent decoding stage to reconstruct the rain layer.

Decoder Single Image Deraining

Learning-Based Depth and Pose Estimation for Monocular Endoscope with Loss Generalization

no code implementations28 Jul 2021 Aji Resindra Widya, Yusuke Monno, Masatoshi Okutomi, Sho Suzuki, Takuji Gotoda, Kenji Miki

In addition, we propose a novel generalized photometric loss function to avoid the complicated process of finding proper weights for balancing the depth and the pose loss terms, which is required for existing direct depth and pose supervision approaches.

3D Reconstruction Pose Estimation

Multi-Modal Pedestrian Detection with Large Misalignment Based on Modal-Wise Regression and Multi-Modal IoU

no code implementations23 Jul 2021 Napat Wanchaitanawong, Masayuki Tanaka, Takashi Shibata, Masatoshi Okutomi

Due to this assumption's breakdown, the position of the bounding boxes does not match between the two modalities, resulting in a significant decrease in detection accuracy, especially in regions where the amount of misalignment is large.

Pedestrian Detection regression

Geometric Data Augmentation Based on Feature Map Ensemble

no code implementations22 Jul 2021 Takashi Shibata, Masayuki Tanaka, Masatoshi Okutomi

The performance of CNN is dramatically degraded by geometric transformation, such as large rotations.

Data Augmentation

Video-Based Camera Localization Using Anchor View Detection and Recursive 3D Reconstruction

no code implementations7 Jul 2021 Hajime Taira, Koki Onbe, Naoyuki Miyashita, Masatoshi Okutomi

In this paper we introduce a new camera localization strategy designed for image sequences captured in challenging industrial situations such as industrial parts inspection.

3D Reconstruction Camera Localization

Spectral MVIR: Joint Reconstruction of 3D Shape and Spectral Reflectance

no code implementations15 Apr 2021 Chunyu Li, Yusuke Monno, Masatoshi Okutomi

Reconstructing an object's high-quality 3D shape with inherent spectral reflectance property, beyond typical device-dependent RGB albedos, opens the door to applications requiring a high-fidelity 3D model in terms of both geometry and photometry.

Inverse Rendering

VIO-Aided Structure from Motion Under Challenging Environments

no code implementations24 Jan 2021 Zijie Jiang, Hajime Taira, Naoyuki Miyashita, Masatoshi Okutomi

In this paper, we present a robust and efficient Structure from Motion pipeline for accurate 3D reconstruction under challenging environments by leveraging the camera pose information from a visual-inertial odometry.

3D Reconstruction Image Registration

Spectral Reflectance Estimation Using Projector with Unknown Spectral Power Distribution

no code implementations18 Dec 2020 Hironori Hidaka, Yusuke Monno, Masatoshi Okutomi

A lighting-based multispectral imaging system using an RGB camera and a projector is one of the most practical and low-cost systems to acquire multispectral observations for estimating the scene's spectral reflectance information.

Deep Snapshot HDR Imaging Using Multi-Exposure Color Filter Array

no code implementations20 Nov 2020 Takeru Suda, Masayuki Tanaka, Yusuke Monno, Masatoshi Okutomi

In this paper, we propose a deep snapshot high dynamic range (HDR) imaging framework that can effectively reconstruct an HDR image from the RAW data captured using a multi-exposure color filter array (ME-CFA), which consists of a mosaic pattern of RGB filters with different exposure levels.

Image Reconstruction

Adaptive Future Frame Prediction with Ensemble Network

no code implementations13 Nov 2020 Wonjik Kim, Masayuki Tanaka, Masatoshi Okutomi, Yoko SASAKI

A common limitation of the existing learning-based approaches is a mismatch of training data and test data.

Human Segmentation with Dynamic LiDAR Data

1 code implementation16 Oct 2020 Tao Zhong, Wonjik Kim, Masayuki Tanaka, Masatoshi Okutomi

It has a two-branch structure, i. e., the spatial segmentation branch and the temporal velocity estimation branch.

Segmentation

Monochrome and Color Polarization Demosaicking Using Edge-Aware Residual Interpolation

no code implementations28 Jul 2020 Miki Morimatsu, Yusuke Monno, Masayuki Tanaka, Masatoshi Okutomi

Since the polarimeter consists of an image sensor equipped with a monochrome or color polarization filter array (MPFA or CPFA), the demosaicking process to interpolate missing pixel values plays a crucial role in obtaining high-quality polarization images.

Demosaicking

Polarimetric Multi-View Inverse Rendering

no code implementations ECCV 2020 Jinyu Zhao, Yusuke Monno, Masatoshi Okutomi

A polarization camera has great potential for 3D reconstruction since the angle of polarization (AoP) of reflected light is related to an object's surface normal.

3D Reconstruction Inverse Rendering

Classifying degraded images over various levels of degradation

no code implementations15 Jun 2020 Kazuki Endo, Masayuki Tanaka, Masatoshi Okutomi

Classification for degraded images having various levels of degradation is very important in practical applications.

Classification Ensemble Learning +1

Stomach 3D Reconstruction Based on Virtual Chromoendoscopic Image Generation

no code implementations26 Apr 2020 Aji Resindra Widya, Yusuke Monno, Masatoshi Okutomi, Sho Suzuki, Takuji Gotoda, Kenji Miki

Gastric endoscopy is a standard clinical process that enables medical practitioners to diagnose various lesions inside a patient's stomach.

3D Reconstruction Image Generation +1

Learning-Based Human Segmentation and Velocity Estimation Using Automatic Labeled LiDAR Sequence for Training

no code implementations11 Mar 2020 Wonjik Kim, Masayuki Tanaka, Masatoshi Okutomi, Yoko SASAKI

In this paper, we propose an automatic labeled sequential data generation pipeline for human segmentation and velocity estimation with point clouds.

Segmentation

Pro-Cam SSfM: Projector-Camera System for Structure and Spectral Reflectance from Motion

no code implementations ICCV 2019 Chunyu Li, Yusuke Monno, Hironori Hidaka, Masatoshi Okutomi

In this paper, we propose a novel projector-camera system for practical and low-cost acquisition of a dense object 3D model with the spectral reflectance property.

3D Reconstruction

3D Reconstruction of Whole Stomach from Endoscope Video Using Structure-from-Motion

no code implementations30 May 2019 Aji Resindra Widya, Yusuke Monno, Kosuke Imahori, Masatoshi Okutomi, Sho Suzuki, Takuji Gotoda, Kenji Miki

In this work, we investigated how to enable structure-from-motion (SfM) to reconstruct the whole shape of a stomach from a standard endoscope video.

3D Reconstruction

Improving Transparency of Deep Neural Inference Process

no code implementations13 Mar 2019 Hiroshi Kuwajima, Masayuki Tanaka, Masatoshi Okutomi

However, the inference process of deep learning is black-box, and not very suitable to safety-critical systems which must exhibit high transparency.

Gradient-Based Low-Light Image Enhancement

no code implementations25 Sep 2018 Masayuki Tanaka, Takashi Shibata, Masatoshi Okutomi

A low-light image enhancement is a highly demanded image processing technique, especially for consumer digital cameras and cameras on mobile phones.

Low-Light Image Enhancement

Non-blind Image Restoration Based on Convolutional Neural Network

no code implementations11 Sep 2018 Kazutaka Uchida, Masayuki Tanaka, Masatoshi Okutomi

Blind image restoration processors based on convolutional neural network (CNN) are intensively researched because of their high performance.

Image Restoration

Joint optimization for compressive video sensing and reconstruction under hardware constraints

no code implementations ECCV 2018 Michitaka Yoshida, Akihiko Torii, Masatoshi Okutomi, Kenta Endo, Yukinobu Sugiyama, Rin-ichiro Taniguchi, Hajime Nagahara

Compressive video sensing is the process of encoding multiple sub-frames into a single frame with controlled sensor exposures and reconstructing the sub-frames from the single compressed frame.

Compressive Sensing

Structure-from-Motion using Dense CNN Features with Keypoint Relocalization

no code implementations10 May 2018 Aji Resindra Widya, Akihiko Torii, Masatoshi Okutomi

Then, we demonstrate on the Aachen Day-Night dataset that the proposed SfM using dense CNN features with the keypoint relocalization outperforms a state-of-the-art SfM (COLMAP using RootSIFT) by a large margin.

Misalignment-Robust Joint Filter for Cross-Modal Image Pairs

no code implementations ICCV 2017 Takashi Shibata, Masayuki Tanaka, Masatoshi Okutomi

Next, the joint-filter cost volume and a set of filtered images are computed from the target image and the set of the translated guidances.

Gradient-Domain Image Reconstruction Framework With Intensity-Range and Base-Structure Constraints

no code implementations CVPR 2016 Takashi Shibata, Masayuki Tanaka, Masatoshi Okutomi

To generate detail-preserving and artifact-free output images, we combine the benefits of the two approaches into the proposed framework by introducing the intensity-range constraint and the base-structure constraint.

Image Reconstruction Image Restoration +1

24/7 Place Recognition by View Synthesis

no code implementations CVPR 2015 Akihiko Torii, Relja Arandjelovic, Josef Sivic, Masatoshi Okutomi, Tomas Pajdla

We address the problem of large-scale visual place recognition for situations where the scene undergoes a major change in appearance, for example, due to illumination (day/night), change of seasons, aging, or structural modifications over time such as buildings built or destroyed.

Visual Place Recognition

A General and Simple Method for Camera Pose and Focal Length Determination

no code implementations CVPR 2014 Yinqiang Zheng, Shigeki Sugimoto, Imari Sato, Masatoshi Okutomi

In this paper, we revisit the pose determination problem of a partially calibrated camera with unknown focal length, hereafter referred to as the PnPf problem, by using n (n ≥ 4) 3D-to-2D point correspondences.

Triplet

Visual Place Recognition with Repetitive Structures

no code implementations CVPR 2013 Akihiko Torii, Josef Sivic, Tomas Pajdla, Masatoshi Okutomi

Even more importantly, they violate the feature independence assumed in the bag-of-visual-words representation which often leads to over-counting evidence and significant degradation of retrieval performance.

Retrieval Visual Place Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.