Search Results for author: Kiyoharu Aizawa

Found 56 papers, 19 papers with code

Can Pre-trained Networks Detect Familiar Out-of-Distribution Data?

1 code implementation2 Oct 2023 Atsuyuki Miyai, Qing Yu, Go Irie, Kiyoharu Aizawa

We consider that such data may significantly affect the performance of large pre-trained networks because the discriminability of these OOD data depends on the pre-training algorithm.

Out-of-Distribution Detection Out of Distribution (OOD) Detection

Open-Set Domain Adaptation with Visual-Language Foundation Models

no code implementations30 Jul 2023 Qing Yu, Go Irie, Kiyoharu Aizawa

Unsupervised domain adaptation (UDA) has proven to be very effective in transferring knowledge obtained from a source domain with labeled data to a target domain with unlabeled data.

Unsupervised Domain Adaptation

Manga109Dialog A Large-scale Dialogue Dataset for Comics Speaker Detection

1 code implementation30 Jun 2023 Yingxuan Li, Kiyoharu Aizawa, Yusuke Matsui

For further understanding of comics, an automated approach is needed to link text in comics to characters speaking the words.

Graph Generation Scene Graph Generation

Guided Image Synthesis via Initial Image Editing in Diffusion Model

no code implementations5 May 2023 Jiafeng Mao, Xueting Wang, Kiyoharu Aizawa

Diffusion models have the ability to generate high quality images by denoising pure Gaussian noise images.

Denoising Image Manipulation +1

Zero-Shot In-Distribution Detection in Multi-Object Settings Using Vision-Language Foundation Models

2 code implementations10 Apr 2023 Atsuyuki Miyai, Qing Yu, Go Irie, Kiyoharu Aizawa

First, images should be collected using only the name of the ID class without training on the ID data.

Comprehensive Comparisons of Uniform Quantization in Deep Image Compression

1 code implementation1 Mar 2023 Koki Tsubota, Kiyoharu Aizawa

The experimental results reveal that the best approximated quantization differs by the network architectures, and the best approximations of the three are different from the original ones used for the architectures.

Image Compression Quantization

Non-uniform Sampling Strategies for NeRF on 360{\textdegree} images

no code implementations7 Dec 2022 Takashi Otonari, Satoshi Ikehata, Kiyoharu Aizawa

We propose two non-uniform ray sampling schemes for NeRF to suit 360{\textdegree} images - distortion-aware ray sampling and content-aware ray sampling.

Novel View Synthesis

A Structure-Guided Diffusion Model for Large-Hole Image Completion

1 code implementation18 Nov 2022 Daichi Horita, Jiaolong Yang, Dong Chen, Yuki Koyama, Kiyoharu Aizawa, Nicu Sebe

The structure generator generates an edge image representing plausible structures within the holes, which is then used for guiding the texture generation process.

Denoising Texture Synthesis

Universal Deep Image Compression via Content-Adaptive Optimization with Adapters

1 code implementation2 Nov 2022 Koki Tsubota, Hiroaki Akutsu, Kiyoharu Aizawa

This task aims to compress images belonging to arbitrary domains, such as natural images, line drawings, and comics.

Image Compression

Rethinking Rotation in Self-Supervised Contrastive Learning: Adaptive Positive or Negative Data Augmentation

1 code implementation23 Oct 2022 Atsuyuki Miyai, Qing Yu, Daiki Ikami, Go Irie, Kiyoharu Aizawa

The semantics of an image can be rotation-invariant or rotation-variant, so whether the rotated image is treated as positive or negative should be determined based on the content of the image.

Contrastive Learning Data Augmentation

Evaluating the Stability of Deep Image Quality Assessment With Respect to Image Scaling

no code implementations20 Jul 2022 Koki Tsubota, Hiroaki Akutsu, Kiyoharu Aizawa

We comprehensively evaluate four deep IQAs on the same five datasets, and the experimental results show that image scale significantly influences IQA performance.

Image Quality Assessment SSIM

COO: Comic Onomatopoeia Dataset for Recognizing Arbitrary or Truncated Texts

1 code implementation11 Jul 2022 Jeonghun Baek, Yusuke Matsui, Kiyoharu Aizawa

To encourage research on this topic, we provide a novel comic onomatopoeia dataset (COO), which consists of onomatopoeia texts in Japanese comics.

Link Prediction Text Detection

SVG Vector Font Generation for Chinese Characters with Transformer

no code implementations21 Jun 2022 Haruka Aoki, Kiyoharu Aizawa

Designing fonts for Chinese characters is highly labor-intensive and time-consuming.

Font Generation

Intersection Prediction from Single 360° Image via Deep Detection of Possible Direction of Travel

no code implementations10 Apr 2022 Naoki Sugimoto, Satoshi Ikehata, Kiyoharu Aizawa

We constructed a large-scale 360{\deg} Image Intersection Identification (iii360) dataset for training and evaluation where 360{\deg} videos were collected from various areas such as school campus, downtown, suburb, and china town and demonstrate that our PDoT-based method achieves 88\% accuracy, which is significantly better than that achieved by the direct naive binary classification based method.

Binary Classification

Distortion-Aware Self-Supervised 360° Depth Estimation from A Single Equirectangular Projection Image

no code implementations3 Apr 2022 Yuya Hasegawa, Ikehata Satoshi, Kiyoharu Aizawa

We propose a framework of direct use of ERP with coordinate conversion of correspondences and distortion-aware upsampling module to deal with the ERP related problems and extend a self-supervised learning method for open environments.

Depth Estimation Depth Prediction +1

Noisy Annotation Refinement for Object Detection

no code implementations20 Oct 2021 Jiafeng Mao, Qing Yu, Yoko Yamakata, Kiyoharu Aizawa

In this study, we propose a new problem setting of training object detectors on datasets with entangled noises of annotations of class labels and bounding boxes.

object-detection Object Detection

A Novel Perspective for Positive-Unlabeled Learning via Noisy Labels

no code implementations8 Mar 2021 Daiki Tanaka, Daiki Ikami, Kiyoharu Aizawa

Positive-unlabeled learning refers to the process of training a binary classifier using only positive and unlabeled data.

What If We Only Use Real Datasets for Scene Text Recognition? Toward Scene Text Recognition With Fewer Labels

1 code implementation CVPR 2021 Jeonghun Baek, Yusuke Matsui, Kiyoharu Aizawa

To the best of our knowledge, this is the first study that 1) shows sufficient performance by only using real labels and 2) introduces semi- and self-supervised methods into STR with fewer labels.

Data Augmentation Scene Text Recognition

Building Movie Map -- A Tool for Exploring Areas in a City -- and its Evaluation

no code implementations17 Nov 2020 Naoki Sugimoto, Yoshihito Ebine, Kiyoharu Aizawa

Frames of the video are localized on the map, intersections are detected, and videos are segmented.


Few-Shot Font Generation with Deep Metric Learning

no code implementations4 Nov 2020 Haruka Aoki, Koki Tsubota, Hikaru Ikuta, Kiyoharu Aizawa

Designing fonts for languages with a large number of characters, such as Japanese and Chinese, is an extremely labor-intensive and time-consuming task.

Font Generation Metric Learning

SLGAN: Style- and Latent-guided Generative Adversarial Network for Desirable Makeup Transfer and Removal

no code implementations16 Sep 2020 Daichi Horita, Kiyoharu Aizawa

Furthermore, we show that our proposal can interpolate facial makeup images to determine the unique features, compare existing methods, and help users find desirable makeup configurations.

Multi-Task Curriculum Framework for Open-Set Semi-Supervised Learning

no code implementations ECCV 2020 Qing Yu, Daiki Ikami, Go Irie, Kiyoharu Aizawa

Semi-supervised learning (SSL) has been proposed to leverage unlabeled data for training powerful models when only limited labeled data is available.

Channel-Level Variable Quantization Network for Deep Image Compression

1 code implementation15 Jul 2020 Zhisheng Zhong, Hiroaki Akutsu, Kiyoharu Aizawa

In this paper, we propose a channel-level variable quantization network to dynamically allocate more bitrates for significant channels and withdraw bitrates for negligible channels.

Image Compression Quantization

Building a Manga Dataset "Manga109" with Annotations for Multimedia Applications

3 code implementations9 May 2020 Kiyoharu Aizawa, Azuma Fujimoto, Atsushi Otsubo, Toru Ogawa, Yusuke Matsui, Koki Tsubota, Hikaru Ikuta

Manga, or comics, which are a type of multimodal artwork, have been left behind in the recent trend of deep learning applications because of the lack of a proper dataset.


Unsupervised Out-of-Distribution Detection by Maximum Classifier Discrepancy

1 code implementation ICCV 2019 Qing Yu, Kiyoharu Aizawa

Unlike previous methods, we also utilize unlabeled data for unsupervised training and we use these unlabeled data to maximize the discrepancy between the decision boundaries of two classifiers to push OOD samples outside the manifold of the in-distribution (ID) samples, which enables us to detect OOD samples that are far from the support of the ID samples.

Out-of-Distribution Detection Out of Distribution (OOD) Detection

TriDepth: Triangular Patch-based Deep Depth Prediction

no code implementations3 May 2019 Masaya Kaneko, Ken Sakurada, Kiyoharu Aizawa

We propose a novel and efficient representation for single-view depth estimation using Convolutional Neural Networks (CNNs).

3D Scene Reconstruction Depth Estimation +1

Computational Attention System for Children, Adults and Elderly

no code implementations18 Apr 2019 Onkar Krishna, Kiyoharu Aizawa, Go Irie

Observer's of different age-group have shown different scene viewing tendencies independent to the class of the image viewed.

Recognition of Multiple Food Items in a Single Photo for Use in a Buffet-Style Restaurant

no code implementations3 Mar 2019 Masashi Anzawa, Sosuke Amano, Yoko Yamakata, Keiko Motonaga, Akiko Kamei, Kiyoharu Aizawa

We investigate image recognition of multiple food items in a single photo, focusing on a buffet restaurant application, where menu changes at every meal, and only a few images per class are available.

Context-Patch Face Hallucination Based on Thresholding Locality-constrained Representation and Reproducing Learning

2 code implementations3 Sep 2018 Junjun Jiang, Yi Yu, Suhua Tang, Jiayi Ma, Akiko Aizawa, Kiyoharu Aizawa

To this end, this study incorporates the contextual information of image patch and proposes a powerful and efficient context-patch based face hallucination approach, namely Thresholding Locality-constrained Representation and Reproducing learning (TLcR-RL).

Face Hallucination

Scale Drift Correction of Camera Geo-Localization using Geo-Tagged Images

no code implementations26 Aug 2018 Kazuya Iwami, Satoshi Ikehata, Kiyoharu Aizawa

Camera geo-localization from a monocular video is a fundamental task for video analysis and autonomous navigation.

3D Reconstruction Autonomous Navigation +1

Fast and Robust Estimation for Unit-Norm Constrained Linear Fitting Problems

no code implementations CVPR 2018 Daiki Ikami, Toshihiko Yamasaki, Kiyoharu Aizawa

M-estimator using iteratively reweighted least squares (IRLS) is one of the best-known methods for robust estimation.

Local and Global Optimization Techniques in Graph-Based Clustering

no code implementations CVPR 2018 Daiki Ikami, Toshihiko Yamasaki, Kiyoharu Aizawa

We propose a local optimization method, which is widely applicable to graph-based clustering cost functions.


Category-Based Deep CCA for Fine-Grained Venue Discovery from Multimodal Data

no code implementations8 May 2018 Yi Yu, Suhua Tang, Kiyoharu Aizawa, Akiko Aizawa

Given a photo as input, this model performs (i) exact venue search (find the venue where the photo was taken), and (ii) group venue search (find relevant venues with the same category as that of the photo), by the cross-modal correlation between the input photo and textual description of venues.

Cross-Modal Retrieval Retrieval

Personalized Classifier for Food Image Recognition

no code implementations8 Apr 2018 Shota Horiguchi, Sosuke Amano, Makoto Ogawa, Kiyoharu Aizawa

In this paper, we address the personalization problem, which involves adapting to the user's domain incrementally using a very limited number of samples.

Parallel Grid Pooling for Data Augmentation

1 code implementation30 Mar 2018 Akito Takeki, Daiki Ikami, Go Irie, Kiyoharu Aizawa

Convolutional neural network (CNN) architectures utilize downsampling layers, which restrict the subsequent layers to learn spatially invariant features while reducing computational costs.

General Classification Image Augmentation +1

Object Detection for Comics using Manga109 Annotations

5 code implementations23 Mar 2018 Toru Ogawa, Atsushi Otsubo, Rei Narita, Yusuke Matsui, Toshihiko Yamasaki, Kiyoharu Aizawa

We annotated an existing image dataset of comics and created the largest annotation dataset, named Manga109-annotations.

object-detection Object Detection

Significance of Softmax-based Features in Comparison to Distance Metric Learning-based Features

no code implementations29 Dec 2017 Shota Horiguchi, Daiki Ikami, Kiyoharu Aizawa

However, in these DML studies, there were no equitable comparisons between features extracted from a DML-based network and those from a softmax-based network.

Metric Learning

Spatio-Temporal Vector of Locally Max Pooled Features for Action Recognition in Videos

no code implementations CVPR 2017 Ionut Cosmin Duta, Bogdan Ionescu, Kiyoharu Aizawa, Nicu Sebe

The proposed method addresses an important problem of video understanding: how to build a video representation that incorporates the CNN features over the entire video.

Action Recognition In Videos Temporal Action Localization +1

cGAN-based Manga Colorization Using a Single Training Image

1 code implementation21 Jun 2017 Paulina Hensman, Kiyoharu Aizawa

The final results are sharp, clear, and in high resolution, and stay true to the character's original color scheme.


Gaze Distribution Analysis and Saliency Prediction Across Age Groups

no code implementations20 May 2017 Onkar Krishna, Kiyoharu Aizawa, Andrea Helo, Rama Pia

In this paper, we investigated how visual scene processing changes with age and we propose an age-adapted framework that helps to develop a computational model that can predict saliency across different age groups.

Saliency Prediction

PQTable: Non-exhaustive Fast Search for Product-quantized Codes using Hash Tables

no code implementations21 Apr 2017 Yusuke Matsui, Toshihiko Yamasaki, Kiyoharu Aizawa

In this paper, we propose a product quantization table (PQTable); a fast search method for product-quantized codes via hash-tables.


Uncalibrated Photometric Stereo by Stepwise Optimization Using Principal Components of Isotropic BRDFs

no code implementations CVPR 2016 Keisuke Midorikawa, Toshihiko Yamasaki, Kiyoharu Aizawa

We propose a model that represents various isotropic reflectance functions by using the principal components of items in a dataset, and formulate the uncalibrated photometric stereo as a regression problem.

PQTable: Fast Exact Asymmetric Distance Neighbor Search for Product Quantization Using Hash Tables

no code implementations ICCV 2015 Yusuke Matsui, Toshihiko Yamasaki, Kiyoharu Aizawa

We propose the product quantization table (PQTable), a product quantization-based hash table that is fast and requires neither parameter tuning nor training steps.


Sketch-based Manga Retrieval using Manga109 Dataset

no code implementations15 Oct 2015 Yusuke Matsui, Kota Ito, Yuji Aramaki, Toshihiko Yamasaki, Kiyoharu Aizawa

From the experiments, we verified that: (1) the retrieval accuracy of the proposed method is higher than those of previous methods; (2) the proposed method can localize an object instance with reasonable runtime and accuracy; and (3) sketch querying is useful for manga search.

Quantization Retrieval +1

Photometric Stereo using Constrained Bivariate Regression for General Isotropic Surfaces

no code implementations CVPR 2014 Satoshi Ikehata, Kiyoharu Aizawa

This paper presents a photometric stereo method that is purely pixelwise and handles general isotropic surfaces in a stable manner.


Cannot find the paper you are looking for? You can Submit a new open access paper.