Search Results for author: Tatsuya Harada

Found 137 papers, 47 papers with code

Bounding-box Channels for Visual Relationship Detection

no code implementations ECCV 2020 Sho Inayoshi, Keita Otani, Antonio Tejero-de-Pablos, Tatsuya Harada

In this paper, we propose the bounding-box channels, a novel architecture capable of relating the semantic, spatial, and image features strongly.

Relationship Detection Visual Relationship Detection

Plug-and-Play Controller for Story Completion: A Pilot Study toward Emotion-aware Story Writing Assistance

no code implementations In2Writing (ACL) 2022 Yusuke Mori, Hiroaki Yamane, Ryohei Shimizu, Tatsuya Harada

Emotions are essential for storytelling and narrative generation, and as such, the relationship between stories and emotions has been extensively studied.

Story Completion

Finding and Generating a Missing Part for Story Completion

1 code implementation COLING (LaTeCHCLfL, CLFL, LaTeCH) 2020 Yusuke Mori, Hiroaki Yamane, Yusuke Mukuta, Tatsuya Harada

We first conduct an experiment focusing on MPP, and our analysis shows that highly accurate predictions can be obtained when the missing part of a story is the beginning or the end.

Position Story Completion

Find n' Propagate: Open-Vocabulary 3D Object Detection in Urban Environments

no code implementations20 Mar 2024 Djamahl Etchegaray, Zi Huang, Tatsuya Harada, Yadan Luo

In this work, we tackle the limitations of current LiDAR-based 3D object detection systems, which are hindered by a restricted class vocabulary and the high costs associated with annotating new object classes.

3D Object Detection object-detection

HyperVQ: MLR-based Vector Quantization in Hyperbolic Space

no code implementations18 Mar 2024 Nabarun Goswami, Yusuke Mukuta, Tatsuya Harada

However, since the VQVAE is trained with a reconstruction objective, there is no constraint for the embeddings to be well disentangled, a crucial aspect for using them in discriminative tasks.

Quantization Representation Learning

Symmetric Q-learning: Reducing Skewness of Bellman Error in Online Reinforcement Learning

no code implementations12 Mar 2024 Motoki Omura, Takayuki Osa, Yusuke Mukuta, Tatsuya Harada

In deep reinforcement learning, estimating the value function to evaluate the quality of states and actions is essential.

Continuous Control Q-Learning +1

Robustifying a Policy in Multi-Agent RL with Diverse Cooperative Behaviors and Adversarial Style Sampling for Assistive Tasks

no code implementations1 Mar 2024 Takayuki Osa, Tatsuya Harada

We demonstrate that policies trained with a popular deep RL method are vulnerable to changes in policies of other agents and that the proposed framework improves the robustness against such changes.

Reinforcement Learning (RL)

Advancing Large Multi-modal Models with Explicit Chain-of-Reasoning and Visual Question Generation

no code implementations18 Jan 2024 Kohei Uehara, Nabarun Goswami, Hanqin Wang, Toshiaki Baba, Kohtaro Tanaka, Tomohiro Hashimoto, Kai Wang, Rei Ito, Takagi Naoya, Ryo Umagami, Yingyi Wen, Tanachai Anakewat, Tatsuya Harada

The increasing demand for intelligent systems capable of interpreting and reasoning about visual content requires the development of Large Multi-Modal Models (LMMs) that are not only accurate but also have explicit reasoning capabilities.

Language Modelling Large Language Model +2

GPAvatar: Generalizable and Precise Head Avatar from Image(s)

1 code implementation18 Jan 2024 Xuangeng Chu, Yu Li, Ailing Zeng, Tianyu Yang, Lijian Lin, Yunfei Liu, Tatsuya Harada

Head avatar reconstruction, crucial for applications in virtual reality, online meetings, gaming, and film industries, has garnered substantial attention within the computer vision community.

Neural Rendering Novel View Synthesis

Aleth-NeRF: Illumination Adaptive NeRF with Concealing Field Assumption

1 code implementation14 Dec 2023 Ziteng Cui, Lin Gu, Xiao Sun, Xianzheng Ma, Yu Qiao, Tatsuya Harada

The standard Neural Radiance Fields (NeRF) paradigm employs a viewer-centered methodology, entangling the aspects of illumination and material reflectance into emission solely from 3D points.

Fully Spiking Denoising Diffusion Implicit Models

1 code implementation4 Dec 2023 Ryo Watanabe, Yusuke Mukuta, Tatsuya Harada

Spiking neural networks (SNNs) have garnered considerable attention owing to their ability to run on neuromorphic devices with super-high speeds and remarkable energy efficiencies.

Denoising Image Generation

Gradual Source Domain Expansion for Unsupervised Domain Adaptation

1 code implementation16 Nov 2023 Thomas Westfechtel, Hao-Wei Yeh, Dexuan Zhang, Tatsuya Harada

Unsupervised domain adaptation (UDA) tries to overcome the need for a large labeled dataset by transferring knowledge from a source dataset, with lots of labeled data, to a target dataset, that has no labeled data.

Pseudo Label Unsupervised Domain Adaptation

Expert Uncertainty and Severity Aware Chest X-Ray Classification by Multi-Relationship Graph Learning

no code implementations6 Sep 2023 Mengliang Zhang, Xinyue Hu, Lin Gu, Liangchen Liu, Kazuma Kobayashi, Tatsuya Harada, Ronald M. Summers, Yingying Zhu

In this paper, we re-extract disease labels from CXR reports to make them more realistic by considering disease severity and uncertainty in classification.

Graph Learning

Soft Curriculum for Learning Conditional GANs with Noisy-Labeled and Uncurated Unlabeled Data

no code implementations17 Jul 2023 Kai Katsumata, Duc Minh Vo, Tatsuya Harada, Hideki Nakayama

Label-noise or curated unlabeled data is used to compensate for the assumption of clean labeled data in training the conditional generative adversarial network; however, satisfying such an extended assumption is occasionally laborious or impractical.

Conditional Image Generation Generative Adversarial Network

Domain Adaptive Multiple Instance Learning for Instance-level Prediction of Pathological Images

1 code implementation7 Apr 2023 Shusuke Takahama, Yusuke Kurose, Yusuke Mukuta, Hiroyuki Abe, Akihiko Yoshizawa, Tetsuo Ushiku, Masashi Fukayama, Masanobu Kitagawa, Masaru Kitsuregawa, Tatsuya Harada

We conducted experiments on the pathological image dataset we created for this study and showed that the proposed method significantly improves the classification performance compared to existing methods.

Domain Adaptation Multiple Instance Learning

Aleth-NeRF: Low-light Condition View Synthesis with Concealing Fields

1 code implementation10 Mar 2023 Ziteng Cui, Lin Gu, Xiao Sun, Xianzheng Ma, Yu Qiao, Tatsuya Harada

Common capture low-light scenes are challenging for most computer vision techniques, including Neural Radiance Fields (NeRF).

Self-Supervised Learning for Group Equivariant Neural Networks

no code implementations8 Mar 2023 Yusuke Mukuta, Tatsuya Harada

To ensure that training is consistent with the equivariance, we propose two concepts for self-supervised tasks: equivariant pretext labels and invariant contrastive loss.

Self-Supervised Learning

Sketch-based Medical Image Retrieval

1 code implementation7 Mar 2023 Kazuma Kobayashi, Lin Gu, Ryuichiro Hataya, Takaaki Mizuno, Mototaka Miyake, Hirokazu Watanabe, Masamichi Takahashi, Yasuyuki Takamizawa, Yukihiro Yoshida, Satoshi Nakamura, Nobuji Kouno, Amina Bolatkan, Yusuke Kurose, Tatsuya Harada, Ryuji Hamamoto

As a result, our SBMIR system enabled users to overcome previous challenges, including image retrieval based on fine-grained image characteristics, image retrieval without example images, and image retrieval for isolated samples.

Medical Image Retrieval Retrieval

Backprop Induced Feature Weighting for Adversarial Domain Adaptation with Iterative Label Distribution Alignment

1 code implementation WACV 2023 Thomas Westfechtel, Hao-Wei Yeh, Qier Meng, Yusuke Mukuta, Tatsuya Harada

Firstly, it lets the domain classifier focus on features that are important for the classification, and, secondly, it couples the classification and adversarial branch more closely.

Classification Unsupervised Domain Adaptation

Name Your Colour For the Task: Artificially Discover Colour Naming via Colour Quantisation Transformer

1 code implementation ICCV 2023 Shenghan Su, Lin Gu, Yue Yang, Zenghui Zhang, Tatsuya Harada

Besides, our colour quantisation method also offers an efficient quantisation method that effectively compresses the image storage while maintaining high performance in high-level recognition tasks such as classification and detection.

Learning by Asking Questions for Knowledge-based Novel Object Recognition

no code implementations12 Oct 2022 Kohei Uehara, Tatsuya Harada

Our pipeline consists of two components: the Object Classifier, which performs knowledge-based object recognition, and the Question Generator, which generates knowledge-aware questions to acquire novel knowledge.

Object Object Recognition +2

Grouped self-attention mechanism for a memory-efficient Transformer

no code implementations2 Oct 2022 Bumjun Jung, Yusuke Mukuta, Tatsuya Harada

Time-series data analysis is important because numerous real-world tasks such as forecasting weather, electricity consumption, and stock market involve predicting data that vary over time.

Time Series Time Series Analysis

Memory Efficient Temporal & Visual Graph Model for Unsupervised Video Domain Adaptation

no code implementations13 Aug 2022 Xinyue Hu, Lin Gu, Liangchen Liu, Ruijiang Li, Chang Su, Tatsuya Harada, Yingying Zhu

Existing video domain adaption (DA) methods need to store all temporal combinations of video frames or pair the source and target videos, which are memory cost expensive and can't scale up to long videos.

Domain Adaptation Graph Attention

Exploring Resolution and Degradation Clues as Self-supervised Signal for Low Quality Object Detection

1 code implementation5 Aug 2022 Ziteng Cui, Yingying Zhu, Lin Gu, Guo-Jun Qi, Xiaoxiao Li, Renrui Zhang, Zenghui Zhang, Tatsuya Harada

Image restoration algorithms such as super resolution (SR) are indispensable pre-processing modules for object detection in low quality images.

Image Restoration Object +4

Deforming Radiance Fields with Cages

no code implementations25 Jul 2022 Tianhan Xu, Tatsuya Harada

Recent advances in radiance fields enable photorealistic rendering of static or dynamic 3D scenes, but still do not support explicit deformation that is used for scene manipulation or animation.

SATTS: Speaker Attractor Text to Speech, Learning to Speak by Learning to Separate

no code implementations13 Jul 2022 Nabarun Goswami, Tatsuya Harada

The mapping of text to speech (TTS) is non-deterministic, letters may be pronounced differently based on context, or phonemes can vary depending on various physiological and stylistic factors like gender, age, accent, emotions, etc.

Speech Separation

You Only Need 90K Parameters to Adapt Light: A Light Weight Transformer for Image Enhancement and Exposure Correction

1 code implementation30 May 2022 Ziteng Cui, Kunchang Li, Lin Gu, Shenghan Su, Peng Gao, Zhengkai Jiang, Yu Qiao, Tatsuya Harada

Challenging illumination conditions (low-light, under-exposure and over-exposure) in the real world not only cast an unpleasant visual appearance but also taint the computer vision tasks.

Low-Light Image Enhancement object-detection +2

Non-rigid Point Cloud Registration with Neural Deformation Pyramid

1 code implementation25 May 2022 Yang Li, Tatsuya Harada

Non-rigid point cloud registration is a key component in many computer vision and computer graphics applications.

Point Cloud Registration

Multitask AET with Orthogonal Tangent Regularity for Dark Object Detection

2 code implementations ICCV 2021 Ziteng Cui, Guo-Jun Qi, Lin Gu, ShaoDi You, Zenghui Zhang, Tatsuya Harada

To enhance object detection in a dark environment, we propose a novel multitask auto encoding transformation (MAET) model which is able to explore the intrinsic pattern behind illumination translation.

Object object-detection +1

Unsupervised Learning of Efficient Geometry-Aware Neural Articulated Representations

no code implementations19 Apr 2022 Atsuhiro Noguchi, Xiao Sun, Stephen Lin, Tatsuya Harada

We propose an unsupervised method for 3D geometry-aware representation learning of articulated objects, in which no image-pose pairs or foreground masks are used for training.

Representation Learning

Learning from Label Proportions with Instance-wise Consistency

1 code implementation24 Mar 2022 Ryoma Kobayashi, Yusuke Mukuta, Tatsuya Harada

Learning from Label Proportions (LLP) is a weakly supervised learning method that aims to perform instance classification from training data consisting of pairs of bags containing multiple instances and the class label proportions within the bags.

Learning Theory Stochastic Optimization +1

Revisiting Domain Generalized Stereo Matching Networks from a Feature Consistency Perspective

1 code implementation CVPR 2022 Jiawei Zhang, Xiang Wang, Xiao Bai, Chen Wang, Lei Huang, Yimin Chen, Lin Gu, Jun Zhou, Tatsuya Harada, Edwin R. Hancock

The stereo contrastive feature loss function explicitly constrains the consistency between learned features of matching pixel pairs which are observations of the same 3D points.

Contrastive Learning Stereo Matching

Enhancement of Novel View Synthesis Using Omnidirectional Image Completion

no code implementations18 Mar 2022 Takayuki Hara, Tatsuya Harada

In this study, we present a method for synthesizing novel views from a single 360-degree RGB-D image based on the neural radiance field (NeRF) .

Novel View Synthesis

ViNTER: Image Narrative Generation with Emotion-Arc-Aware Transformer

no code implementations15 Feb 2022 Kohei Uehara, Yusuke Mori, Yusuke Mukuta, Tatsuya Harada

Image narrative generation is a task to create a story from an image with a subjective viewpoint.

Time Series Analysis

RestoreDet: Degradation Equivariant Representation for Object Detection in Low Resolution Images

no code implementations7 Jan 2022 Ziteng Cui, Yingying Zhu, Lin Gu, Guo-Jun Qi, Xiaoxiao Li, Peng Gao, Zenghui Zhang, Tatsuya Harada

Image restoration algorithms such as super resolution (SR) are indispensable pre-processing modules for object detection in degraded images.

Image Restoration Object +4

Watch It Move: Unsupervised Discovery of 3D Joints for Re-Posing of Articulated Objects

1 code implementation CVPR 2022 Atsuhiro Noguchi, Umar Iqbal, Jonathan Tremblay, Tatsuya Harada, Orazio Gallo

Rendering articulated objects while controlling their poses is critical to applications such as virtual reality or animation for movies.

Object

Leveraging Human Selective Attention for Medical Image Analysis with Limited Training Data

no code implementations2 Dec 2021 Yifei HUANG, Xiaoxiao Li, Lijin Yang, Lin Gu, Yingying Zhu, Hirofumi Seo, Qiuming Meng, Tatsuya Harada, Yoichi Sato

Then we design a novel Auxiliary Attention Block (AAB) to allow information from SAN to be utilized by the backbone encoder to focus on selective areas.

Tumor Segmentation

A Theoretical and Empirical Model of the Generalization Error under Time-Varying Learning Rate

no code implementations29 Sep 2021 Toru Makuuchi, Yusuke Mukuta, Tatsuya Harada

In this study, we analyze the generalization bound for the time-varying case by applying PAC-Bayes and experimentally show that the theoretical functional form for the batch size and learning rate approximates the generalization error well for both cases.

Hyperparameter Optimization

Fully Spiking Variational Autoencoder

1 code implementation26 Sep 2021 Hiromichi Kamata, Yusuke Mukuta, Tatsuya Harada

Spiking neural networks (SNNs) can be run on neuromorphic devices with ultra-high speed and ultra-low energy consumption because of their binary and event-driven nature.

Image Generation Time Series +1

Video Moment Retrieval with Text Query Considering Many-to-Many Correspondence Using Potentially Relevant Pair

no code implementations25 Jun 2021 Sho Maeoki, Yusuke Mukuta, Tatsuya Harada

In this paper, we propose a novel training method that takes advantage of potentially relevant pairs, which are detected based on linguistic analysis about text annotation.

Moment Retrieval Retrieval +1

Neural Articulated Radiance Field

1 code implementation ICCV 2021 Atsuhiro Noguchi, Xiao Sun, Stephen Lin, Tatsuya Harada

We present Neural Articulated Radiance Field (NARF), a novel deformable 3D representation for articulated objects learned from images.

Decomposing Normal and Abnormal Features of Medical Images into Discrete Latent Codes for Content-Based Image Retrieval

no code implementations23 Mar 2021 Kazuma Kobayashi, Ryuichiro Hataya, Yusuke Kurose, Mototaka Miyake, Masamichi Takahashi, Akiko Nakagawa, Tatsuya Harada, Ryuji Hamamoto

To support comparative diagnostic reading, content-based image retrieval (CBIR), which can selectively utilize normal and abnormal features in medical images as two separable semantic components, will be useful.

Anatomy Content-Based Image Retrieval +2

Estimating and Improving Fairness with Adversarial Learning

1 code implementation7 Mar 2021 Xiaoxiao Li, Ziteng Cui, Yifan Wu, Lin Gu, Tatsuya Harada

To tackle this issue, we propose an adversarial multi-task training strategy to simultaneously mitigate and detect bias in the deep learning-based medical image analysis system.

Fairness

Goal-Oriented Gaze Estimation for Zero-Shot Learning

1 code implementation CVPR 2021 Yang Liu, Lei Zhou, Xiao Bai, Yifei HUANG, Lin Gu, Jun Zhou, Tatsuya Harada

Therefore, we introduce a novel goal-oriented gaze estimation module (GEM) to improve the discriminative attribute localization based on the class-level attributes for ZSL.

Attribute Gaze Estimation +1

Unsupervised Domain Adaptation via Minimized Joint Error

no code implementations1 Jan 2021 Dexuan Zhang, Tatsuya Harada

In this paper, we argue that the joint error is essential for the domain adaptation problem, in particular if the samples from different classes in source/target are closely aligned when matching the marginal distributions.

Image Classification Unsupervised Domain Adaptation

Neural Star Domain as Primitive Representation

no code implementations NeurIPS 2020 Yuki Kawana, Yusuke Mukuta, Tatsuya Harada

We show that NSD is a universal approximator of the star domain and is not only parsimonious and semantic but also an implicit and explicit shape representation.

Image Reconstruction

Information Bottleneck Constrained Latent Bidirectional Embedding for Zero-Shot Learning

no code implementations16 Sep 2020 Yang Liu, Lei Zhou, Xiao Bai, Lin Gu, Tatsuya Harada, Jun Zhou

Though many ZSL methods rely on a direct mapping between the visual and the semantic space, the calibration deviation and hubness problem limit the generalization capability to unseen classes.

Attribute Zero-Shot Learning

Neural Granular Sound Synthesis

no code implementations4 Aug 2020 Adrien Bitton, Philippe Esling, Tatsuya Harada

In this setting the learned grain space is invertible, meaning that we can continuously synthesize sound when traversing its dimensions.

Audio Generation

Learning Agile Locomotion via Adversarial Training

no code implementations3 Aug 2020 Yujin Tang, Jie Tan, Tatsuya Harada

In contrast to prior works that used only one adversary, we find that training an ensemble of adversaries, each of which specializes in a different escaping strategy, is essential for the protagonist to master agility.

Reinforcement Learning (RL)

Point Cloud Based Reinforcement Learning for Sim-to-Real and Partial Observability in Visual Navigation

no code implementations27 Jul 2020 Kenzo Lobos-Tsunekawa, Tatsuya Harada

Reinforcement Learning (RL), among other learning-based methods, represents powerful tools to solve complex robotic tasks (e. g., actuation, manipulation, navigation, etc.

Reinforcement Learning (RL) Visual Navigation

Vector-Quantized Timbre Representation

1 code implementation13 Jul 2020 Adrien Bitton, Philippe Esling, Tatsuya Harada

Although its definition is usually elusive, it can be seen from a signal processing viewpoint as all the spectral features that are perceived independently from pitch and loudness.

SplitFusion: Simultaneous Tracking and Mapping for Non-Rigid Scenes

no code implementations4 Jul 2020 Yang Li, Tianwei Zhang, Yoshihiko Nakamura, Tatsuya Harada

We present SplitFusion, a novel dense RGB-D SLAM framework that simultaneously performs tracking and dense reconstruction for both rigid and non-rigid components of the scene.

Hyperbolic Neural Networks++

1 code implementation ICLR 2021 Ryohei Shimizu, Yusuke Mukuta, Tatsuya Harada

Hyperbolic spaces, which have the capacity to embed tree structures without distortion owing to their exponential volume growth, have recently been applied to machine learning to better capture the hierarchical nature of data.

BIG-bench Machine Learning regression

Learning Global and Local Features of Normal Brain Anatomy for Unsupervised Abnormality Detection

1 code implementation26 May 2020 Kazuma Kobayashi, Ryuichiro Hataya, Yusuke Kurose, Amina Bolatkan, Mototaka Miyake, Hirokazu Watanabe, Masamichi Takahashi, Jun Itami, Tatsuya Harada, Ryuji Hamamoto

In addition, we devise a metric to evaluate the anatomical fidelity of the reconstructed images and confirm that the overall detection performance is improved when the image reconstruction network achieves a higher score.

Anatomy Anomaly Detection +1

Learning to Optimize Non-Rigid Tracking

no code implementations CVPR 2020 Yang Li, Aljaž Božič, Tianwei Zhang, Yanli Ji, Tatsuya Harada, Matthias Nießner

One of the widespread solutions for non-rigid tracking has a nested-loop structure: with Gauss-Newton to minimize a tracking objective in the outer loop, and Preconditioned Conjugate Gradient (PCG) to solve a sparse linear system in the inner loop.

Blur, Noise, and Compression Robust Generative Adversarial Networks

no code implementations CVPR 2021 Takuhiro Kaneko, Tatsuya Harada

However, in contrast to NR-GAN, to address irreversible characteristics, we introduce masking architectures adjusting degradation strength values in a data-driven manner using bypasses before and after degradation.

Image Generation Image Restoration

Captioning Images with Novel Objects via Online Vocabulary Expansion

no code implementations6 Mar 2020 Mikihiro Tanaka, Tatsuya Harada

In this study, we introduce a low cost method for generating descriptions from images containing novel objects.

Image Captioning Word Embeddings

Spherical Image Generation from a Single Normal Field of View Image by Considering Scene Symmetry

no code implementations9 Jan 2020 Takayuki Hara, Tatsuya Harada

We propose a method to generate spherical image from a single NFOV image, and control the degree of freedom of the generated regions using scene symmetry.

Image Generation

Noise Robust Generative Adversarial Networks

2 code implementations CVPR 2020 Takuhiro Kaneko, Tatsuya Harada

Therefore, we propose distribution and transformation constraints that encourage the noise generator to capture only the noise-specific components.

Image Denoising Image Generation

Unsupervised Keyword Extraction for Full-sentence VQA

no code implementations EMNLP (nlpbt) 2020 Kohei Uehara, Tatsuya Harada

In the majority of the existing Visual Question Answering (VQA) research, the answers consist of short, often single words, as per instructions given to the annotators during dataset construction.

Keyword Extraction Question Answering +2

Self-supervised Learning of 3D Objects from Natural Images

no code implementations20 Nov 2019 Hiroharu Kato, Tatsuya Harada

We present a method to learn single-view reconstruction of the 3D shape, pose, and texture of objects from categorized natural images in a self-supervised manner.

3D Object Reconstruction Object +1

A General Upper Bound for Unsupervised Domain Adaptation

no code implementations3 Oct 2019 Dexuan Zhang, Tatsuya Harada

In this work, we present a novel upper bound of target error to address the problem for unsupervised domain adaptation.

Image Classification Unsupervised Domain Adaptation

Revisiting Fine-tuning for Few-shot Learning

no code implementations1 Oct 2019 Akihiro Nakamura, Tatsuya Harada

With both tasks, we show that our method achieves higher accuracy than common few-shot learning algorithms.

Few-Shot Learning

Rethinking Task and Metrics of Instance Segmentation on 3D Point Clouds

no code implementations27 Sep 2019 Kosuke Arase, Yusuke Mukuta, Tatsuya Harada

Certain existing studies have split input point clouds into small regions such as 1m x 1m; one reason for this is that models in the studies cannot consume a large number of points because of the large space complexity.

Instance Segmentation Semantic Segmentation

Measuring Numerical Common Sense: Is A Word Embedding Approach Effective?

no code implementations25 Sep 2019 Hiroaki Yamane, Chin-Yew Lin, Tatsuya Harada

To this end, we first used a crowdsourcing service to obtain sufficient data for a subjective agreement on numerical common sense.

Common Sense Reasoning regression +1

Invariant Feature Coding using Tensor Product Representation

no code implementations5 Jun 2019 Yusuke Mukuta, Tatsuya Harada

Based on this result, a novel feature model that explicitly consider group action is proposed for principal component analysis and k-means clustering, which are commonly used in most feature coding methods, and global feature functions.

Clustering

Compact Approximation for Polynomial of Covariance Feature

no code implementations5 Jun 2019 Yusuke Mukuta, Tatsuaki Machida, Tatsuya Harada

Subsequently, we apply the proposed approximation to the polynomial corresponding to the matrix square root to obtain a compact approximation for the square root of the covariance feature.

Fine-Grained Image Recognition

Scalable Generative Models for Graphs with Graph Attention Mechanism

no code implementations ICLR 2020 Wataru Kawai, Yusuke Mukuta, Tatsuya Harada

Graphs are ubiquitous real-world data structures, and generative models that approximate distributions over graphs and derive new samples from them have significant importance.

Graph Attention Graph Generation

Interactive Video Retrieval with Dialog

no code implementations7 May 2019 Sho Maeoki, Kohei Uehara, Tatsuya Harada

We propose a system to retrieve videos by asking questions about the content of the videos and leveraging the user's responses to the questions.

Retrieval Video Retrieval

Label-Noise Robust Multi-Domain Image-to-Image Translation

1 code implementation6 May 2019 Takuhiro Kaneko, Tatsuya Harada

This problem is challenging in terms of scalability because it requires the learning of numerous mappings, the number of which increases proportional to the number of domains.

Image-to-Image Translation Translation

Long-Term Human Video Generation of Multiple Futures Using Poses

no code implementations16 Apr 2019 Naoya Fushishita, Antonio Tejero-de-Pablos, Yusuke Mukuta, Tatsuya Harada

First, from an input human video, we generate sequences of future human poses (i. e., the image coordinates of their body-joints) via adversarial learning.

Autonomous Driving Pose Prediction +2

Image Generation From Small Datasets via Batch Statistics Adaptation

4 code implementations ICCV 2019 Atsuhiro Noguchi, Tatsuya Harada

To reduce the amount of data required, we propose a new method for transferring prior knowledge of the pre-trained generator, which is trained with a large dataset, to a small dataset in a different domain.

Common Sense Reasoning Image Generation

End-to-End Learning Using Cycle Consistency for Image-to-Caption Transformations

no code implementations25 Mar 2019 Keisuke Hagiwara, Yusuke Mukuta, Tatsuya Harada

So far, research to generate captions from images has been carried out from the viewpoint that a caption holds sufficient information for an image.

Pose Graph Optimization for Unsupervised Monocular Visual Odometry

no code implementations15 Mar 2019 Yang Li, Yoshitaka Ushiku, Tatsuya Harada

In this paper, we propose to leverage graph optimization and loop closure detection to overcome limitations of unsupervised learning based monocular visual odometry.

Loop Closure Detection Monocular Visual Odometry

TWINs: Two Weighted Inconsistency-reduced Networks for Partial Domain Adaptation

no code implementations18 Dec 2018 Toshihiko Matsuura, Kuniaki Saito, Tatsuya Harada

We utilize two classification networks to estimate the ratio of the target samples in each class with which a classification loss is weighted to adapt the classes present in the target domain.

General Classification Partial Domain Adaptation +2

Learning to Explain with Complemental Examples

no code implementations CVPR 2019 Atsushi Kanehira, Tatsuya Harada

This paper addresses the generation of explanations with visual examples.

Conditional Video Generation Using Action-Appearance Captions

no code implementations4 Dec 2018 Shohei Yamamoto, Antonio Tejero-de-Pablos, Yoshitaka Ushiku, Tatsuya Harada

The results demonstrate that CFT-GAN is able to successfully generate videos containing the action and appearances indicated in the captions.

Optical Flow Estimation Video Generation

Class-Distinct and Class-Mutual Image Generation with GANs

2 code implementations27 Nov 2018 Takuhiro Kaneko, Yoshitaka Ushiku, Tatsuya Harada

To overcome this limitation, we address a novel problem called class-distinct and class-mutual image generation, in which the goal is to construct a generator that can capture between-class relationships and generate an image selectively conditioned on the class specificity.

Conditional Image Generation Image-to-Image Translation +1

Label-Noise Robust Generative Adversarial Networks

3 code implementations CVPR 2019 Takuhiro Kaneko, Yoshitaka Ushiku, Tatsuya Harada

To remedy this, we propose a novel family of GANs called label-noise robust GANs (rGANs), which, by incorporating a noise transition model, can learn a clean label conditional generative distribution even when training labels are noisy.

Robust classification

Learning View Priors for Single-view 3D Reconstruction

no code implementations CVPR 2019 Hiroharu Kato, Tatsuya Harada

The discriminator is trained to distinguish the reconstructed views of the observed viewpoints from those of the unobserved viewpoints.

3D Reconstruction Object +1

Visual Question Generation for Class Acquisition of Unknown Objects

1 code implementation ECCV 2018 Kohei Uehara, Antonio Tejero-de-Pablos, Yoshitaka Ushiku, Tatsuya Harada

In this paper, we propose a method for generating questions about unknown objects in an image, as means to get information about classes that have not been learned.

Question Generation Question-Generation

Open Set Domain Adaptation by Backpropagation

4 code implementations ECCV 2018 Kuniaki Saito, Shohei Yamamoto, Yoshitaka Ushiku, Tatsuya Harada

Almost all of them are proposed for a closed-set scenario, where the source and the target domain completely share the class of their samples.

Domain Adaptation

Customized Image Narrative Generation via Interactive Visual Question Generation and Answering

no code implementations CVPR 2018 Andrew Shin, Yoshitaka Ushiku, Tatsuya Harada

Image description task has been invariably examined in a static manner with qualitative presumptions held to be universally applicable, regardless of the scope or target of the description.

Question Generation Question-Generation

Viewpoint-aware Video Summarization

no code implementations CVPR 2018 Atsushi Kanehira, Luc van Gool, Yoshitaka Ushiku, Tatsuya Harada

To satisfy these requirements (A)-(C) simultaneously, we proposed a novel video summarization method from multiple groups of videos.

Semantic Similarity Semantic Textual Similarity +1

Maximum Classifier Discrepancy for Unsupervised Domain Adaptation

8 code implementations CVPR 2018 Kuniaki Saito, Kohei Watanabe, Yoshitaka Ushiku, Tatsuya Harada

To solve these problems, we introduce a new approach that attempts to align distributions of source and target by utilizing the task-specific decision boundaries.

Image Classification Multi-Source Unsupervised Domain Adaptation +2

Between-class Learning for Image Classification

3 code implementations CVPR 2018 Yuji Tokozume, Yoshitaka Ushiku, Tatsuya Harada

Second, we propose a mixing method that treats the images as waveforms, which leads to a further improvement in performance.

Classification General Classification +1

Neural 3D Mesh Renderer

3 code implementations CVPR 2018 Hiroharu Kato, Yoshitaka Ushiku, Tatsuya Harada

Using this renderer, we perform single-image 3D mesh reconstruction with silhouette image supervision and our system outperforms the existing voxel-based approach.

3D Object Reconstruction Style Transfer

Adversarial Dropout Regularization

no code implementations ICLR 2018 Kuniaki Saito, Yoshitaka Ushiku, Tatsuya Harada, Kate Saenko

However, a drawback of this approach is that the critic simply labels the generated features as in-domain or not, without considering the boundaries between classes.

General Classification Image Classification +2

Melody Generation for Pop Music via Word Representation of Musical Properties

1 code implementation31 Oct 2017 Andrew Shin, Leopold Crestel, Hiroharu Kato, Kuniaki Saito, Katsunori Ohnishi, Masataka Yamaguchi, Masahiro Nakawaki, Yoshitaka Ushiku, Tatsuya Harada

Automatic melody generation for pop music has been a long-time aspiration for both AI researchers and musicians.

Sound Multimedia Audio and Speech Processing

Spatio-temporal Person Retrieval via Natural Language Queries

no code implementations ICCV 2017 Masataka Yamaguchi, Kuniaki Saito, Yoshitaka Ushiku, Tatsuya Harada

In this paper, we address the problem of spatio-temporal person retrieval from multiple videos using a natural language query, in which we output a tube (i. e., a sequence of bounding boxes) which encloses the person described by the query.

Human Detection Natural Language Queries +2

Development of JavaScript-based deep learning platform and application to distributed training

no code implementations7 Feb 2017 Masatoshi Hidaka, Ken Miura, Tatsuya Harada

In the experiments, we demonstrate their practicality by training VGGNet in a distributed manner using web browsers as the client.

DeMIAN: Deep Modality Invariant Adversarial Network

no code implementations23 Dec 2016 Kuniaki Saito, Yusuke Mukuta, Yoshitaka Ushiku, Tatsuya Harada

To obtain the common representations under such a situation, we propose to make the distributions over different modalities similar in the learned representations, namely modality-invariant representations.

Domain Adaptation General Classification +2

The Color of the Cat is Gray: 1 Million Full-Sentences Visual Question Answering (FSVQA)

no code implementations21 Sep 2016 Andrew Shin, Yoshitaka Ushiku, Tatsuya Harada

Visual Question Answering (VQA) task has showcased a new stage of interaction between language and vision, two of the most pivotal components of artificial intelligence.

Question Answering Sentence +1

DualNet: Domain-Invariant Network for Visual Question Answering

no code implementations20 Jun 2016 Kuniaki Saito, Andrew Shin, Yoshitaka Ushiku, Tatsuya Harada

Visual question answering (VQA) task not only bridges the gap between images and language, but also requires that specific contents within the image are understood as indicated by linguistic context of the question, in order to generate the accurate answers.

Question Answering Visual Question Answering

Kernel Approximation via Empirical Orthogonal Decomposition for Unsupervised Feature Learning

no code implementations CVPR 2016 Yusuke Mukuta, Tatsuya Harada

Our experiments show that the proposed method is better than the random features method and comparable with the Nystrom method in terms of the approximation error and classification accuracy.

Improved Dense Trajectory with Cross Streams

no code implementations29 Apr 2016 Katsunori Ohnishi, Masatoshi Hidaka, Tatsuya Harada

This new descriptor is calculated by applying discriminative weights learned from one network to a convolutional layer of the other network.

Action Classification Action Recognition +1

Dense Image Representation with Spatial Pyramid VLAD Coding of CNN for Locally Robust Captioning

no code implementations30 Mar 2016 Andrew Shin, Masataka Yamaguchi, Katsunori Ohnishi, Tatsuya Harada

The workflow of extracting features from images using convolutional neural networks (CNN) and generating captions with recurrent neural networks (RNN) has become a de-facto standard for image captioning task.

General Classification Image Captioning

Common Subspace for Model and Similarity: Phrase Learning for Caption Generation From Images

no code implementations ICCV 2015 Yoshitaka Ushiku, Masataka Yamaguchi, Yusuke Mukuta, Tatsuya Harada

In order to overcome the shortage of training samples, CoSMoS obtains a subspace in which (a) all feature vectors associated with the same phrase are mapped as mutually close, (b) classifiers for each phrase are learned, and (c) training samples are shared among co-occurring phrases.

Caption Generation Descriptive

Recognizing Activities of Daily Living with a Wrist-mounted Camera

no code implementations CVPR 2016 Katsunori Ohnishi, Atsushi Kanehira, Asako Kanezaki, Tatsuya Harada

We present a novel dataset and a novel algorithm for recognizing activities of daily living (ADL) from a first-person wearable camera.

object-detection Object Detection

Visual Language Modeling on CNN Image Representations

no code implementations9 Nov 2015 Hiroharu Kato, Tatsuya Harada

Additionally, a method to measure naturalness can be complementary to Convolutional Neural Network (CNN) based features, which are known to be insensitive to the naturalness of images.

Language Modelling

Image Reconstruction from Bag-of-Visual-Words

no code implementations CVPR 2014 Hiroharu Kato, Tatsuya Harada

The objective of this work is to reconstruct an original image from Bag-of-Visual-Words (BoVW).

Image Reconstruction Retrieval

Implementation of a Practical Distributed Calculation System with Browsers and JavaScript, and Application to Distributed Deep Learning

no code implementations19 Mar 2015 Ken Miura, Tatsuya Harada

We have also developed a new JavaScript neural network framework called "Sukiyaki" that uses general purpose GPUs with web browsers.

MILJS : Brand New JavaScript Libraries for Matrix Calculation and Machine Learning

no code implementations21 Feb 2015 Ken Miura, Tetsuaki Mano, Atsushi Kanehira, Yuichiro Tsuchiya, Tatsuya Harada

Our core library offering a matrix calculation is called Sushi, which exhibits far better performance than any other leading machine learning libraries written in JavaScript.

BIG-bench Machine Learning

Three Guidelines of Online Learning for Large-Scale Visual Recognition

no code implementations CVPR 2014 Yoshitaka Ushiku, Masatoshi Hidaka, Tatsuya Harada

In this paper, we would like to evaluate online learning algorithms for large-scale visual recognition using state-of-the-art features which are preselected and held fixed.

Document Classification General Classification +1

Graphical Gaussian Vector for Image Categorization

no code implementations NeurIPS 2012 Tatsuya Harada, Yasuo Kuniyoshi

This paper proposes a novel image representation called a Graphical Gaussian Vector, which is a counterpart of the codebook and local feature matching approaches.

Image Categorization Object Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.