Search Results for author: Yoshitaka Ushiku

Found 31 papers, 16 papers with code

Removing Word-Level Spurious Alignment between Images and Pseudo-Captions in Unsupervised Image Captioning

1 code implementation EACL 2021 Ukyo Honda, Yoshitaka Ushiku, Atsushi Hashimoto, Taro Watanabe, Yuji Matsumoto

Unsupervised image captioning is a challenging task that aims at generating captions without the supervision of image-sentence pairs, but only with images and sentences drawn from different sources and object labels detected from the images.

Image Captioning

Divergence Optimization for Noisy Universal Domain Adaptation

no code implementations CVPR 2021 Qing Yu, Atsushi Hashimoto, Yoshitaka Ushiku

Hence, we consider a new realistic setting called Noisy UniDA, in which classifiers are trained with noisy labeled data from the source domain and unlabeled data with an unknown class distribution from the target domain.

Universal Domain Adaptation

Visual Grounding Annotation of Recipe Flow Graph

no code implementations LREC 2020 Taichi Nishimura, Suzushi Tomori, Hayato Hashimoto, Atsushi Hashimoto, Yoko Yamakata, Jun Harashima, Yoshitaka Ushiku, Shinsuke Mori

Visual grounding is provided as bounding boxes to image sequences of recipes, and each bounding box is linked to an element of the workflow.

Visual Grounding

Crowd Density Forecasting by Modeling Patch-based Dynamics

no code implementations22 Nov 2019 Hiroaki Minoura, Ryo Yonetani, Mai Nishimura, Yoshitaka Ushiku

To address this task, we have developed the patch-based density forecasting network (PDFN), which enables forecasting over a sequence of crowd density maps describing how crowded each location is in each video frame.

Autonomous Driving

Decentralized Learning of Generative Adversarial Networks from Non-iid Data

no code implementations23 May 2019 Ryo Yonetani, Tomohiro Takahashi, Atsushi Hashimoto, Yoshitaka Ushiku

This work addresses a new problem that learns generative adversarial networks (GANs) from multiple data collections that are each i) owned separately by different clients and ii) drawn from a non-identical distribution that comprises different classes.

Image Generation

Pose Graph Optimization for Unsupervised Monocular Visual Odometry

no code implementations15 Mar 2019 Yang Li, Yoshitaka Ushiku, Tatsuya Harada

In this paper, we propose to leverage graph optimization and loop closure detection to overcome limitations of unsupervised learning based monocular visual odometry.

Loop Closure Detection Monocular Visual Odometry

Conditional Video Generation Using Action-Appearance Captions

no code implementations4 Dec 2018 Shohei Yamamoto, Antonio Tejero-de-Pablos, Yoshitaka Ushiku, Tatsuya Harada

The results demonstrate that CFT-GAN is able to successfully generate videos containing the action and appearances indicated in the captions.

Optical Flow Estimation Video Generation

Generating Easy-to-Understand Referring Expressions for Target Identifications

2 code implementations ICCV 2019 Mikihiro Tanaka, Takayuki Itamochi, Kenichi Narioka, Ikuro Sato, Yoshitaka Ushiku, Tatsuya Harada

Moreover, we regard that sentences that are easily understood are those that are comprehended correctly and quickly by humans.

Label-Noise Robust Generative Adversarial Networks

3 code implementations CVPR 2019 Takuhiro Kaneko, Yoshitaka Ushiku, Tatsuya Harada

To remedy this, we propose a novel family of GANs called label-noise robust GANs (rGANs), which, by incorporating a noise transition model, can learn a clean label conditional generative distribution even when training labels are noisy.

Robust classification

Class-Distinct and Class-Mutual Image Generation with GANs

2 code implementations27 Nov 2018 Takuhiro Kaneko, Yoshitaka Ushiku, Tatsuya Harada

To overcome this limitation, we address a novel problem called class-distinct and class-mutual image generation, in which the goal is to construct a generator that can capture between-class relationships and generate an image selectively conditioned on the class specificity.

Conditional Image Generation Image-to-Image Translation

Visual Question Generation for Class Acquisition of Unknown Objects

1 code implementation ECCV 2018 Kohei Uehara, Antonio Tejero-de-Pablos, Yoshitaka Ushiku, Tatsuya Harada

In this paper, we propose a method for generating questions about unknown objects in an image, as means to get information about classes that have not been learned.

Question Generation

Open Set Domain Adaptation by Backpropagation

4 code implementations ECCV 2018 Kuniaki Saito, Shohei Yamamoto, Yoshitaka Ushiku, Tatsuya Harada

Almost all of them are proposed for a closed-set scenario, where the source and the target domain completely share the class of their samples.

Domain Adaptation

Customized Image Narrative Generation via Interactive Visual Question Generation and Answering

no code implementations CVPR 2018 Andrew Shin, Yoshitaka Ushiku, Tatsuya Harada

Image description task has been invariably examined in a static manner with qualitative presumptions held to be universally applicable, regardless of the scope or target of the description.

Question Generation

Viewpoint-aware Video Summarization

no code implementations CVPR 2018 Atsushi Kanehira, Luc van Gool, Yoshitaka Ushiku, Tatsuya Harada

To satisfy these requirements (A)-(C) simultaneously, we proposed a novel video summarization method from multiple groups of videos.

Semantic Similarity Semantic Textual Similarity +1

Maximum Classifier Discrepancy for Unsupervised Domain Adaptation

6 code implementations CVPR 2018 Kuniaki Saito, Kohei Watanabe, Yoshitaka Ushiku, Tatsuya Harada

To solve these problems, we introduce a new approach that attempts to align distributions of source and target by utilizing the task-specific decision boundaries.

Image Classification Multi-Source Unsupervised Domain Adaptation +2

Between-class Learning for Image Classification

3 code implementations CVPR 2018 Yuji Tokozume, Yoshitaka Ushiku, Tatsuya Harada

Second, we propose a mixing method that treats the images as waveforms, which leads to a further improvement in performance.

Classification General Classification +1

Neural 3D Mesh Renderer

2 code implementations CVPR 2018 Hiroharu Kato, Yoshitaka Ushiku, Tatsuya Harada

Using this renderer, we perform single-image 3D mesh reconstruction with silhouette image supervision and our system outperforms the existing voxel-based approach.

3D Object Reconstruction Style Transfer

Adversarial Dropout Regularization

no code implementations ICLR 2018 Kuniaki Saito, Yoshitaka Ushiku, Tatsuya Harada, Kate Saenko

However, a drawback of this approach is that the critic simply labels the generated features as in-domain or not, without considering the boundaries between classes.

General Classification Image Classification +2

Melody Generation for Pop Music via Word Representation of Musical Properties

1 code implementation31 Oct 2017 Andrew Shin, Leopold Crestel, Hiroharu Kato, Kuniaki Saito, Katsunori Ohnishi, Masataka Yamaguchi, Masahiro Nakawaki, Yoshitaka Ushiku, Tatsuya Harada

Automatic melody generation for pop music has been a long-time aspiration for both AI researchers and musicians.

Sound Multimedia Audio and Speech Processing

Spatio-temporal Person Retrieval via Natural Language Queries

no code implementations ICCV 2017 Masataka Yamaguchi, Kuniaki Saito, Yoshitaka Ushiku, Tatsuya Harada

In this paper, we address the problem of spatio-temporal person retrieval from multiple videos using a natural language query, in which we output a tube (i. e., a sequence of bounding boxes) which encloses the person described by the query.

Human Detection Person Retrieval

DeMIAN: Deep Modality Invariant Adversarial Network

no code implementations23 Dec 2016 Kuniaki Saito, Yusuke Mukuta, Yoshitaka Ushiku, Tatsuya Harada

To obtain the common representations under such a situation, we propose to make the distributions over different modalities similar in the learned representations, namely modality-invariant representations.

Domain Adaptation General Classification +2

The Color of the Cat is Gray: 1 Million Full-Sentences Visual Question Answering (FSVQA)

no code implementations21 Sep 2016 Andrew Shin, Yoshitaka Ushiku, Tatsuya Harada

Visual Question Answering (VQA) task has showcased a new stage of interaction between language and vision, two of the most pivotal components of artificial intelligence.

Question Answering Visual Question Answering

DualNet: Domain-Invariant Network for Visual Question Answering

no code implementations20 Jun 2016 Kuniaki Saito, Andrew Shin, Yoshitaka Ushiku, Tatsuya Harada

Visual question answering (VQA) task not only bridges the gap between images and language, but also requires that specific contents within the image are understood as indicated by linguistic context of the question, in order to generate the accurate answers.

Question Answering Visual Question Answering

Common Subspace for Model and Similarity: Phrase Learning for Caption Generation From Images

no code implementations ICCV 2015 Yoshitaka Ushiku, Masataka Yamaguchi, Yusuke Mukuta, Tatsuya Harada

In order to overcome the shortage of training samples, CoSMoS obtains a subspace in which (a) all feature vectors associated with the same phrase are mapped as mutually close, (b) classifiers for each phrase are learned, and (c) training samples are shared among co-occurring phrases.

Three Guidelines of Online Learning for Large-Scale Visual Recognition

no code implementations CVPR 2014 Yoshitaka Ushiku, Masatoshi Hidaka, Tatsuya Harada

In this paper, we would like to evaluate online learning algorithms for large-scale visual recognition using state-of-the-art features which are preselected and held fixed.

Document Classification General Classification +1

Cannot find the paper you are looking for? You can Submit a new open access paper.