Search Results for author: Mayu Otani

Found 29 papers, 12 papers with code

Would Deep Generative Models Amplify Bias in Future Models?

no code implementations4 Apr 2024 Tianwei Chen, Yusuke Hirota, Mayu Otani, Noa Garcia, Yuta Nakashima

We investigate the impact of deep generative models on potential social biases in upcoming computer vision models.

Image Captioning Image Generation

LayoutFlow: Flow Matching for Layout Generation

no code implementations27 Mar 2024 Julian Jorge Andrade Guerreiro, Naoto Inoue, Kento Masui, Mayu Otani, Hideki Nakayama

Finding a suitable layout represents a crucial task for diverse applications in graphic design.

Denoising

Multimodal Color Recommendation in Vector Graphic Documents

no code implementations8 Aug 2023 Qianru Qiu, Xueting Wang, Mayu Otani

Additionally, it is applicable for another color recommendation task, full palette generation, which generates a complete color palette corresponding to the given text.

Toward Verifiable and Reproducible Human Evaluation for Text-to-Image Generation

no code implementations CVPR 2023 Mayu Otani, Riku Togashi, Yu Sawai, Ryosuke Ishigami, Yuta Nakashima, Esa Rahtu, Janne Heikkilä, Shin'ichi Satoh

Human evaluation is critical for validating the performance of text-to-image generative models, as this highly cognitive process requires deep comprehension of text and images.

Text-to-Image Generation

Towards Flexible Multi-modal Document Models

1 code implementation CVPR 2023 Naoto Inoue, Kotaro Kikuchi, Edgar Simo-Serra, Mayu Otani, Kota Yamaguchi

Creative workflows for generating graphical documents involve complex inter-related tasks, such as aligning elements, choosing appropriate fonts, or employing aesthetically harmonious colors.

Multi-Task Learning Position

LayoutDM: Discrete Diffusion Model for Controllable Layout Generation

1 code implementation CVPR 2023 Naoto Inoue, Kotaro Kikuchi, Edgar Simo-Serra, Mayu Otani, Kota Yamaguchi

Controllable layout generation aims at synthesizing plausible arrangement of element bounding boxes with optional constraints, such as type or position of a specific element.

Position

Generative Colorization of Structured Mobile Web Pages

1 code implementation22 Dec 2022 Kotaro Kikuchi, Naoto Inoue, Mayu Otani, Edgar Simo-Serra, Kota Yamaguchi

The web page colorization problem is then formalized as a task of estimating plausible color styles for a given web page content with a given hierarchical structure of the elements.

Colorization Efficient Exploration +1

Contrastive Losses Are Natural Criteria for Unsupervised Video Summarization

1 code implementation18 Nov 2022 Zongshang Pang, Yuta Nakashima, Mayu Otani, Hajime Nagahara

Video summarization aims to select the most informative subset of frames in a video to facilitate efficient video browsing.

Image Classification Representation Learning +1

Video Summarization Overview

no code implementations21 Oct 2022 Mayu Otani, Yale Song, Yang Wang

With the broad growth of video capturing devices and applications on the web, it is more demanding to provide desired video content for users efficiently.

Video Summarization

Color Recommendation for Vector Graphic Documents based on Multi-Palette Representation

no code implementations22 Sep 2022 Qianru Qiu, Xueting Wang, Mayu Otani, Yuki Iwazaki

We train the model and build a color recommendation system on a large-scale dataset of vector graphic documents.

Does Robustness on ImageNet Transfer to Downstream Tasks?

no code implementations CVPR 2022 Yutaro Yamada, Mayu Otani

For object detection and semantic segmentation, we find that a vanilla Swin Transformer, a variant of Vision Transformer tailored for dense prediction tasks, transfers robustness better than Convolutional Neural Networks that are trained to be robust to the corrupted version of ImageNet.

Classification Image Classification +5

AxIoU: An Axiomatically Justified Measure for Video Moment Retrieval

no code implementations CVPR 2022 Riku Togashi, Mayu Otani, Yuta Nakashima, Esa Rahtu, Janne Heikkila, Tetsuya Sakai

First, it is rank-insensitive: It ignores the rank positions of successfully localised moments in the top-$K$ ranked list by treating the list as a set.

Moment Retrieval Retrieval

Constrained Graphic Layout Generation via Latent Optimization

1 code implementation2 Aug 2021 Kotaro Kikuchi, Edgar Simo-Serra, Mayu Otani, Kota Yamaguchi

We optimize using the latent space of an off-the-shelf layout generation model, allowing our approach to be complementary to and used with existing layout generation models.

Attending Self-Attention: A Case Study of Visually Grounded Supervision in Vision-and-Language Transformers

no code implementations ACL 2021 Jules Samaran, Noa Garcia, Mayu Otani, Chenhui Chu, Yuta Nakashima

The impressive performances of pre-trained visually grounded language models have motivated a growing body of research investigating what has been learned during the pre-training.

Language Modelling Visual Grounding

A Picture May Be Worth a Hundred Words for Visual Question Answering

no code implementations25 Jun 2021 Yusuke Hirota, Noa Garcia, Mayu Otani, Chenhui Chu, Yuta Nakashima, Ittetsu Taniguchi, Takao Onoye

This paper delves into the effectiveness of textual representations for image understanding in the specific context of VQA.

Data Augmentation Descriptive +2

Scalable Personalised Item Ranking through Parametric Density Estimation

no code implementations11 May 2021 Riku Togashi, Masahiro Kato, Mayu Otani, Tetsuya Sakai, Shin'ichi Satoh

However, such methods have two main drawbacks particularly in large-scale applications; (1) the pairwise approach is severely inefficient due to the quadratic computational cost; and (2) even recent model-based samplers (e. g. IRGAN) cannot achieve practical efficiency due to the training of an extra model.

Density Estimation Learning-To-Rank

Density-Ratio Based Personalised Ranking from Implicit Feedback

no code implementations19 Jan 2021 Riku Togashi, Masahiro Kato, Mayu Otani, Shin'ichi Satoh

Learning from implicit user feedback is challenging as we can only observe positive samples but never access negative ones.

Density Ratio Estimation

Alleviating Cold-Start Problems in Recommendation through Pseudo-Labelling over Knowledge Graph

2 code implementations10 Nov 2020 Riku Togashi, Mayu Otani, Shin'ichi Satoh

Solving cold-start problems is indispensable to provide meaningful recommendation results for new users and items.

Uncovering Hidden Challenges in Query-Based Video Moment Retrieval

1 code implementation1 Sep 2020 Mayu Otani, Yuta Nakashima, Esa Rahtu, Janne Heikkilä

In this paper, we present a series of experiments assessing how well the benchmark results reflect the true progress in solving the moment retrieval task.

Moment Retrieval Retrieval +2

Knowledge-Based Visual Question Answering in Videos

no code implementations17 Apr 2020 Noa Garcia, Mayu Otani, Chenhui Chu, Yuta Nakashima

We propose a novel video understanding task by fusing knowledge-based and video question answering.

Question Answering Video Question Answering +2

Rethinking the Evaluation of Video Summaries

2 code implementations CVPR 2019 Mayu Otani, Yuta Nakashima, Esa Rahtu, Janne Heikkilä

Video summarization is a technique to create a short skim of the original video while preserving the main stories/content.

Video Segmentation Video Semantic Segmentation +1

iParaphrasing: Extracting Visually Grounded Paraphrases via an Image

1 code implementation COLING 2018 Chenhui Chu, Mayu Otani, Yuta Nakashima

These extracted VGPs have the potential to improve language and image multimodal tasks such as visual question answering and image captioning.

Image Captioning Question Answering +1

Video Summarization using Deep Semantic Features

2 code implementations28 Sep 2016 Mayu Otani, Yuta Nakashima, Esa Rahtu, Janne Heikkilä, Naokazu Yokoya

For this, we design a deep neural network that maps videos as well as descriptions to a common semantic space and jointly trained it with associated pairs of videos and descriptions.

Clustering Video Summarization

Learning Joint Representations of Videos and Sentences with Web Image Search

no code implementations8 Aug 2016 Mayu Otani, Yuta Nakashima, Esa Rahtu, Janne Heikkilä, Naokazu Yokoya

In description generation, the performance level is comparable to the current state-of-the-art, although our embeddings were trained for the retrieval tasks.

Image Retrieval Natural Language Queries +5

Cannot find the paper you are looking for? You can Submit a new open access paper.