Search Results for author: Hiromi Wakaki

Found 17 papers, 7 papers with code

Fundamental Exploration of Evaluation Metrics for Persona Characteristics of Text Utterances

no code implementations SIGDIAL (ACL) 2021 Chiaki Miyazaki, Saya Kanno, Makoto Yoda, Junya Ono, Hiromi Wakaki

When evaluating the appropriateness of a large number of arbitrary utterances to be registered in the utterance database of a retrieval-based dialog system, evaluation metrics that require a reference (or a “correct” utterance) for each evaluation target cannot be used.

Retrieval

CARE: Aligning Language Models for Regional Cultural Awareness

1 code implementation7 Apr 2025 Geyang Guo, Tarek Naous, Hiromi Wakaki, Yukiko Nishimura, Yuki Mitsufuji, Alan Ritter, Wei Xu

Existing language models (LMs) often exhibit a Western-centric bias and struggle to represent diverse cultural knowledge.

Cross-Modal Learning for Music-to-Music-Video Description Generation

no code implementations14 Mar 2025 Zhuoyuan Mao, Mengjie Zhao, Qiyu Wu, Zhi Zhong, Wei-Hsiang Liao, Hiromi Wakaki, Yuki Mitsufuji

In this study, we focus on the MV description generation task and propose a comprehensive pipeline encompassing training data construction and multimodal model fine-tuning.

Video Description Video Generation

DeepResonance: Enhancing Multimodal Music Understanding via Music-centric Multi-way Instruction Tuning

no code implementations18 Feb 2025 Zhuoyuan Mao, Mengjie Zhao, Qiyu Wu, Hiromi Wakaki, Yuki Mitsufuji

Recent advancements in music large language models (LLMs) have significantly improved music understanding tasks, which involve the model's ability to analyze and interpret various musical elements.

TED: Turn Emphasis with Dialogue Feature Attention for Emotion Recognition in Conversation

no code implementations2 Jan 2025 Junya Ono, Hiromi Wakaki

This paper proposes a priority-based attention method to distinguish each turn explicitly by adding dialogue features into the attention mechanism, called Turn Emphasis with Dialogue (TED).

Emotion Recognition in Conversation

OpenMU: Your Swiss Army Knife for Music Understanding

2 code implementations21 Oct 2024 Mengjie Zhao, Zhi Zhong, Zhuoyuan Mao, Shiqi Yang, Wei-Hsiang Liao, Shusuke Takahashi, Hiromi Wakaki, Yuki Mitsufuji

We present OpenMU-Bench, a large-scale benchmark suite for addressing the data scarcity issue in training multimodal language models to understand music.

Distillation of Discrete Diffusion through Dimensional Correlations

1 code implementation11 Oct 2024 Satoshi Hayakawa, Yuhta Takida, Masaaki Imaizumi, Hiromi Wakaki, Yuki Mitsufuji

Diffusion models have demonstrated exceptional performances in various fields of generative modeling, but suffer from slow sampling speed due to their iterative nature.

GLOV: Guided Large Language Models as Implicit Optimizers for Vision Language Models

1 code implementation8 Oct 2024 M. Jehanzeb Mirza, Mengjie Zhao, Zhuoyuan Mao, Sivan Doveh, Wei Lin, Paul Gavrikov, Michael Dorkenwald, Shiqi Yang, Saurav Jha, Hiromi Wakaki, Yuki Mitsufuji, Horst Possegger, Rogerio Feris, Leonid Karlinsky, James Glass

In each respective optimization step, the ranked prompts are fed as in-context examples (with their accuracies) to equip the LLM with the knowledge of the type of text prompts preferred by the downstream VLM.

Zero-Shot Learning

ComperDial: Commonsense Persona-grounded Dialogue Dataset and Benchmark

no code implementations17 Jun 2024 Hiromi Wakaki, Yuki Mitsufuji, Yoshinori Maeda, Yukiko Nishimura, Silin Gao, Mengjie Zhao, Keiichi Yamada, Antoine Bosselut

We propose a new benchmark, ComperDial, which facilitates the training and evaluation of evaluation metrics for open-domain dialogue systems.

Few-shot Dialogue Strategy Learning for Motivational Interviewing via Inductive Reasoning

no code implementations23 Mar 2024 Zhouhang Xie, Bodhisattwa Prasad Majumder, Mengjie Zhao, Yoshinori Maeda, Keiichi Yamada, Hiromi Wakaki, Julian McAuley

We consider the task of building a dialogue system that can motivate users to adopt positive lifestyle changes: Motivational Interviewing.

Instruction Following

DiffuCOMET: Contextual Commonsense Knowledge Diffusion

1 code implementation26 Feb 2024 Silin Gao, Mete Ismayilzada, Mengjie Zhao, Hiromi Wakaki, Yuki Mitsufuji, Antoine Bosselut

Inferring contextually-relevant and diverse commonsense to understand narratives remains challenging for knowledge models.

Diversity

Using Natural Language Inference to Improve Persona Extraction from Dialogue in a New Domain

no code implementations12 Jan 2024 Alexandra DeLucia, Mengjie Zhao, Yoshinori Maeda, Makoto Yoda, Keiichi Yamada, Hiromi Wakaki

To address both these issues, we introduce a natural language inference method for post-hoc adapting a trained persona extraction model to a new setting.

Diversity Natural Language Inference +1

Towards reporting bias in visual-language datasets: bimodal augmentation by decoupling object-attribute association

no code implementations2 Oct 2023 Qiyu Wu, Mengjie Zhao, Yutong He, Lang Huang, Junya Ono, Hiromi Wakaki, Yuki Mitsufuji

In this paper, we focus on the wide existence of reporting bias in visual-language datasets, embodied as the object-attribute association, which can subsequentially degrade models trained on them.

Attribute Object

PeaCoK: Persona Commonsense Knowledge for Consistent and Engaging Narratives

1 code implementation3 May 2023 Silin Gao, Beatriz Borges, Soyoung Oh, Deniz Bayazit, Saya Kanno, Hiromi Wakaki, Yuki Mitsufuji, Antoine Bosselut

They must also learn to maintain consistent speaker personas for themselves throughout the narrative, so that their counterparts feel involved in a realistic conversation or story.

Knowledge Graphs World Knowledge

ComFact: A Benchmark for Linking Contextual Commonsense Knowledge

1 code implementation23 Oct 2022 Silin Gao, Jena D. Hwang, Saya Kanno, Hiromi Wakaki, Yuki Mitsufuji, Antoine Bosselut

Understanding rich narratives, such as dialogues and stories, often requires natural language processing systems to access relevant knowledge from commonsense knowledge graphs.

Knowledge Graphs Response Generation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.