Search Results for author: Daiki Shimada

Found 2 papers, 0 papers with code

AVicuna: Audio-Visual LLM with Interleaver and Context-Boundary Alignment for Temporal Referential Dialogue

no code implementations24 Mar 2024 Yunlong Tang, Daiki Shimada, Jing Bi, Chenliang Xu

In everyday communication, humans frequently use speech and gestures to refer to specific areas or objects, a process known as Referential Dialogue (RD).

Video Understanding

Dual Normalization Multitasking for Audio-Visual Sounding Object Localization

no code implementations1 Jun 2021 Tokuhiro Nishikawa, Daiki Shimada, Jerry Jun Yokono

Although several research works have been reported on audio-visual sound source localization in unconstrained videos, no datasets and metrics have been proposed in the literature to quantitatively evaluate its performance.

Object Object Localization

Cannot find the paper you are looking for? You can Submit a new open access paper.