1 code implementation • ACL (mmsr, IWCS) 2021 • Riko Suzuki, Hitomi Yanaka, Koji Mineshima, Daisuke Bekki
This paper introduces a new video-and-language dataset with human actions for multimodal logical inference, which focuses on intentional and aspectual expressions that describe dynamic human actions.
no code implementations • ACL 2019 • Riko Suzuki, Hitomi Yanaka, Masashi Yoshikawa, Koji Mineshima, Daisuke Bekki
A large amount of research about multimodal inference across text and vision has been recently developed to obtain visually grounded word and sentence representations.