Search Results for author: Naoki Wake

Found 8 papers, 2 papers with code

Agent AI: Surveying the Horizons of Multimodal Interaction

1 code implementation7 Jan 2024 Zane Durante, Qiuyuan Huang, Naoki Wake, Ran Gong, Jae Sung Park, Bidipta Sarkar, Rohan Taori, Yusuke Noda, Demetri Terzopoulos, Yejin Choi, Katsushi Ikeuchi, Hoi Vo, Li Fei-Fei, Jianfeng Gao

To accelerate research on agent-based multimodal intelligence, we define "Agent AI" as a class of interactive systems that can perceive visual stimuli, language inputs, and other environmentally-grounded data, and can produce meaningful embodied actions.

GPT-4V(ision) for Robotics: Multimodal Task Planning from Human Demonstration

no code implementations20 Nov 2023 Naoki Wake, Atsushi Kanehira, Kazuhiro Sasabuchi, Jun Takamatsu, Katsushi Ikeuchi

The computation starts by analyzing the videos with GPT-4V to convert environmental and action details into text, followed by a GPT-4-empowered task planner.

Language Modelling Object +1

Bias in Emotion Recognition with ChatGPT

no code implementations18 Oct 2023 Naoki Wake, Atsushi Kanehira, Kazuhiro Sasabuchi, Jun Takamatsu, Katsushi Ikeuchi

This technical report explores the ability of ChatGPT in recognizing emotions from text, which can be the basis of various applications like interactive chatbots, data annotation, and mental health analysis.

Emotion Recognition Sentiment Analysis

Text-driven object affordance for guiding grasp-type recognition in multimodal robot teaching

1 code implementation27 Feb 2021 Naoki Wake, Daichi Saito, Kazuhiro Sasabuchi, Hideki Koike, Katsushi Ikeuchi

These findings highlight the significance of object affordance in multimodal robot teaching, regardless of whether real objects are present in the images.

Mixed Reality Object +1

Understanding Action Sequences based on Video Captioning for Learning-from-Observation

no code implementations9 Dec 2020 Iori Yanokura, Naoki Wake, Kazuhiro Sasabuchi, Katsushi Ikeuchi, Masayuki Inaba

We propose a Learning-from-Observation framework that splits and understands a video of a human demonstration with verbal instructions to extract accurate action sequences.

Video Captioning Video Understanding

Learning-from-Observation Framework: One-Shot Robot Teaching for Grasp-Manipulation-Release Household Operations

no code implementations4 Aug 2020 Naoki Wake, Riku Arakawa, Iori Yanokura, Takuya Kiyokawa, Kazuhiro Sasabuchi, Jun Takamatsu, Katsushi Ikeuchi

In the context of one-shot robot teaching, the contributions of the paper are: 1) to propose a framework that 1) covers various tasks in grasp-manipulation-release class household operations and 2) mimics human postures during the operations.

Robotics Human-Computer Interaction

Cannot find the paper you are looking for? You can Submit a new open access paper.