no code implementations • 8 Oct 2024 • Takara Taniguchi, Ryosuke Furuta
To overcome these challenges, we propose a data augmentation method in feature space to increase the variation of the query.
no code implementations • 15 Sep 2024 • Nie Lin, Takehiko Ohkawa, Mingfang Zhang, Yifei HUANG, Ryosuke Furuta, Yoichi Sato
Our experiments demonstrate that our method outperforms conventional contrastive learning approaches that produce positive pairs sorely from a single image with data augmentation.
1 code implementation • 10 Jul 2024 • Liangyang Ouyang, Ruicong Liu, Yifei HUANG, Ryosuke Furuta, Yoichi Sato
Experimental results on VISOR dataset reveal that ActionVOS significantly reduces the mis-segmentation of inactive objects, confirming that actions help the ActionVOS model understand objects' involvement.
no code implementations • 2 May 2024 • Masatoshi Tateno, Takuma Yagi, Ryosuke Furuta, Yoichi Sato
We further accumulate the derived object states to consider past state contexts to infer current object state pseudo-labels.
no code implementations • 1 Feb 2024 • Takuma Yagi, Misaki Ohashi, Yifei HUANG, Ryosuke Furuta, Shungo Adachi, Toutai Mitsuyama, Yoichi Sato
The dataset consists of multi-view videos of 32 participants performing mock biological experiments with a total duration of 14. 5 hours.
2 code implementations • CVPR 2024 • Kristen Grauman, Andrew Westbury, Lorenzo Torresani, Kris Kitani, Jitendra Malik, Triantafyllos Afouras, Kumar Ashutosh, Vijay Baiyya, Siddhant Bansal, Bikram Boote, Eugene Byrne, Zach Chavis, Joya Chen, Feng Cheng, Fu-Jen Chu, Sean Crane, Avijit Dasgupta, Jing Dong, Maria Escobar, Cristhian Forigua, Abrham Gebreselasie, Sanjay Haresh, Jing Huang, Md Mohaiminul Islam, Suyog Jain, Rawal Khirodkar, Devansh Kukreja, Kevin J Liang, Jia-Wei Liu, Sagnik Majumder, Yongsen Mao, Miguel Martin, Effrosyni Mavroudi, Tushar Nagarajan, Francesco Ragusa, Santhosh Kumar Ramakrishnan, Luigi Seminara, Arjun Somayazulu, Yale Song, Shan Su, Zihui Xue, Edward Zhang, Jinxu Zhang, Angela Castillo, Changan Chen, Xinzhu Fu, Ryosuke Furuta, Cristina Gonzalez, Prince Gupta, Jiabo Hu, Yifei HUANG, Yiming Huang, Weslie Khoo, Anush Kumar, Robert Kuo, Sach Lakhavani, Miao Liu, Mi Luo, Zhengyi Luo, Brighid Meredith, Austin Miller, Oluwatumininu Oguntola, Xiaqing Pan, Penny Peng, Shraman Pramanick, Merey Ramazanova, Fiona Ryan, Wei Shan, Kiran Somasundaram, Chenan Song, Audrey Southerland, Masatoshi Tateno, Huiyu Wang, Yuchen Wang, Takuma Yagi, Mingfei Yan, Xitong Yang, Zecheng Yu, Shengxin Cindy Zha, Chen Zhao, Ziwei Zhao, Zhifan Zhu, Jeff Zhuo, Pablo Arbelaez, Gedas Bertasius, David Crandall, Dima Damen, Jakob Engel, Giovanni Maria Farinella, Antonino Furnari, Bernard Ghanem, Judy Hoffman, C. V. Jawahar, Richard Newcombe, Hyun Soo Park, James M. Rehg, Yoichi Sato, Manolis Savva, Jianbo Shi, Mike Zheng Shou, Michael Wray
We present Ego-Exo4D, a diverse, large-scale multimodal multiview video dataset and benchmark challenge.
no code implementations • 28 Nov 2023 • Takehiko Ohkawa, Takuma Yagi, Taichi Nishimura, Ryosuke Furuta, Atsushi Hashimoto, Yoshitaka Ushiku, Yoichi Sato
We propose a novel benchmark for cross-view knowledge transfer of dense video captioning, adapting models from web instructional videos with exocentric views to an egocentric view.
no code implementations • 30 Oct 2023 • Ryosuke Furuta, Yoichi Sato
In contrast to the conventional domain generalization for object detection that requires labeled data from multiple domains, SS-DGOD and WS-DGOD require labeled data only from one domain and unlabeled or weakly-labeled data from multiple domains for training.
no code implementations • 9 Oct 2023 • Yuan Yin, Yifei HUANG, Ryosuke Furuta, Yoichi Sato
Point-level supervised temporal action localization (PTAL) aims at recognizing and localizing actions in untrimmed videos where only a single point (frame) within every action instance is annotated in training data.
1 code implementation • 7 Feb 2023 • Zecheng Yu, Yifei HUANG, Ryosuke Furuta, Takuma Yagi, Yusuke Goutsu, Yoichi Sato
Object affordance is an important concept in hand-object interaction, providing information on action possibilities based on human motor capacity and objects' physical property thus benefiting tasks such as action anticipation and robot imitation learning.
no code implementations • 11 Jun 2022 • Zecheng Yu, Yifei HUANG, Ryosuke Furuta, Takuma Yagi, Yusuke Goutsu, Yoichi Sato
Object affordance is an important concept in human-object interaction, providing information on action possibilities based on human motor capacity and objects' physical property thus benefiting tasks such as action anticipation and robot imitation learning.
no code implementations • 5 Jun 2022 • Takehiko Ohkawa, Ryosuke Furuta, Yoichi Sato
In this survey, we present a systematic review of 3D hand pose estimation from the perspective of efficient annotation and learning.
no code implementations • 16 Mar 2022 • Takehiko Ohkawa, Yu-Jhe Li, Qichen Fu, Ryosuke Furuta, Kris M. Kitani, Yoichi Sato
We aim to improve the performance of regressing hand keypoints and segmenting pixel-level hand masks under new imaging conditions (e. g., outdoors) when we only have labeled images taken under very different conditions (e. g., indoors).
no code implementations • 28 Feb 2022 • Koya Tango, Takehiko Ohkawa, Ryosuke Furuta, Yoichi Sato
Detecting the positions of human hands and objects-in-contact (hand-object detection) in each video frame is vital for understanding human activities from videos.
no code implementations • 16 Jul 2021 • Yugo Shimizu, Ryosuke Furuta, Delong Ouyang, Yukinobu Taniguchi, Ryota Hinami, Shonosuke Ishiwatari
To realize consistent colorization, we propose here a semi-automatic colorization method based on generative adversarial networks (GAN); the method learns the painting style of a specific comic from small amount of training data.
1 code implementation • 16 Dec 2019 • Ryosuke Furuta, Naoto Inoue, Toshihiko Yamasaki
However, the applications of deep reinforcement learning (RL) for image processing are still limited.
1 code implementation • 10 Nov 2018 • Ryosuke Furuta, Naoto Inoue, Toshihiko Yamasaki
This paper tackles a new problem setting: reinforcement learning with pixel-wise rewards (pixelRL) for image processing.
3 code implementations • CVPR 2018 • Naoto Inoue, Ryosuke Furuta, Toshihiko Yamasaki, Kiyoharu Aizawa
Can we detect common objects in a variety of image domains without instance-level annotations?
Ranked #5 on
Weakly Supervised Object Detection
on Watercolor2k
(using extra training data)