no code implementations • 15 Aug 2023 • Yuya Yoshikawa, Yutaro Shigeto, Masashi Shimbo, Akikazu Takeuchi
The Meta Video Dataset (MetaVD) provides annotated relations between action classes in major datasets for human action recognition in videos.
1 code implementation • CVPR 2023 • Yutaro Shigeto, Masashi Shimbo, Yuya Yoshikawa, Akikazu Takeuchi
Barlow Twins and VICReg are self-supervised representation learning models that use regularizers to decorrelate features.
1 code implementation • Computer Vision and Image Understanding 2021 • Yuya Yoshikawa, Yutaro Shigeto, Akikazu Takeuchi
To realize this solution, we constructed a meta video dataset from the existing datasets for human action recognition, referred to as MetaVD.
no code implementations • LREC 2020 • Yutaro Shigeto, Yuya Yoshikawa, Jiaqing Lin, Akikazu Takeuchi
Each caption in our dataset describes a video in the form of "who does what and where."
1 code implementation • 12 Apr 2018 • Yuya Yoshikawa, Jiaqing Lin, Akikazu Takeuchi
A new large-scale video dataset for human action recognition, called STAIR Actions is introduced.
1 code implementation • ACL 2017 • Yuya Yoshikawa, Yutaro Shigeto, Akikazu Takeuchi
In recent years, automatic generation of image descriptions (captions), that is, image captioning, has attracted a great deal of attention.