no code implementations • 28 Aug 2024 • Wei-Jhe Huang, Min-Hung Chen, Shang-Hong Lai
In this paper, we aim to adapt the pretrained image-language models to detect unseen actions.
1 code implementation • 10 Apr 2023 • Wei-Jhe Huang, Jheng-Hsien Yeh, Min-Hung Chen, Gueter Josmy Faure, Shang-Hong Lai
Finally, we calculate the similarity between the interaction feature and the text feature for each label to determine the action category.