no code implementations • 21 Nov 2023 • Sumin Lee, Sangmin Woo, Muhammad Adi Nugroho, Changick Kim
CFEM incorporates sepearte learnable query embeddings for each modality, which guide CFEM to extract complementary information and global action content from the other modalities.
no code implementations • ICCV 2023 • Muhammad Adi Nugroho, Sangmin Woo, Sumin Lee, Changick Kim
To address this issue, we propose Audio-Visual Glance Network (AVGN), which leverages the commonly available audio and visual modalities to efficiently process the spatio-temporally important parts of a video.
1 code implementation • 25 Nov 2022 • Sangmin Woo, Sumin Lee, Yeonju Park, Muhammad Adi Nugroho, Changick Kim
We ask: how can we train a model that is robust to missing modalities?
no code implementations • 24 Aug 2022 • Sumin Lee, Sangmin Woo, Yeonju Park, Muhammad Adi Nugroho, Changick Kim
In multi-modal action recognition, it is important to consider not only the complementary nature of different modalities but also global action content.
no code implementations • 5 Jul 2022 • Agus Gunawan, Muhammad Adi Nugroho, Se Jin Park
We explore a different direction where we propose to improve real image denoising performance through a better learning strategy that can enable test-time adaptation on the multi-task network.