no code implementations • 14 Mar 2024 • Jongsuk Kim, Hyeongkeun Lee, Kyeongha Rho, Junmo Kim, Joon Son Chung
Recent advancements in self-supervised audio-visual representation learning have demonstrated its potential to capture rich and comprehensive representations.
no code implementations • 27 Sep 2022 • Janghyeon Lee, Jongsuk Kim, Hyounguk Shon, Bumsoo Kim, Seung Hwan Kim, Honglak Lee, Junmo Kim
Pre-training vision-language models with contrastive objectives has shown promising results that are both scalable to large uncurated datasets and transferable to many downstream applications.