1 code implementation • 4 Dec 2023 • Geonmo Gu, Sanghyuk Chun, Wonjae Kim, Yoohoon Kang, Sangdoo Yun
Our LinCIR (Language-only training for CIR) can be trained only with text datasets by a novel self-supervision named self-masking projection (SMP).
no code implementations • 12 Oct 2023 • Jaewoo Lee, Jaehong Yoon, Wonjae Kim, Yunji Kim, Sung Ju Hwang
Continuously learning a variety of audio-video semantics over time is crucial for audio-related reasoning tasks in our ever-evolving world.
no code implementations • 19 Sep 2023 • SeokHyeon Park, Wonjae Kim, Young-Ho Kim, Jinwook Seo
Extracting semantic representations from mobile user interfaces (UI) and using the representations for designers' decision-making processes have shown the potential to be effective computational design support tools.
1 code implementation • 1 May 2023 • Namuk Park, Wonjae Kim, Byeongho Heo, Taekyung Kim, Sangdoo Yun
We present a comparative study on how and why contrastive learning (CL) and masked image modeling (MIM) differ in their representations and in their performance of downstream tasks.
1 code implementation • 21 Mar 2023 • Geonmo Gu, Sanghyuk Chun, Wonjae Kim, HeeJae Jun, Yoohoon Kang, Sangdoo Yun
This paper proposes a novel diffusion-based model, CompoDiff, for solving zero-shot Composed Image Retrieval (ZS-CIR) with latent diffusion.
Ranked #3 on Zero-Shot Composed Image Retrieval (ZS-CIR) on CIRCO
1 code implementation • ICCV 2023 • Song Park, Sanghyuk Chun, Byeongho Heo, Wonjae Kim, Sangdoo Yun
We need billion-scale images to achieve more generalizable and ground-breaking vision models, as well as massive dataset storage to ship the images (e. g., the LAION-4B dataset needs 240TB storage space).
1 code implementation • 23 Feb 2023 • Hyungyung Lee, Da Young Lee, Wonjae Kim, Jin-Hwa Kim, Tackeun Kim, Jihang Kim, Leonard Sunwoo, Edward Choi
We also find that view-specific special tokens can distinguish between different views and properly generate specific views even if they do not exist in the dataset, and utilizing multi-view chest X-rays can faithfully capture the abnormal findings in the additional X-rays.
no code implementations • 8 Dec 2022 • Byungsoo Ko, Han-Gyu Kim, Byeongho Heo, Sangdoo Yun, Sanghyuk Chun, Geonmo Gu, Wonjae Kim
As ViT groups the channels via a multi-head attention mechanism, grouping the channels by GGeM leads to lower head-wise dependence while amplifying important channels on the activation maps.
no code implementations • 7 Dec 2022 • Kyuyong Shin, Hanock Kwak, Wonjae Kim, Jisu Jeong, Seungjae Jung, Kyung-Min Kim, Jung-Woo Ha, Sang-Woo Lee
Recent studies have proposed unified user modeling frameworks that leverage user behavior data from various applications.
1 code implementation • 17 Oct 2022 • Jong Hak Moon, Wonjae Kim, Edward Choi
Recently, dense contrastive learning has shown superior performance on dense prediction tasks compared to instance-level contrastive learning.
1 code implementation • 17 Apr 2022 • Hwanjun Song, Deqing Sun, Sanghyuk Chun, Varun Jampani, Dongyoon Han, Byeongho Heo, Wonjae Kim, Ming-Hsuan Yang
Transformers have been widely used in numerous vision problems especially for visual recognition and detection.
2 code implementations • 7 Apr 2022 • Sanghyuk Chun, Wonjae Kim, Song Park, Minsuk Chang, Seong Joon Oh
Image-Text matching (ITM) is a common task for evaluating the quality of Vision and Language (VL) models.
no code implementations • 24 Oct 2021 • Jiyoung Lee, Wonjae Kim, Daehoon Gwak, Edward Choi
Periodic signals play an important role in daily lives.
1 code implementation • ICLR 2022 • Hwanjun Song, Deqing Sun, Sanghyuk Chun, Varun Jampani, Dongyoon Han, Byeongho Heo, Wonjae Kim, Ming-Hsuan Yang
Transformers are transforming the landscape of computer vision, especially for recognition tasks.
Ranked #12 on Object Detection on COCO 2017 val
5 code implementations • 5 Feb 2021 • Wonjae Kim, Bokyung Son, Ildoo Kim
Vision-and-Language Pre-training (VLP) has improved performance on various joint vision-and-language downstream tasks.
Ranked #2 on Image Retrieval on PhotoChat
no code implementations • 9 Sep 2020 • Wonpyo Park, Wonjae Kim, Kihyun You, Minsu Cho
Mutual learning is an ensemble training strategy to improve generalization by transferring individual knowledge to each other while simultaneously training multiple models.
no code implementations • 25 Sep 2019 • Yoonho Lee, Wonjae Kim, Seungjin Choi
This paper analyzes how generalization works in meta-learning.
no code implementations • 28 May 2019 • Yoonho Lee, Wonjae Kim, Wonpyo Park, Seungjin Choi
In this paper we present a model that produces Discrete InfoMax Codes (DIMCO); we learn a probabilistic encoder that yields k-way d-dimensional codes associated with input data.
1 code implementation • NeurIPS 2019 • Wonjae Kim, Yoonho Lee
Without relevant human priors, neural networks may learn uninterpretable features.