Search Results for author: Wonjae Kim

Found 19 papers, 11 papers with code

Language-only Efficient Training of Zero-shot Composed Image Retrieval

1 code implementation • 4 Dec 2023 • Geonmo Gu, Sanghyuk Chun, Wonjae Kim, Yoohoon Kang, Sangdoo Yun

Our LinCIR (Language-only training for CIR) can be trained only with text datasets by a novel self-supervision named self-masking projection (SMP).

Ranked #1 on Zero-Shot Composed Image Retrieval (ZS-CIR) on Fashion IQ

Image Retrieval Retrieval +1

Paper
Code

STELLA: Continual Audio-Video Pre-training with Spatio-Temporal Localized Alignment

no code implementations • 12 Oct 2023 • Jaewoo Lee, Jaehong Yoon, Wonjae Kim, Yunji Kim, Sung Ju Hwang

Continuously learning a variety of audio-video semantics over time is crucial for audio-related reasoning tasks in our ever-evolving world.

Continual Learning Representation Learning +1

Paper
Add Code

Computational Approaches for App-to-App Retrieval and Design Consistency Check

no code implementations • 19 Sep 2023 • SeokHyeon Park, Wonjae Kim, Young-Ho Kim, Jinwook Seo

Extracting semantic representations from mobile user interfaces (UI) and using the representations for designers' decision-making processes have shown the potential to be effective computational design support tools.

Decision Making Retrieval

Paper
Add Code

What Do Self-Supervised Vision Transformers Learn?

1 code implementation • 1 May 2023 • Namuk Park, Wonjae Kim, Byeongho Heo, Taekyung Kim, Sangdoo Yun

We present a comparative study on how and why contrastive learning (CL) and masked image modeling (MIM) differ in their representations and in their performance of downstream tasks.

Contrastive Learning

Paper
Code

CompoDiff: Versatile Composed Image Retrieval With Latent Diffusion

1 code implementation • 21 Mar 2023 • Geonmo Gu, Sanghyuk Chun, Wonjae Kim, HeeJae Jun, Yoohoon Kang, Sangdoo Yun

This paper proposes a novel diffusion-based model, CompoDiff, for solving zero-shot Composed Image Retrieval (ZS-CIR) with latent diffusion.

Ranked #3 on Zero-Shot Composed Image Retrieval (ZS-CIR) on CIRCO

Retrieval Zero-Shot Composed Image Retrieval (ZS-CIR)

Paper
Code

SeiT: Storage-Efficient Vision Training with Tokens Using 1% of Pixel Storage

1 code implementation • ICCV 2023 • Song Park, Sanghyuk Chun, Byeongho Heo, Wonjae Kim, Sangdoo Yun

We need billion-scale images to achieve more generalizable and ground-breaking vision models, as well as massive dataset storage to ship the images (e. g., the LAION-4B dataset needs 240TB storage space).

Continual Learning

Paper
Code

UniXGen: A Unified Vision-Language Model for Multi-View Chest X-ray Generation and Report Generation

1 code implementation • 23 Feb 2023 • Hyungyung Lee, Da Young Lee, Wonjae Kim, Jin-Hwa Kim, Tackeun Kim, Jihang Kim, Leonard Sunwoo, Edward Choi

We also find that view-specific special tokens can distinguish between different views and properly generate specific views even if they do not exist in the dataset, and utilizing multi-view chest X-rays can faithfully capture the abnormal findings in the additional X-rays.

Language Modelling Quantization

Paper
Code

Group Generalized Mean Pooling for Vision Transformer

no code implementations • 8 Dec 2022 • Byungsoo Ko, Han-Gyu Kim, Byeongho Heo, Sangdoo Yun, Sanghyuk Chun, Geonmo Gu, Wonjae Kim

As ViT groups the channels via a multi-head attention mechanism, grouping the channels by GGeM leads to lower head-wise dependence while amplifying important channels on the activation maps.

Image Retrieval Representation Learning +1

Paper
Add Code

Pivotal Role of Language Modeling in Recommender Systems: Enriching Task-specific and Task-agnostic Representation Learning

no code implementations • 7 Dec 2022 • Kyuyong Shin, Hanock Kwak, Wonjae Kim, Jisu Jeong, Seungjae Jung, Kyung-Min Kim, Jung-Woo Ha, Sang-Woo Lee

Recent studies have proposed unified user modeling frameworks that leverage user behavior data from various applications.

Language Modelling Recommendation Systems +2

Paper
Add Code

Correlation between Alignment-Uniformity and Performance of Dense Contrastive Representations

1 code implementation • 17 Oct 2022 • Jong Hak Moon, Wonjae Kim, Edward Choi

Recently, dense contrastive learning has shown superior performance on dense prediction tasks compared to instance-level contrastive learning.

Contrastive Learning

Paper
Code

An Extendable, Efficient and Effective Transformer-based Object Detector

1 code implementation • 17 Apr 2022 • Hwanjun Song, Deqing Sun, Sanghyuk Chun, Varun Jampani, Dongyoon Han, Byeongho Heo, Wonjae Kim, Ming-Hsuan Yang

Transformers have been widely used in numerous vision problems especially for visual recognition and detection.

Image Classification Instance Segmentation +4

299

Paper
Code

ECCV Caption: Correcting False Negatives by Collecting Machine-and-Human-verified Image-Caption Associations for MS-COCO

2 code implementations • 7 Apr 2022 • Sanghyuk Chun, Wonjae Kim, Song Park, Minsuk Chang, Seong Joon Oh

Image-Text matching (ITM) is a common task for evaluating the quality of Vision and Language (VL) models.

Image-text matching Text Matching

119

Paper
Code

Conditional Generation of Periodic Signals with Fourier-Based Decoder

no code implementations • 24 Oct 2021 • Jiyoung Lee, Wonjae Kim, Daehoon Gwak, Edward Choi

Periodic signals play an important role in daily lives.

Imputation

Paper
Add Code

ViDT: An Efficient and Effective Fully Transformer-based Object Detector

1 code implementation • ICLR 2022 • Hwanjun Song, Deqing Sun, Sanghyuk Chun, Varun Jampani, Dongyoon Han, Byeongho Heo, Wonjae Kim, Ming-Hsuan Yang

Transformers are transforming the landscape of computer vision, especially for recognition tasks.

Ranked #12 on Object Detection on COCO 2017 val

Image Classification Object +2

299

Paper
Code

ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision

5 code implementations • 5 Feb 2021 • Wonjae Kim, Bokyung Son, Ildoo Kim

Vision-and-Language Pre-training (VLP) has improved performance on various joint vision-and-language downstream tasks.

Ranked #2 on Image Retrieval on PhotoChat

Cross-Modal Retrieval Image Retrieval +5

124,793

Paper
Code

Diversified Mutual Learning for Deep Metric Learning

no code implementations • 9 Sep 2020 • Wonpyo Park, Wonjae Kim, Kihyun You, Minsu Cho

Mutual learning is an ensemble training strategy to improve generalization by transferring individual knowledge to each other while simultaneously training multiple models.

Metric Learning Transfer Learning

Paper
Add Code

Discrete InfoMax Codes for Meta-Learning

no code implementations • 25 Sep 2019 • Yoonho Lee, Wonjae Kim, Seungjin Choi

This paper analyzes how generalization works in meta-learning.

Meta-Learning Metric Learning

Paper
Add Code

Discrete Infomax Codes for Supervised Representation Learning

no code implementations • 28 May 2019 • Yoonho Lee, Wonjae Kim, Wonpyo Park, Seungjin Choi

In this paper we present a model that produces Discrete InfoMax Codes (DIMCO); we learn a probabilistic encoder that yields k-way d-dimensional codes associated with input data.

Meta-Learning Metric Learning +2

Paper
Add Code

Learning Dynamics of Attention: Human Prior for Interpretable Machine Reasoning

1 code implementation • NeurIPS 2019 • Wonjae Kim, Yoonho Lee

Without relevant human priors, neural networks may learn uninterpretable features.

Visual Reasoning

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.