1 code implementation • 29 May 2023 • Sanghyuk Chun
In addition, two optimization techniques are proposed to enhance PCME++ further; first, the incorporation of pseudo-positives to prevent the loss saturation problem under massive false negatives; second, mixed sample data augmentation for probabilistic matching.
1 code implementation • 21 Apr 2023 • Seulki Park, Daeho Um, Hajung Yoon, Sanghyuk Chun, Sangdoo Yun, Jin Young Choi
In this paper, we propose a robustness benchmark for image-text matching models to assess their vulnerabilities.
1 code implementation • 10 Apr 2023 • Gyeongsik Moon, Hongsuk Choi, Sanghyuk Chun, Jiyoung Lee, Sangdoo Yun
Recovering 3D human mesh in the wild is greatly challenging as in-the-wild (ITW) datasets provide only 2D pose ground truths (GTs).
Ranked #5 on
3D Multi-Person Pose Estimation
on MuPoTS-3D
1 code implementation • 21 Mar 2023 • Geonmo Gu, Sanghyuk Chun, Wonjae Kim, HeeJae Jun, Yoohoon Kang, Sangdoo Yun
This paper proposes a novel diffusion-based model, CompoDiff, for solving Composed Image Retrieval (CIR) with latent diffusion and presents a newly created dataset of 18 million reference images, conditions, and corresponding target image triplets to train the model.
1 code implementation • ICCV 2023 • Song Park, Sanghyuk Chun, Byeongho Heo, Wonjae Kim, Sangdoo Yun
We need billion-scale images to achieve more generalizable and ground-breaking vision models, as well as massive dataset storage to ship the images (e. g., the LAION-4B dataset needs 240TB storage space).
no code implementations • 1 Mar 2023 • Sangwon Jung, TaeEon Park, Sanghyuk Chun, Taesup Moon
Many existing group fairness-aware training methods aim to achieve the group fairness by either re-weighting underrepresented groups based on certain rules or using weakly approximated surrogates for the fairness metrics in the objective as regularization terms.
no code implementations • 8 Dec 2022 • Byungsoo Ko, Han-Gyu Kim, Byeongho Heo, Sangdoo Yun, Sanghyuk Chun, Geonmo Gu, Wonjae Kim
As ViT groups the channels via a multi-head attention mechanism, grouping the channels by GGeM leads to lower head-wise dependence while amplifying important channels on the activation maps.
no code implementations • 20 Oct 2022 • Jaehui Hwang, Dongyoon Han, Byeongho Heo, Song Park, Sanghyuk Chun, Jong-Seok Lee
In recent years, a huge amount of deep neural architectures have been developed for image classification.
1 code implementation • 21 Aug 2022 • Chanwoo Park, Sangdoo Yun, Sanghyuk Chun
Our theoretical results show that regardless of the choice of the mixing strategy, MSDA behaves as a pixel-level regularization of the underlying training loss and a regularization of the first layer parameters.
1 code implementation • 17 Apr 2022 • Hwanjun Song, Deqing Sun, Sanghyuk Chun, Varun Jampani, Dongyoon Han, Byeongho Heo, Wonjae Kim, Ming-Hsuan Yang
Transformers have been widely used in numerous vision problems especially for visual recognition and detection.
2 code implementations • 7 Apr 2022 • Sanghyuk Chun, Wonjae Kim, Song Park, Minsuk Chang, Seong Joon Oh
Image-Text matching (ITM) is a common task for evaluating the quality of Vision and Language (VL) models.
1 code implementation • 21 Mar 2022 • Junbum Cha, Kyungjae Lee, Sungrae Park, Sanghyuk Chun
Domain generalization (DG) aims to learn a generalized model to an unseen target domain using only limited source domains.
Ranked #1 on
Domain Generalization
on TerraIncognita
2 code implementations • 7 Feb 2022 • Saehyung Lee, Sanghyuk Chun, Sangwon Jung, Sangdoo Yun, Sungroh Yoon
However, in this study, we prove that the existing DC methods can perform worse than the random selection method when task-irrelevant information forms a significant part of the training dataset.
2 code implementations • 22 Dec 2021 • Song Park, Sanghyuk Chun, Junbum Cha, Bado Lee, Hyunjung Shim
Existing methods learn to disentangle style and content elements by developing a universal style representation for each font style.
1 code implementation • CVPR 2022 • Sangwon Jung, Sanghyuk Chun, Taesup Moon
To address this problem, we propose a simple Confidence-based Group Label assignment (CGL) strategy that is readily applicable to any fairness-aware learning method.
1 code implementation • ICLR 2022 • Hwanjun Song, Deqing Sun, Sanghyuk Chun, Varun Jampani, Dongyoon Han, Byeongho Heo, Wonjae Kim, Ming-Hsuan Yang
Transformers are transforming the landscape of computer vision, especially for recognition tasks.
Ranked #11 on
Object Detection
on COCO 2017 val
no code implementations • ICLR 2022 • Luca Scimeca, Seong Joon Oh, Sanghyuk Chun, Michael Poli, Sangdoo Yun
This phenomenon, also known as shortcut learning, is emerging as a key limitation of the current generation of machine learning models.
no code implementations • 29 Sep 2021 • Saehyung Lee, Hyungyu Lee, Sanghyuk Chun, Sungroh Yoon
Several recent studies have shown that the use of extra in-distribution data can lead to a high level of adversarial robustness.
no code implementations • 24 Aug 2021 • Sanghyuk Chun, Song Park
Hence, StyleAugment let the model observe abundant confounding cues for each image by on-the-fly the augmentation strategy, while the augmented images are more realistic than artistic style transferred images.
no code implementations • NeurIPS 2021 • Michael Poli, Stefano Massaroli, Luca Scimeca, Seong Joon Oh, Sanghyuk Chun, Atsushi Yamashita, Hajime Asama, Jinkyoo Park, Animesh Garg
Effective control and prediction of dynamical systems often require appropriate handling of continuous-time and discrete, event-triggered processes.
4 code implementations • ICCV 2021 • Song Park, Sanghyuk Chun, Junbum Cha, Bado Lee, Hyunjung Shim
MX-Font extracts multiple style features not explicitly conditioned on component labels, but automatically by multiple experts to represent different local concepts, e. g., left-side sub-glyph.
10 code implementations • ICCV 2021 • Byeongho Heo, Sangdoo Yun, Dongyoon Han, Sanghyuk Chun, Junsuk Choe, Seong Joon Oh
We empirically show that such a spatial dimension reduction is beneficial to a transformer architecture as well, and propose a novel Pooling-based Vision Transformer (PiT) upon the original ViT model.
Ranked #310 on
Image Classification
on ImageNet
4 code implementations • NeurIPS 2021 • Junbum Cha, Sanghyuk Chun, Kyungjae Lee, Han-Cheol Cho, Seunghyun Park, Yunsung Lee, Sungrae Park
Domain generalization (DG) methods aim to achieve generalizability to an unseen target domain by using only training data from the source domains.
Ranked #15 on
Domain Generalization
on TerraIncognita
2 code implementations • CVPR 2021 • Sangdoo Yun, Seong Joon Oh, Byeongho Heo, Dongyoon Han, Junsuk Choe, Sanghyuk Chun
However, they have not fixed the training set, presumably because of a formidable annotation cost.
Ranked #21 on
Image Classification
on OmniBenchmark
1 code implementation • CVPR 2021 • Sanghyuk Chun, Seong Joon Oh, Rafael Sampaio de Rezende, Yannis Kalantidis, Diane Larlus
Instead, we propose to use Probabilistic Cross-Modal Embedding (PCME), where samples from the different modalities are represented as probabilistic distributions in the common embedding space.
3 code implementations • 23 Sep 2020 • Song Park, Sanghyuk Chun, Junbum Cha, Bado Lee, Hyunjung Shim
However, learning component-wise styles solely from reference glyphs is infeasible in the few-shot font generation scenario, when a target script has a large number of components, e. g., over 200 for Chinese.
2 code implementations • 8 Jul 2020 • Junsuk Choe, Seong Joon Oh, Sanghyuk Chun, Seungho Lee, Zeynep Akata, Hyunjung Shim
In this paper, we argue that WSOL task is ill-posed with only image-level labels, and propose a new evaluation protocol where full supervision is limited to only a small held-out set not overlapping with the test set.
4 code implementations • ICLR 2021 • Byeongho Heo, Sanghyuk Chun, Seong Joon Oh, Dongyoon Han, Sangdoo Yun, Gyuwan Kim, Youngjung Uh, Jung-Woo Ha
Because of the scale invariance, this modification only alters the effective step sizes without changing the effective update directions, thus enjoying the original convergence properties of GD optimizers.
3 code implementations • ECCV 2020 • Junbum Cha, Sanghyuk Chun, Gayoung Lee, Bado Lee, Seonghyeon Kim, Hwalsuk Lee
By utilizing the compositionality of compositional scripts, we propose a novel font generation framework, named Dual Memory-augmented Font Generation Network (DM-Font), which enables us to generate a high-quality font library with only a few samples.
no code implementations • 9 Mar 2020 • Sanghyuk Chun, Seong Joon Oh, Sangdoo Yun, Dongyoon Han, Junsuk Choe, Youngjoon Yoo
Despite apparent human-level performances of deep neural networks (DNN), they behave fundamentally differently from humans.
2 code implementations • CVPR 2020 • Junsuk Choe, Seong Joon Oh, Seungho Lee, Sanghyuk Chun, Zeynep Akata, Hyunjung Shim
In this paper, we argue that WSOL task is ill-posed with only image-level labels, and propose a new evaluation protocol where full supervision is limited to only a small held-out set not overlapping with the test set.
no code implementations • 11 Nov 2019 • Minz Won, Sanghyuk Chun, Xavier Serra
Recently, we proposed a self-attention based music tagging model.
Sound Audio and Speech Processing
no code implementations • 15 Oct 2019 • YoungJoon Yoo, Sanghyuk Chun, Sangdoo Yun, Jung-Woo Ha, Jaejun Yoo
We first assume that the priors of future samples can be generated in an independently and identically distributed (i. i. d.)
3 code implementations • ICML 2020 • Hyojin Bahng, Sanghyuk Chun, Sangdoo Yun, Jaegul Choo, Seong Joon Oh
This tactic is feasible in many scenarios where it is much easier to define a set of biased representations than to define and quantify bias.
2 code implementations • 12 Jun 2019 • Minz Won, Sanghyuk Chun, Xavier Serra
In addition, we demonstrate the interpretability of the proposed architecture with a heat map visualization.
Sound Audio and Speech Processing
29 code implementations • ICCV 2019 • Sangdoo Yun, Dongyoon Han, Seong Joon Oh, Sanghyuk Chun, Junsuk Choe, Youngjoon Yoo
Regional dropout strategies have been proposed to enhance the performance of convolutional neural network classifiers.
Ranked #1 on
Out-of-Distribution Generalization
on ImageNet-W
no code implementations • ICLR 2019 • Jisung Hwang, Younghoon Kim, Sanghyuk Chun, Jaejun Yoo, Ji-Hoon Kim, Dongyoon Han, Jung-Woo Ha
The checkerboard phenomenon is one of the well-known visual artifacts in the computer vision field.
4 code implementations • ICCV 2019 • Jaejun Yoo, Youngjung Uh, Sanghyuk Chun, Byeongkyu Kang, Jung-Woo Ha
The key ingredient of our method is wavelet transforms that naturally fits in deep networks.
1 code implementation • 21 Dec 2018 • Jang-Hyun Kim, Jaejun Yoo, Sanghyuk Chun, Adrian Kim, Jung-Woo Ha
We present a hybrid framework that leverages the trade-off between temporal and frequency precision in audio representations to improve the performance of speech enhancement task.
Audio and Speech Processing Sound
no code implementations • 5 Mar 2015 • Sanghyuk Chun, Yung-Kyun Noh, Jinwoo Shin
Subspace clustering (SC) is a popular method for dimensionality reduction of high-dimensional data, where it generalizes Principal Component Analysis (PCA).