1 code implementation • 11 Jun 2025 • Seonho Lee, Jiho Choi, Inha Kang, Jiwook Kim, Junsung Park, Hyunjung Shim
Vision-Language Models (VLMs) have shown remarkable performance on diverse visual and linguistic tasks, yet they remain fundamentally limited in their understanding of 3D spatial structures.
no code implementations • 24 May 2025 • Junyong Kang, Seohyun Lim, Kyungjune Baek, Hyunjung Shim
While recent advances in this area have extended preference optimization techniques from large language models (LLMs) to the diffusion setting, they often struggle with limited exploration.
no code implementations • CVPR 2025 • Junsung Park, Hwijeong Lee, Inha Kang, Hyunjung Shim
We observed that adverse weather induces degradation of semantic-level features and both corruption of local features, leading to a misprediction of "things" as "stuff".
1 code implementation • CVPR 2025 • Jiho Choi, Seonho Lee, Minhyun Lee, Seungho Lee, Hyunjung Shim
Open-Vocabulary Part Segmentation (OVPS) is an emerging field for recognizing fine-grained parts in unseen categories.
Open Vocabulary Semantic Segmentation
Open-Vocabulary Semantic Segmentation
1 code implementation • 8 Jan 2025 • Hyogon Ryu, Nahyeon Park, Hyunjung Shim
To address these challenges, we propose Distribution-aware Group Quantization (DGQ), a method that identifies and adaptively handles pixel-wise and channel-wise outliers to preserve image quality.
1 code implementation • CVPR 2025 • Dongseob Kim, Hyunjung Shim
Multi-label classification is crucial for comprehensive image understanding, yet acquiring accurate annotations is challenging and costly.
1 code implementation • 30 Dec 2024 • Seojeong Park, Jiho Choi, Kyungjune Baek, Hyunjung Shim
Video Moment Retrieval (MR) aims to localize moments within a video based on a given natural language query.
Ranked #8 on
Moment Retrieval
on QVHighlights
no code implementations • 24 Dec 2024 • Hojun Choi, Junsuk Choe, Hyunjung Shim
Our approach groups contextually related ``concepts'' into a bag and adjusts the scale of concepts within the bag for more effective embedding alignment.
1 code implementation • 18 Dec 2024 • Na Min An, Eunki Kim, James Thorne, Hyunjung Shim
Contrastive Language-Image Pretraining (CLIP) enables zero-shot inference in downstream tasks such as image-text retrieval and classification.
1 code implementation • 28 Nov 2024 • Minhyun Lee, Seungho Lee, Song Park, Dongyoon Han, Byeongho Heo, Hyunjung Shim
Referring Image Segmentation (RIS) is an advanced vision-language task that involves identifying and segmenting objects within an image as described by free-form text descriptions.
Ranked #6 on
Referring Expression Segmentation
on RefCOCO testB
1 code implementation • 9 Oct 2024 • Seungho Lee, Hwijeong Lee, Hyunjung Shim
Additionally, we employ a dual-branch structure to mitigate performance degradation caused by data imbalance.
Ranked #1 on
Semi-Supervised Semantic Segmentation
on nuScenes
LIDAR Semantic Segmentation
Semi-Supervised Semantic Segmentation
no code implementations • 24 Sep 2024 • Seoungyoon Kang, Youngsun Lim, Hyunjung Shim
Our label generation strategy can complement existing dataset distillation methods for significantly enhancing their training efficiency and performance.
1 code implementation • 19 Sep 2024 • Youngsun Lim, Hojun Choi, Hyunjung Shim
Our evaluation protocols measure image hallucination by testing if images from existing TTI models can correctly respond to these questions.
1 code implementation • 12 Sep 2024 • Nahyeon Park, Kunhee Kim, Hyunjung Shim
Recent breakthroughs in text-to-image models have opened up promising research avenues in personalized image generation, enabling users to create diverse images of a specific subject using natural language prompts.
1 code implementation • 12 Sep 2024 • Seonho Lee, Jiho Choi, Seohyun Lim, Jiwook Kim, Hyunjung Shim
Recent advancements in text-to-image diffusion models have demonstrated remarkable success, yet they often struggle to fully capture the user's intent.
2 code implementations • 16 Jul 2024 • Jiwook Kim, Seonho Lee, Jaeyo Shin, Jiho Choi, Hyunjung Shim
Score distillation sampling (SDS) has emerged as an effective framework in text-driven 3D editing tasks, leveraging diffusion models for 3D-consistent editing.
no code implementations • 15 Jul 2024 • Youngsun Lim, Hyunjung Shim
Depending on the nature of the hallucination, we employ off-the-shelf image editing tools, either InstructPix2Pix or IP-Adapter, to leverage factual information from the retrieved image.
2 code implementations • 2 Jul 2024 • Junsung Park, KyungMin Kim, Hyunjung Shim
Motivated by this issue, we identified key factors of adverse weather and conducted a toy experiment to pinpoint the main causes of performance degradation: (1) Geometric perturbation due to refraction caused by fog or droplets in the air and (2) Point drop due to energy absorption and occlusions.
Ranked #1 on
LIDAR Semantic Segmentation
on SemanticSTF
1 code implementation • 28 Jun 2024 • Junsung Park, Hyunjung Shim
We further extend our ensemble method to CAMs from AMN (ResNet-like) and MCTformer (ViT-like) models, achieving performance benefits in advanced WSSS models.
2 code implementations • 17 Jun 2024 • Jiho Choi, Seonho Lee, Seungho Lee, Minhyun Lee, Hyunjung Shim
Open-vocabulary part segmentation (OVPS) is an emerging research area focused on segmenting fine-grained entities using diverse and previously unseen vocabularies.
Open Vocabulary Semantic Segmentation
Open-Vocabulary Semantic Segmentation
no code implementations • 13 Feb 2024 • Yunji Jung, Seokju Lee, Tair Djanibekov, Hyunjung Shim, Jong Chul Ye
In this work, we propose a training-free approach for non-rigid editing with Stable Diffusion, aimed at improving the identity preservation quality without compromising editability.
no code implementations • 23 Jan 2024 • Seungho Lee, Seoungyoon Kang, Hyunjung Shim
This study demonstrates a cost-effective approach to semantic segmentation using self-supervised vision transformers (SSVT).
no code implementations • 9 Jan 2024 • Hyogon Ryu, Seohyun Lim, Hyunjung Shim
The emergence of billion-parameter diffusion models such as Stable Diffusion XL, Imagen, and DALL-E 3 has significantly propelled the domain of generative AI.
1 code implementation • 21 Dec 2023 • Dongseob Kim, Seungho Lee, Junsuk Choe, Hyunjung Shim
Notably, the proposed method achieves 51. 8\% mIoU on the Cityscapes test dataset, showcasing its potential as a strong WSSS baseline on driving scene datasets.
Weakly supervised Semantic Segmentation
Weakly-Supervised Semantic Segmentation
1 code implementation • 15 Dec 2023 • Minhyun Lee, Song Park, Byeongho Heo, Dongyoon Han, Hyunjung Shim
A recent breakthrough by SeiT proposed the use of Vector-Quantized (VQ) feature vectors (i. e., tokens) as network inputs for vision classification.
1 code implementation • CVPR 2022 • Kyungjune Baek, Hyunjung Shim
Since our synthesizer only considers the generic properties of natural images, the single model pretrained on our dataset can be consistently transferred to various target datasets, and even outperforms the previous methods pretrained with the natural images in terms of Fr'echet inception distance.
1 code implementation • CVPR 2022 • Minhyun Lee, Dongseob Kim, Hyunjung Shim
Existing WSSS methods commonly argue that the sparse coverage of CAM incurs the performance bottleneck of WSSS.
Ranked #18 on
Weakly-Supervised Semantic Segmentation
on COCO 2014 val
Weakly supervised segmentation
Weakly supervised Semantic Segmentation
+1
1 code implementation • 4 Jan 2022 • Minjin Choi, jinhong Kim, Joonsek Lee, Hyunjung Shim, Jongwuk Lee
Session-based recommendation (SR) predicts the next items from a sequence of previous items consumed by an anonymous user.
2 code implementations • 22 Dec 2021 • Song Park, Sanghyuk Chun, Junbum Cha, Bado Lee, Hyunjung Shim
Existing methods learn to disentangle style and content elements by developing a universal style representation for each font style.
1 code implementation • CVPR 2021 • Seungho Lee, Minhyun Lee, Jongwuk Lee, Hyunjung Shim
Existing studies in weakly-supervised semantic segmentation (WSSS) using image-level weak supervision have several limitations: sparse object coverage, inaccurate object boundaries, and co-occurring pixels from non-target objects.
Ranked #24 on
Weakly-Supervised Semantic Segmentation
on PASCAL VOC 2012 test
(using extra training data)
4 code implementations • ICCV 2021 • Song Park, Sanghyuk Chun, Junbum Cha, Bado Lee, Hyunjung Shim
MX-Font extracts multiple style features not explicitly conditioned on component labels, but automatically by multiple experts to represent different local concepts, e. g., left-side sub-glyph.
3 code implementations • 30 Mar 2021 • Minjin Choi, jinhong Kim, Joonseok Lee, Hyunjung Shim, Jongwuk Lee
Session-based recommendation aims at predicting the next item given a sequence of previous items consumed in the session, e. g., on e-commerce or multimedia streaming services.
no code implementations • 1 Jan 2021 • Duhyeon Bang, Yunho Jeon, Jin-Hwa Kim, Jiwon Kim, Hyunjung Shim
When a person identifies objects, he or she can think by associating objects to many classes and conclude by taking inter-class relations into account.
1 code implementation • Pattern Recognition 2021 • Kyungjune Baek, Duhyeon Bang, Hyunjung Shim
Recently developed regularization techniques improve the networks generalization by only considering the global context.
no code implementations • 1 Jan 2021 • Daejin Kim, Hyunjung Shim, Jongwuk Lee
We demonstrate that AAP equipped with existing pruning methods (i. e., iterative pruning, one-shot pruning, and dynamic pruning) consistently improves the accuracy of original methods at 128× - 4096× compression ratios on three benchmark datasets.
3 code implementations • 23 Sep 2020 • Song Park, Sanghyuk Chun, Junbum Cha, Bado Lee, Hyunjung Shim
However, learning component-wise styles solely from reference glyphs is infeasible in the few-shot font generation scenario, when a target script has a large number of components, e. g., over 200 for Chinese.
2 code implementations • 8 Jul 2020 • Junsuk Choe, Seong Joon Oh, Sanghyuk Chun, Seungho Lee, Zeynep Akata, Hyunjung Shim
In this paper, we argue that WSOL task is ill-posed with only image-level labels, and propose a new evaluation protocol where full supervision is limited to only a small held-out set not overlapping with the test set.
1 code implementation • ICCV 2021 • Kyungjune Baek, Yunjey Choi, Youngjung Uh, Jaejun Yoo, Hyunjung Shim
To this end, we propose a truly unsupervised image-to-image translation model (TUNIT) that simultaneously learns to separate image domains and translates input images into the estimated domains.
2 code implementations • CVPR 2020 • Junsuk Choe, Seong Joon Oh, Seungho Lee, Sanghyuk Chun, Zeynep Akata, Hyunjung Shim
In this paper, we argue that WSOL task is ill-posed with only image-level labels, and propose a new evaluation protocol where full supervision is limited to only a small held-out set not overlapping with the test set.
no code implementations • 13 Nov 2019 • Jae-woong Lee, Minjin Choi, Jongwuk Lee, Hyunjung Shim
Knowledge distillation (KD) is a well-known method to reduce inference latency by compressing a cumbersome teacher model to a small student model.
1 code implementation • CVPR 2019 • Junsuk Choe, Hyunjung Shim
Weakly Supervised Object Localization (WSOL) techniques learn the object location only using image-level labels, without location annotations.
no code implementations • 28 Sep 2018 • Seongjong Song, Hyunjung Shim
We propose a novel approach to recovering the translucent objects from a single time-of-flight (ToF) depth camera using deep residual networks.
no code implementations • 27 Sep 2018 • Duhyeon Bang, Hyunjung Shim
In order to analyze the real data in the latent space of GANs, it is necessary to investigate the inverse generation mapping from the data to the latent vector.
no code implementations • 20 Jul 2018 • Kyungjune Baek, Duhyeon Bang, Hyunjung Shim
Also, we show that our model can achieve the competitive performance with the state-of-the-art attribute editing technique in terms of attribute editing quality.
no code implementations • 3 Jul 2018 • Duhyeon Bang, Hyunjung Shim
We propose a novel algorithm, namely Resembled Generative Adversarial Networks (GAN), that generates two different domain data simultaneously where they resemble each other.
no code implementations • 1 Jun 2018 • Junsuk Choe, Joo Hyun Park, Hyunjung Shim
Our important finding is that high image diversity of GAN, which is a main goal in GAN research, is ironically disadvantageous for object localization, because such discriminators focus not only on the target object, but also on the various objects, such as background objects.
no code implementations • 28 May 2018 • Duhyeon Bang, Seoungyoon Kang, Hyunjung Shim
Various studies assert that the latent space of a GAN is semanticallymeaningful and can be utilized for advanced data analysis and manipulation.
1 code implementation • 12 Apr 2018 • Duhyeon Bang, Hyunjung Shim
Mode collapse is a critical problem in training generative adversarial networks.
no code implementations • 22 Feb 2018 • Junsuk Choe, Joo Hyun Park, Hyunjung Shim
To this end, we employ an effective data augmentation for improving the accuracy of the object localization.
no code implementations • ICML 2018 • Duhyeon Bang, Hyunjung Shim
Because the AE learns to minimize forward KL divergence, our GAN training with representative features is influenced by both reverse and forward KL divergence.