no code implementations • 5 Jun 2025 • Youngwan Lee, Kangsan Kim, KwanYong Park, Ilcahe Jung, Soojin Jang, Seanie Lee, Yong-Ju Lee, Sung Ju Hwang
We further propose SafeLLaVA, a novel VLM augmented with a learnable safety meta token and a dedicated safety head.
1 code implementation • CVPR 2025 • Kangsan Kim, Geon Park, Youngwan Lee, Woongyeong Yeo, Sung Ju Hwang
To address these issues, we propose VideoICL, a novel video in-context learning framework for OOD tasks that introduces a similarity-based relevant example selection strategy and a confidence-based iterative inference approach.
1 code implementation • 24 Jun 2024 • Jeffrey Willette, Heejun Lee, Youngwan Lee, Myeongjae Jeon, Sung Ju Hwang
The transformer's context window is vital for tasks such as few-shot learning and conditional generation as it preserves previous tokens for active memory.
no code implementations • 14 Jun 2024 • Heejun Lee, Geon Park, Youngwan Lee, Jaduk Suh, Jina Kim, Wonyoung Jeong, Bumsik Kim, Hyemin Lee, Myeongjae Jeon, Sung Ju Hwang
In addition to improving the time complexity of the attention mechanism, we further optimize GPU memory usage by implementing KV cache offloading, which stores only $O(\log T)$ tokens on the GPU while maintaining similar decoding throughput.
no code implementations • 13 Jun 2024 • Injoon Hwang, Haewon Park, Youngwan Lee, Jooyoung Yang, SunJae Maeng
Low-rank adaption (LoRA) is a prominent method that adds a small number of learnable parameters to the frozen pre-trained weights for parameter-efficient fine-tuning.
no code implementations • 28 May 2024 • Youngwan Lee, Jeffrey Ryan Willette, Jonghee Kim, Sung Ju Hwang
To further investigate the reason for better generalization of the self-supervised ViT when trained by MAE (MAE-ViT) and the effect of the gradient correction of RC-MAE from the perspective of optimization, we visualize the loss landscapes of the self-supervised vision transformer by both MAE and RC-MAE and compare them with the supervised ViT (Sup-ViT).
no code implementations • 19 Feb 2024 • Sungjun Ahn, Hyun-Jeong Yim, Youngwan Lee, Sung-Ik Park
This paper introduces a media service model that exploits artificial intelligence (AI) video generators at the receive end.
no code implementations • 7 Dec 2023 • Youngwan Lee, KwanYong Park, Yoorhim Cho, Yong-Ju Lee, Sung Ju Hwang
As text-to-image (T2I) synthesis models increase in size, they demand higher inference costs due to the need for more expensive GPUs with larger memory, which makes it challenging to reproduce these models in addition to the restricted access to training datasets.
2 code implementations • 19 Nov 2022 • Sunil Hwang, Jaehong Yoon, Youngwan Lee, Sung Ju Hwang
Masked Video Autoencoder (MVA) approaches have demonstrated their potential by significantly outperforming previous video representation learning methods.
Ranked #1 on
Object State Change Classification
on Ego4D
Object State Change Classification
Object State Change Classification on Ego4D
+4
1 code implementation • 5 Oct 2022 • Youngwan Lee, Jeffrey Willette, Jonghee Kim, Juho Lee, Sung Ju Hwang
Masked image modeling (MIM) has become a popular strategy for self-supervised learning~(SSL) of visual representations with Vision Transformers.
3 code implementations • CVPR 2022 • Youngwan Lee, Jonghee Kim, Jeff Willette, Sung Ju Hwang
While Convolutional Neural Networks (CNNs) have been the dominant architectures for such tasks, recently introduced Vision Transformers (ViTs) aim to replace them as a backbone.
Ranked #40 on
Instance Segmentation
on COCO minival
1 code implementation • 1 Dec 2020 • Youngwan Lee, Hyung-Il Kim, Kimin Yun, Jinyoung Moon
By using the proposed temporal modeling method (T-OSA), and the efficient factorized component (D(2+1)D), we construct two types of VoV3D networks, VoV3D-M and VoV3D-L.
Ranked #30 on
Action Recognition
on Something-Something V1
(using extra training data)
no code implementations • 21 Sep 2020 • Joong-won Hwang, Youngwan Lee, Sungchan Oh, Yuseok Bae
Moreover, we further improved SWA to be adequate to adversarial training.
no code implementations • 28 Jun 2020 • Youngwan Lee, Joong-won Hwang, Hyung-Il Kim, Kimin Yun, Yongjin Kwon, Yuseok Bae, Sung Ju Hwang
To tackle these limitations, we propose a new localization uncertainty estimation method called UAD for anchor-free object detection.
Ranked #132 on
Object Detection
on COCO test-dev
1 code implementation • CVPR 2020 • Youngwan Lee, Jongyoul Park
We propose a simple yet efficient anchor-free instance segmentation, called CenterMask, that adds a novel spatial attention-guided mask (SAG-Mask) branch to anchor-free one stage object detector (FCOS) in the same vein with Mask R-CNN.
8 code implementations • arXiv 2019 • Youngwan Lee, Jongyoul Park
We hope that CenterMask and VoVNetV2 can serve as a solid baseline of real-time instance segmentation and backbone network for various vision tasks, respectively.
12 code implementations • 22 Apr 2019 • Youngwan Lee, Joong-won Hwang, Sangrok Lee, Yuseok Bae, Jongyoul Park
As DenseNet conserves intermediate features with diverse receptive fields by aggregating them with dense connection, it shows good performance on the object detection task.
Ranked #66 on
Instance Segmentation
on COCO test-dev
no code implementations • 1 Dec 2017 • Seung-Hwan Bae, Youngwan Lee, Youngjoo Jo, Yuseok Bae, Joong-won Hwang
The recent advances of convolutional detectors show impressive performance improvement for large scale object detection.
no code implementations • 4 Feb 2017 • Youngwan Lee, Byeonghak Yim, Huien Kim, Eunsoo Park, Xuenan Cui, Taekang Woo, Hakil Kim
Since convolutional neural network(CNN)models emerged, several tasks in computer vision have actively deployed CNN models for feature extraction.