no code implementations • 24 Jun 2024 • Jeffrey Willette, Heejun Lee, Youngwan Lee, Myeongjae Jeon, Sung Ju Hwang
The context window within a transformer provides a form of active memory for the current task, which can be useful for few-shot learning and conditional generation, both which depend heavily on previous context tokens.
no code implementations • 14 Jun 2024 • Heejun Lee, Geon Park, Youngwan Lee, Jina Kim, Wonyoung Jeong, Myeongjae Jeon, Sung Ju Hwang
In modern large language models (LLMs), increasing sequence lengths is a crucial challenge for enhancing their comprehension and coherence in handling complex tasks such as multi-modal question answering.
no code implementations • 13 Jun 2024 • Injoon Hwang, Haewon Park, Youngwan Lee, Jooyoung Yang, SunJae Maeng
Low-rank adaption (LoRA) is a prominent method that adds a small number of learnable parameters to the frozen pre-trained weights for parameter-efficient fine-tuning.
no code implementations • 28 May 2024 • Youngwan Lee, Jeffrey Ryan Willette, Jonghee Kim, Sung Ju Hwang
To further investigate the reason for better generalization of the self-supervised ViT when trained by MAE (MAE-ViT) and the effect of the gradient correction of RC-MAE from the perspective of optimization, we visualize the loss landscapes of the self-supervised vision transformer by both MAE and RC-MAE and compare them with the supervised ViT (Sup-ViT).
no code implementations • 19 Feb 2024 • Sungjun Ahn, Hyun-Jeong Yim, Youngwan Lee, Sung-Ik Park
This paper introduces a media service model that exploits artificial intelligence (AI) video generators at the receive end.
no code implementations • 7 Dec 2023 • Youngwan Lee, KwanYong Park, Yoorhim Cho, Yong-Ju Lee, Sung Ju Hwang
As text-to-image (T2I) synthesis models increase in size, they demand higher inference costs due to the need for more expensive GPUs with larger memory, which makes it challenging to reproduce these models in addition to the restricted access to training datasets.
2 code implementations • 19 Nov 2022 • Sunil Hwang, Jaehong Yoon, Youngwan Lee, Sung Ju Hwang
Masked Video Autoencoder (MVA) approaches have demonstrated their potential by significantly outperforming previous video representation learning methods.
Ranked #1 on Object State Change Classification on Ego4D
Object State Change Classification Object State Change Classification on Ego4D +4
1 code implementation • 5 Oct 2022 • Youngwan Lee, Jeffrey Willette, Jonghee Kim, Juho Lee, Sung Ju Hwang
Masked image modeling (MIM) has become a popular strategy for self-supervised learning~(SSL) of visual representations with Vision Transformers.
3 code implementations • CVPR 2022 • Youngwan Lee, Jonghee Kim, Jeff Willette, Sung Ju Hwang
While Convolutional Neural Networks (CNNs) have been the dominant architectures for such tasks, recently introduced Vision Transformers (ViTs) aim to replace them as a backbone.
Ranked #38 on Instance Segmentation on COCO minival
1 code implementation • 1 Dec 2020 • Youngwan Lee, Hyung-Il Kim, Kimin Yun, Jinyoung Moon
By using the proposed temporal modeling method (T-OSA), and the efficient factorized component (D(2+1)D), we construct two types of VoV3D networks, VoV3D-M and VoV3D-L.
Ranked #30 on Action Recognition on Something-Something V1 (using extra training data)
no code implementations • 21 Sep 2020 • Joong-won Hwang, Youngwan Lee, Sungchan Oh, Yuseok Bae
Moreover, we further improved SWA to be adequate to adversarial training.
no code implementations • 28 Jun 2020 • Youngwan Lee, Joong-won Hwang, Hyung-Il Kim, Kimin Yun, Yongjin Kwon, Yuseok Bae, Sung Ju Hwang
To tackle these limitations, we propose a new localization uncertainty estimation method called UAD for anchor-free object detection.
Ranked #128 on Object Detection on COCO test-dev
1 code implementation • CVPR 2020 • Youngwan Lee, Jongyoul Park
We propose a simple yet efficient anchor-free instance segmentation, called CenterMask, that adds a novel spatial attention-guided mask (SAG-Mask) branch to anchor-free one stage object detector (FCOS) in the same vein with Mask R-CNN.
8 code implementations • arXiv 2019 • Youngwan Lee, Jongyoul Park
We hope that CenterMask and VoVNetV2 can serve as a solid baseline of real-time instance segmentation and backbone network for various vision tasks, respectively.
Ranked #1 on Object Detection on COCO test-dev (Hardware Burden metric)
14 code implementations • 22 Apr 2019 • Youngwan Lee, Joong-won Hwang, Sangrok Lee, Yuseok Bae, Jongyoul Park
As DenseNet conserves intermediate features with diverse receptive fields by aggregating them with dense connection, it shows good performance on the object detection task.
Ranked #64 on Instance Segmentation on COCO test-dev
no code implementations • 1 Dec 2017 • Seung-Hwan Bae, Youngwan Lee, Youngjoo Jo, Yuseok Bae, Joong-won Hwang
The recent advances of convolutional detectors show impressive performance improvement for large scale object detection.
no code implementations • 4 Feb 2017 • Youngwan Lee, Byeonghak Yim, Huien Kim, Eunsoo Park, Xuenan Cui, Taekang Woo, Hakil Kim
Since convolutional neural network(CNN)models emerged, several tasks in computer vision have actively deployed CNN models for feature extraction.