Search Results for author: Dongyoon Han

Found 44 papers, 29 papers with code

Token-Supervised Value Models for Enhancing Mathematical Reasoning Capabilities of Large Language Models

no code implementations12 Jul 2024 Jung Hyun Lee, June Yong Yang, Byeongho Heo, Dongyoon Han, Kang Min Yoo

Large Language Models (LLMs) have demonstrated impressive problem-solving capabilities in mathematics through step-by-step reasoning chains.

GSM8K Math +1

HYPE: Hyperbolic Entailment Filtering for Underspecified Images and Texts

no code implementations26 Apr 2024 Wonjae Kim, Sanghyuk Chun, Taekyung Kim, Dongyoon Han, Sangdoo Yun

In an era where the volume of data drives the effectiveness of self-supervised learning, the specificity and clarity of data semantics play a crucial role in model training.

Self-Supervised Learning Specificity

Leveraging Temporal Contextualization for Video Action Recognition

1 code implementation15 Apr 2024 Minji Kim, Dongyoon Han, Taekyung Kim, Bohyung Han

To be specific, we introduce Temporal Contextualization (TC), a layer-wise temporal information infusion mechanism for videos, which 1) extracts core information from each frame, 2) connects relevant information across frames for the summarization into context tokens, and 3) leverages the context tokens for feature encoding.

Action Recognition Temporal Action Localization +2

Model Stock: All we need is just a few fine-tuned models

2 code implementations28 Mar 2024 Dong-Hwan Jang, Sangdoo Yun, Dongyoon Han

This paper introduces an efficient fine-tuning method for large pre-trained models, offering strong in-distribution (ID) and out-of-distribution (OOD) performance.

DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs

1 code implementation28 Mar 2024 Donghyun Kim, Byeongho Heo, Dongyoon Han

This paper revives Densely Connected Convolutional Networks (DenseNets) and reveals the underrated effectiveness over predominant ResNet-style architectures.

Fine-Grained Image Classification Instance Segmentation +3

Rotary Position Embedding for Vision Transformer

1 code implementation20 Mar 2024 Byeongho Heo, Song Park, Dongyoon Han, Sangdoo Yun

This study provides a comprehensive analysis of RoPE when applied to ViTs, utilizing practical implementations of RoPE for 2D vision data.

Position

Morphing Tokens Draw Strong Masked Image Models

no code implementations30 Dec 2023 Taekyung Kim, Byeongho Heo, Dongyoon Han

DTM is compatible with various SSL frameworks; we showcase an improved MIM by employing DTM, barely introducing extra training costs.

Fine-Grained Image Classification Self-Supervised Learning

SeiT++: Masked Token Modeling Improves Storage-efficient Training

1 code implementation15 Dec 2023 Minhyun Lee, Song Park, Byeongho Heo, Dongyoon Han, Hyunjung Shim

A recent breakthrough by SeiT proposed the use of Vector-Quantized (VQ) feature vectors (i. e., tokens) as network inputs for vision classification.

Classification Data Augmentation +2

Match me if you can: Semantic Correspondence Learning with Unpaired Images

no code implementations30 Nov 2023 Jiwon Kim, Byeongho Heo, Sangdoo Yun, Seungryong Kim, Dongyoon Han

Recent approaches for semantic correspondence have focused on obtaining high-quality correspondences using a complicated network, refining the ambiguous or noisy matching points.

Semantic correspondence

Towards Calibrated Robust Fine-Tuning of Vision-Language Models

no code implementations3 Nov 2023 Changdae Oh, Hyesu Lim, Mijoo Kim, Dongyoon Han, Sangdoo Yun, Jaegul Choo, Alexander Hauptmann, Zhi-Qi Cheng, Kyungwoo Song

Improving out-of-distribution (OOD) generalization through in-distribution (ID) adaptation is a primary goal of robust fine-tuning methods beyond the naive fine-tuning approach.

Autonomous Driving Medical Diagnosis

Gramian Attention Heads are Strong yet Efficient Vision Learners

1 code implementation ICCV 2023 Jongbin Ryu, Dongyoon Han, Jongwoo Lim

We introduce a novel architecture design that enhances expressiveness by incorporating multiple head classifiers (\ie, classification heads) instead of relying on channel expansion or additional building blocks.

Fine-Grained Image Classification Instance Segmentation +2

Learning with Unmasked Tokens Drives Stronger Vision Learners

no code implementations20 Oct 2023 Taekyung Kim, Sanghyuk Chun, Byeongho Heo, Dongyoon Han

MIMs such as Masked Autoencoder (MAE) learn strong representations by randomly masking input tokens for the encoder to process, with the decoder reconstructing the masked tokens to the input.

Attribute Decoder +3

GeNAS: Neural Architecture Search with Better Generalization

1 code implementation15 May 2023 JoonHyun Jeong, Joonsang Yu, Geondo Park, Dongyoon Han, Youngjoon Yoo

Recent neural architecture search (NAS) approaches rely on validation loss or accuracy to find the superior network for the target data.

Neural Architecture Search object-detection +2

Neglected Free Lunch -- Learning Image Classifiers Using Annotation Byproducts

3 code implementations30 Mar 2023 Dongyoon Han, Junsuk Choe, Seonghyeok Chun, John Joon Young Chung, Minsuk Chang, Sangdoo Yun, Jean Y. Song, Seong Joon Oh

We refer to the new paradigm of training models with annotation byproducts as learning using annotation byproducts (LUAB).

Time Series

The Devil is in the Points: Weakly Semi-Supervised Instance Segmentation via Point-Guided Mask Representation

1 code implementation CVPR 2023 Beomyoung Kim, JoonHyun Jeong, Dongyoon Han, Sung Ju Hwang

In this paper, we introduce a novel learning scheme named weakly semi-supervised instance segmentation (WSSIS) with point labels for budget-efficient and high-performance instance segmentation.

Instance Segmentation Semantic Segmentation +1

Generating Instance-level Prompts for Rehearsal-free Continual Learning

no code implementations ICCV 2023 Dahuin Jung, Dongyoon Han, Jihwan Bang, Hwanjun Song

However, we observe that the use of a prompt pool creates a domain scalability problem between pre-training and continual learning.

Continual Learning

Can We Find Strong Lottery Tickets in Generative Models?

no code implementations16 Dec 2022 Sangyeop Yeo, Yoojin Jang, Jy-yong Sohn, Dongyoon Han, Jaejun Yoo

To the best of our knowledge, we are the first to show the existence of strong lottery tickets in generative models and provide an algorithm to find it stably.

Model Compression Network Pruning

Loss-based Sequential Learning for Active Domain Adaptation

no code implementations25 Apr 2022 Kyeongtak Han, Youngeun Kim, Dongyoon Han, Sungeun Hong

To solve these, we fully utilize pseudo labels of the unlabeled target domain by leveraging loss prediction.

Diversity Domain Adaptation

Frequency Selective Augmentation for Video Representation Learning

no code implementations8 Apr 2022 Jinhyung Kim, Taeoh Kim, Minho Shim, Dongyoon Han, Dongyoon Wee, Junmo Kim

FreqAug stochastically removes specific frequency components from the video so that learned representation captures essential features more from the remaining information for various downstream tasks.

Action Recognition Data Augmentation +3

Demystifying the Neural Tangent Kernel from a Practical Perspective: Can it be trusted for Neural Architecture Search without training?

1 code implementation CVPR 2022 Jisoo Mok, Byunggook Na, Ji-Hoon Kim, Dongyoon Han, Sungroh Yoon

To take such non-linear characteristics into account, we introduce Label-Gradient Alignment (LGA), a novel NTK-based metric whose inherent formulation allows it to capture the large amount of non-linear advantage present in modern neural architectures.

Neural Architecture Search

Learning Features with Parameter-Free Layers

1 code implementation ICLR 2022 Dongyoon Han, Youngjoon Yoo, Beomyoung Kim, Byeongho Heo

We aim to break the stereotype of organizing the spatial operations of building blocks into trainable layers.

OCR-free Document Understanding Transformer

4 code implementations30 Nov 2021 Geewook Kim, Teakgyu Hong, Moonbin Yim, Jeongyeon Nam, Jinyoung Park, Jinyeong Yim, Wonseok Hwang, Sangdoo Yun, Dongyoon Han, Seunghyun Park

Current Visual Document Understanding (VDU) methods outsource the task of reading text to off-the-shelf Optical Character Recognition (OCR) engines and focus on the understanding task with the OCR outputs.

Document Image Classification document understanding +3

Contrastive Vicinal Space for Unsupervised Domain Adaptation

1 code implementation26 Nov 2021 Jaemin Na, Dongyoon Han, Hyung Jin Chang, Wonjun Hwang

In the contrastive space, inter-domain discrepancy is mitigated by constraining instances to have contrastive views and labels, and the consensus space reduces the confusion between intra-domain categories.

Unsupervised Domain Adaptation

Rethinking Spatial Dimensions of Vision Transformers

10 code implementations ICCV 2021 Byeongho Heo, Sangdoo Yun, Dongyoon Han, Sanghyuk Chun, Junsuk Choe, Seong Joon Oh

We empirically show that such a spatial dimension reduction is beneficial to a transformer architecture as well, and propose a novel Pooling-based Vision Transformer (PiT) upon the original ViT model.

Dimensionality Reduction Image Classification +2

VideoMix: Rethinking Data Augmentation for Video Classification

3 code implementations7 Dec 2020 Sangdoo Yun, Seong Joon Oh, Byeongho Heo, Dongyoon Han, Jinhyung Kim

Recent data augmentation strategies have been reported to address the overfitting problems in static image classifiers.

Action Localization Action Recognition +5

Rethinking Channel Dimensions for Efficient Model Design

10 code implementations CVPR 2021 Dongyoon Han, Sangdoo Yun, Byeongho Heo, Youngjoon Yoo

We then investigate the channel configuration of a model by searching network architectures concerning the channel configuration under the computational cost restriction.

Image Classification Instance Segmentation +4

AdamP: Slowing Down the Slowdown for Momentum Optimizers on Scale-invariant Weights

4 code implementations ICLR 2021 Byeongho Heo, Sanghyuk Chun, Seong Joon Oh, Dongyoon Han, Sangdoo Yun, Gyuwan Kim, Youngjung Uh, Jung-Woo Ha

Because of the scale invariance, this modification only alters the effective step sizes without changing the effective update directions, thus enjoying the original convergence properties of GD optimizers.

Audio Classification Image Classification +3

An Empirical Evaluation on Robustness and Uncertainty of Regularization Methods

no code implementations9 Mar 2020 Sanghyuk Chun, Seong Joon Oh, Sangdoo Yun, Dongyoon Han, Junsuk Choe, Youngjoon Yoo

Despite apparent human-level performances of deep neural networks (DNN), they behave fundamentally differently from humans.

Bayesian Inference

EXTD: Extremely Tiny Face Detector via Iterative Filter Reuse

2 code implementations15 Jun 2019 YoungJoon Yoo, Dongyoon Han, Sangdoo Yun

In this paper, we propose a new multi-scale face detector having an extremely tiny number of parameters (EXTD), less than 0. 1 million, as well as achieving comparable performance to deep heavy detectors.

Face Detection

Character Region Awareness for Text Detection

18 code implementations CVPR 2019 Youngmin Baek, Bado Lee, Dongyoon Han, Sangdoo Yun, Hwalsuk Lee

Scene text detection methods based on neural networks have emerged recently and have shown promising results.

 Ranked #1 on Scene Text Detection on ICDAR 2013 (Precision metric)

Scene Text Detection Text Detection

C3: Concentrated-Comprehensive Convolution and its application to semantic segmentation

2 code implementations12 Dec 2018 Hyojin Park, Youngjoon Yoo, Geonseok Seo, Dongyoon Han, Sangdoo Yun, Nojun Kwak

To resolve this problem, we propose a new block called Concentrated-Comprehensive Convolution (C3) which applies the asymmetric convolutions before the depth-wise separable dilated convolution to compensate for the information loss due to dilated convolution.

Semantic Segmentation

Deep Pyramidal Residual Networks

9 code implementations CVPR 2017 Dongyoon Han, Jiwhan Kim, Junmo Kim

This design, which is discussed in depth together with our new insights, has proven to be an effective means of improving generalization ability.

General Classification Image Classification

Unsupervised Simultaneous Orthogonal Basis Clustering Feature Selection

no code implementations CVPR 2015 Dongyoon Han, Junmo Kim

Unlike the recent unsupervised feature selection methods, SOCFS does not explicitly use the pre-computed local structure information for data points represented as additional terms of their objective functions, but directly computes latent cluster information by the target matrix conducting orthogonal basis clustering in a single unified term of the proposed objective function.

Clustering feature selection

Salient Region Detection via High-Dimensional Color Transform

no code implementations CVPR 2014 Jiwhan Kim, Dongyoon Han, Yu-Wing Tai, Junmo Kim

By mapping a low dimensional RGB color to a feature vector in a high-dimensional color space, we show that we can linearly separate the salient regions from the background by finding an optimal linear combination of color coefficients in the high-dimensional color space.

Vocal Bursts Intensity Prediction

Cannot find the paper you are looking for? You can Submit a new open access paper.