Search Results for author: Kyungmin Lee

Found 22 papers, 9 papers with code

Calibrated Multi-Preference Optimization for Aligning Diffusion Models

no code implementations4 Feb 2025 Kyungmin Lee, Xiaohang Li, Qifei Wang, Junfeng He, Junjie Ke, Ming-Hsuan Yang, Irfan Essa, Jinwoo Shin, Feng Yang, Yinxiao Li

To address this, we present Calibrated Preference Optimization (CaPO), a novel method to align T2I diffusion models by incorporating the general preference from multiple reward models without human annotated data.

DiffusionGuard: A Robust Defense Against Malicious Diffusion-based Image Editing

1 code implementation8 Oct 2024 June Suk Choi, Kyungmin Lee, Jongheon Jeong, Saining Xie, Jinwoo Shin, Kimin Lee

Through extensive experiments, we show that our method achieves stronger protection and improved mask robustness with lower computational costs compared to the strongest baseline.

Image Manipulation

Investigating Pre-Training Objectives for Generalization in Vision-Based Reinforcement Learning

1 code implementation10 Jun 2024 Donghu Kim, Hojoon Lee, Kyungmin Lee, Dongyoon Hwang, Jaegul Choo

In contrast, objectives focused on learning task-specific knowledge (e. g., identifying agents and fitting reward functions) improve performance in environments similar to the pre-training dataset but not in varied ones.

Atari Games Reinforcement Learning (RL)

DreamFlow: High-Quality Text-to-3D Generation by Approximating Probability Flow

no code implementations22 Mar 2024 Kyungmin Lee, Kihyuk Sohn, Jinwoo Shin

Recent progress in text-to-3D generation has been achieved through the utilization of score distillation methods: they make use of the pre-trained text-to-image (T2I) diffusion models by distilling via the diffusion model training objective.

3D Generation Image-to-Image Translation +1

Improving Diffusion Models for Authentic Virtual Try-on in the Wild

1 code implementation8 Mar 2024 Yisol Choi, Sangkyung Kwak, Kyungmin Lee, Hyungwon Choi, Jinwoo Shin

Finally, we present a customization method using a pair of person-garment images, which significantly improves fidelity and authenticity.

Virtual Try-on

Direct Consistency Optimization for Robust Customization of Text-to-Image Diffusion Models

no code implementations19 Feb 2024 Kyungmin Lee, Sangkyung Kwak, Kihyuk Sohn, Jinwoo Shin

Through extensive experiments on subject and style customization, we demonstrate that our method positions itself on a superior Pareto frontier between subject (or style) consistency and image-text alignment over all previous baselines; it not only outperforms regular fine-tuning objective in image-text alignment, but also shows higher fidelity to the reference images than the method that fine-tunes with additional prior dataset.

Collaborative Score Distillation for Consistent Visual Synthesis

2 code implementations4 Jul 2023 Subin Kim, Kyungmin Lee, June Suk Choi, Jongheon Jeong, Kihyuk Sohn, Jinwoo Shin

Generative priors of large-scale text-to-image diffusion models enable a wide range of new generation and editing applications on diverse visual modalities.

S-CLIP: Semi-supervised Vision-Language Learning using Few Specialist Captions

1 code implementation NeurIPS 2023 Sangwoo Mo, Minkyu Kim, Kyungmin Lee, Jinwoo Shin

By combining these objectives, S-CLIP significantly enhances the training of CLIP using only a few image-text pairs, as demonstrated in various specialist domains, including remote sensing, fashion, scientific figures, and comics.

Contrastive Learning Image-text Retrieval +4

STUNT: Few-shot Tabular Learning with Self-generated Tasks from Unlabeled Tables

1 code implementation2 Mar 2023 Jaehyun Nam, Jihoon Tack, Kyungmin Lee, Hankook Lee, Jinwoo Shin

Learning with few labeled tabular samples is often an essential requirement for industrial machine learning applications as varieties of tabular data suffer from high annotation costs or have difficulties in collecting new samples for novel tasks.

Few-Shot Learning

Discovering and Mitigating Visual Biases through Keyword Explanation

1 code implementation CVPR 2024 Younghyun Kim, Sangwoo Mo, Minkyu Kim, Kyungmin Lee, Jaeho Lee, Jinwoo Shin

The keyword explanation form of visual bias offers several advantages, such as a clear group naming for bias discovery and a natural extension for debiasing using these group names.

Image Classification Image Generation

GCISG: Guided Causal Invariant Learning for Improved Syn-to-real Generalization

no code implementations22 Aug 2022 Gilhyun Nam, Gyeongjae Choi, Kyungmin Lee

In sum, we refer to our method as Guided Causal Invariant Syn-to-real Generalization that effectively improves the performance of syn-to-real generalization.

Domain Generalization Image Classification +1

RenyiCL: Contrastive Representation Learning with Skew Renyi Divergence

1 code implementation12 Aug 2022 Kyungmin Lee, Jinwoo Shin

Here, the choice of data augmentation is sensitive to the quality of learned representations: as harder the data augmentations are applied, the views share more task-relevant information, but also task-irrelevant one that can hinder the generalization capability of representation.

Contrastive Learning Data Augmentation +1

Prototypical Contrastive Predictive Coding

no code implementations ICLR 2022 Kyungmin Lee

Transferring representational knowledge of a model to another is a wide-ranging topic in machine learning.

Contrastive Learning Knowledge Distillation +2

Efficient randomized smoothing by denoising with learned score function

no code implementations1 Jan 2021 Kyungmin Lee, Seyoon Oh

In this work, we present an efficient method for randomized smoothing that does not require any re-training of classifiers.

Image Denoising

Applying GPGPU to Recurrent Neural Network Language Model based Fast Network Search in the Real-Time LVCSR

no code implementations23 Jul 2020 Kyungmin Lee, Chiyoun Park, Ilhwan Kim, Namhoon Kim, Jaewon Lee

Recurrent Neural Network Language Models (RNNLMs) have started to be used in various fields of speech recognition due to their outstanding performance.

Language Modeling Language Modelling +2

end-to-end training of a large vocabulary end-to-end speech recognition system

no code implementations22 Dec 2019 Chanwoo Kim, Sungsoo Kim, Kwangyoun Kim, Mehul Kumar, Jiyeon Kim, Kyungmin Lee, Changwoo Han, Abhinav Garg, Eunhyang Kim, Minkyoo Shin, Shatrughan Singh, Larry Heck, Dhananjaya Gowda

Our end-to-end speech recognition system built using this training infrastructure showed a 2. 44 % WER on test-clean of the LibriSpeech test set after applying shallow fusion with a Transformer language model (LM).

Data Augmentation Language Modelling +2

Local Spectroscopies Reveal Percolative Metal in Disordered Mott Insulators

no code implementations29 Jul 2019 Joseph C. Szabo, Kyungmin Lee, Vidya Madhavan, Nandini Trivedi

We elucidate the mechanism by which a Mott insulator transforms into a non-Fermi liquid metal upon increasing disorder at half filling.

Strongly Correlated Electrons Disordered Systems and Neural Networks

Accelerating recurrent neural network language model based online speech recognition system

no code implementations30 Jan 2018 Kyungmin Lee, Chiyoun Park, Namhoon Kim, Jaewon Lee

This paper presents methods to accelerate recurrent neural network based language models (RNNLMs) for online speech recognition systems.

Language Modeling Language Modelling +2

Cannot find the paper you are looking for? You can Submit a new open access paper.