Search Results for author: Jaekyeom Kim

Found 9 papers, 6 papers with code

Model-Agnostic Boundary-Adversarial Sampling for Test-Time Generalization in Few-Shot learning

1 code implementation ECCV 2020 Jaekyeom Kim, Hyoungseok Kim, Gunhee Kim

Few-shot learning is an important research problem that tackles one of the greatest challenges of machine learning: learning a new task from a limited amount of labeled data.

Few-Shot Learning

Small Language Models Need Strong Verifiers to Self-Correct Reasoning

no code implementations26 Apr 2024 Yunxiang Zhang, Muhammad Khalifa, Lajanugen Logeswaran, Jaekyeom Kim, Moontae Lee, Honglak Lee, Lu Wang

Self-correction has emerged as a promising solution to boost the reasoning performance of large language models (LLMs), where LLMs refine their solutions using self-generated critiques that pinpoint the errors.

Math

Lipschitz-constrained Unsupervised Skill Discovery

1 code implementation ICLR 2022 Seohong Park, Jongwook Choi, Jaekyeom Kim, Honglak Lee, Gunhee Kim

To address this issue, we propose Lipschitz-constrained Skill Discovery (LSD), which encourages the agent to discover more diverse, dynamic, and far-reaching skills.

Time Discretization-Invariant Safe Action Repetition for Policy Gradient Methods

1 code implementation NeurIPS 2021 Seohong Park, Jaekyeom Kim, Gunhee Kim

SAR can handle the stochasticity of environments by adaptively reacting to changes in states during action repetition.

Policy Gradient Methods

Unsupervised Skill Discovery with Bottleneck Option Learning

1 code implementation27 Jun 2021 Jaekyeom Kim, Seohong Park, Gunhee Kim

Having the ability to acquire inherent skills from environments without any external rewards or supervision like humans is an important problem.

Disentanglement

Drop-Bottleneck: Learning Discrete Compressed Representation for Noise-Robust Exploration

1 code implementation ICLR 2021 Jaekyeom Kim, Minjung Kim, Dongyeon Woo, Gunhee Kim

We propose a novel information bottleneck (IB) method named Drop-Bottleneck, which discretely drops features that are irrelevant to the target variable.

Adversarial Robustness Dimensionality Reduction

EMI: Exploration with Mutual Information

1 code implementation2 Oct 2018 Hyoungseok Kim, Jaekyeom Kim, Yeonwoo Jeong, Sergey Levine, Hyun Oh Song

Reinforcement learning algorithms struggle when the reward signal is very sparse.

Continuous Control Reinforcement Learning (RL)

EMI: Exploration with Mutual Information Maximizing State and Action Embeddings

no code implementations27 Sep 2018 HyoungSeok Kim, Jaekyeom Kim, Yeonwoo Jeong, Sergey Levine, Hyun Oh Song

Policy optimization struggles when the reward feedback signal is very sparse and essentially becomes a random search algorithm until the agent stumbles upon a rewarding or the goal state.

Continuous Control

Cannot find the paper you are looking for? You can Submit a new open access paper.