1 code implementation • 10 Jan 2025 • Taywon Min, Haeone Lee, Hanho Ryu, Yongchan Kwon, Kimin Lee
By quantifying the impact of human feedback on reward models, we believe that influence functions can enhance feedback interpretability and contribute to scalable oversight in RLHF, helping labelers provide more accurate and consistent feedback.
no code implementations • 31 Oct 2024 • Eugene Jang, Kimin Lee, Jin-Woo Chung, Keuntae Park, Seungwon Shin
In this work, we investigate incomplete tokens, i. e., undecodable tokens with stray bytes resulting from byte-level byte-pair encoding (BPE) tokenization.
1 code implementation • 23 Oct 2024 • Juyong Lee, Dongyoon Hahm, June Suk Choi, W. Bradley Knox, Kimin Lee
In this work, we introduce MobileSafetyBench, a benchmark designed to evaluate the safety of device-control agents within a realistic mobile environment based on Android emulators.
no code implementations • 18 Oct 2024 • Hanna Kim, Minkyoo Song, Seung Ho Na, Seungwon Shin, Kimin Lee
Recent advancements in Large Language Models (LLMs) have established them as agentic systems capable of planning and interacting with various tools.
no code implementations • 15 Oct 2024 • Seonghyeon Ye, Joel Jang, Byeongguk Jeon, Sejune Joo, Jianwei Yang, Baolin Peng, Ajay Mandlekar, Reuben Tan, Yu-Wei Chao, Bill Yuchen Lin, Lars Liden, Kimin Lee, Jianfeng Gao, Luke Zettlemoyer, Dieter Fox, Minjoon Seo
We introduce Latent Action Pretraining for general Action models (LAPA), an unsupervised method for pretraining Vision-Language-Action (VLA) models without ground-truth robot action labels.
no code implementations • 14 Oct 2024 • Yongjin Yang, Sihyeon Kim, Hojung Jung, Sangmin Bae, Sangmook Kim, Se-Young Yun, Kimin Lee
Fine-tuning text-to-image diffusion models with human feedback is an effective method for aligning model behavior with human intentions.
1 code implementation • 8 Oct 2024 • June Suk Choi, Kyungmin Lee, Jongheon Jeong, Saining Xie, Jinwoo Shin, Kimin Lee
Through extensive experiments, we show that our method achieves stronger protection and improved mask robustness with lower computational costs compared to the strongest baseline.
1 code implementation • 4 Oct 2024 • KyuYoung Kim, Ah Jeong Seo, Hao liu, Jinwoo Shin, Kimin Lee
Large language models (LLMs) fine-tuned with alignment techniques, such as reinforcement learning from human feedback, have been instrumental in developing some of the most capable AI systems to date.
1 code implementation • 15 Jul 2024 • Hyungjun Yoon, Biniyam Aschalew Tolera, Taesik Gong, Kimin Lee, Sung-Ju Lee
We design a visual prompt that directs MLLMs to utilize visualized sensor data alongside the target sensory task descriptions.
no code implementations • 24 Jun 2024 • Katherine M. Collins, Najoung Kim, Yonatan Bitton, Verena Rieser, Shayegan Omidshafiei, Yushi Hu, Sherol Chen, Senjuti Dutta, Minsuk Chang, Kimin Lee, Youwei Liang, Georgina Evans, Sahil Singla, Gang Li, Adrian Weller, Junfeng He, Deepak Ramachandran, Krishnamurthy Dj Dvijotham
Human feedback plays a critical role in learning and refining reward models for text-to-image generation, but the optimal form the feedback should take for learning an accurate reward function has not been conclusively established.
no code implementations • 6 Jun 2024 • Dongyoung Kim, Kimin Lee, Jinwoo Shin, Jaehyung Kim
To tackle this problem, we propose a new framework that boosts the alignment of LLMs through Self-generated Preference data (Selfie) using only a very small amount of human-annotated preference data.
no code implementations • 25 Apr 2024 • Juyong Lee, Taywon Min, Minyong An, Dongyoon Hahm, Haeone Lee, Changyeon Kim, Kimin Lee
In this work, we introduce B-MoCA: a novel benchmark with interactive environments for evaluating and developing mobile device control agents.
no code implementations • 5 Apr 2024 • Sangwon Jang, Jaehyeong Jo, Kimin Lee, Sung Ju Hwang
Experimental results show that our MuDI can produce high-quality personalized images without identity mixing, even for highly similar subjects as shown in Figure 1.
1 code implementation • 2 Apr 2024 • KyuYoung Kim, Jongheon Jeong, Minyong An, Mohammad Ghavamzadeh, Krishnamurthy Dvijotham, Jinwoo Shin, Kimin Lee
To investigate this issue in depth, we introduce the Text-Image Alignment Assessment (TIA2) benchmark, which comprises a diverse collection of text prompts, images, and human annotations.
no code implementations • CVPR 2024 • Minyoung Hwang, Luca Weihs, Chanwoo Park, Kimin Lee, Aniruddha Kembhavi, Kiana Ehsani
Customizing robotic behaviors to be aligned with diverse human preferences is an underexplored challenge in the field of embodied AI.
no code implementations • 4 Dec 2023 • Daewon Chae, Nokyung Park, Jinkyu Kim, Kimin Lee
In this work, we introduce InstructBooth, a novel method designed to enhance image-text alignment in personalized text-to-image models without sacrificing the personalization ability.
4 code implementations • 1 Jun 2023 • Kihyuk Sohn, Nataniel Ruiz, Kimin Lee, Daniel Castro Chin, Irina Blok, Huiwen Chang, Jarred Barber, Lu Jiang, Glenn Entis, Yuanzhen Li, Yuan Hao, Irfan Essa, Michael Rubinstein, Dilip Krishnan
Pre-trained large text-to-image models synthesize impressive images with an appropriate use of text prompts.
2 code implementations • 25 May 2023 • Ying Fan, Olivia Watkins, Yuqing Du, Hao liu, MoonKyung Ryu, Craig Boutilier, Pieter Abbeel, Mohammad Ghavamzadeh, Kangwook Lee, Kimin Lee
We focus on diffusion models, defining the fine-tuning task as an RL problem, and updating the pre-trained text-to-image diffusion models using policy gradient to maximize the feedback-trained reward.
1 code implementation • 2 Mar 2023 • Changyeon Kim, Jongjin Park, Jinwoo Shin, Honglak Lee, Pieter Abbeel, Kimin Lee
In this paper, we present Preference Transformer, a neural architecture that models human preferences using transformers.
no code implementations • 23 Feb 2023 • Kimin Lee, Hao liu, MoonKyung Ryu, Olivia Watkins, Yuqing Du, Craig Boutilier, Pieter Abbeel, Mohammad Ghavamzadeh, Shixiang Shane Gu
Our results demonstrate the potential for learning from human feedback to significantly improve text-to-image models.
3 code implementations • 10 Feb 2023 • Seohong Park, Kimin Lee, Youngwoon Lee, Pieter Abbeel
One of the key capabilities of intelligent agents is the ability to discover useful skills without external supervision.
1 code implementation • 5 Feb 2023 • Younggyo Seo, Junsu Kim, Stephen James, Kimin Lee, Jinwoo Shin, Pieter Abbeel
In this paper, we investigate how to learn good representations with multi-view data and utilize them for visual robotic manipulation.
1 code implementation • 24 Oct 2022 • Hao liu, Lisa Lee, Kimin Lee, Pieter Abbeel
Our \ours method consists of a multimodal transformer that encodes visual observations and language instructions, and a transformer-based policy that predicts actions based on encoded representations.
no code implementations • 15 Sep 2022 • Younggyo Seo, Kimin Lee, Fangchen Liu, Stephen James, Pieter Abbeel
Video prediction is an important yet challenging problem; burdened with the tasks of generating future frames and learning environment dynamics.
no code implementations • 28 Jun 2022 • Younggyo Seo, Danijar Hafner, Hao liu, Fangchen Liu, Stephen James, Kimin Lee, Pieter Abbeel
Yet the current approaches typically train a single model end-to-end for learning both visual representations and dynamics, making it difficult to accurately model the interaction between robots and small objects.
Model-based Reinforcement Learning Reinforcement Learning (RL) +1
2 code implementations • ICLR 2022 • Xinran Liang, Katherine Shu, Kimin Lee, Pieter Abbeel
Our intuition is that disagreement in learned reward model reflects uncertainty in tailored human feedback and could be useful for exploration.
2 code implementations • 25 Mar 2022 • Younggyo Seo, Kimin Lee, Stephen James, Pieter Abbeel
Our framework consists of two phases: we pre-train an action-free latent video prediction model, and then utilize the pre-trained representations for efficiently learning action-conditional world models on unseen environments.
no code implementations • ICLR 2022 • Jongjin Park, Younggyo Seo, Jinwoo Shin, Honglak Lee, Pieter Abbeel, Kimin Lee
In order to leverage unlabeled samples for reward learning, we infer pseudo-labels of the unlabeled samples based on the confidence of the preference predictor.
2 code implementations • NeurIPS 2021 • Hankook Lee, Kibok Lee, Kimin Lee, Honglak Lee, Jinwoo Shin
Recent unsupervised representation learning methods have shown to be effective in a range of vision tasks by learning representations invariant to data augmentations such as random cropping and color jittering.
1 code implementation • 4 Nov 2021 • Kimin Lee, Laura Smith, Anca Dragan, Pieter Abbeel
However, it is difficult to quantify the progress in preference-based RL due to the lack of a commonly adopted benchmark.
1 code implementation • 28 Oct 2021 • Michael Laskin, Denis Yarats, Hao liu, Kimin Lee, Albert Zhan, Kevin Lu, Catherine Cang, Lerrel Pinto, Pieter Abbeel
Deep Reinforcement Learning (RL) has emerged as a powerful paradigm to solve a range of complex yet specific control tasks.
no code implementations • 26 Oct 2021 • Zhao Mandi, Fangchen Liu, Kimin Lee, Pieter Abbeel
We then study the multi-task setting, where multi-task training is followed by (i) one-shot imitation on variations within the training tasks, (ii) one-shot imitation on new tasks, and (iii) fine-tuning on new tasks.
no code implementations • 29 Sep 2021 • Younggyo Seo, Kimin Lee, Fangchen Liu, Stephen James, Pieter Abbeel
Video prediction is an important yet challenging problem; burdened with the tasks of generating future frames and learning environment dynamics.
no code implementations • 11 Aug 2021 • Xiaofei Wang, Kimin Lee, Kourosh Hakhamaneshi, Pieter Abbeel, Michael Laskin
A promising approach to solving challenging long-horizon tasks has been to extract behavior priors (skills) by fitting generative models to large offline datasets of demonstrations.
1 code implementation • 1 Jul 2021 • SeungHyun Lee, Younggyo Seo, Kimin Lee, Pieter Abbeel, Jinwoo Shin
Recent advance in deep offline reinforcement learning (RL) has made it possible to train strong robotic agents from offline datasets.
no code implementations • 18 Jun 2021 • Abdus Salam Azad, Edward Kim, Qiancheng Wu, Kimin Lee, Ion Stoica, Pieter Abbeel, Sanjit A. Seshia
To showcase the benefits, we interfaced SCENIC to an existing RTS environment Google Research Football(GRF) simulator and introduced a benchmark consisting of 32 realistic scenarios, encoded in SCENIC, to train RL agents and testing their generalization capabilities.
2 code implementations • 9 Jun 2021 • Kimin Lee, Laura Smith, Pieter Abbeel
We also show that our method is able to utilize real-time human feedback to effectively prevent reward exploitation and learn new behaviors that are difficult to specify with standard reward functions.
19 code implementations • NeurIPS 2021 • Lili Chen, Kevin Lu, Aravind Rajeswaran, Kimin Lee, Aditya Grover, Michael Laskin, Pieter Abbeel, Aravind Srinivas, Igor Mordatch
In particular, we present Decision Transformer, an architecture that casts the problem of RL as conditional sequence modeling.
Ranked #3 on Offline RL on D4RL
1 code implementation • NeurIPS 2021 • Lili Chen, Kimin Lee, Aravind Srinivas, Pieter Abbeel
Recent advances in off-policy deep reinforcement learning (RL) have led to impressive success in complex tasks from visual observations.
Ranked #33 on Atari Games on Atari 2600 Amidar
2 code implementations • ICLR Workshop SSL-RL 2021 • Younggyo Seo, Lili Chen, Jinwoo Shin, Honglak Lee, Pieter Abbeel, Kimin Lee
Recent exploration methods have proven to be a recipe for improving sample-efficiency in deep reinforcement learning (RL).
no code implementations • 1 Jan 2021 • Mandi Zhao, Qiyang Li, Aravind Srinivas, Ignasi Clavera, Kimin Lee, Pieter Abbeel
Attention mechanisms are generic inductive biases that have played a critical role in improving the state-of-the-art in supervised learning, unsupervised pre-training and generative modeling for multiple domains including vision, language and speech.
no code implementations • 1 Jan 2021 • SeungHyun Lee, Younggyo Seo, Kimin Lee, Pieter Abbeel, Jinwoo Shin
As it turns out, fine-tuning offline RL agents is a non-trivial challenge, due to distribution shift – the agent encounters out-of-distribution samples during online interaction, which may cause bootstrapping error in Q-learning and instability during fine-tuning.
no code implementations • 1 Jan 2021 • Lili Chen, Kimin Lee, Aravind Srinivas, Pieter Abbeel
In this paper, we present Latent Vector Experience Replay (LeVER), a simple modification of existing off-policy RL methods, to address these computational and memory requirements without sacrificing the performance of RL agents.
no code implementations • 1 Jan 2021 • Kimin Lee, Michael Laskin, Aravind Srinivas, Pieter Abbeel
Furthermore, since our weighted Bellman backups rely on maintaining an ensemble, we investigate how weighted Bellman backups interact with other benefits previously derived from ensembles: (a) Bootstrap; (b) UCB Exploration.
1 code implementation • 17 Dec 2020 • Seung Jun Moon, Sangwoo Mo, Kimin Lee, Jaeho Lee, Jinwoo Shin
We claim that one central obstacle to the reliability is the over-reliance of the model on a limited number of keywords, instead of looking at the whole context.
no code implementations • 28 Oct 2020 • Wilka Carvalho, Anthony Liang, Kimin Lee, Sungryull Sohn, Honglak Lee, Richard L. Lewis, Satinder Singh
In this work, we show that one can learn object-interaction tasks from scratch without supervision by learning an attentive object-model as an auxiliary task during task learning with an object-centric relational RL agent.
1 code implementation • NeurIPS 2020 • Younggyo Seo, Kimin Lee, Ignasi Clavera, Thanard Kurutach, Jinwoo Shin, Pieter Abbeel
Model-based reinforcement learning (RL) has shown great potential in various control tasks in terms of both sample-efficiency and final performance.
3 code implementations • 14 Sep 2020 • Adam Stooke, Kimin Lee, Pieter Abbeel, Michael Laskin
In an effort to overcome limitations of reward-driven feature learning in deep reinforcement learning (RL) from images, we propose decoupling representation learning from policy learning.
no code implementations • 3 Aug 2020 • Xingyu Lu, Kimin Lee, Pieter Abbeel, Stas Tiomkin
Despite the significant progress of deep reinforcement learning (RL) in solving sequential decision making problems, RL agents often overfit to training environments and struggle to adapt to new, unseen environments.
no code implementations • ICLR 2021 • Youngmin Oh, Kimin Lee, Jinwoo Shin, Eunho Yang, Sung Ju Hwang
Experience replay, which enables the agents to remember and reuse experience from the past, has played a significant role in the success of off-policy reinforcement learning (RL).
1 code implementation • 9 Jul 2020 • Kimin Lee, Michael Laskin, Aravind Srinivas, Pieter Abbeel
Off-policy deep reinforcement learning (RL) has been successful in a range of challenging domains.
2 code implementations • ICML 2020 • Kimin Lee, Younggyo Seo, Seung-Hyun Lee, Honglak Lee, Jinwoo Shin
Model-based reinforcement learning (RL) enjoys several benefits, such as data-efficiency and planning, by learning a model of the environment's dynamics.
Model-based Reinforcement Learning reinforcement-learning +1
2 code implementations • NeurIPS 2020 • Michael Laskin, Kimin Lee, Adam Stooke, Lerrel Pinto, Pieter Abbeel, Aravind Srinivas
To this end, we present Reinforcement Learning with Augmented Data (RAD), a simple plug-and-play module that can enhance most RL algorithms.
1 code implementation • CVPR 2020 • Sukmin Yun, Jongjin Park, Kimin Lee, Jinwoo Shin
Deep neural networks with millions of parameters may suffer from poor generalization due to overfitting.
5 code implementations • ICLR 2020 • Kimin Lee, Kibok Lee, Jinwoo Shin, Honglak Lee
Deep reinforcement learning (RL) agents often fail to generalize to unseen environments (yet semantically similar to trained agents), particularly when they are trained on high-dimensional state spaces, such as images.
no code implementations • ICLR 2019 • Kimin Lee, Sukmin Yun, Kibok Lee, Honglak Lee, Bo Li, Jinwoo Shin
For instance, on CIFAR-10 dataset containing 45% noisy training labels, we improve the test accuracy of a deep model optimized by the state-of-the-art noise-handling training method from33. 34% to 43. 02%.
1 code implementation • ICCV 2019 • Kibok Lee, Kimin Lee, Jinwoo Shin, Honglak Lee
Lifelong learning with deep neural networks is well-known to suffer from catastrophic forgetting: the performance on previous tasks drastically degrades when learning a new task.
1 code implementation • 31 Jan 2019 • Kimin Lee, Sukmin Yun, Kibok Lee, Honglak Lee, Bo Li, Jinwoo Shin
Large-scale datasets may contain significant proportions of noisy (incorrect) class labels, and it is well-known that modern deep neural networks (DNNs) poorly generalize from such noisy training datasets.
1 code implementation • 28 Jan 2019 • Dan Hendrycks, Kimin Lee, Mantas Mazeika
He et al. (2018) have called into question the utility of pre-training by showing that training from scratch can often yield similar performance to pre-training.
no code implementations • NeurIPS 2018 • Jonghwan Mun, Kimin Lee, Jinwoo Shin, Bohyung Han
The proposed framework is model-agnostic and applicable to any tasks other than VQA, e. g., image classification with a large number of labels but few per-class examples, which is known to be difficult under existing MCL schemes.
4 code implementations • NeurIPS 2018 • Kimin Lee, Kibok Lee, Honglak Lee, Jinwoo Shin
Detecting test samples drawn sufficiently far away from the training distribution statistically or adversarially is a fundamental requirement for deploying a good classifier in many real-world machine learning applications.
Ranked #2 on Out-of-Distribution Detection on MS-1M vs. IJB-C
no code implementations • CVPR 2018 • Kibok Lee, Kimin Lee, Kyle Min, Yuting Zhang, Jinwoo Shin, Honglak Lee
The essential ingredients of our methods are confidence-calibrated classifiers, data relabeling, and the leave-one-out strategy for modeling novel classes under the hierarchical taxonomy.
3 code implementations • ICLR 2018 • Kimin Lee, Honglak Lee, Kibok Lee, Jinwoo Shin
The problem of detecting whether a test sample is from in-distribution (i. e., training distribution by a classifier) or out-of-distribution sufficiently different from it arises in many real-world machine learning applications.
2 code implementations • ICML 2017 • Kimin Lee, Changho Hwang, KyoungSoo Park, Jinwoo Shin
Ensemble methods are arguably the most trustworthy techniques for boosting the performance of machine learning models.
no code implementations • 11 Apr 2017 • Kimin Lee, Jaehyung Kim, Song Chong, Jinwoo Shin
In this paper, we aim at developing efficient training methods for SFNN, in particular using known architectures and pre-trained parameters of DNN.