no code implementations • 25 Jan 2025 • Nirav Diwan, Tolga Ergen, Dongsub Shim, Honglak Lee
Direct Preference Optimization (DPO) has emerged as a de-facto approach for aligning language models with human preferences.
no code implementations • 24 Dec 2024 • Xingjian Zhang, Ziyang Xiong, Shixuan Liu, Yutong Xie, Tolga Ergen, Dongsub Shim, Hua Xu, Honglak Lee, Qiaozhu Me
Low-dimensional visualizations, or "projection maps" of datasets, are widely used across scientific research and creative industries as effective tools for interpreting large-scale and complex information.
1 code implementation • 10 Jun 2024 • Xingjian Zhang, Yutong Xie, Jin Huang, Jinge Ma, Zhaoying Pan, Qijia Liu, Ziyang Xiong, Tolga Ergen, Dongsub Shim, Honglak Lee, Qiaozhu Mei
Scientific innovation relies on detailed workflows, which include critical steps such as analyzing literature, generating ideas, validating these ideas, interpreting results, and inspiring follow-up research.
1 code implementation • 7 Dec 2023 • Sungryull Sohn, Yiwei Lyu, Anthony Liu, Lajanugen Logeswaran, Dong-Ki Kim, Dongsub Shim, Honglak Lee
Our TOD-Flow graph learns what a model can, should, and should not predict, effectively reducing the search space and providing a rationale for the model's prediction.
no code implementations • 16 Nov 2023 • Lajanugen Logeswaran, Sungryull Sohn, Yiwei Lyu, Anthony Zhe Liu, Dong-Ki Kim, Dongsub Shim, Moontae Lee, Honglak Lee
One of the fundamental skills required for an agent acting in an environment to complete tasks is the ability to understand what actions are plausible at any given point.
no code implementations • 25 Oct 2023 • Dong-Ki Kim, Sungryull Sohn, Lajanugen Logeswaran, Dongsub Shim, Honglak Lee
Recently, there has been an increasing interest in automated prompt optimization based on reinforcement learning (RL).
Multi-agent Reinforcement Learning
reinforcement-learning
+2
1 code implementation • CVPR 2023 • Qiao Gu, Dongsub Shim, Florian Shkurti
To achieve a better stability-plasticity trade-off, we propose Backward Feature Projection (BFP), a method for continual learning that allows the new features to change up to a learnable linear transformation of the old features.
no code implementations • 16 Jun 2022 • Sungmin Cha, Jihwan Kwak, Dongsub Shim, Hyunwoo Kim, Moontae Lee, Honglak Lee, Taesup Moon
Class incremental learning (CIL) algorithms aim to continually learn new object classes from incrementally arriving data while not forgetting past learned classes.
1 code implementation • 28 Nov 2021 • Zhibo Zhang, Jongseong Jang, Chiheb Trabelsi, Ruiwen Li, Scott Sanner, Yeonjeong Jeong, Dongsub Shim
Contrastive learning has led to substantial improvements in the quality of learned embedding representations for tasks such as image classification.
no code implementations • 29 May 2021 • Ruiwen Li, Zhibo Zhang, Jiani Li, Chiheb Trabelsi, Scott Sanner, Jongseong Jang, Yeonjeong Jeong, Dongsub Shim
Recent years have seen the introduction of a range of methods for post-hoc explainability of image classifier predictions.
3 code implementations • 31 Aug 2020 • Dongsub Shim, Zheda Mai, Jihwan Jeong, Scott Sanner, Hyunwoo Kim, Jongseong Jang
As image-based deep learning becomes pervasive on every device, from cell phones to smart watches, there is a growing need to develop methods that continually learn from data while minimizing memory footprint and power consumption.