no code implementations • 8 Oct 2024 • Yunhui Jang, Jaehyung Kim, Sungsoo Ahn
The adaptation of large language models (LLMs) to chemistry has shown promising performance in molecular understanding tasks, such as generating a text description from a molecule.
no code implementations • 4 Oct 2024 • Hyosoon Jang, Yunhui Jang, Jaehyung Kim, Sungsoo Ahn
In response, we propose a new method for fine-tuning molecular generative LLMs to autoregressively generate a set of structurally diverse molecules, where each molecule is generated by conditioning on the previously generated molecules.
no code implementations • 31 Jul 2024 • Dongwon Son, Sanghyeon Son, Jaehyung Kim, Beomjoon Kim
We present DEF-oriCORN, a framework for language-directed manipulation tasks.
1 code implementation • 26 Jun 2024 • Jaehyung Kim, Yiming Yang
As the diversity of users increases, the capability of providing personalized responses by large language models (LLMs) has become increasingly important.
1 code implementation • 26 Jun 2024 • Jaehyung Kim, Dongyoung Kim, Yiming Yang
It uses a trained adaptation model to perform a seq2seq mapping from the often-imperfect reasonings of the original black-box LLM to the correct or improved reasonings.
1 code implementation • 12 Jun 2024 • Jaehyun Nam, KyuYoung Kim, Seunghyuk Oh, Jihoon Tack, Jaehyung Kim, Jinwoo Shin
In tabular prediction tasks, tree-based models combined with automated feature engineering methods often outperform deep learning approaches that rely on learned representations.
no code implementations • 6 Jun 2024 • Dongyoung Kim, Kimin Lee, Jinwoo Shin, Jaehyung Kim
To tackle this problem, we propose a new framework that boosts the alignment of LLMs through Self-generated Preference data (Selfie) using only a very small amount of human-annotated preference data.
2 code implementations • 17 Apr 2024 • Jaehyung Kim, Jaehyun Nam, Sangwoo Mo, Jongjin Park, Sang-Woo Lee, Minjoon Seo, Jung-Woo Ha, Jinwoo Shin
While incorporating new information with the retrieval of relevant passages is a promising way to improve QA with LLMs, the existing methods often require additional fine-tuning which becomes infeasible with recent LLMs.
1 code implementation • 16 Apr 2024 • Woomin Song, Seunghyuk Oh, Sangwoo Mo, Jaehyung Kim, Sukmin Yun, Jung-Woo Ha, Jinwoo Shin
Large language models (LLMs) have shown remarkable performance in various natural language processing tasks.
1 code implementation • 7 Mar 2024 • Jihoon Tack, Jaehyung Kim, Eric Mitchell, Jinwoo Shin, Yee Whye Teh, Jonathan Richard Schwarz
To address these challenges, we propose Memory of Amortized Contexts (MAC), an efficient and effective online adaptation framework for LLMs with strong knowledge retention.
1 code implementation • 29 Jan 2024 • Ritik Sachin Parkar, Jaehyung Kim, Jong Inn Park, Dongyeop Kang
However, how to select unlabelled instructions is not well-explored, especially in the context of LLMs.
no code implementations • 26 Jan 2024 • Debarati Das, Karin de Langis, Anna Martin-Boyle, Jaehyung Kim, Minhwa Lee, Zae Myung Kim, Shirley Anugrah Hayati, Risako Owan, Bin Hu, Ritik Parkar, Ryan Koo, Jonginn Park, Aahan Tyagi, Libby Ferland, Sanjali Roy, Vincent Liu, Dongyeop Kang
This work delves into the expanding role of large language models (LLMs) in generating artificial data.
1 code implementation • 7 Dec 2023 • Jaehyung Kim, Yuning Mao, Rui Hou, Hanchao Yu, Davis Liang, Pascale Fung, Qifan Wang, Fuli Feng, Lifu Huang, Madian Khabsa
Under a unified evaluation of fine-tuned LMs by incorporating four representative perspectives of model robustness, we demonstrate the effectiveness of RoAST compared to state-of-the-art fine-tuning methods on six different types of LMs, which indicates its usefulness in practice.
1 code implementation • 8 Jun 2023 • Jaehyung Kim, Jinwoo Shin, Dongyeop Kang
In this paper, we investigate task-specific preferences between pairs of input texts as a new alternative way for such auxiliary data annotation.
2 code implementations • 30 May 2023 • Jaehyung Kim, Yekyung Kim, Karin de Langis, Jinwoo Shin, Dongyeop Kang
However, not all samples in these datasets are equally valuable for learning, as some may be redundant or noisy.
1 code implementation • 24 May 2023 • London Lowmanstone, Ruyuan Wan, Risako Owan, Jaehyung Kim, Dongyeop Kang
In our analysis of the results, we found that the choice of imputation method significantly impacts soft label changes and distribution.
no code implementations • 12 Jan 2023 • Ruyuan Wan, Jaehyung Kim, Dongyeop Kang
Particularly, we extract disagreement labels from the annotators' voting histories in the five subjective datasets, and then fine-tune language models to predict annotators' disagreement.
1 code implementation • 19 Jul 2022 • Sukmin Yun, Jaehyung Kim, Dongyoon Han, Hwanjun Song, Jung-Woo Ha, Jinwoo Shin
Understanding temporal dynamics of video is an essential aspect of learning better video representations.
1 code implementation • CVPR 2022 • Sukmin Yun, Hankook Lee, Jaehyung Kim, Jinwoo Shin
Despite its simplicity, we demonstrate that it can significantly improve the performance of existing SSL methods for various visual tasks, including object detection and semantic segmentation.
no code implementations • ICLR 2022 • Junhyun Nam, Jaehyung Kim, Jaeho Lee, Jinwoo Shin
The paradigm of worst-group loss minimization has shown its promise in avoiding to learn spurious correlations, but requires costly additional supervision on spurious attributes.
no code implementations • 29 Sep 2021 • Sukmin Yun, Hankook Lee, Jaehyung Kim, Jinwoo Shin
This paper aims to improve their performance further by utilizing the architectural advantages of the underlying neural network, as the current state-of-the-art visual pretext tasks for self-supervised learning do not enjoy the benefit, i. e., they are architecture-agnostic.
no code implementations • ICLR 2022 • Jaehyung Kim, Dongyeop Kang, Sungsoo Ahn, Jinwoo Shin
Remarkably, our method is more effective on the challenging low-data and class-imbalanced regimes, and the learned augmentation policy is well-transferable to the different tasks and models.
1 code implementation • NeurIPS 2020 • Jaehyung Kim, Youngbum Hur, Sejun Park, Eunho Yang, Sung Ju Hwang, Jinwoo Shin
While semi-supervised learning (SSL) has proven to be a promising way for leveraging unlabeled data when labeled data is scarce, the existing SSL algorithms typically assume that training class distributions are balanced.
1 code implementation • CVPR 2020 • Jaehyung Kim, Jongheon Jeong, Jinwoo Shin
In most real-world scenarios, labeled training datasets are highly class-imbalanced, where deep neural networks suffer from generalizing to a balanced testing criterion.
Ranked #43 on
Long-tail Learning
on CIFAR-10-LT (ρ=10)
no code implementations • 11 Apr 2017 • Kimin Lee, Jaehyung Kim, Song Chong, Jinwoo Shin
In this paper, we aim at developing efficient training methods for SFNN, in particular using known architectures and pre-trained parameters of DNN.