Search Results for author: Jaehyung Kim

Found 25 papers, 15 papers with code

Chain-of-Thoughts for Molecular Understanding

no code implementations8 Oct 2024 Yunhui Jang, Jaehyung Kim, Sungsoo Ahn

The adaptation of large language models (LLMs) to chemistry has shown promising performance in molecular understanding tasks, such as generating a text description from a molecule.

Can LLMs Generate Diverse Molecules? Towards Alignment with Structural Diversity

no code implementations4 Oct 2024 Hyosoon Jang, Yunhui Jang, Jaehyung Kim, Sungsoo Ahn

In response, we propose a new method for fine-tuning molecular generative LLMs to autoregressively generate a set of structurally diverse molecules, where each molecule is generated by conditioning on the previously generated molecules.

Diversity Drug Discovery

Few-shot Personalization of LLMs with Mis-aligned Responses

1 code implementation26 Jun 2024 Jaehyung Kim, Yiming Yang

As the diversity of users increases, the capability of providing personalized responses by large language models (LLMs) has become increasingly important.

Diversity

Learning to Correct for QA Reasoning with Black-box LLMs

1 code implementation26 Jun 2024 Jaehyung Kim, Dongyoung Kim, Yiming Yang

It uses a trained adaptation model to perform a seq2seq mapping from the often-imperfect reasonings of the original black-box LLM to the correct or improved reasonings.

Optimized Feature Generation for Tabular Data via LLMs with Decision Tree Reasoning

1 code implementation12 Jun 2024 Jaehyun Nam, KyuYoung Kim, Seunghyuk Oh, Jihoon Tack, Jaehyung Kim, Jinwoo Shin

In tabular prediction tasks, tree-based models combined with automated feature engineering methods often outperform deep learning approaches that rely on learned representations.

Automated Feature Engineering Feature Engineering +1

Aligning Large Language Models with Self-generated Preference Data

no code implementations6 Jun 2024 Dongyoung Kim, Kimin Lee, Jinwoo Shin, Jaehyung Kim

To tackle this problem, we propose a new framework that boosts the alignment of LLMs through Self-generated Preference data (Selfie) using only a very small amount of human-annotated preference data.

In-Context Learning

SuRe: Summarizing Retrievals using Answer Candidates for Open-domain QA of LLMs

2 code implementations17 Apr 2024 Jaehyung Kim, Jaehyun Nam, Sangwoo Mo, Jongjin Park, Sang-Woo Lee, Minjoon Seo, Jung-Woo Ha, Jinwoo Shin

While incorporating new information with the retrieval of relevant passages is a promising way to improve QA with LLMs, the existing methods often require additional fine-tuning which becomes infeasible with recent LLMs.

Question Answering Retrieval

Online Adaptation of Language Models with a Memory of Amortized Contexts

1 code implementation7 Mar 2024 Jihoon Tack, Jaehyung Kim, Eric Mitchell, Jinwoo Shin, Yee Whye Teh, Jonathan Richard Schwarz

To address these challenges, we propose Memory of Amortized Contexts (MAC), an efficient and effective online adaptation framework for LLMs with strong knowledge retention.

Language Modelling Meta-Learning

SelectLLM: Can LLMs Select Important Instructions to Annotate?

1 code implementation29 Jan 2024 Ritik Sachin Parkar, Jaehyung Kim, Jong Inn Park, Dongyeop Kang

However, how to select unlabelled instructions is not well-explored, especially in the context of LLMs.

Active Learning Instruction Following

RoAST: Robustifying Language Models via Adversarial Perturbation with Selective Training

1 code implementation7 Dec 2023 Jaehyung Kim, Yuning Mao, Rui Hou, Hanchao Yu, Davis Liang, Pascale Fung, Qifan Wang, Fuli Feng, Lifu Huang, Madian Khabsa

Under a unified evaluation of fine-tuned LMs by incorporating four representative perspectives of model robustness, we demonstrate the effectiveness of RoAST compared to state-of-the-art fine-tuning methods on six different types of LMs, which indicates its usefulness in practice.

Adversarial Robustness

Prefer to Classify: Improving Text Classifiers via Auxiliary Preference Learning

1 code implementation8 Jun 2023 Jaehyung Kim, Jinwoo Shin, Dongyeop Kang

In this paper, we investigate task-specific preferences between pairs of input texts as a new alternative way for such auxiliary data annotation.

Multi-Task Learning

Annotation Imputation to Individualize Predictions: Initial Studies on Distribution Dynamics and Model Predictions

1 code implementation24 May 2023 London Lowmanstone, Ruyuan Wan, Risako Owan, Jaehyung Kim, Dongyeop Kang

In our analysis of the results, we found that the choice of imputation method significantly impacts soft label changes and distribution.

Imputation valid

Everyone's Voice Matters: Quantifying Annotation Disagreement Using Demographic Information

no code implementations12 Jan 2023 Ruyuan Wan, Jaehyung Kim, Dongyeop Kang

Particularly, we extract disagreement labels from the annotators' voting histories in the five subjective datasets, and then fine-tune language models to predict annotators' disagreement.

Patch-level Representation Learning for Self-supervised Vision Transformers

1 code implementation CVPR 2022 Sukmin Yun, Hankook Lee, Jaehyung Kim, Jinwoo Shin

Despite its simplicity, we demonstrate that it can significantly improve the performance of existing SSL methods for various visual tasks, including object detection and semantic segmentation.

Instance Segmentation object-detection +5

Spread Spurious Attribute: Improving Worst-group Accuracy with Spurious Attribute Estimation

no code implementations ICLR 2022 Junhyun Nam, Jaehyung Kim, Jaeho Lee, Jinwoo Shin

The paradigm of worst-group loss minimization has shown its promise in avoiding to learn spurious correlations, but requires costly additional supervision on spurious attributes.

Attribute

PASS: Patch-Aware Self-Supervision for Vision Transformer

no code implementations29 Sep 2021 Sukmin Yun, Hankook Lee, Jaehyung Kim, Jinwoo Shin

This paper aims to improve their performance further by utilizing the architectural advantages of the underlying neural network, as the current state-of-the-art visual pretext tasks for self-supervised learning do not enjoy the benefit, i. e., they are architecture-agnostic.

object-detection Object Detection +3

What Makes Better Augmentation Strategies? Augment Difficult but Not too Different

no code implementations ICLR 2022 Jaehyung Kim, Dongyeop Kang, Sungsoo Ahn, Jinwoo Shin

Remarkably, our method is more effective on the challenging low-data and class-imbalanced regimes, and the learned augmentation policy is well-transferable to the different tasks and models.

Data Augmentation Semantic Similarity +3

Distribution Aligning Refinery of Pseudo-label for Imbalanced Semi-supervised Learning

1 code implementation NeurIPS 2020 Jaehyung Kim, Youngbum Hur, Sejun Park, Eunho Yang, Sung Ju Hwang, Jinwoo Shin

While semi-supervised learning (SSL) has proven to be a promising way for leveraging unlabeled data when labeled data is scarce, the existing SSL algorithms typically assume that training class distributions are balanced.

Pseudo Label

M2m: Imbalanced Classification via Major-to-minor Translation

1 code implementation CVPR 2020 Jaehyung Kim, Jongheon Jeong, Jinwoo Shin

In most real-world scenarios, labeled training datasets are highly class-imbalanced, where deep neural networks suffer from generalizing to a balanced testing criterion.

Classification Diversity +4

Simplified Stochastic Feedforward Neural Networks

no code implementations11 Apr 2017 Kimin Lee, Jaehyung Kim, Song Chong, Jinwoo Shin

In this paper, we aim at developing efficient training methods for SFNN, in particular using known architectures and pre-trained parameters of DNN.

Cannot find the paper you are looking for? You can Submit a new open access paper.