1 code implementation • 6 Feb 2025 • Minsang Kim, Seungjun Baek
We propose Syntriever, a training framework for retrievers using synthetic data from black-box LLMs.
no code implementations • 12 Dec 2024 • Minsang Kim, Seungjun Baek
Large language models (LLMs) closely interact with humans, and thus need an intimate understanding of the cultural values of human society.
1 code implementation • 20 Jun 2024 • Minsang Kim, Cheoneum Park, Seungjun Baek
In addition, to compensate for the case where the retrieved passages contain distracting information or divided opinions, we augment the retrieved passages with self-generated passages by LLMs to guide the answer extraction.
no code implementations • 20 Jun 2024 • Minsang Kim, Seungjun Baek
In this work, we consider a data pruning method based on information entropy.
1 code implementation • 13 Feb 2024 • Minsang Kim, Seungjun Baek
HPLC leverages the positional information of nodes based on landmarks at various levels of hierarchy such as nodes' distances to landmarks, inter-landmark distances and hierarchical grouping of clusters.
no code implementations • 14 Aug 2023 • Dongik Shin, Beomsuk Kim, Seungjun Baek
We propose to train a single segmentation model so that the model can adapt to each sub-group.
1 code implementation • 29 Jun 2022 • Minsang Kim, Seungjun Baek
In the common feature extraction, we apply the common encoding function to all input embeddings.