Search Results for author: Sehyun Choi

Found 7 papers, 5 papers with code

Parameter-Efficient Checkpoint Merging via Metrics-Weighted Averaging

no code implementations23 Apr 2025 Shi Jie Yu, Sehyun Choi

Checkpoint merging is a technique for combining multiple model snapshots into a single superior model, potentially reducing training time for large language models.

Mathematical Reasoning parameter-efficient fine-tuning

Cross-Architecture Transfer Learning for Linear-Cost Inference Transformers

no code implementations3 Apr 2024 Sehyun Choi

Motivated by this approach, we propose Cross-Architecture Transfer Learning (XATL), in which the weights of the shared components between LCI and self-attention-based transformers, such as layernorms, MLPs, input/output embeddings, are directly transferred to the new architecture from already pre-trained model parameters.

Language Modeling Language Modelling +1

AbsPyramid: Benchmarking the Abstraction Ability of Language Models with a Unified Entailment Graph

1 code implementation15 Nov 2023 Zhaowei Wang, Haochen Shi, Weiqi Wang, Tianqing Fang, Hongming Zhang, Sehyun Choi, Xin Liu, Yangqiu Song

Cognitive research indicates that abstraction ability is essential in human intelligence, which remains under-explored in language models.

Benchmarking

CKBP v2: Better Annotation and Reasoning for Commonsense Knowledge Base Population

1 code implementation20 Apr 2023 Tianqing Fang, Quyet V. Do, Zihao Zheng, Weiqi Wang, Sehyun Choi, Zhaowei Wang, Yangqiu Song

We show that CKBP v2 serves as a challenging and representative evaluation dataset for the CSKB Population task, while its development set aids in selecting a population model that leads to improved knowledge acquisition for downstream commonsense reasoning.

Knowledge Base Population Question Answering

Cannot find the paper you are looking for? You can Submit a new open access paper.