Search Results for author: Jinlin Xiao

Found 5 papers, 4 papers with code

OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning

1 code implementation22 Dec 2024 Yuxiang Zhang, YuQi Yang, Jiangming Shu, Yuhang Wang, Jinlin Xiao, Jitao Sang

OpenAI's recent introduction of Reinforcement Fine-Tuning (RFT) showcases the potential of reasoning foundation model and offers a new paradigm for fine-tuning beyond simple pattern imitation.

o1-Coder: an o1 Replication for Coding

1 code implementation29 Nov 2024 Yuxiang Zhang, Shangxi Wu, YuQi Yang, Jiangming Shu, Jinlin Xiao, Chao Kong, Jitao Sang

The technical report introduces O1-CODER, an attempt to replicate OpenAI's o1 model with a focus on coding tasks.

Reinforcement Learning (RL)

Debiasing Vison-Language Models with Text-Only Training

no code implementations12 Oct 2024 Yunfan Yang, Chaoquan Jiang, Zhiyu Lin, Jinlin Xiao, Jiaming Zhang, Jitao Sang

Existing debiasing methods struggle to obtain sufficient image samples for minority groups and incur high costs for group labeling.

Large Language Model

KG-FPQ: Evaluating Factuality Hallucination in LLMs with Knowledge Graph-based False Premise Questions

1 code implementation8 Jul 2024 Yanxu Zhu, Jinlin Xiao, Yuhang Wang, Jitao Sang

Recent studies have demonstrated that large language models (LLMs) are susceptible to being misled by false premise questions (FPQs), leading to errors in factual knowledge, know as factuality hallucination.

Hallucination Knowledge Graphs

Improving Weak-to-Strong Generalization with Scalable Oversight and Ensemble Learning

1 code implementation1 Feb 2024 Jitao Sang, Yuhang Wang, Jing Zhang, Yanxu Zhu, Chao Kong, Junhong Ye, Shuyu Wei, Jinlin Xiao

In the first phase, based on human supervision, the quality of weak supervision is enhanced through a combination of scalable oversight and ensemble learning, reducing the capability gap between weak teachers and strong students.

Ensemble Learning In-Context Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.