Search Results for author: Yige Yuan

Found 13 papers, 8 papers with code

On a Connection Between Imitation Learning and RLHF

no code implementations7 Mar 2025 Teng Xiao, Yige Yuan, Mingxiao Li, Zhengyu Chen, Vasant G Honavar

We establish a close theoretical connection between reinforcement learning from human feedback RLHF and imitation learning (IL), revealing that RLHF implicitly performs imitation learning on the preference data distribution.

Imitation Learning

MIGE: A Unified Framework for Multimodal Instruction-Based Image Generation and Editing

1 code implementation28 Feb 2025 Xueyun Tian, Wei Li, Bingbing Xu, Yige Yuan, Yuanzhuo Wang, HuaWei Shen

Experiments show that MIGE excels in both subject-driven generation and instruction-based editing while setting a state-of-the-art in the new task of instruction-based subject-driven editing.

Image Generation Transfer Learning

SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters

1 code implementation2 Feb 2025 Teng Xiao, Yige Yuan, Zhengyu Chen, Mingxiao Li, Shangsong Liang, Zhaochun Ren, Vasant G Honavar

Existing preference optimization objectives for language model alignment require additional hyperparameters that must be extensively tuned to achieve optimal performance, increasing both the complexity and time required for fine-tuning large language models.

Cal-DPO: Calibrated Direct Preference Optimization for Language Model Alignment

1 code implementation19 Dec 2024 Teng Xiao, Yige Yuan, Huaisheng Zhu, Mingxiao Li, Vasant G Honavar

Contrastive preference optimization has shown promising results in aligning LLMs with available preference data by optimizing the implicit reward associated with the policy.

Language Modeling Language Modelling

Fact-Level Confidence Calibration and Self-Correction

1 code implementation20 Nov 2024 Yige Yuan, Bingbing Xu, Hexiang Tan, Fei Sun, Teng Xiao, Wei Li, HuaWei Shen, Xueqi Cheng

Confidence calibration in LLMs, i. e., aligning their self-assessed confidence with the actual accuracy of their responses, enabling them to self-evaluate the correctness of their outputs.

How to Leverage Demonstration Data in Alignment for Large Language Model? A Self-Imitation Learning Perspective

1 code implementation14 Oct 2024 Teng Xiao, Mingxiao Li, Yige Yuan, Huaisheng Zhu, Chao Cui, Vasant G Honavar

This paper introduces a novel generalized self-imitation learning ($\textbf{GSIL}$) framework, which effectively and efficiently aligns large language models with offline demonstration data.

Density Ratio Estimation GSM8K +6

MITA: Bridging the Gap between Model and Data for Test-time Adaptation

no code implementations12 Oct 2024 Yige Yuan, Bingbing Xu, Teng Xiao, Liang Hou, Fei Sun, HuaWei Shen, Xueqi Cheng

Test-Time Adaptation (TTA) has emerged as a promising paradigm for enhancing the generalizability of models.

Test-time Adaptation

Negative as Positive: Enhancing Out-of-distribution Generalization for Graph Contrastive Learning

no code implementations25 May 2024 Zixu Wang, Bingbing Xu, Yige Yuan, HuaWei Shen, Xueqi Cheng

Graph contrastive learning (GCL), standing as the dominant paradigm in the realm of graph pre-training, has yielded considerable progress.

Contrastive Learning Out-of-Distribution Generalization

TEA: Test-time Energy Adaptation

1 code implementation CVPR 2024 Yige Yuan, Bingbing Xu, Liang Hou, Fei Sun, HuaWei Shen, Xueqi Cheng

To address this, we propose a novel energy-based perspective, enhancing the model's perception of target data distributions without requiring access to training data or processes.

Test-time Adaptation

PDE+: Enhancing Generalization via PDE with Adaptive Distributional Diffusion

1 code implementation25 May 2023 Yige Yuan, Bingbing Xu, Bo Lin, Liang Hou, Fei Sun, HuaWei Shen, Xueqi Cheng

The generalization of neural networks is a central challenge in machine learning, especially concerning the performance under distributions that differ from training ones.

Data Augmentation

Towards Generalizable Graph Contrastive Learning: An Information Theory Perspective

no code implementations20 Nov 2022 Yige Yuan, Bingbing Xu, HuaWei Shen, Qi Cao, Keting Cen, Wen Zheng, Xueqi Cheng

Guided by the bound, we design a GCL framework named InfoAdv with enhanced generalization ability, which jointly optimizes the generalization metric and InfoMax to strike the right balance between pretext task fitting and the generalization ability on downstream tasks.

Contrastive Learning Data Augmentation +1

Hierarchical Estimation for Effective and Efficient Sampling Graph Neural Network

no code implementations16 Nov 2022 Yang Li, Bingbing Xu, Qi Cao, Yige Yuan, HuaWei Shen

On account that previous studies either lacks variance analysis or only focus on a particular sampling paradigm, we firstly propose an unified node sampling variance analysis framework and analyze the core challenge "circular dependency" for deriving the minimum variance sampler, i. e., sampling probability depends on node embeddings while node embeddings can not be calculated until sampling is finished.

Graph Neural Network Time Series +1

Cannot find the paper you are looking for? You can Submit a new open access paper.