Search Results for author: Yong Lin

Found 29 papers, 16 papers with code

AdaptMI: Adaptive Skill-based In-context Math Instruction for Small Language Models

no code implementations30 Apr 2025 Yinghui He, Abhishek Panigrahi, Yong Lin, Sanjeev Arora

In-context learning (ICL) allows a language model to improve its problem-solving capability when provided with suitable information in context.

In-Context Learning Math

Rethinking Mixture-of-Agents: Is Mixing Different Large Language Models Beneficial?

no code implementations2 Feb 2025 Wenzhe Li, Yong Lin, Mengzhou Xia, Chi Jin

We confirm that the MoA performance is rather sensitive to the quality, and mixing different LLMs often lowers the average quality of the models.

Math MMLU

Entropy-Regularized Process Reward Model

1 code implementation15 Dec 2024 Hanning Zhang, Pengcheng Wang, Shizhe Diao, Yong Lin, Rui Pan, Hanze Dong, Dylan Zhang, Pavlo Molchanov, Tong Zhang

Our theoretical analysis shows that we could derive the optimal reward model from the initial policy sampling.

GSM8K Math +3

On the Limited Generalization Capability of the Implicit Reward Model Induced by Direct Preference Optimization

no code implementations5 Sep 2024 Yong Lin, Skyler Seto, Maartje ter Hoeve, Katherine Metcalf, Barry-John Theobald, Xuan Wang, Yizhe Zhang, Chen Huang, Tong Zhang

These findings highlight that DPORM has limited generalization ability and substantiates the integration of an explicit reward model in iterative DPO approaches.

Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic

2 code implementations24 Aug 2024 Yifei He, Yuzheng Hu, Yong Lin, Tong Zhang, Han Zhao

Our algorithm works in two steps: i) Localization: identify tiny ($1\%$ of the total parameters) localized regions in the finetuned models containing essential skills for the downstream tasks, and ii) Stitching: reintegrate only these essential regions back into the pretrained model for task synergy.

Model Compression Task Arithmetic

Leveraging Invariant Principle for Heterophilic Graph Structure Distribution Shifts

no code implementations18 Aug 2024 Jinluan Yang, Zhengyu Chen, Teng Xiao, Wenqiao Zhang, Yong Lin, Kun Kuang

However, existing works are only devoted to designing better HGNN backbones or architectures for node classification tasks on heterophilic and homophilic graph benchmarks simultaneously, and their analyses of HGNN performance with respect to nodes are only based on the determined data distribution without exploring the effect caused by this structural difference between training and testing nodes.

Data Augmentation Node Classification

Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs

2 code implementations14 Jun 2024 Rui Yang, Ruomeng Ding, Yong Lin, huan zhang, Tong Zhang

Reward models trained on human preference data have been proven to effectively align Large Language Models (LLMs) with human intent within the framework of reinforcement learning from human feedback (RLHF).

Language Modeling Language Modelling +1

On the Benefits of Over-parameterization for Out-of-Distribution Generalization

no code implementations26 Mar 2024 Yifan Hao, Yong Lin, Difan Zou, Tong Zhang

We demonstrate that in this scenario, further increasing the model's parameterization can significantly reduce the OOD loss.

Out-of-Distribution Generalization

A Sober Look at the Robustness of CLIPs to Spurious Features

no code implementations18 Mar 2024 Qizhou Wang, Yong Lin, Yongqiang Chen, Ludwig Schmidt, Bo Han, Tong Zhang

Large vision language models, such as CLIP, demonstrate impressive robustness to spurious features than single-modal models trained on ImageNet.

Benchmarking

Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards

1 code implementation28 Feb 2024 Haoxiang Wang, Yong Lin, Wei Xiong, Rui Yang, Shizhe Diao, Shuang Qiu, Han Zhao, Tong Zhang

Additionally, DPA models user preferences as directions (i. e., unit vectors) in the reward space to achieve user-dependent preference control.

The Instinctive Bias: Spurious Images lead to Illusion in MLLMs

1 code implementation6 Feb 2024 Tianyang Han, Qing Lian, Rui Pan, Renjie Pi, Jipeng Zhang, Shizhe Diao, Yong Lin, Tong Zhang

In this paper, we identify a typical class of inputs that baffles MLLMs, which consist of images that are highly relevant but inconsistent with answers, causing MLLMs to suffer from visual illusion.

Hallucination

R-Tuning: Instructing Large Language Models to Say `I Don't Know'

1 code implementation16 Nov 2023 Hanning Zhang, Shizhe Diao, Yong Lin, Yi R. Fung, Qing Lian, Xingyao Wang, Yangyi Chen, Heng Ji, Tong Zhang

This approach is formalized by first identifying the disparity in knowledge encompassed by pre-trained parameters compared to that of instruction tuning data.

Hallucination Sentence

Continuous Invariance Learning

no code implementations9 Oct 2023 Yong Lin, Fan Zhou, Lu Tan, Lintao Ma, Jiameng Liu, Yansu He, Yuan Yuan, Yu Liu, James Zhang, Yujiu Yang, Hao Wang

To address this challenge, we then propose Continuous Invariance Learning (CIL), which extracts invariant features across continuously indexed domains.

Cloud Computing

Spurious Feature Diversification Improves Out-of-distribution Generalization

no code implementations29 Sep 2023 Yong Lin, Lu Tan, Yifan Hao, Honam Wong, Hanze Dong, Weizhong Zhang, Yujiu Yang, Tong Zhang

Contrary to the conventional wisdom that focuses on learning invariant features for better OOD performance, our findings suggest that incorporating a large number of diverse spurious features weakens their individual contributions, leading to improved overall OOD generalization performance.

Out-of-Distribution Generalization

Mitigating the Alignment Tax of RLHF

1 code implementation12 Sep 2023 Yong Lin, Hangyu Lin, Wei Xiong, Shizhe Diao, Jianmeng Liu, Jipeng Zhang, Rui Pan, Haoxiang Wang, Wenbin Hu, Hanning Zhang, Hanze Dong, Renjie Pi, Han Zhao, Nan Jiang, Heng Ji, Yuan YAO, Tong Zhang

Building on the analysis and the observation that averaging different layers of the transformer leads to significantly different alignment-forgetting trade-offs, we propose Heterogeneous Model Averaging (HMA) to Heterogeneously find various combination ratios of model layers.

Common Sense Reasoning Continual Learning

Optimal Sample Selection Through Uncertainty Estimation and Its Application in Deep Learning

no code implementations5 Sep 2023 Yong Lin, Chen Liu, Chenlu Ye, Qing Lian, Yuan YAO, Tong Zhang

Our proposed method, COPS (unCertainty based OPtimal Sub-sampling), is designed to minimize the expected loss of a model trained on subsampled data.

Active Learning Deep Learning

What is Essential for Unseen Goal Generalization of Offline Goal-conditioned RL?

1 code implementation30 May 2023 Rui Yang, Yong Lin, Xiaoteng Ma, Hao Hu, Chongjie Zhang, Tong Zhang

In this paper, we study out-of-distribution (OOD) generalization of offline GCRL both theoretically and empirically to identify factors that are important.

Imitation Learning Offline RL

Active Prompting with Chain-of-Thought for Large Language Models

2 code implementations23 Feb 2023 Shizhe Diao, Pengcheng Wang, Yong Lin, Rui Pan, Xiang Liu, Tong Zhang

For this purpose, we propose a solution to the key problem of determining which questions are the most important and helpful ones to annotate from a pool of task-specific queries.

Active Learning Zero-Shot Learning

Model Agnostic Sample Reweighting for Out-of-Distribution Learning

1 code implementation24 Jan 2023 Xiao Zhou, Yong Lin, Renjie Pi, Weizhong Zhang, Renzhe Xu, Peng Cui, Tong Zhang

The overfitting issue is addressed by considering a bilevel formulation to search for the sample reweighting, in which the generalization complexity depends on the search space of sample weights instead of the model size.

Probabilistic Bilevel Coreset Selection

no code implementations24 Jan 2023 Xiao Zhou, Renjie Pi, Weizhong Zhang, Yong Lin, Tong Zhang

The goal of coreset selection in supervised learning is to produce a weighted subset of data, so that training only on the subset achieves similar performance as training on the entire dataset.

Bilevel Optimization Continual Learning

Stable Learning via Sparse Variable Independence

no code implementations2 Dec 2022 Han Yu, Peng Cui, Yue He, Zheyan Shen, Yong Lin, Renzhe Xu, Xingxuan Zhang

The problem of covariate-shift generalization has attracted intensive research attention.

Variable Selection

Particle-based Variational Inference with Preconditioned Functional Gradient Flow

no code implementations25 Nov 2022 Hanze Dong, Xi Wang, Yong Lin, Tong Zhang

With the popularity of Stein variational gradient descent (SVGD), the focus of particle-based VI algorithms has been on the properties of functions in Reproducing Kernel Hilbert Space (RKHS) to approximate the gradient flow.

Variational Inference

Self-Guided Noise-Free Data Generation for Efficient Zero-Shot Learning

2 code implementations25 May 2022 Jiahui Gao, Renjie Pi, Yong Lin, Hang Xu, Jiacheng Ye, Zhiyong Wu, Weizhong Zhang, Xiaodan Liang, Zhenguo Li, Lingpeng Kong

In this paradigm, the synthesized data from the PLM acts as the carrier of knowledge, which is used to train a task-specific model with orders of magnitude fewer parameters than the PLM, achieving both higher performance and efficiency than prompt-based zero-shot learning methods on PLMs.

text-classification Text Classification +1

ZIN: When and How to Learn Invariance Without Environment Partition?

1 code implementation11 Mar 2022 Yong Lin, Shengyu Zhu, Lu Tan, Peng Cui

When data are divided into distinct environments according to the heterogeneity, recent invariant learning methods have proposed to learn robust and invariant models based on this environment partition.

Black-box Prompt Learning for Pre-trained Language Models

1 code implementation21 Jan 2022 Shizhe Diao, Zhichao Huang, Ruijia Xu, Xuechun Li, Yong Lin, Xiao Zhou, Tong Zhang

Particularly, instead of fine-tuning the model in the cloud, we adapt PLMs by prompt learning, which efficiently optimizes only a few parameters of the discrete prompts.

Prompt Learning text-classification +1

Bayesian Invariant Risk Minimization

no code implementations CVPR 2022 Yong Lin, Hanze Dong, Hao Wang, Tong Zhang

Generalization under distributional shift is an open challenge for machine learning.

Bayesian Inference

Homologies of path complexes and digraphs

1 code implementation12 Jul 2012 Alexander Grigor'yan, Yong Lin, Yuri Muranov, Shing-Tung Yau

In this paper we introduce a path complex that can be regarded as a generalization of the notion of a simplicial complex.

Combinatorics Algebraic Topology

Cannot find the paper you are looking for? You can Submit a new open access paper.