Search Results for author: Yingxiang Yang

Found 13 papers, 2 papers with code

Reward-Augmented Data Enhances Direct Preference Alignment of LLMs

1 code implementation10 Oct 2024 Shenao Zhang, Zhihan Liu, Boyi Liu, Yufeng Zhang, Yingxiang Yang, Yongfei Liu, Liyu Chen, Tao Sun, Zhaoran Wang

This dataset is easily integrated with existing direct alignment algorithms and is applicable to any preference dataset.

Instruction Following

Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizer

no code implementations26 May 2024 Zhihan Liu, Miao Lu, Shenao Zhang, Boyi Liu, Hongyi Guo, Yingxiang Yang, Jose Blanchet, Zhaoran Wang

To mitigate overoptimization, we first propose a theoretical algorithm that chooses the best policy for an adversarially chosen reward model; one that simultaneously minimizes the maximum likelihood estimation of the loss and a reward penalty term.

How Can LLM Guide RL? A Value-Based Approach

1 code implementation25 Feb 2024 Shenao Zhang, Sirui Zheng, Shuqi Ke, Zhihan Liu, Wanxin Jin, Jianbo Yuan, Yingxiang Yang, Hongxia Yang, Zhaoran Wang

Specifically, we develop an algorithm named LINVIT that incorporates LLM guidance as a regularization factor in value-based RL, leading to significant reductions in the amount of data needed for learning, particularly when the difference between the ideal policy and the LLM-informed policy is small, which suggests that the initial policy is close to optimal, reducing the need for further exploration.

Decision Making Reinforcement Learning (RL)

Reason out Your Layout: Evoking the Layout Master from Large Language Models for Text-to-Image Synthesis

no code implementations28 Nov 2023 Xiaohui Chen, Yongfei Liu, Yingxiang Yang, Jianbo Yuan, Quanzeng You, Li-Ping Liu, Hongxia Yang

Recent advancements in text-to-image (T2I) generative models have shown remarkable capabilities in producing diverse and imaginative visuals based on text prompts.

Image Generation

Let Models Speak Ciphers: Multiagent Debate through Embeddings

no code implementations10 Oct 2023 Chau Pham, Boyi Liu, Yingxiang Yang, Zhengyu Chen, Tianyi Liu, Jianbo Yuan, Bryan A. Plummer, Zhaoran Wang, Hongxia Yang

Although natural language is an obvious choice for communication due to LLM's language understanding capability, the token sampling step needed when generating natural language poses a potential risk of information loss, as it uses only one token to represent the model's belief across the entire vocabulary.

The Devil is in the Detail: A Framework for Macroscopic Prediction via Microscopic Models

no code implementations NeurIPS 2020 Yingxiang Yang, Negar Kiyavash, Le Song, Niao He

Macroscopic data aggregated from microscopic events are pervasive in machine learning, such as country-level COVID-19 infection statistics based on city-level data.

Stochastic Optimization

Learning Positive Functions with Pseudo Mirror Descent

no code implementations NeurIPS 2019 Yingxiang Yang, Haoxiang Wang, Negar Kiyavash, Niao He

The nonparametric learning of positive-valued functions appears widely in machine learning, especially in the context of estimating intensity functions of point processes.

Computational Efficiency Point Processes

Predictive Approximate Bayesian Computation via Saddle Points

no code implementations NeurIPS 2018 Yingxiang Yang, Bo Dai, Negar Kiyavash, Niao He

Approximate Bayesian computation (ABC) is an important methodology for Bayesian inference when the likelihood function is intractable.

Bayesian Inference regression

Detecting Nonlinear Causality in Multivariate Time Series with Sparse Additive Models

no code implementations11 Mar 2018 Yingxiang Yang, Adams Wei Yu, Zhaoran Wang, Tuo Zhao

We propose a nonparametric method for detecting nonlinear causal relationship within a set of multidimensional discrete time series, by using sparse additive models (SpAMs).

Additive models Model Selection +2

Nonparametric Hawkes Processes: Online Estimation and Generalization Bounds

no code implementations25 Jan 2018 Yingxiang Yang, Jalal Etesami, Niao He, Negar Kiyavash

In this paper, we design a nonparametric online algorithm for estimating the triggering functions of multivariate Hawkes processes.

Generalization Bounds

Online Learning for Multivariate Hawkes Processes

no code implementations NeurIPS 2017 Yingxiang Yang, Jalal Etesami, Niao He, Negar Kiyavash

We develop a nonparametric and online learning algorithm that estimates the triggering functions of a multivariate Hawkes process (MHP).

Efficient Neighborhood Selection for Gaussian Graphical Models

no code implementations22 Sep 2015 Yingxiang Yang, Jalal Etesami, Negar Kiyavash

This paper addresses the problem of neighborhood selection for Gaussian graphical models.

Cannot find the paper you are looking for? You can Submit a new open access paper.