Search Results for author: Xiaoying Zhang

Found 15 papers, 4 papers with code

GI-Free Pilot-Aided Channel Estimation for Affine Frequency Division Multiplexing Systems

no code implementations • 1 Apr 2024 • Yu Zhou, Haoran Yin, Nanhao Zhou, Yanqun Tang, Xiaoying Zhang, Weijie Yuan

The recently developed affine frequency division multiplexing (AFDM) can achieve full diversity in doubly selective channels, providing a comprehensive sparse representation of the delay-Doppler domain channel.

Paper
Add Code

Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards

no code implementations • 12 Mar 2024 • Wei Shen, Xiaoying Zhang, Yuanshun Yao, Rui Zheng, Hongyi Guo, Yang Liu

Reinforcement learning from human feedback (RLHF) is the mainstream paradigm used to align large language models (LLMs) with human preferences.

reinforcement-learning

Paper
Add Code

Overcoming Reward Overoptimization via Adversarial Policy Optimization with Lightweight Uncertainty Estimation

no code implementations • 8 Mar 2024 • Xiaoying Zhang, Jean-Francois Ton, Wei Shen, Hongning Wang, Yang Liu

We introduce Adversarial Policy Optimization (AdvPO), a novel solution to the pervasive issue of reward over-optimization in Reinforcement Learning from Human Feedback (RLHF) for Large Language Models (LLMs).

Paper
Add Code

Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation

no code implementations • 14 Feb 2024 • Xiaoying Zhang, Baolin Peng, Ye Tian, Jingyan Zhou, Lifeng Jin, Linfeng Song, Haitao Mi, Helen Meng

Despite showing increasingly human-like abilities, large language models (LLMs) often struggle with factual inaccuracies, i. e. "hallucinations", even when they hold relevant knowledge.

Paper
Add Code

Human-Instruction-Free LLM Self-Alignment with Limited Samples

no code implementations • 6 Jan 2024 • Hongyi Guo, Yuanshun Yao, Wei Shen, Jiaheng Wei, Xiaoying Zhang, Zhaoran Wang, Yang Liu

The key idea is to first retrieve high-quality samples related to the target domain and use them as In-context Learning examples to generate more samples.

In-Context Learning Instruction Following

Paper
Add Code

Rethinking Machine Ethics -- Can LLMs Perform Moral Reasoning through the Lens of Moral Theories?

no code implementations • 29 Aug 2023 • Jingyan Zhou, Minda Hu, Junan Li, Xiaoying Zhang, Xixin Wu, Irwin King, Helen Meng

Our analysis exhibits the potentials and flaws in existing resources (models and datasets) in developing explainable moral judgment-making systems.

Ethics

Paper
Add Code

Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models' Alignment

1 code implementation • 10 Aug 2023 • Yang Liu, Yuanshun Yao, Jean-Francois Ton, Xiaoying Zhang, Ruocheng Guo, Hao Cheng, Yegor Klochkov, Muhammad Faaiz Taufiq, Hang Li

However, a major challenge faced by practitioners is the lack of clear guidance on evaluating whether LLM outputs align with social norms, values, and regulations.

Fairness Models Alignment

Paper
Code

SGP-TOD: Building Task Bots Effortlessly via Schema-Guided LLM Prompting

no code implementations • 15 May 2023 • Xiaoying Zhang, Baolin Peng, Kun Li, Jingyan Zhou, Helen Meng

Building end-to-end task bots and maintaining their integration with new functionalities using minimal human efforts is a long-standing challenge in dialog research.

dialog state tracking

Paper
Add Code

Debiasing Recommendation by Learning Identifiable Latent Confounders

1 code implementation • 10 Feb 2023 • Qing Zhang, Xiaoying Zhang, Yang Liu, Hongning Wang, Min Gao, Jiheng Zhang, Ruocheng Guo

Confounding bias arises due to the presence of unmeasured variables (e. g., the socio-economic status of a user) that can affect both a user's exposure and feedback.

Causal Inference counterfactual +1

Paper
Code

Disentangled Representation for Diversified Recommendations

1 code implementation • 13 Jan 2023 • Xiaoying Zhang, Hongning Wang, Hang Li

This calls for a fine-grained understanding of a user's preferences over items, where one needs to recognize the user's choice is driven by the quality of the item itself, or the pre-selected attributes of the item.

Paper
Code

Low-Interception Waveform: To Prevent the Recognition of Spectrum Waveform Modulation via Adversarial Examples

no code implementations • 20 Jan 2022 • Haidong Xie, Jia Tan, Xiaoying Zhang, Nan Ji, Haihua Liao, Zuguo Yu, Xueshuang Xiang, Naijin Liu

This leads to the problem of a malicious third party using a deep learning model to easily recognize the modulation format of the transmitted waveform.

Paper
Add Code

Toward Self-learning End-to-End Task-Oriented Dialog Systems

no code implementations • SIGDIAL (ACL) 2022 • Xiaoying Zhang, Baolin Peng, Jianfeng Gao, Helen Meng

In this paper, we study the problem of automatically adapting task bots to changing environments by learning from human-bot interactions with minimum or zero human annotations.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

A Low Complexity Learning-based Channel Estimation for OFDM Systems with Online Training

no code implementations • 14 Jul 2021 • Kai Mei, Jun Liu, Xiaoying Zhang, Kuo Cao, Nandana Rajatheva, Jibo Wei

Besides, a training data construction approach utilizing least square (LS) estimation results is proposed so that the training data can be collected during the data transmission.

BIG-bench Machine Learning

Paper
Add Code

Unstructured Knowledge Access in Task-oriented Dialog Modeling using Language Inference, Knowledge Retrieval and Knowledge-Integrative Response Generation

1 code implementation • 15 Jan 2021 • Mudit Chaudhary, Borislav Dzodzo, Sida Huang, Chun Hei Lo, Mingzhi Lyu, Lun Yiu Nie, Jinbo Xing, Tianhua Zhang, Xiaoying Zhang, Jingyan Zhou, Hong Cheng, Wai Lam, Helen Meng

Dialog systems enriched with external knowledge can handle user queries that are outside the scope of the supporting databases/APIs.

Natural Language Inference Response Generation +1

Paper
Code

Conversational Contextual Bandit: Algorithm and Application

no code implementations • 4 Jun 2019 • Xiaoying Zhang, Hong Xie, Hang Li, John C. S. Lui

Here, a key-term can relate to a subset of arms, for example, a category of articles in news recommendation.

News Recommendation Recommendation Systems

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.