Search Results for author: Shusheng Xu

Found 7 papers, 1 papers with code

Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study

no code implementations16 Apr 2024 Shusheng Xu, Wei Fu, Jiaxuan Gao, Wenjie Ye, Weilin Liu, Zhiyu Mei, Guangju Wang, Chao Yu, Yi Wu

However, in academic benchmarks, state-of-the-art results are often achieved via reward-free methods, such as Direct Preference Optimization (DPO).

Code Generation

A Benchmark for Low-Switching-Cost Reinforcement Learning

no code implementations13 Dec 2021 Shusheng Xu, Yancheng Liang, Yunfei Li, Simon Shaolei Du, Yi Wu

A ubiquitous requirement in many practical reinforcement learning (RL) applications, including medical treatment, recommendation system, education and robotics, is that the deployed policy that actually interacts with the environment cannot change frequently.

Atari Games reinforcement-learning +1

Native Chinese Reader: A Dataset Towards Native-Level Chinese Machine Reading Comprehension

no code implementations13 Dec 2021 Shusheng Xu, Yichen Liu, Xiaoyu Yi, Siyuan Zhou, Huizi Li, Yi Wu

We present Native Chinese Reader (NCR), a new machine reading comprehension (MRC) dataset with particularly long articles in both modern and classical Chinese.

Common Sense Reasoning Machine Reading Comprehension

PhyloTransformer: A Discriminative Model for Mutation Prediction Based on a Multi-head Self-attention Mechanism

no code implementations3 Nov 2021 Yingying Wu, Shusheng Xu, Shing-Tung Yau, Yi Wu

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has caused an ongoing pandemic infecting 219 million people as of 10/19/21, with a 3. 6% mortality rate.

Language Modelling

Sequence Level Contrastive Learning for Text Summarization

no code implementations8 Sep 2021 Shusheng Xu, Xingxing Zhang, Yi Wu, Furu Wei

In this paper, we propose a contrastive learning model for supervised abstractive text summarization, where we view a document, its gold summary and its model generated summaries as different views of the same mean representation and maximize the similarities between them during training.

Abstractive Text Summarization Contrastive Learning +2

Deep Q-Learning with Low Switching Cost

no code implementations1 Jan 2021 Shusheng Xu, Simon Shaolei Du, Yi Wu

We initiate the study on deep reinforcement learning problems that require low switching cost, i. e., small number of policy switches during training.

Atari Games Q-Learning +2

Cannot find the paper you are looking for? You can Submit a new open access paper.