Search Results for author: Tianchi Cai

Found 11 papers, 2 papers with code

Face4RAG: Factual Consistency Evaluation for Retrieval Augmented Generation in Chinese

no code implementations1 Jul 2024 Yunqi Xu, Tianchi Cai, Jiyan Jiang, Xierui Song

To fix this issue, we further propose a new method called \emph{L-Face4RAG} with two novel designs of logic-preserving answer decomposition and fact-logic FCE.

RAG Retrieval

FoRAG: Factuality-optimized Retrieval Augmented Generation for Web-enhanced Long-form Question Answering

no code implementations19 Jun 2024 Tianchi Cai, Zhiwen Tan, Xierui Song, Tao Sun, Jiyan Jiang, Yunqi Xu, Yinger Zhang, Jinjie Gu

Retrieval Augmented Generation (RAG) has become prevalent in question-answering (QA) tasks due to its ability of utilizing search engine to enhance the quality of long-form question-answering (LFQA).

Answer Generation Long Form Question Answering +2

Mitigate Position Bias with Coupled Ranking Bias on CTR Prediction

no code implementations29 May 2024 Yao Zhao, Zhining Liu, Tianchi Cai, Haipeng Zhang, Chenyi Zhuang, Jinjie Gu

Using both synthetic and industrial datasets, we first show how this widely coexisted ranking bias deteriorates the performance of the existing position bias estimation methods.

Click-Through Rate Prediction Position +1

ULMA: Unified Language Model Alignment with Human Demonstration and Point-wise Preference

1 code implementation5 Dec 2023 Tianchi Cai, Xierui Song, Jiyan Jiang, Fei Teng, Jinjie Gu, Guannan Zhang

Aligning language models to human expectations, e. g., being helpful and harmless, has become a pressing challenge for large language models.

Language Modelling Large Language Model

Marketing Budget Allocation with Offline Constrained Deep Reinforcement Learning

no code implementations6 Sep 2023 Tianchi Cai, Jiyan Jiang, Wenpeng Zhang, Shiji Zhou, Xierui Song, Li Yu, Lihong Gu, Xiaodong Zeng, Jinjie Gu, Guannan Zhang

We further show that this method is guaranteed to converge to the optimal policy, which cannot be achieved by previous value-based reinforcement learning methods for marketing budget allocation.

Marketing reinforcement-learning

Model-free Reinforcement Learning with Stochastic Reward Stabilization for Recommender Systems

no code implementations25 Aug 2023 Tianchi Cai, Shenliao Bao, Jiyan Jiang, Shiji Zhou, Wenpeng Zhang, Lihong Gu, Jinjie Gu, Guannan Zhang

Model-free RL-based recommender systems have recently received increasing research attention due to their capability to handle partial feedback and long-term rewards.

Recommendation Systems reinforcement-learning

Generalized Consistent Multi-Class Classification with Rejection to be Compatible with Arbitrary Losses

2 code implementations Conference 2022 Yuzhou Cao, Tianchi Cai, Lei Feng, Lihong Gu, Jinjie Gu, Bo An, Gang Niu, Masashi Sugiyama

\emph{Classification with rejection} (CwR) refrains from making a prediction to avoid critical misclassification when encountering test samples that are difficult to classify.

Classification Multi-class Classification

A Policy Efficient Reduction Approach to Convex Constrained Deep Reinforcement Learning

no code implementations29 Aug 2021 Tianchi Cai, Wenpeng Zhang, Lihong Gu, Xiaodong Zeng, Jinjie Gu

To apply value-based methods to CRL, a recent groundbreaking line of game-theoretic approaches uses the mixed policy that randomizes among a set of carefully generated policies to converge to the desired constraint-satisfying policy.

General Reinforcement Learning reinforcement-learning +1

LinkLouvain: Link-Aware A/B Testing and Its Application on Online Marketing Campaign

no code implementations3 Feb 2021 Tianchi Cai, Daxi Cheng, Chen Liang, Ziqi Liu, Lihong Gu, Huizhi Xie, Zhiqiang Zhang, Xiaodong Zeng, Jinjie Gu

In this paper, we analyze the network A/B testing problem under a real-world online marketing campaign, describe our proposed LinkLouvain method, and evaluate it on real-world data.

Link Prediction Marketing

Robust Offline Reinforcement Learning from Low-Quality Data

no code implementations1 Jan 2021 Wenjie Shi, Tianchi Cai, Shiji Song, Lihong Gu, Jinjie Gu, Gao Huang

We theoretically show that AdaPT produces a tight upper bound on the distributional deviation between the learned policy and the behavior policy, and this upper bound is the minimum requirement to guarantee policy improvement at each iteration.

Continuous Control Offline RL +2

A Reduction Approach to Constrained Reinforcement Learning

no code implementations1 Jan 2021 Tianchi Cai, Wenjie Shi, Lihong Gu, Xiaodong Zeng, Jinjie Gu

In this paper, we present a reduction approach to find sparse policies that randomize among a constant number of policies for the constrained RL problem.

Diversity reinforcement-learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.