Search Results for author: Qin Ding

Found 5 papers, 0 papers with code

A Unified Knowledge-Distillation and Semi-Supervised Learning Framework to Improve Industrial Ads Delivery Systems

no code implementations5 Feb 2025 Hamid Eghbalzadeh, Yang Wang, Rui Li, Yuji Mo, Qin Ding, Jiaxiang Fu, Liang Dai, Shuo Gu, Nima Noorshams, Sem Park, Bo Long, Xue Feng

Industrial ads ranking systems conventionally rely on labeled impression data, which leads to challenges such as overfitting, slower incremental gain from model scaling, and biases due to discrepancies between training and serving data.

Knowledge Distillation

Syndicated Bandits: A Framework for Auto Tuning Hyper-parameters in Contextual Bandit Algorithms

no code implementations5 Jun 2021 Qin Ding, Yue Kang, Yi-Wei Liu, Thomas C. M. Lee, Cho-Jui Hsieh, James Sharpnack

To tackle this problem, we first propose a two-layer bandit structure for auto tuning the exploration parameter and further generalize it to the Syndicated Bandits framework which can learn multiple hyper-parameters dynamically in contextual bandit environment.

Recommendation Systems

Robust Stochastic Linear Contextual Bandits Under Adversarial Attacks

no code implementations5 Jun 2021 Qin Ding, Cho-Jui Hsieh, James Sharpnack

We provide theoretical guarantees for our proposed algorithm and show by experiments that our proposed algorithm improves the robustness against various kinds of popular attacks.

Multi-Armed Bandits Recommendation Systems

An Efficient Algorithm For Generalized Linear Bandit: Online Stochastic Gradient Descent and Thompson Sampling

no code implementations7 Jun 2020 Qin Ding, Cho-Jui Hsieh, James Sharpnack

A natural way to resolve this problem is to apply online stochastic gradient descent (SGD) so that the per-step time and memory complexity can be reduced to constant with respect to $t$, but a contextual bandit policy based on online SGD updates that balances exploration and exploitation has remained elusive.

Thompson Sampling

Multiscale Non-stationary Stochastic Bandits

no code implementations13 Feb 2020 Qin Ding, Cho-Jui Hsieh, James Sharpnack

Classic contextual bandit algorithms for linear models, such as LinUCB, assume that the reward distribution for an arm is modeled by a stationary linear regression.

regression

Cannot find the paper you are looking for? You can Submit a new open access paper.