Search Results for author: Shihong Song

Found 3 papers, 0 papers with code

Regularized-OFU: an efficient algorithm for general contextual bandit with optimization oracles

no code implementations29 Sep 2021 Yichi Zhou, Shihong Song, Huishuai Zhang, Jun Zhu, Wei Chen, Tie-Yan Liu

In contextual bandit, one major challenge is to develop theoretically solid and empirically efficient algorithms for general function classes.

Multi-Armed Bandits Thompson Sampling

Regularized OFU: an Efficient UCB Estimator forNon-linear Contextual Bandit

no code implementations29 Jun 2021 Yichi Zhou, Shihong Song, Huishuai Zhang, Jun Zhu, Wei Chen, Tie-Yan Liu

However, it is in general unknown how to deriveefficient and effective EE trade-off methods for non-linearcomplex tasks, suchas contextual bandit with deep neural network as the reward function.

Multi-Armed Bandits

Understanding Human Behaviors in Crowds by Imitating the Decision-Making Process

no code implementations25 Jan 2018 Haosheng Zou, Hang Su, Shihong Song, Jun Zhu

Crowd behavior understanding is crucial yet challenging across a wide range of applications, since crowd behavior is inherently determined by a sequential decision-making process based on various factors, such as the pedestrians' own destinations, interaction with nearby pedestrians and anticipation of upcoming events.

Collision Avoidance Imitation Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.