no code implementations • 12 Feb 2024 • Mengxiao Zhang, Haipeng Luo

Contextual multinomial logit (MNL) bandits capture many real-world assortment recommendation problems such as online retailing/advertising.

no code implementations • 12 Feb 2024 • Mengxiao Zhang, Yuheng Zhang, Haipeng Luo, Paul Mineiro

Bandits with feedback graphs are powerful online learning models that interpolate between the full information and classic bandit problems, capturing many real-life applications.

no code implementations • 8 Oct 2023 • Mengxiao Zhang, Haipeng Luo

We study online learning in contextual pay-per-click auctions where at each of the $T$ rounds, the learner receives some context along with a set of ads and needs to make an estimate on their click-through rate (CTR) in order to run a second-price pay-per-click auction.

no code implementations • 7 Mar 2023 • Mengxiao Zhang, Fernando Beltran, Jiamou Liu

Data pricing, as a key function of a data marketplace, demands quantifying the monetary value of data.

no code implementations • 30 Jan 2023 • Brendan Lucier, Sarath Pattathil, Aleksandrs Slivkins, Mengxiao Zhang

We study a game between autobidding algorithms that compete in an online advertising platform.

no code implementations • 23 Oct 2022 • Mengxiao Zhang, Shi Chen, Haipeng Luo, Yingfei Wang

Supply chain management (SCM) has been recognized as an important discipline with applications to many industries, where the two-echelon stochastic inventory model, involving one downstream retailer and one upstream supplier, plays a fundamental role for developing firms' SCM strategies.

no code implementations • 4 Oct 2022 • Haipeng Luo, Hanghang Tong, Mengxiao Zhang, Yuheng Zhang

For general strongly observable graphs, we develop an algorithm that achieves the optimal regret $\widetilde{\mathcal{O}}((\sum_{t=1}^T\alpha_t)^{1/2}+\max_{t\in[T]}\alpha_t)$ with high probability, where $\alpha_t$ is the independence number of the feedback graph at round $t$.

1 code implementation • 26 Jul 2022 • Chaofei Hong, Mengwen Yuan, Mengxiao Zhang, Xiao Wang, Chegnjun Zhang, Jiaxin Wang, Gang Pan, Zhaohui Wu, Huajin Tang

In this work, we present a Python based spiking neural network (SNN) simulation and training framework, aka SPAIC that aims to support brain-inspired model and algorithm researches integrated with features from both deep learning and neuroscience.

no code implementations • 12 Feb 2022 • Haipeng Luo, Mengxiao Zhang, Peng Zhao

We consider the problem of adversarial bandit convex optimization, that is, online learning over a sequence of arbitrary convex loss functions with only one function evaluation for each of them.

no code implementations • 12 Feb 2022 • Haipeng Luo, Mengxiao Zhang, Peng Zhao, Zhi-Hua Zhou

The CORRAL algorithm of Agarwal et al. (2017) and its variants (Foster et al., 2020a) achieve this goal with a regret overhead of order $\widetilde{O}(\sqrt{MT})$ where $M$ is the number of base algorithms and $T$ is the time horizon.

no code implementations • 30 Jan 2022 • Mengxiao Zhang, Peng Zhao, Haipeng Luo, Zhi-Hua Zhou

Learning from repeated play in a fixed two-player zero-sum game is a classic problem in game theory and online learning.

no code implementations • 11 Feb 2021 • Chung-Wei Lee, Haipeng Luo, Chen-Yu Wei, Mengxiao Zhang, Xiaojin Zhang

In this work, we develop linear bandit algorithms that automatically adapt to different environments.

no code implementations • 8 Feb 2021 • Chen-Yu Wei, Chung-Wei Lee, Mengxiao Zhang, Haipeng Luo

We study infinite-horizon discounted two-player zero-sum Markov games, and develop a decentralized algorithm that provably converges to the set of Nash equilibria under self-play.

1 code implementation • ICLR 2019 • Tiange Luo, Tianle Cai, Mengxiao Zhang, Siyu Chen, Li-Wei Wang

We next investigate the adversarial examples which 'fool' a CNN with Random Mask.

1 code implementation • ICLR 2021 • Chen-Yu Wei, Chung-Wei Lee, Mengxiao Zhang, Haipeng Luo

Specifically, for OMWU in bilinear games over the simplex, we show that when the equilibrium is unique, linear last-iterate convergence is achieved with a learning rate whose value is set to a universal constant, improving the result of (Daskalakis & Panageas, 2019b) under the same assumption.

no code implementations • NeurIPS 2020 • Chung-Wei Lee, Haipeng Luo, Chen-Yu Wei, Mengxiao Zhang

We develop a new approach to obtaining high probability regret bounds for online learning with bandit feedback against an adaptive adversary.

no code implementations • 2 Feb 2020 • Chung-Wei Lee, Haipeng Luo, Mengxiao Zhang

We study small-loss bounds for adversarial multi-armed bandits with graph feedback, that is, adaptive regret bounds that depend on the loss of the best arm or related quantities, instead of the total number of rounds.

1 code implementation • 19 Nov 2019 • Tiange Luo, Tianle Cai, Mengxiao Zhang, Siyu Chen, Di He, Li-Wei Wang

Robustness of convolutional neural networks (CNNs) has gained in importance on account of adversarial examples, i. e., inputs added as well-designed perturbations that are imperceptible to humans but can cause the model to predict incorrectly.

no code implementations • 5 Nov 2017 • Mengxiao Zhang, Wangquan Wu, Yanren Zhang, Kun He, Tao Yu, Huan Long, John E. Hopcroft

Our results show that the dimensions of different categories are close to each other and decline quickly along the convolutional layers and fully connected layers.

no code implementations • 2 Apr 2017 • Kun He, Jingbo Wang, Haochuan Li, Yao Shu, Mengxiao Zhang, Man Zhu, Li-Wei Wang, John E. Hopcroft

Toward a deeper understanding on the inner work of deep neural networks, we investigate CNN (convolutional neural network) using DCN (deconvolutional network) and randomization technique, and gain new insights for the intrinsic property of this network architecture.

Cannot find the paper you are looking for? You can
Submit a new open access paper.

Contact us on:
hello@paperswithcode.com
.
Papers With Code is a free resource with all data licensed under CC-BY-SA.