Search Results for author: Mengxiao Zhang

Found 20 papers, 4 papers with code

Efficient Contextual Bandits with Uninformed Feedback Graphs

no code implementations • 12 Feb 2024 • Mengxiao Zhang, Yuheng Zhang, Haipeng Luo, Paul Mineiro

Bandits with feedback graphs are powerful online learning models that interpolate between the full information and classic bandit problems, capturing many real-life applications.

Multi-Armed Bandits regression

Paper
Add Code

Contextual Multinomial Logit Bandits with General Value Functions

no code implementations • 12 Feb 2024 • Mengxiao Zhang, Haipeng Luo

Contextual multinomial logit (MNL) bandits capture many real-world assortment recommendation problems such as online retailing/advertising.

Computational Efficiency Multi-Armed Bandits

Paper
Add Code

Online Learning in Contextual Second-Price Pay-Per-Click Auctions

no code implementations • 8 Oct 2023 • Mengxiao Zhang, Haipeng Luo

We study online learning in contextual pay-per-click auctions where at each of the $T$ rounds, the learner receives some context along with a set of ads and needs to make an estimate on their click-through rate (CTR) in order to run a second-price pay-per-click auction.

Paper
Add Code

A Survey of Data Pricing for Data Marketplaces

no code implementations • 7 Mar 2023 • Mengxiao Zhang, Fernando Beltran, Jiamou Liu

Data pricing, as a key function of a data marketplace, demands quantifying the monetary value of data.

Paper
Add Code

Autobidders with Budget and ROI Constraints: Efficiency, Regret, and Pacing Dynamics

no code implementations • 30 Jan 2023 • Brendan Lucier, Sarath Pattathil, Aleksandrs Slivkins, Mengxiao Zhang

We study a game between autobidding algorithms that compete in an online advertising platform.

Paper
Add Code

No-Regret Learning in Two-Echelon Supply Chain with Unknown Demand Distribution

no code implementations • 23 Oct 2022 • Mengxiao Zhang, Shi Chen, Haipeng Luo, Yingfei Wang

Supply chain management (SCM) has been recognized as an important discipline with applications to many industries, where the two-echelon stochastic inventory model, involving one downstream retailer and one upstream supplier, plays a fundamental role for developing firms' SCM strategies.

Management

Paper
Add Code

Improved High-Probability Regret for Adversarial Bandits with Time-Varying Feedback Graphs

no code implementations • 4 Oct 2022 • Haipeng Luo, Hanghang Tong, Mengxiao Zhang, Yuheng Zhang

For general strongly observable graphs, we develop an algorithm that achieves the optimal regret $\widetilde{\mathcal{O}}((\sum_{t=1}^T\alpha_t)^{1/2}+\max_{t\in[T]}\alpha_t)$ with high probability, where $\alpha_t$ is the independence number of the feedback graph at round $t$.

Multi-Armed Bandits

Paper
Add Code

SPAIC: A Spike-based Artificial Intelligence Computing Framework

1 code implementation • 26 Jul 2022 • Chaofei Hong, Mengwen Yuan, Mengxiao Zhang, Xiao Wang, Chegnjun Zhang, Jiaxin Wang, Gang Pan, Zhaohui Wu, Huajin Tang

In this work, we present a Python based spiking neural network (SNN) simulation and training framework, aka SPAIC that aims to support brain-inspired model and algorithm researches integrated with features from both deep learning and neuroscience.

Paper
Code

Corralling a Larger Band of Bandits: A Case Study on Switching Regret for Linear Bandits

no code implementations • 12 Feb 2022 • Haipeng Luo, Mengxiao Zhang, Peng Zhao, Zhi-Hua Zhou

The CORRAL algorithm of Agarwal et al. (2017) and its variants (Foster et al., 2020a) achieve this goal with a regret overhead of order $\widetilde{O}(\sqrt{MT})$ where $M$ is the number of base algorithms and $T$ is the time horizon.

Paper
Add Code

Adaptive Bandit Convex Optimization with Heterogeneous Curvature

no code implementations • 12 Feb 2022 • Haipeng Luo, Mengxiao Zhang, Peng Zhao

We consider the problem of adversarial bandit convex optimization, that is, online learning over a sequence of arbitrary convex loss functions with only one function evaluation for each of them.

Paper
Add Code

No-Regret Learning in Time-Varying Zero-Sum Games

no code implementations • 30 Jan 2022 • Mengxiao Zhang, Peng Zhao, Haipeng Luo, Zhi-Hua Zhou

Learning from repeated play in a fixed two-player zero-sum game is a classic problem in game theory and online learning.

Paper
Add Code

Achieving Near Instance-Optimality and Minimax-Optimality in Stochastic and Adversarial Linear Bandits Simultaneously

no code implementations • 11 Feb 2021 • Chung-Wei Lee, Haipeng Luo, Chen-Yu Wei, Mengxiao Zhang, Xiaojin Zhang

In this work, we develop linear bandit algorithms that automatically adapt to different environments.

Paper
Add Code

Last-iterate Convergence of Decentralized Optimistic Gradient Descent/Ascent in Infinite-horizon Competitive Markov Games

no code implementations • 8 Feb 2021 • Chen-Yu Wei, Chung-Wei Lee, Mengxiao Zhang, Haipeng Luo

We study infinite-horizon discounted two-player zero-sum Markov games, and develop a decentralized algorithm that provably converges to the set of Nash equilibria under self-play.

Paper
Add Code

RANDOM MASK: Towards Robust Convolutional Neural Networks

1 code implementation • ICLR 2019 • Tiange Luo, Tianle Cai, Mengxiao Zhang, Siyu Chen, Li-Wei Wang

We next investigate the adversarial examples which 'fool' a CNN with Random Mask.

Paper
Code

Linear Last-iterate Convergence in Constrained Saddle-point Optimization

1 code implementation • ICLR 2021 • Chen-Yu Wei, Chung-Wei Lee, Mengxiao Zhang, Haipeng Luo

Specifically, for OMWU in bilinear games over the simplex, we show that when the equilibrium is unique, linear last-iterate convergence is achieved with a learning rate whose value is set to a universal constant, improving the result of (Daskalakis & Panageas, 2019b) under the same assumption.

Paper
Code

Bias no more: high-probability data-dependent regret bounds for adversarial bandits and MDPs

no code implementations • NeurIPS 2020 • Chung-Wei Lee, Haipeng Luo, Chen-Yu Wei, Mengxiao Zhang

We develop a new approach to obtaining high probability regret bounds for online learning with bandit feedback against an adaptive adversary.

Paper
Add Code

A Closer Look at Small-loss Bounds for Bandits with Graph Feedback

no code implementations • 2 Feb 2020 • Chung-Wei Lee, Haipeng Luo, Mengxiao Zhang

We study small-loss bounds for adversarial multi-armed bandits with graph feedback, that is, adaptive regret bounds that depend on the loss of the best arm or related quantities, instead of the total number of rounds.

Multi-Armed Bandits

Paper
Add Code

Defective Convolutional Networks

1 code implementation • 19 Nov 2019 • Tiange Luo, Tianle Cai, Mengxiao Zhang, Siyu Chen, Di He, Li-Wei Wang

Robustness of convolutional neural networks (CNNs) has gained in importance on account of adversarial examples, i. e., inputs added as well-designed perturbations that are imperceptible to humans but can cause the model to predict incorrectly.

Paper
Code

The Local Dimension of Deep Manifold

no code implementations • 5 Nov 2017 • Mengxiao Zhang, Wangquan Wu, Yanren Zhang, Kun He, Tao Yu, Huan Long, John E. Hopcroft

Our results show that the dimensions of different categories are close to each other and decline quickly along the convolutional layers and fully connected layers.

Paper
Add Code

Randomness in Deconvolutional Networks for Visual Representation

no code implementations • 2 Apr 2017 • Kun He, Jingbo Wang, Haochuan Li, Yao Shu, Mengxiao Zhang, Man Zhu, Li-Wei Wang, John E. Hopcroft

Toward a deeper understanding on the inner work of deep neural networks, we investigate CNN (convolutional neural network) using DCN (deconvolutional network) and randomization technique, and gain new insights for the intrinsic property of this network architecture.

General Classification Image Reconstruction

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.