Search Results for author: Kwang-Sung Jun

Found 32 papers, 7 papers with code

Efficient Low-Rank Matrix Estimation, Experimental Design, and Arm-Set-Dependent Low-Rank Bandits

no code implementations17 Feb 2024 Kyoungseok Jang, Chicheng Zhang, Kwang-Sung Jun

Assuming access to the distribution of the covariates, we propose a novel low-rank matrix estimation method called LowPopArt and provide its recovery guarantee that depends on a novel quantity denoted by B(Q) that characterizes the hardness of the problem, where Q is the covariance matrix of the measurement distribution.

Computational Efficiency Efficient Exploration +2

Better-than-KL PAC-Bayes Bounds

no code implementations14 Feb 2024 Ilja Kuzborskij, Kwang-Sung Jun, Yulian Wu, Kyoungseok Jang, Francesco Orabona

In this paper, we consider the problem of proving concentration inequalities to estimate the mean of the sequence.

Inductive Bias

Noise-Adaptive Confidence Sets for Linear Bandits and Application to Bayesian Optimization

no code implementations12 Feb 2024 Kwang-Sung Jun, Jungtaek Kim

First, we propose a novel confidence set that is `semi-adaptive' to the unknown sub-Gaussian parameter $\sigma_*^2$ in the sense that the (normalized) confidence width scales with $\sqrt{d\sigma_*^2 + \sigma_0^2}$ where $d$ is the dimension and $\sigma_0^2$ is the specified sub-Gaussian parameter (known) that can be much larger than $\sigma_*^2$.

Bayesian Optimization Decision Making +1

Graph Sparsifications using Neural Network Assisted Monte Carlo Tree Search

1 code implementation17 Nov 2023 Alvin Chiu, Mithun Ghosh, Reyan Ahmed, Kwang-Sung Jun, Stephen Kobourov, Michael T. Goodrich

Graph neural networks have been successful for machine learning, as well as for combinatorial and graph problems such as the Subgraph Isomorphism Problem and the Traveling Salesman Problem.

Traveling Salesman Problem

Improved Regret Bounds of (Multinomial) Logistic Bandits via Regret-to-Confidence-Set Conversion

2 code implementations28 Oct 2023 Junghyun Lee, Se-Young Yun, Kwang-Sung Jun

Logistic bandit is a ubiquitous framework of modeling users' choices, e. g., click vs. no click for advertisement recommender system.

Recommendation Systems

Nearly Optimal Steiner Trees using Graph Neural Network Assisted Monte Carlo Tree Search

1 code implementation30 Apr 2023 Reyan Ahmed, Mithun Ghosh, Kwang-Sung Jun, Stephen Kobourov

Graph neural networks are useful for learning problems, as well as for combinatorial and graph problems such as the Subgraph Isomorphism Problem and the Traveling Salesman Problem.

Traveling Salesman Problem

Kullback-Leibler Maillard Sampling for Multi-armed Bandits with Bounded Rewards

1 code implementation NeurIPS 2023 Hao Qin, Kwang-Sung Jun, Chicheng Zhang

Maillard sampling \cite{maillard13apprentissage}, an attractive alternative to Thompson sampling, has recently been shown to achieve competitive regret guarantees in the sub-Gaussian reward setting \cite{bian2022maillard} while maintaining closed-form action probabilities, which is useful for offline policy evaluation.

Multi-Armed Bandits Thompson Sampling

Tighter PAC-Bayes Bounds Through Coin-Betting

no code implementations12 Feb 2023 Kyoungseok Jang, Kwang-Sung Jun, Ilja Kuzborskij, Francesco Orabona

We consider the problem of estimating the mean of a sequence of random elements $f(X_1, \theta)$ $, \ldots, $ $f(X_n, \theta)$ where $f$ is a fixed scalar function, $S=(X_1, \ldots, X_n)$ are independent random variables, and $\theta$ is a possibly $S$-dependent parameter.

Revisiting Simple Regret: Fast Rates for Returning a Good Arm

no code implementations30 Oct 2022 Yao Zhao, Connor James Stephens, Csaba Szepesvári, Kwang-Sung Jun

Simple regret is a natural and parameter-free performance criterion for pure exploration in multi-armed bandits yet is less popular than the probability of missing the best arm or an $\epsilon$-good arm, perhaps due to lack of easy ways to characterize it.

Multi-Armed Bandits

PopArt: Efficient Sparse Regression and Experimental Design for Optimal Sparse Linear Bandits

1 code implementation25 Oct 2022 Kyoungseok Jang, Chicheng Zhang, Kwang-Sung Jun

In this paper, we propose a simple and computationally efficient sparse linear estimation method called PopArt that enjoys a tighter $\ell_1$ recovery guarantee compared to Lasso (Tibshirani, 1996) in many problems.

Decision Making Experimental Design +1

Norm-Agnostic Linear Bandits

no code implementations3 May 2022 Spencer, Gales, Sunder Sethuraman, Kwang-Sung Jun

For the latter, we do not pay any price in the regret for now knowing $S$.

Recommendation Systems

An Experimental Design Approach for Regret Minimization in Logistic Bandits

no code implementations4 Feb 2022 Blake Mason, Kwang-Sung Jun, Lalit Jain

Finally, we discuss the impact of the bias of the MLE on the logistic bandit problem, providing an example where $d^2$ lower order regret (cf., it is $d$ for linear bandits) may not be improved as long as the MLE is used and how bias-corrected estimators may be used to make it closer to $d$.

Experimental Design

Jointly Efficient and Optimal Algorithms for Logistic Bandits

2 code implementations6 Jan 2022 Louis Faury, Marc Abeille, Kwang-Sung Jun, Clément Calauzènes

Logistic Bandits have recently undergone careful scrutiny by virtue of their combined theoretical and practical relevance.

Computational Efficiency

Improved Regret Analysis for Variance-Adaptive Linear Bandits and Horizon-Free Linear Mixture MDPs

no code implementations5 Nov 2021 Yeoneung Kim, Insoon Yang, Kwang-Sung Jun

For linear bandits, we achieve $\tilde O(\min\{d\sqrt{K}, d^{1. 5}\sqrt{\sum_{k=1}^K \sigma_k^2}\} + d^2)$ where $d$ is the dimension of the features, $K$ is the time horizon, and $\sigma_k^2$ is the noise variance at time step $k$, and $\tilde O$ ignores polylogarithmic dependence, which is a factor of $d^3$ improvement.

LEMMA

Maillard Sampling: Boltzmann Exploration Done Optimally

no code implementations5 Nov 2021 Jie Bian, Kwang-Sung Jun

This less-known algorithm, which we call Maillard sampling (MS), computes the probability of choosing each arm in a \textit{closed form}, which is not true for Thompson sampling, a widely-adopted bandit algorithm in the industry.

counterfactual Thompson Sampling

Tight Concentrations and Confidence Sequences from the Regret of Universal Portfolio

1 code implementation27 Oct 2021 Francesco Orabona, Kwang-Sung Jun

A classic problem in statistics is the estimation of the expectation of random variables from samples.

Transfer Learning in Bandits with Latent Continuity

no code implementations4 Feb 2021 Hyejin Park, Seiyun Shin, Kwang-Sung Jun, Jungseul Ok

To cope with the latent structural parameter, we consider a transfer learning setting in which an agent must learn to transfer the structural information from the prior tasks to the next task, which is inspired by practical problems such as rate adaptation in wireless link.

Multi-Armed Bandits Transfer Learning

DISE: Dynamic Integrator Selection to Minimize Forward Pass Time in Neural ODEs

no code implementations1 Jan 2021 Soyoung Kang, Ganghyeon Park, Kwang-Sung Jun, Noseong Park

Because it is not the case that every input requires the advanced integrator, we design an auxiliary neural network to choose an appropriate integrator given input to decrease the overall inference time without significantly sacrificing accuracy.

Improved Confidence Bounds for the Linear Logistic Model and Applications to Linear Bandits

no code implementations23 Nov 2020 Kwang-Sung Jun, Lalit Jain, Blake Mason, Houssam Nassif

Specifically, our confidence bound avoids a direct dependence on $1/\kappa$, where $\kappa$ is the minimal variance over all arms' reward distributions.

Crush Optimism with Pessimism: Structured Bandits Beyond Asymptotic Optimality

no code implementations NeurIPS 2020 Kwang-Sung Jun, Chicheng Zhang

In this paper, we focus on the finite hypothesis case and ask if one can achieve the asymptotic optimality while enjoying bounded regret whenever possible.

Parameter-Free Locally Differentially Private Stochastic Subgradient Descent

no code implementations21 Nov 2019 Kwang-Sung Jun, Francesco Orabona

We consider the problem of minimizing a convex risk with stochastic subgradients guaranteeing $\epsilon$-locally differentially private ($\epsilon$-LDP).

Stochastic Optimization

Bilinear Bandits with Low-rank Structure

no code implementations8 Jan 2019 Kwang-Sung Jun, Rebecca Willett, Stephen Wright, Robert Nowak

We introduce the bilinear bandit problem with low-rank structure in which an action takes the form of a pair of arms from two different entity types, and the reward is a bilinear function of the known feature vectors of the arms.

Adversarial Attacks on Stochastic Bandits

no code implementations NeurIPS 2018 Kwang-Sung Jun, Lihong Li, Yuzhe ma, Xiaojin Zhu

We study adversarial attacks that manipulate the reward signals to control the actions chosen by a stochastic multi-armed bandit algorithm.

Data Poisoning Attacks in Contextual Bandits

no code implementations17 Aug 2018 Yuzhe Ma, Kwang-Sung Jun, Lihong Li, Xiaojin Zhu

We provide a general attack framework based on convex optimization and show that by slightly manipulating rewards in the data, an attacker can force the bandit algorithm to pull a target arm for a target contextual vector.

Data Poisoning Multi-Armed Bandits +2

Online Learning for Changing Environments using Coin Betting

no code implementations6 Nov 2017 Kwang-Sung Jun, Francesco Orabona, Stephen Wright, Rebecca Willett

A key challenge in online learning is that classical algorithms can be slow to adapt to changing environments.

Metric Learning

Scalable Generalized Linear Bandits: Online Computation and Hashing

no code implementations NeurIPS 2017 Kwang-Sung Jun, Aniruddha Bhargava, Robert Nowak, Rebecca Willett

Second, for the case where the number $N$ of arms is very large, we propose new algorithms in which each next arm is selected via an inner product search.

Thompson Sampling

Improved Strongly Adaptive Online Learning using Coin Betting

no code implementations14 Oct 2016 Kwang-Sung Jun, Francesco Orabona, Rebecca Willett, Stephen Wright

This paper describes a new parameter-free online learning algorithm for changing environments.

Metric Learning

Graph-Based Active Learning: A New Look at Expected Error Minimization

no code implementations3 Sep 2016 Kwang-Sung Jun, Robert Nowak

In graph-based active learning, algorithms based on expected error minimization (EEM) have been popular and yield good empirical performance.

Active Learning

Human Memory Search as Initial-Visit Emitting Random Walk

no code implementations NeurIPS 2015 Kwang-Sung Jun, Jerry Zhu, Timothy T. Rogers, Zhuoran Yang, Ming Yuan

In this paper, we propose the first efficient maximum likelihood estimate (MLE) for INVITE by decomposing the censored output into a series of absorbing random walks.

Cannot find the paper you are looking for? You can Submit a new open access paper.