Search Results for author: Cem Tekin

Found 36 papers, 3 papers with code

Robust Pareto Set Identification with Contaminated Bandit Feedback

no code implementations • 6 Jun 2022 • Kerem Bozgan, Cem Tekin

We consider the Pareto set identification (PSI) problem in multi-objective multi-armed bandits (MO-MAB) with contaminated reward observations.

Multi-Armed Bandits

Paper
Add Code

Federated Multi-Armed Bandits Under Byzantine Attacks

no code implementations • 9 May 2022 • Ilker Demirel, Yigit Yildirim, Cem Tekin

We demonstrate Fed-MoM-UCB's effectiveness against the baselines in the presence of Byzantine attacks via experiments.

Data Poisoning Federated Learning +1

Paper
Add Code

Safe Linear Leveling Bandits

no code implementations • 13 Dec 2021 • Ilker Demirel, Mehmet Ufuk Ozdemir, Cem Tekin

In this work, we tackle a different critical task through the lens of \textit{linear stochastic bandits}, where the aim is to keep the actions' outcomes close to a target level while respecting a \textit{two-sided} safety constraint, which we call \textit{leveling}.

Multi-Armed Bandits Thompson Sampling

Paper
Add Code

Contextual Combinatorial Multi-output GP Bandits with Group Constraints

no code implementations • 29 Nov 2021 • Sepehr Elahi, Baran Atalar, Sevda Öğüt, Cem Tekin

In federated multi-armed bandit problems, maximizing global reward while satisfying minimum privacy requirements to protect clients is the main goal.

Federated Learning Gaussian Processes

Paper
Add Code

ESCADA: Efficient Safety and Context Aware Dose Allocation for Precision Medicine

1 code implementation • 26 Nov 2021 • Ilker Demirel, Ahmet Alparslan Celik, Cem Tekin

We propose ESCADA, a novel and generic multi-armed bandit (MAB) algorithm tailored for the leveling task, to make safe, personalized, and context-aware dose recommendations.

Thompson Sampling

Paper
Code

Vector Optimization with Stochastic Bandit Feedback

no code implementations • 23 Oct 2021 • Çağın Ararat, Cem Tekin

We introduce vector optimization problems with stochastic bandit feedback, in which preferences among designs are encoded by a polyhedral ordering cone $C$.

Paper
Add Code

Contextual Combinatorial Bandits with Changing Action Sets via Gaussian Processes

no code implementations • 5 Oct 2021 • Andi Nika, Sepehr Elahi, Cem Tekin

We consider a contextual bandit problem with a combinatorial action set and time-varying base arm availability.

Gaussian Processes

Paper
Add Code

Conservative Policy Construction Using Variational Autoencoders for Logged Data with Missing Values

no code implementations • 8 Sep 2021 • Mahed Abroshan, Kai Hou Yip, Cem Tekin, Mihaela van der Schaar

Secondly, such datasets are usually imperfect, additionally cursed with missing values in the attributes of features.

Decision Making

Paper
Add Code

Contextual Combinatorial Volatile Multi-armed Bandit with Adaptive Discretization

1 code implementation • 28 Aug 2020 • Andi Nika, Sepehr Elahi, Cem Tekin

We consider contextual combinatorial volatile multi-armed bandit (CCV-MAB), in which at each round, the learner observes a set of available base arms and their contexts, and then, selects a super arm that contains $K$ base arms in order to maximize its cumulative reward.

Paper
Code

Pareto Active Learning with Gaussian Processes and Adaptive Discretization

1 code implementation • 24 Jun 2020 • Andi Nika, Kerem Bozgan, Sepehr Elahi, Çağın Ararat, Cem Tekin

We consider the problem of optimizing a vector-valued objective function $\boldsymbol{f}$ sampled from a Gaussian Process (GP) whose index set is a well-behaved, compact metric space $({\cal X}, d)$ of designs.

Active Learning Gaussian Processes

Paper
Code

Lexicographic Multiarmed Bandit

no code implementations • 26 Jul 2019 • Alihan Hüyük, Cem Tekin

The algorithm we propose for the second setting also attains bounded regret for the multiarmed bandit with satisficing objectives.

Paper
Add Code

Thompson Sampling for Combinatorial Network Optimization in Unknown Environments

no code implementations • 7 Jul 2019 • Alihan Hüyük, Cem Tekin

Influence maximization, adaptive routing, and dynamic spectrum allocation all require choosing the right action from a large set of alternatives.

Combinatorial Optimization Thompson Sampling

Paper
Add Code

Exploiting Relevance for Online Decision-Making in High-Dimensions

no code implementations • 1 Jul 2019 • Eralp Turgay, Cem Bulucu, Cem Tekin

As our learning model, we consider a structured contextual multi-armed bandit (CMAB) with high-dimensional arm (action) and context (data) sets, where the rewards depend only on a few relevant dimensions of the joint context-arm set, possibly in a non-linear way.

Decision Making Vocal Bursts Intensity Prediction

Paper
Add Code

Group Retention when Using Machine Learning in Sequential Decision Making: the Interplay between User Dynamics and Fairness

no code implementations • NeurIPS 2019 • Xueru Zhang, Mohammad Mahdi Khalili, Cem Tekin, Mingyan Liu

Machine Learning (ML) models trained on data from multiple demographic groups can inherit representation disparity (Hashimoto et al., 2018) that may exist in the data: the model may be less favorable to groups contributing less to the training process; this in turn can degrade population retention in these groups over time, and exacerbate representation disparity in the long run.

Decision Making Fairness

Paper
Add Code

Analysis of Thompson Sampling for Combinatorial Multi-armed Bandit with Probabilistically Triggered Arms

no code implementations • 7 Sep 2018 • Alihan Hüyük, Cem Tekin

We analyze the regret of combinatorial Thompson sampling (CTS) for the combinatorial multi-armed bandit with probabilistically triggered arms under the semi-bandit feedback setting.

Thompson Sampling

Paper
Add Code

Combinatorial Multi-Objective Multi-Armed Bandit Problem

no code implementations • 11 Mar 2018 • Doruk Öner, Altuğ Karakurt, Atilla Eryilmaz, Cem Tekin

In this paper, we introduce the COmbinatorial Multi-Objective Multi-Armed Bandit (COMO-MAB) problem that captures the challenges of combinatorial and multi-objective online learning simultaneously.

Paper
Add Code

Multi-objective Contextual Bandit Problem with Similarity Information

no code implementations • 11 Mar 2018 • Eralp Turğay, Doruk Öner, Cem Tekin

Essentially, the contextual Pareto regret is the sum of the distances of the arms chosen by the learner to the context dependent Pareto front.

Paper
Add Code

Multi-objective Contextual Multi-armed Bandit with a Dominant Objective

no code implementations • 18 Aug 2017 • Cem Tekin, Eralp Turgay

In this case, the optimal arm given a context is the one that maximizes the expected reward in the non-dominant objective among all arms that maximize the expected reward in the dominant objective.

Medical Diagnosis Recommendation Systems

Paper
Add Code

Combinatorial Multi-armed Bandit with Probabilistically Triggered Arms: A Case with Bounded Regret

no code implementations • 24 Jul 2017 • A. Ömer Sarıtaç, Cem Tekin

Under the assumption that the arm triggering probabilities (ATPs) are positive for all arms, we prove that a class of upper confidence bound (UCB) policies, named Combinatorial UCB with exploration rate $\kappa$ (CUCB-$\kappa$), and Combinatorial Thompson Sampling (CTS), which estimates the expected states of the arms via Thompson sampling, achieve bounded regret.

Movie Recommendation Thompson Sampling

Paper
Add Code

Context-Aware Hierarchical Online Learning for Performance Maximization in Mobile Crowdsourcing

no code implementations • 10 May 2017 • Sabrina Klos, Cem Tekin, Mihaela van der Schaar, Anja Klein

In our algorithm, a local controller (LC) in the mobile device of a worker regularly observes the worker's context, her/his decisions to accept or decline tasks and the quality in completing tasks.

Paper
Add Code

Gambler's Ruin Bandit Problem

no code implementations • 21 May 2016 • Nima Akbarzadeh, Cem Tekin

In the GRBP, the learner proceeds in a sequence of rounds, where each round is a Markov Decision Process (MDP) with two actions (arms): a continuation action that moves the learner randomly over the state space around the current state; and a terminal action that moves the learner directly into one of the two terminal states (goal and dead-end state).

Paper
Add Code

Adaptive Ensemble Learning with Confidence Bounds

no code implementations • 23 Dec 2015 • Cem Tekin, Jinsung Yoon, Mihaela van der Schaar

Extracting actionable intelligence from distributed, heterogeneous, correlated and high-dimensional data sources requires run-time processing and learning both locally and globally.

Ensemble Learning Meta-Learning

Paper
Add Code

Episodic Multi-armed Bandits

no code implementations • 4 Aug 2015 • Cem Tekin, Mihaela van der Schaar

After the $stop$ action is taken, the learner collects a terminal reward, and observes the costs and terminal rewards associated with each step of the episode.

Multi-Armed Bandits

Paper
Add Code

Global Bandits

no code implementations • 29 Mar 2015 • Onur Atan, Cem Tekin, Mihaela van der Schaar

In the case in which rewards of all arms are deterministic functions of a single unknown parameter, we construct a greedy policy that achieves {\em bounded regret}, with a bound that depends on the single true parameter of the problem.

Decision Making Informativeness +1

Paper
Add Code

Contextual Online Learning for Multimedia Content Aggregation

no code implementations • 7 Feb 2015 • Cem Tekin, Mihaela van der Schaar

A key challenge for such systems is to accurately predict what type of content each of its consumers prefers in a certain context, and adapt these predictions to the evolving consumers' preferences, contexts and content characteristics.

Paper
Add Code

RELEAF: An Algorithm for Learning and Exploiting Relevance

no code implementations • 5 Feb 2015 • Cem Tekin, Mihaela van der Schaar

We prove a general regret bound for our algorithm whose time order depends only on the maximum number of relevant dimensions among all the actions, which in the special case where the relevance relation is single-valued (a function), reduces to $\tilde{O}(T^{2(\sqrt{2}-1)})$; in the absence of a relevance relation, the best known contextual bandit algorithms achieve regret $\tilde{O}(T^{(D+1)/(D+2)})$, where $D$ is the full dimension of the context vector.

Decision Making Medical Diagnosis +2

Paper
Add Code

Discovering, Learning and Exploiting Relevance

no code implementations • NeurIPS 2014 • Cem Tekin, Mihaela van der Schaar

When the relation is a function, i. e., the reward of an action only depends on the context of a single type, and the expected reward of an action is Lipschitz continuous in the context of its relevant type, we propose an algorithm that achieves $\tilde{O}(T^{\gamma})$ regret with a high probability, where $\gamma=2/(1+\sqrt{2})$.

Medical Diagnosis Recommendation Systems +1

Paper
Add Code

Jamming Bandits

no code implementations • 13 Nov 2014 • SaiDhiraj Amuru, Cem Tekin, Mihaela van der Schaar, R. Michael Buehrer

We first present novel online learning algorithms to maximize the jamming efficacy against static transmitter-receiver pairs and prove that our learning algorithm converges to the optimal (in terms of the error rate inflicted at the victim and the energy used) jamming strategy.

Paper
Add Code

Global Bandits with Holder Continuity

no code implementations • 29 Oct 2014 • Onur Atan, Cem Tekin, Mihaela van der Schaar

Specifically, we prove that the parameter-free (worst-case) regret is sublinear in time, and decreases with the informativeness of the arms.

Informativeness

Paper
Add Code

Distributed Online Learning in Social Recommender Systems

no code implementations • 26 Sep 2013 • Cem Tekin, Simpson Zhang, Mihaela van der Schaar

In contrast to centralized recommender systems, in which there is a single centralized seller who has access to the complete inventory of items as well as the complete record of sales and user information, in decentralized recommender systems each seller/learner only has access to the inventory of items and user information for its own products and not the products and user information of other sellers, but can get commission if it sells an item of another seller.

Decision Making Recommendation Systems

Paper
Add Code

Distributed Online Learning via Cooperative Contextual Bandits

no code implementations • 21 Aug 2013 • Cem Tekin, Mihaela van der Schaar

At each moment of time, an instance characterized by a certain context may arrive to each learner; based on the context, the learner can select one of its own actions (which gives a reward and provides information) or request assistance from another learner.

Event Detection Multi-Armed Bandits +1

Paper
Add Code

Decentralized Online Big Data Classification - a Bandit Framework

no code implementations • 21 Aug 2013 • Cem Tekin, Mihaela van der Schaar

We model the problem of joint classification by the distributed and heterogeneous learners from multiple data sources as a distributed contextual bandit problem where each data is characterized by a specific context.

Classification General Classification

Paper
Add Code

Distributed Online Big Data Classification Using Context Information

no code implementations • 2 Jul 2013 • Cem Tekin, Mihaela van der Schaar

Distributed, online data mining systems have emerged as a result of applications requiring analysis of large amounts of correlated and high-dimensional data produced by multiple distributed data sources.

Classification General Classification

Paper
Add Code

Online Learning in a Contract Selection Problem

no code implementations • 15 May 2013 • Cem Tekin, Mingyan Liu

In an online contract selection problem there is a seller which offers a set of contracts to sequentially arriving buyers whose types are drawn from an unknown distribution.

Recommendation Systems

Paper
Add Code

Optimal Adaptive Learning in Uncontrolled Restless Bandit Problems

no code implementations • 20 Jul 2011 • Cem Tekin, Mingyan Liu

In an uncontrolled restless bandit problem, there is a finite set of arms, each of which when pulled yields a positive reward.

Paper
Add Code

Online Algorithms for the Multi-Armed Bandit Problem with Markovian Rewards

no code implementations • 14 Jul 2010 • Cem Tekin, Mingyan Liu

The player receives a state-dependent reward each time it plays an arm.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.