no code implementations • 31 May 2024 • Yilin Zheng, Atilla Eryilmaz
With the development of edge networks and mobile computing, the need to serve heterogeneous data sources at the network edge requires the design of new distributed machine learning mechanisms.
no code implementations • 28 May 2024 • Semih Cayci, Atilla Eryilmaz
In this paper, we study a natural policy gradient method based on recurrent neural networks (RNNs) for partially-observable Markov decision processes, whereby RNNs are used for policy parameterization and policy evaluation to address curse of dimensionality in non-Markovian reinforcement learning.
no code implementations • 19 Feb 2024 • Semih Cayci, Atilla Eryilmaz
We analyze recurrent neural networks trained with gradient descent in the supervised learning setting for dynamical systems, and prove that gradient descent can achieve optimality \emph{without} massive overparameterization.
no code implementations • 9 Jun 2021 • Semih Cayci, Yilin Zheng, Atilla Eryilmaz
In a wide variety of applications including online advertising, contractual hiring, and wireless scheduling, the controller is constrained by a stringent budget constraint on the available resources, which are consumed in a random amount by each action, and a stochastic feasibility constraint that may impose important operational limitations on decision-making.
no code implementations • 30 Jun 2020 • Semih Cayci, Atilla Eryilmaz, R. Srikant
Time-constrained decision processes have been ubiquitous in many fundamental applications in physics, biology and computer science.
no code implementations • NeurIPS 2020 • Semih Cayci, Swati Gupta, Atilla Eryilmaz
Furthermore, as a consequence of certain ethical and economic concerns, the controller may impose deadlines on the completion of each task, and require fairness across different groups in the allocation of total time budget $B$.
no code implementations • 29 Feb 2020 • Semih Cayci, Atilla Eryilmaz, R. Srikant
We prove a regret lower bound for this problem, and show that the proposed algorithms achieve tight problem-dependent regret bounds, which are optimal up to a universal constant factor in the case of jointly Gaussian cost and reward pairs.
no code implementations • 27 Nov 2018 • Semih Cayci, Atilla Eryilmaz
We study the problem of serving randomly arriving and delay-sensitive traffic over a multi-channel communication system with time-varying channel states and unknown statistics.
no code implementations • 11 Mar 2018 • Doruk Öner, Altuğ Karakurt, Atilla Eryilmaz, Cem Tekin
In this paper, we introduce the COmbinatorial Multi-Objective Multi-Armed Bandit (COMO-MAB) problem that captures the challenges of combinatorial and multi-objective online learning simultaneously.
no code implementations • 26 Apr 2017 • Swapna Buccapatnam, Fang Liu, Atilla Eryilmaz, Ness B. Shroff
We study the stochastic multi-armed bandit (MAB) problem in the presence of side-observations across actions that occur as a result of an underlying network structure.