Search Results for author: Zakaria Mhammedi

Found 26 papers, 3 papers with code

Beating Adversarial Low-Rank MDPs with Unknown Transition and Bandit Feedback

no code implementations11 Nov 2024 Haolin Liu, Zakaria Mhammedi, Chen-Yu Wei, Julian Zimmert

First, we improve the $poly(d, A, H)T^{5/6}$ regret bound of Zhao et al. (2024) to $poly(d, A, H)T^{2/3}$ for the full-information unknown transition setting, where d is the rank of the transitions, A is the number of actions, H is the horizon length, and T is the number of episodes.

Reinforcement Learning under Latent Dynamics: Toward Statistical and Algorithmic Modularity

no code implementations23 Oct 2024 Philip Amortila, Dylan J. Foster, Nan Jiang, Akshay Krishnamurthy, Zakaria Mhammedi

Real-world applications of reinforcement learning often involve environments where agents operate on complex, high-dimensional observations, but the underlying (''latent'') dynamics are comparatively simple.

reinforcement-learning Reinforcement Learning

Online Convex Optimization with a Separation Oracle

no code implementations3 Oct 2024 Zakaria Mhammedi

Existing projection-free methods based on the classical Frank-Wolfe algorithm achieve a suboptimal regret bound of $O(T^{3/4})$, while more recent separation-based approaches guarantee a regret bound of $O(\kappa \sqrt{T})$, where $\kappa$ denotes the asphericity of the feasible set, defined as the ratio of the radii of the containing and contained balls.

Improved Sample Complexity of Imitation Learning for Barrier Model Predictive Control

no code implementations1 Oct 2024 Daniel Pfrommer, Swati Padmanabhan, Kwangjun Ahn, Jack Umenberger, Tobia Marcucci, Zakaria Mhammedi, Ali Jadbabaie

Recent work in imitation learning has shown that having an expert controller that is both suitably smooth and stable enables stronger guarantees on the performance of the learned controller.

Imitation Learning Model Predictive Control

Sample and Oracle Efficient Reinforcement Learning for MDPs with Linearly-Realizable Value Functions

no code implementations7 Sep 2024 Zakaria Mhammedi

Designing sample-efficient and computationally feasible reinforcement learning (RL) algorithms is particularly challenging in environments with large or infinite state and action spaces.

Reinforcement Learning (RL)

Fully Unconstrained Online Learning

no code implementations30 May 2024 Ashok Cutkosky, Zakaria Mhammedi

We provide an online learning algorithm that obtains regret $G\|w_\star\|\sqrt{T\log(\|w_\star\|G\sqrt{T})} + \|w_\star\|^2 + G^2$ on $G$-Lipschitz convex losses for any comparison point $w_\star$ without knowing either $G$ or $\|w_\star\|$.

The Power of Resets in Online Reinforcement Learning

no code implementations23 Apr 2024 Zakaria Mhammedi, Dylan J. Foster, Alexander Rakhlin

We use local simulator access to unlock new statistical guarantees that were previously out of reach: - We show that MDPs with low coverability (Xie et al. 2023) -- a general structural condition that subsumes Block MDPs and Low-Rank MDPs -- can be learned in a sample-efficient fashion with only $Q^{\star}$-realizability (realizability of the optimal state-value function); existing online RL algorithms require significantly stronger representation conditions.

reinforcement-learning Reinforcement Learning

Efficient Model-Free Exploration in Low-Rank MDPs

no code implementations NeurIPS 2023 Zakaria Mhammedi, Adam Block, Dylan J. Foster, Alexander Rakhlin

A major challenge in reinforcement learning is to develop practical, sample-efficient algorithms for exploration in high-dimensional domains where generalization and function approximation is required.

Representation Learning

On the Sample Complexity of Imitation Learning for Smoothed Model Predictive Control

no code implementations2 Jun 2023 Daniel Pfrommer, Swati Padmanabhan, Kwangjun Ahn, Jack Umenberger, Tobia Marcucci, Zakaria Mhammedi, Ali Jadbabaie

Recent work in imitation learning has shown that having an expert controller that is both suitably smooth and stable enables stronger guarantees on the performance of the learned controller.

Imitation Learning Learning Theory +1

Representation Learning with Multi-Step Inverse Kinematics: An Efficient and Optimal Approach to Rich-Observation RL

1 code implementation12 Apr 2023 Zakaria Mhammedi, Dylan J. Foster, Alexander Rakhlin

We address these issues by providing the first computationally efficient algorithm that attains rate-optimal sample complexity with respect to the desired accuracy level, with minimal statistical assumptions.

Representation Learning

Quasi-Newton Steps for Efficient Online Exp-Concave Optimization

no code implementations2 Nov 2022 Zakaria Mhammedi, Khashayar Gatmiry

Typical algorithms for these settings, such as the Online Newton Step (ONS), can guarantee a $O(d\ln T)$ bound on their regret after $T$ rounds, where $d$ is the dimension of the feasible set.

Open-Ended Question Answering

Model Predictive Control via On-Policy Imitation Learning

no code implementations17 Oct 2022 Kwangjun Ahn, Zakaria Mhammedi, Horia Mania, Zhang-Wei Hong, Ali Jadbabaie

Recent approaches to data-driven MPC have used the simplest form of imitation learning known as behavior cloning to learn controllers that mimic the performance of MPC by online sampling of the trajectories of the closed-loop MPC system.

Imitation Learning model +2

Exploiting the Curvature of Feasible Sets for Faster Projection-Free Online Learning

no code implementations23 May 2022 Zakaria Mhammedi

In this paper, we leverage recent results in parameter-free Online Learning, and develop an OCO algorithm that makes two calls to an LO Oracle per round and achieves the near-optimal $\widetilde{O}(\sqrt{T})$ regret whenever the feasible set is strongly convex.

Damped Online Newton Step for Portfolio Selection

no code implementations15 Feb 2022 Zakaria Mhammedi, Alexander Rakhlin

In this paper, we build on the recent work by Haipeng et al. 2018 and present the first practical online portfolio selection algorithm with a logarithmic regret and whose per-round time and space complexities depend only logarithmically on the horizon.

Risk Monotonicity in Statistical Learning

no code implementations NeurIPS 2021 Zakaria Mhammedi

Acquisition of data is a difficult task in many applications of machine learning, and it is only natural that one hopes and expects the population risk to decrease (better performance) monotonically with increasing data points.

Efficient Projection-Free Online Convex Optimization with Membership Oracle

no code implementations10 Nov 2021 Zakaria Mhammedi

However, the Frank-Wolfe algorithm and its variants do not achieve the optimal performance, in terms of regret or rate, for general convex sets.

Stochastic Optimization

Risk-Monotonicity in Statistical Learning

no code implementations28 Nov 2020 Zakaria Mhammedi

Acquisition of data is a difficult task in many applications of machine learning, and it is only natural that one hopes and expects the population risk to decrease (better performance) monotonically with increasing data points.

Learning the Linear Quadratic Regulator from Nonlinear Observations

no code implementations NeurIPS 2020 Zakaria Mhammedi, Dylan J. Foster, Max Simchowitz, Dipendra Misra, Wen Sun, Akshay Krishnamurthy, Alexander Rakhlin, John Langford

We introduce a new algorithm, RichID, which learns a near-optimal policy for the RichLQR with sample complexity scaling only with the dimension of the latent state space and the capacity of the decoder function class.

continuous-control Continuous Control +1

PAC-Bayesian Bound for the Conditional Value at Risk

no code implementations NeurIPS 2020 Zakaria Mhammedi, Benjamin Guedj, Robert C. Williamson

Conditional Value at Risk (CVaR) is a family of "coherent risk measures" which generalize the traditional mathematical expectation.

Fairness

Lipschitz and Comparator-Norm Adaptivity in Online Learning

no code implementations27 Feb 2020 Zakaria Mhammedi, Wouter M. Koolen

We study Online Convex Optimization in the unbounded setting where neither predictions nor gradient are constrained.

Lipschitz Adaptivity with Multiple Learning Rates in Online Learning

no code implementations27 Feb 2019 Zakaria Mhammedi, Wouter M. Koolen, Tim van Erven

For MetaGrad, we further improve the computational efficiency of handling constraints on the domain of prediction, and we remove the need to specify the number of rounds in advance.

Active Learning Computational Efficiency

Constant Regret, Generalized Mixability, and Mirror Descent

no code implementations NeurIPS 2018 Zakaria Mhammedi, Robert C. Williamson

For a given entropy $\Phi$, losses for which a constant regret is possible using the \textsc{GAA} are called $\Phi$-mixable.

Open-Ended Question Answering

Adversarial Generation of Real-time Feedback with Neural Networks for Simulation-based Training

no code implementations4 Mar 2017 Xingjun Ma, Sudanthi Wijewickrema, Shuo Zhou, Yun Zhou, Zakaria Mhammedi, Stephen O'Leary, James Bailey

It is the aim of this paper to develop an efficient and effective feedback generation method for the provision of real-time feedback in SBT.

Efficient Orthogonal Parametrisation of Recurrent Neural Networks Using Householder Reflections

1 code implementation ICML 2017 Zakaria Mhammedi, Andrew Hellicar, Ashfaqur Rahman, James Bailey

Our contributions are as follows; we first show that constraining the transition matrix to be unitary is a special case of an orthogonal constraint.

Cannot find the paper you are looking for? You can Submit a new open access paper.