no code implementations • 13 Jan 2025 • Yahya Sattar, Yassir Jedra, Maryam Fazel, Sarah Dean
We consider the problem of learning a realization of a partially observed bilinear dynamical system (BLDS) from noisy input-output data.
no code implementations • 13 Dec 2024 • Avinandan Bose, Zhihan Xiong, Aadirupa Saha, Simon Shaolei Du, Maryam Fazel
Our results yield improved sample efficiency of hybrid RLHF over pure offline and online exploration.
no code implementations • 2 Oct 2024 • Zhihan Xiong, Maryam Fazel, Lin Xiao
We propose Dual Approximation Policy Optimization (DAPO), a framework that incorporates general function approximation into policy mirror descent methods.
no code implementations • 29 Jun 2024 • Weihang Xu, Maryam Fazel, Simon S. Du
We study the gradient Expectation-Maximization (EM) algorithm for Gaussian Mixture Models (GMM) in the over-parameterized setting, where a general GMM with $n>1$ components learns from data that are generated by a single ground truth Gaussian distribution.
no code implementations • 19 Feb 2024 • Avinandan Bose, Simon Shaolei Du, Maryam Fazel
We study the problem of representation transfer in offline Reinforcement Learning (RL), where a learner has access to episodic data from a number of source tasks collected a priori, and aims to learn a shared representation to be used in finding a good policy for a target task.
no code implementations • 12 Feb 2024 • Qiwen Cui, Maryam Fazel, Simon S. Du
In multiplayer games, self-interested behavior among the players can harm the social welfare.
no code implementations • 19 Dec 2023 • Avinandan Bose, Mihaela Curmei, Daniel L. Jiang, Jamie Morgenstern, Sarah Dean, Lillian J. Ratliff, Maryam Fazel
(ii) Suboptimal Local Solutions: The total loss (sum of loss functions across all users and all services) landscape is not convex even if the individual losses on a single service are convex, making it likely for the learning dynamics to get stuck in local minima.
1 code implementation • 27 Jul 2023 • Zhihan Xiong, Romain Camilleri, Maryam Fazel, Lalit Jain, Kevin Jamieson
For robust identification, it is well-known that if arms are chosen randomly and non-adaptively from a G-optimal design over $\mathcal{X}$ at each time then the error probability decreases as $\exp(-T\Delta^2_{(1)}/d)$, where $\Delta_{(1)} = \min_{x \neq x^*} (x^* - x)^\top \frac{1}{T}\sum_{t=1}^T \theta_t$.
no code implementations • 12 Jun 2023 • Haozhe Jiang, Qiwen Cui, Zhihan Xiong, Maryam Fazel, Simon S. Du
Specifically, we focus on games with bandit feedback, where testing an equilibrium can result in substantial regret even when the gap to be tested is small, and the existence of multiple optimal solutions (equilibria) in stationary games poses extra challenges.
no code implementations • 2 Feb 2023 • Yuzhen Qin, Yingcong Li, Fabio Pasqualetti, Maryam Fazel, Samet Oymak
The growing interest in complex decision-making and language modeling problems highlights the importance of sample-efficient learning over very long horizons.
no code implementations • 24 Oct 2022 • Haozhe Jiang, Qiwen Cui, Zhihan Xiong, Maryam Fazel, Simon S. Du
On the other hand, we convert the game to multi-agent linear bandits and show that with a generalized data coverage assumption in offline linear bandits, we can efficiently recover the approximate NE.
no code implementations • 10 Oct 2022 • Bin Hu, Kaiqing Zhang, Na Li, Mehran Mesbahi, Maryam Fazel, Tamer Başar
Gradient-based methods have been widely used for system design and optimization in diverse application domains.
1 code implementation • 13 Jul 2022 • Vincent Roulet, Siddhartha Srinivasa, Maryam Fazel, Zaid Harchaoui
We present the implementation of nonlinear control algorithms based on linear and quadratic approximations of the objective from a functional viewpoint.
1 code implementation • 7 Jul 2022 • Adhyyan Narang, Omid Sadeghi, Lillian J Ratliff, Maryam Fazel, Jeff Bilmes
In the context of online interactive machine learning with combinatorial objectives, we extend purely submodular prior work to more general non-submodular objectives.
2 code implementations • 6 Jun 2022 • Sarah Dean, Mihaela Curmei, Lillian J. Ratliff, Jamie Morgenstern, Maryam Fazel
Numerous online services are data-driven: the behavior of users affects the system's parameters, and the system's parameters affect the users' experience of the service, which in turn affects the way users may interact with the system.
no code implementations • 4 Jun 2022 • Qiwen Cui, Zhihan Xiong, Maryam Fazel, Simon S. Du
We propose a centralized algorithm for Markov congestion games, whose sample complexity again has only polynomial dependence on all relevant problem parameters, but not the size of the action set.
no code implementations • 8 Apr 2022 • Mitas Ray, Dmitriy Drusvyatskiy, Maryam Fazel, Lillian J. Ratliff
This paper studies the problem of expected loss minimization given a data distribution that is dependent on the decision-maker's action and evolves dynamically in time according to a geometric decay process.
1 code implementation • 30 Mar 2022 • Yue Sun, Samet Oymak, Maryam Fazel
Hankel regularization encourages the low-rankness of the Hankel matrix, which maps to the low-orderness of the system.
no code implementations • 7 Mar 2022 • Lijun Ding, Dmitriy Drusvyatskiy, Maryam Fazel, Zaid Harchaoui
Empirical evidence suggests that for a variety of overparameterized nonlinear models, most notably in neural network training, the growth of the loss around a minimizer strongly impacts its performance.
1 code implementation • NeurIPS 2021 • Yue Sun, Adhyyan Narang, Halil Ibrahim Gulluk, Samet Oymak, Maryam Fazel
Specifically, for (1), we first show that learning the optimal representation coincides with the problem of designing a task-aware regularization to promote inductive bias.
no code implementations • 10 Jan 2022 • Adhyyan Narang, Evan Faulkner, Dmitriy Drusvyatskiy, Maryam Fazel, Lillian J. Ratliff
We show that under mild assumptions, the performatively stable equilibria can be found efficiently by a variety of algorithms, including repeated retraining and the repeated (stochastic) gradient method.
no code implementations • 15 Nov 2021 • Omid Sadeghi, Maryam Fazel
Then, we study $L$-smooth monotone strongly DR-submodular functions that have bounded curvature, and we show how to exploit such additional structure to obtain algorithms with improved approximation guarantees and faster convergence rates for the maximization problem.
no code implementations • NeurIPS 2021 • Romain Camilleri, Zhihan Xiong, Maryam Fazel, Lalit Jain, Kevin Jamieson
The main results of this work precisely characterize this trade-off between labeled samples and stopping time and provide an algorithm that nearly-optimally achieves the minimal label complexity given a desired stopping time.
no code implementations • 15 Jun 2021 • Omid Sadeghi, Prasanna Raut, Maryam Fazel
For $(1)$, we obtain the first logarithmic regret bounds.
no code implementations • 19 Feb 2021 • Zhihan Xiong, Ruoqi Shen, Qiwen Cui, Maryam Fazel, Simon S. Du
To achieve the desired result, we develop 1) a new clipping operation to ensure both the probability of being optimistic and the probability of being pessimistic are lower bounded by a constant, and 2) a new recursive formula for the absolute value of estimation errors to analyze the regret.
no code implementations • 14 Feb 2021 • Halil Ibrahim Gulluk, Yue Sun, Samet Oymak, Maryam Fazel
We prove that subspace-based representations can be learned in a sample-efficient manner and provably benefit future tasks in terms of sample complexity.
no code implementations • 23 Dec 2020 • Mitas Ray, Omid Sadeghi, Lillian J. Ratliff, Maryam Fazel
We study the problem of online resource allocation, where multiple customers arrive sequentially and the seller must irrevocably allocate resources to each incoming customer while also facing a procurement cost for the total allocation.
no code implementations • NeurIPS 2020 • Omid Sadeghi, Prasanna Raut, Maryam Fazel
In this paper, we consider an online optimization problem in which the reward functions are DR-submodular, and in addition to maximizing the total reward, the sequence of decisions must satisfy some convex constraints on average.
no code implementations • L4DC 2020 • Yue Sun, Samet Oymak, Maryam Fazel
This paper studies low-order linear system identification via regularized regression.
no code implementations • 29 May 2020 • Prasanna Sanjay Raut, Omid Sadeghi, Maryam Fazel
Stochastic long-term constraints arise naturally in applications where there is a limited budget or resource available and resource consumption at each step is governed by stochastically time-varying environments.
no code implementations • 30 Jun 2019 • Omid Sadeghi, Reza Eghbali, Maryam Fazel
In this paper, we study a certain class of online optimization problems, where the goal is to maximize a function that is not necessarily concave and satisfies the Diminishing Returns (DR) property under budget constraints.
no code implementations • 30 Jun 2019 • Omid Sadeghi, Maryam Fazel
In this paper, we study a class of online optimization problems with long-term budget constraints where the objective functions are not necessarily concave (nor convex) but they instead satisfy the Diminishing Returns (DR) property.
no code implementations • NeurIPS 2019 • Yue Sun, Nicolas Flammarion, Maryam Fazel
We consider minimizing a nonconvex, smooth function $f$ on a Riemannian manifold $\mathcal{M}$.
no code implementations • 10 Apr 2019 • Amin Jalali, Adel Javanmard, Maryam Fazel
Prior knowledge on properties of a target model often come as discrete or combinatorial descriptions.
no code implementations • ICML 2018 • Maryam Fazel, Rong Ge, Sham M. Kakade, Mehran Mesbahi
Direct policy gradient methods for reinforcement learning and continuous control problems are a popular approach for a variety of reasons: 1) they are easy to implement without explicit knowledge of the underlying model 2) they are an "end-to-end" approach, directly optimizing the performance metric of interest 3) they inherently allow for richly parameterized policies.
no code implementations • ICLR 2018 • Maryam Fazel, Rong Ge, Sham M. Kakade, Mehran Mesbahi
Direct policy gradient methods for reinforcement learning and continuous control problems are a popular approach for a variety of reasons: 1) they are easy to implement without explicit knowledge of the underlying model; 2) they are an "end-to-end" approach, directly optimizing the performance metric of interest; 3) they inherently allow for richly parameterized policies.
no code implementations • NeurIPS 2016 • Amin Jalali, Qiyang Han, Ioana Dumitriu, Maryam Fazel
The Stochastic Block Model (SBM) is a widely used random graph model for networks with communities.
no code implementations • NeurIPS 2016 • Reza Eghbali, Maryam Fazel
Online optimization covers problems such as online resource allocation, online bipartite matching, adwords (a central problem in e-commerce and advertising), and adwords with separable concave returns.
no code implementations • 15 Dec 2015 • Amin Jalali, Qiyang Han, Ioana Dumitriu, Maryam Fazel
For instance, $\log n$ is considered to be the standard lower bound on the cluster size for exact recovery via convex methods, for homogenous SBM.
no code implementations • 16 Jul 2015 • Amin Jalali, Maryam Fazel, Lin Xiao
We propose a new class of convex penalty functions, called \emph{variational Gram functions} (VGFs), that can promote pairwise relations, such as orthogonality, among a set of vectors in a vector space.
no code implementations • 27 Oct 2014 • Reza Eghbali, Jon Swenson, Maryam Fazel
Online optimization problems arise in many resource allocation tasks, where the future demands for each resource and the associated utility functions change over time and are not known apriori, yet resources need to be allocated at every point in time despite the future uncertainty.
no code implementations • 3 Jun 2014 • Krishnamurthy Dvijotham, Maryam Fazel, Emanuel Todorov
We develop a framework for convexifying a fairly general class of optimization problems.
no code implementations • 28 Feb 2014 • Kean Ming Tan, Palma London, Karthik Mohan, Su-In Lee, Maryam Fazel, Daniela Witten
We consider the problem of learning a high-dimensional graphical model in which certain hub nodes are highly-connected to many other nodes.
no code implementations • 21 Mar 2013 • Karthik Mohan, Palma London, Maryam Fazel, Daniela Witten, Su-In Lee
We consider estimation under two distinct assumptions: (1) differences between the K networks are due to individual nodes that are perturbed across conditions, or (2) similarities among the K networks are due to the presence of common hub nodes that are shared across all K networks.
no code implementations • NeurIPS 2012 • Karthik Mohan, Mike Chung, Seungyeop Han, Daniela Witten, Su-In Lee, Maryam Fazel
We consider estimation of multiple high-dimensional Gaussian graphical models corresponding to a single set of nodes under several distinct conditions.