Search Results for author: Davoud Ataee Tarzanagh

Found 15 papers, 8 papers with code

Transformers as Support Vector Machines

1 code implementation31 Aug 2023 Davoud Ataee Tarzanagh, Yingcong Li, Christos Thrampoulidis, Samet Oymak

In this work, we establish a formal equivalence between the optimization geometry of self-attention and a hard-margin SVM problem that separates optimal input tokens from non-optimal tokens using linear constraints on the outer-products of token pairs.

Max-Margin Token Selection in Attention Mechanism

1 code implementation NeurIPS 2023 Davoud Ataee Tarzanagh, Yingcong Li, Xuechen Zhang, Samet Oymak

Interestingly, the SVM formulation of $\boldsymbol{p}$ is influenced by the support vector geometry of $\boldsymbol{v}$.

Federated Multi-Sequence Stochastic Approximation with Local Hypergradient Estimation

1 code implementation2 Jun 2023 Davoud Ataee Tarzanagh, Mingchen Li, Pranay Sharma, Samet Oymak

Stochastic approximation with multiple coupled sequences (MSA) has found broad applications in machine learning as it encompasses a rich class of problems including bilevel optimization (BLO), multi-level compositional optimization (MCO), and reinforcement learning (specifically, actor-critic methods).

Bilevel Optimization

A Penalty-Based Method for Communication-Efficient Decentralized Bilevel Programming

no code implementations8 Nov 2022 Parvin Nazari, Ahmad Mousavi, Davoud Ataee Tarzanagh, George Michailidis

A key feature of the proposed algorithm is to estimate the hyper-gradient of the penalty function via decentralized computation of matrix-vector products and few vector communications, which is then integrated within an alternating algorithm to obtain finite-time convergence analysis under different convexity assumptions.

Bilevel Optimization Federated Learning

Online Bilevel Optimization: Regret Analysis of Online Alternating Gradient Methods

1 code implementation6 Jul 2022 Davoud Ataee Tarzanagh, Parvin Nazari, BoJian Hou, Li Shen, Laura Balzano

This paper introduces \textit{online bilevel optimization} in which a sequence of time-varying bilevel problems is revealed one after the other.

Bilevel Optimization

Fair Community Detection and Structure Learning in Heterogeneous Graphical Models

no code implementations9 Dec 2021 Davoud Ataee Tarzanagh, Laura Balzano, Alfred O. Hero

In particular, we assume there is some community or clustering structure in the true underlying graph, and we seek to learn a sparse undirected graph and its communities from the data such that demographic groups are fairly represented within the communities.

Community Detection Fairness +1

Identification and Adaptive Control of Markov Jump Systems: Sample Complexity and Regret Bounds

no code implementations13 Nov 2021 Yahya Sattar, Zhe Du, Davoud Ataee Tarzanagh, Laura Balzano, Necmiye Ozay, Samet Oymak

Combining our sample complexity results with recent perturbation results for certainty equivalent control, we prove that when the episode lengths are appropriately chosen, the proposed adaptive control scheme achieves $\mathcal{O}(\sqrt{T})$ regret, which can be improved to $\mathcal{O}(polylog(T))$ with partial knowledge of the system.

Certainty Equivalent Quadratic Control for Markov Jump Systems

no code implementations26 May 2021 Zhe Du, Yahya Sattar, Davoud Ataee Tarzanagh, Laura Balzano, Samet Oymak, Necmiye Ozay

Real-world control applications often involve complex dynamics subject to abrupt changes or variations.

Solving a class of non-convex min-max games using adaptive momentum methods

no code implementations26 Apr 2021 Babak Barazandeh, Davoud Ataee Tarzanagh, George Michailidis

Adaptive momentum methods have recently attracted a lot of attention for training of deep neural networks.

Adaptive First-and Zeroth-order Methods for Weakly Convex Stochastic Optimization Problems

no code implementations19 May 2020 Parvin Nazari, Davoud Ataee Tarzanagh, George Michailidis

In this paper, we design and analyze a new family of adaptive subgradient methods for solving an important class of weakly convex (possibly nonsmooth) stochastic optimization problems.

Stochastic Optimization

Grassmannian Optimization for Online Tensor Completion and Tracking with the t-SVD

1 code implementation30 Jan 2020 Kyle Gilman, Davoud Ataee Tarzanagh, Laura Balzano

We propose a new fast streaming algorithm for the tensor completion problem of imputing missing entries of a low-tubal-rank tensor using the tensor singular value decomposition (t-SVD) algebraic framework.

Regularized and Smooth Double Core Tensor Factorization for Heterogeneous Data

1 code implementation24 Nov 2019 Davoud Ataee Tarzanagh, George Michailidis

We introduce a general tensor model suitable for data analytic tasks for {\em heterogeneous} datasets, wherein there are joint low-rank structures within groups of observations, but also discriminative structures across different groups.

Clustering Recommendation Systems

Online Distributed Estimation of Principal Eigenspaces

no code implementations17 May 2019 Davoud Ataee Tarzanagh, Mohamad Kazem Shirani Faradonbeh, George Michailidis

Principal components analysis (PCA) is a widely used dimension reduction technique with an extensive range of applications.

Clustering Dimensionality Reduction

DADAM: A Consensus-based Distributed Adaptive Gradient Method for Online Optimization

1 code implementation ICLR 2019 Parvin Nazari, Davoud Ataee Tarzanagh, George Michailidis

Adaptive gradient-based optimization methods such as \textsc{Adagrad}, \textsc{Rmsprop}, and \textsc{Adam} are widely used in solving large-scale machine learning problems including deep learning.

Stochastic Optimization

Cannot find the paper you are looking for? You can Submit a new open access paper.