Search Results for author: Davoud Ataee Tarzanagh

Found 15 papers, 8 papers with code

Transformers as Support Vector Machines

1 code implementation • 31 Aug 2023 • Davoud Ataee Tarzanagh, Yingcong Li, Christos Thrampoulidis, Samet Oymak

In this work, we establish a formal equivalence between the optimization geometry of self-attention and a hard-margin SVM problem that separates optimal input tokens from non-optimal tokens using linear constraints on the outer-products of token pairs.

Paper
Code

Max-Margin Token Selection in Attention Mechanism

1 code implementation • NeurIPS 2023 • Davoud Ataee Tarzanagh, Yingcong Li, Xuechen Zhang, Samet Oymak

Interestingly, the SVM formulation of $\boldsymbol{p}$ is influenced by the support vector geometry of $\boldsymbol{v}$.

Paper
Code

Federated Multi-Sequence Stochastic Approximation with Local Hypergradient Estimation

1 code implementation • 2 Jun 2023 • Davoud Ataee Tarzanagh, Mingchen Li, Pranay Sharma, Samet Oymak

Stochastic approximation with multiple coupled sequences (MSA) has found broad applications in machine learning as it encompasses a rich class of problems including bilevel optimization (BLO), multi-level compositional optimization (MCO), and reinforcement learning (specifically, actor-critic methods).

Bilevel Optimization

Paper
Code

A Penalty-Based Method for Communication-Efficient Decentralized Bilevel Programming

no code implementations • 8 Nov 2022 • Parvin Nazari, Ahmad Mousavi, Davoud Ataee Tarzanagh, George Michailidis

A key feature of the proposed algorithm is to estimate the hyper-gradient of the penalty function via decentralized computation of matrix-vector products and few vector communications, which is then integrated within an alternating algorithm to obtain finite-time convergence analysis under different convexity assumptions.

Bilevel Optimization Federated Learning

Paper
Add Code

Online Bilevel Optimization: Regret Analysis of Online Alternating Gradient Methods

1 code implementation • 6 Jul 2022 • Davoud Ataee Tarzanagh, Parvin Nazari, BoJian Hou, Li Shen, Laura Balzano

This paper introduces \textit{online bilevel optimization} in which a sequence of time-varying bilevel problems is revealed one after the other.

Bilevel Optimization

Paper
Code

FedNest: Federated Bilevel, Minimax, and Compositional Optimization

3 code implementations • 4 May 2022 • Davoud Ataee Tarzanagh, Mingchen Li, Christos Thrampoulidis, Samet Oymak

Standard federated optimization methods successfully apply to stochastic problems with single-level structure.

Adversarial Robustness Hyperparameter Optimization +1

Paper
Code

Fair Community Detection and Structure Learning in Heterogeneous Graphical Models

no code implementations • 9 Dec 2021 • Davoud Ataee Tarzanagh, Laura Balzano, Alfred O. Hero

In particular, we assume there is some community or clustering structure in the true underlying graph, and we seek to learn a sparse undirected graph and its communities from the data such that demographic groups are fairly represented within the communities.

Community Detection Fairness +1

Paper
Add Code

Identification and Adaptive Control of Markov Jump Systems: Sample Complexity and Regret Bounds

no code implementations • 13 Nov 2021 • Yahya Sattar, Zhe Du, Davoud Ataee Tarzanagh, Laura Balzano, Necmiye Ozay, Samet Oymak

Combining our sample complexity results with recent perturbation results for certainty equivalent control, we prove that when the episode lengths are appropriately chosen, the proposed adaptive control scheme achieves $\mathcal{O}(\sqrt{T})$ regret, which can be improved to $\mathcal{O}(polylog(T))$ with partial knowledge of the system.

Paper
Add Code

Certainty Equivalent Quadratic Control for Markov Jump Systems

no code implementations • 26 May 2021 • Zhe Du, Yahya Sattar, Davoud Ataee Tarzanagh, Laura Balzano, Samet Oymak, Necmiye Ozay

Real-world control applications often involve complex dynamics subject to abrupt changes or variations.

Paper
Add Code

Solving a class of non-convex min-max games using adaptive momentum methods

no code implementations • 26 Apr 2021 • Babak Barazandeh, Davoud Ataee Tarzanagh, George Michailidis

Adaptive momentum methods have recently attracted a lot of attention for training of deep neural networks.

Paper
Add Code

Adaptive First-and Zeroth-order Methods for Weakly Convex Stochastic Optimization Problems

no code implementations • 19 May 2020 • Parvin Nazari, Davoud Ataee Tarzanagh, George Michailidis

In this paper, we design and analyze a new family of adaptive subgradient methods for solving an important class of weakly convex (possibly nonsmooth) stochastic optimization problems.

Stochastic Optimization

Paper
Add Code

Grassmannian Optimization for Online Tensor Completion and Tracking with the t-SVD

1 code implementation • 30 Jan 2020 • Kyle Gilman, Davoud Ataee Tarzanagh, Laura Balzano

We propose a new fast streaming algorithm for the tensor completion problem of imputing missing entries of a low-tubal-rank tensor using the tensor singular value decomposition (t-SVD) algebraic framework.

Paper
Code

Regularized and Smooth Double Core Tensor Factorization for Heterogeneous Data

1 code implementation • 24 Nov 2019 • Davoud Ataee Tarzanagh, George Michailidis

We introduce a general tensor model suitable for data analytic tasks for {\em heterogeneous} datasets, wherein there are joint low-rank structures within groups of observations, but also discriminative structures across different groups.

Clustering Recommendation Systems

Paper
Code

Online Distributed Estimation of Principal Eigenspaces

no code implementations • 17 May 2019 • Davoud Ataee Tarzanagh, Mohamad Kazem Shirani Faradonbeh, George Michailidis

Principal components analysis (PCA) is a widely used dimension reduction technique with an extensive range of applications.

Clustering Dimensionality Reduction

Paper
Add Code

DADAM: A Consensus-based Distributed Adaptive Gradient Method for Online Optimization

1 code implementation • ICLR 2019 • Parvin Nazari, Davoud Ataee Tarzanagh, George Michailidis

Adaptive gradient-based optimization methods such as \textsc{Adagrad}, \textsc{Rmsprop}, and \textsc{Adam} are widely used in solving large-scale machine learning problems including deep learning.

Stochastic Optimization

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.