Search Results for author: Aryan Mokhtari

Found 63 papers, 8 papers with code

In-Context Learning with Transformers: Softmax Attention Adapts to Function Lipschitzness

no code implementations • 18 Feb 2024 • Liam Collins, Advait Parulekar, Aryan Mokhtari, Sujay Sanghavi, Sanjay Shakkottai

We show that an attention unit learns a window that it uses to implement a nearest-neighbors predictor adapted to the landscape of the pretraining tasks.

In-Context Learning

Paper
Add Code

An Accelerated Gradient Method for Simple Bilevel Optimization with Convex Lower-level Problem

no code implementations • 12 Feb 2024 • Jincheng Cao, Ruichen Jiang, Erfan Yazdandoost Hamedani, Aryan Mokhtari

In this paper, we focus on simple bilevel optimization problems, where we minimize a convex smooth objective function over the optimal solution set of another convex smooth constrained optimization problem.

Bilevel Optimization

Paper
Add Code

Krylov Cubic Regularized Newton: A Subspace Second-Order Method with Dimension-Free Convergence Rate

no code implementations • 5 Jan 2024 • Ruichen Jiang, Parameswaran Raman, Shoham Sabach, Aryan Mokhtari, Mingyi Hong, Volkan Cevher

In this paper, we introduce a novel subspace cubic regularized Newton method that achieves a dimension-independent global convergence rate of ${O}\left(\frac{1}{mk}+\frac{1}{k^2}\right)$ for solving convex optimization problems.

Second-order methods

Paper
Add Code

Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks

no code implementations • 13 Jul 2023 • Liam Collins, Hamed Hassani, Mahdi Soltanolkotabi, Aryan Mokhtari, Sanjay Shakkottai

An increasingly popular machine learning paradigm is to pretrain a neural network (NN) on many tasks offline, then adapt it to downstream tasks, often by re-training only the last linear layer of the network.

Binary Classification Multi-Task Learning +1

Paper
Add Code

Limited-Memory Greedy Quasi-Newton Method with Non-asymptotic Superlinear Convergence Rate

no code implementations • 27 Jun 2023 • Zhan Gao, Aryan Mokhtari, Alec Koppel

Interestingly, our established non-asymptotic superlinear convergence rate demonstrates an explicit trade-off between the convergence speed and memory requirement, which to our knowledge, is the first of its kind.

Paper
Add Code

Online Learning Guided Curvature Approximation: A Quasi-Newton Method with Global Non-Asymptotic Superlinear Convergence

no code implementations • 16 Feb 2023 • Ruichen Jiang, Qiujiang Jin, Aryan Mokhtari

Quasi-Newton algorithms are among the most popular iterative methods for solving unconstrained minimization problems, largely due to their favorable superlinear convergence property.

Paper
Add Code

InfoNCE Loss Provably Learns Cluster-Preserving Representations

no code implementations • 15 Feb 2023 • Advait Parulekar, Liam Collins, Karthikeyan Shanmugam, Aryan Mokhtari, Sanjay Shakkottai

The goal of contrasting learning is to learn a representation that preserves underlying clusters by keeping samples with similar content, e. g. the ``dogness'' of a dog, close to each other in the space generated by the representation.

Paper
Add Code

Network Adaptive Federated Learning: Congestion and Lossy Compression

no code implementations • 11 Jan 2023 • Parikshit Hegde, Gustavo de Veciana, Aryan Mokhtari

In order to achieve the dual goals of privacy and learning across distributed data, Federated Learning (FL) systems rely on frequent exchanges of large files (model updates) between a set of clients and the server.

Federated Learning

Paper
Add Code

Future Gradient Descent for Adapting the Temporal Shifting Data Distribution in Online Recommendation Systems

no code implementations • 2 Sep 2022 • Mao Ye, Ruichen Jiang, Haoxiang Wang, Dhruv Choudhary, Xiaocong Du, Bhargav Bhushanam, Aryan Mokhtari, Arun Kejariwal, Qiang Liu

One of the key challenges of learning an online recommendation model is the temporal domain shift, which causes the mismatch between the training and testing data distribution and hence domain generalization error.

Domain Generalization Recommendation Systems

Paper
Add Code

A Conditional Gradient-based Method for Simple Bilevel Optimization with Convex Lower-level Problem

1 code implementation • 17 Jun 2022 • Ruichen Jiang, Nazanin Abolfazli, Aryan Mokhtari, Erfan Yazdandoost Hamedani

To the best of our knowledge, our method achieves the best-known iteration complexity for the considered class of bilevel problems.

Bilevel Optimization

Paper
Code

Straggler-Resilient Personalized Federated Learning

1 code implementation • 5 Jun 2022 • Isidoros Tziotis, Zebang Shen, Ramtin Pedarsani, Hamed Hassani, Aryan Mokhtari

Federated Learning is an emerging learning paradigm that allows training models from samples distributed across a large network of clients while respecting privacy and communication restrictions.

Learning Theory Personalized Federated Learning +1

Paper
Code

FedAvg with Fine Tuning: Local Updates Lead to Representation Learning

no code implementations • 27 May 2022 • Liam Collins, Hamed Hassani, Aryan Mokhtari, Sanjay Shakkottai

We show that the reason behind generalizability of the FedAvg's output is its power in learning the common data representation among the clients' tasks, by leveraging the diversity among client data distributions via local updates.

Federated Learning Image Classification +1

Paper
Add Code

Generalized Optimistic Methods for Convex-Concave Saddle Point Problems

no code implementations • 19 Feb 2022 • Ruichen Jiang, Aryan Mokhtari

In this paper, we follow this approach and distill the underlying idea of optimism to propose a generalized optimistic method, which includes the optimistic gradient method as a special case.

Second-order methods

Paper
Add Code

The Power of Adaptivity in SGD: Self-Tuning Step Sizes with Unbounded Gradients and Affine Variance

no code implementations • 11 Feb 2022 • Matthew Faw, Isidoros Tziotis, Constantine Caramanis, Aryan Mokhtari, Sanjay Shakkottai, Rachel Ward

We study convergence rates of AdaGrad-Norm as an exemplar of adaptive stochastic gradient methods (SGD), where the step sizes change based on observed stochastic gradients, for minimizing non-convex, smooth objectives.

Paper
Add Code

MAML and ANIL Provably Learn Representations

no code implementations • 7 Feb 2022 • Liam Collins, Aryan Mokhtari, Sewoong Oh, Sanjay Shakkottai

Recent empirical evidence has driven conventional wisdom to believe that gradient-based meta-learning (GBML) methods perform well at few-shot learning because they learn an expressive data representation that is shared across tasks.

Few-Shot Learning Representation Learning

Paper
Add Code

Minimax Optimization: The Case of Convex-Submodular

no code implementations • 1 Nov 2021 • Arman Adibi, Aryan Mokhtari, Hamed Hassani

Prior literature has thus far mainly focused on studying such problems in the continuous domain, e. g., convex-concave minimax optimization is now understood to a significant extent.

Paper
Add Code

Exploiting Local Convergence of Quasi-Newton Methods Globally: Adaptive Sample Size Approach

no code implementations • NeurIPS 2021 • Qiujiang Jin, Aryan Mokhtari

In this paper, we use an adaptive sample size scheme that exploits the superlinear convergence of quasi-Newton methods globally and throughout the entire learning process.

Paper
Add Code

Exploiting Shared Representations for Personalized Federated Learning

3 code implementations • 14 Feb 2021 • Liam Collins, Hamed Hassani, Aryan Mokhtari, Sanjay Shakkottai

Based on this intuition, we propose a novel federated learning framework and algorithm for learning a shared data representation across clients and unique local heads for each client.

Meta-Learning Multi-Task Learning +2

1,144

Paper
Code

Generalization of Model-Agnostic Meta-Learning Algorithms: Recurring and Unseen Tasks

no code implementations • NeurIPS 2021 • Alireza Fallah, Aryan Mokhtari, Asuman Ozdaglar

In this paper, we study the generalization properties of Model-Agnostic Meta-Learning (MAML) algorithms for supervised learning problems.

Generalization Bounds Meta-Learning

Paper
Add Code

Straggler-Resilient Federated Learning: Leveraging the Interplay Between Statistical Accuracy and System Heterogeneity

no code implementations • 28 Dec 2020 • Amirhossein Reisizadeh, Isidoros Tziotis, Hamed Hassani, Aryan Mokhtari, Ramtin Pedarsani

Federated Learning is a novel paradigm that involves learning from data samples distributed across a large network of clients while the data remains local.

Federated Learning

Paper
Add Code

Personalized Federated Learning with Theoretical Guarantees: A Model-Agnostic Meta-Learning Approach

2 code implementations • NeurIPS 2020 • Alireza Fallah, Aryan Mokhtari, Asuman Ozdaglar

In this paper, we study a personalized variant of the federated learning in which our goal is to find an initial shared model that current or new users can easily adapt to their local dataset by performing one or a few steps of gradient descent with respect to their own data.

Meta-Learning Personalized Federated Learning

1,144

Paper
Code

Second Order Optimality in Decentralized Non-Convex Optimization via Perturbed Gradient Tracking

no code implementations • NeurIPS 2020 • Isidoros Tziotis, Constantine Caramanis, Aryan Mokhtari

In this paper we study the problem of escaping from saddle points and achieving second-order optimality in a decentralized setting where a group of agents collaborate to minimize their aggregate objective function.

Paper
Add Code

How Does the Task Landscape Affect MAML Performance?

no code implementations • 27 Oct 2020 • Liam Collins, Aryan Mokhtari, Sanjay Shakkottai

Model-Agnostic Meta-Learning (MAML) has become increasingly popular for training models that can quickly adapt to new tasks via one or few stochastic gradient descent steps.

Few-Shot Image Classification Meta-Learning

Paper
Add Code

Submodular Meta-Learning

1 code implementation • NeurIPS 2020 • Arman Adibi, Aryan Mokhtari, Hamed Hassani

Motivated by this terminology, we propose a novel meta-learning framework in the discrete domain where each task is equivalent to maximizing a set function under a cardinality constraint.

Meta-Learning

Paper
Code

Federated Learning with Compression: Unified Analysis and Sharp Guarantees

1 code implementation • 2 Jul 2020 • Farzin Haddadpour, Mohammad Mahdi Kamani, Aryan Mokhtari, Mehrdad Mahdavi

In federated learning, communication cost is often a critical bottleneck to scale up distributed optimization algorithms to collaboratively learn a model from millions of devices with potentially unreliable or limited communication and heterogeneous data distributions.

Distributed Optimization Federated Learning

174

Paper
Code

Safe Learning under Uncertain Objectives and Constraints

no code implementations • 23 Jun 2020 • Mohammad Fereydounian, Zebang Shen, Aryan Mokhtari, Amin Karbasi, Hamed Hassani

More precisely, by assuming that Reliable-FW has access to a (stochastic) gradient oracle of the objective function and a noisy feasibility oracle of the safety polytope, it finds an $\epsilon$-approximate first-order stationary point with the optimal ${\mathcal{O}}({1}/{\epsilon^2})$ gradient oracle complexity (resp.

Paper
Add Code

Hybrid Model for Anomaly Detection on Call Detail Records by Time Series Forecasting

no code implementations • 7 Jun 2020 • Aryan Mokhtari, Leyla Sadighi, Behnam Bahrak, Mojtaba Eshghie

In this paper, a new hybrid method is proposed based on various anomaly detection methods such as GARCH, K-means, and Neural Network to determine the anomalous data.

Anomaly Detection Time Series +1

Paper
Add Code

Non-asymptotic Superlinear Convergence of Standard Quasi-Newton Methods

no code implementations • 30 Mar 2020 • Qiujiang Jin, Aryan Mokhtari

In this paper, we provide a finite-time (non-asymptotic) convergence analysis for Broyden quasi-Newton algorithms under the assumptions that the objective function is strongly convex, its gradient is Lipschitz continuous, and its Hessian is Lipschitz continuous at the optimal solution.

Paper
Add Code

Quantized Decentralized Stochastic Learning over Directed Graphs

no code implementations • ICML 2020 • Hossein Taheri, Aryan Mokhtari, Hamed Hassani, Ramtin Pedarsani

We consider a decentralized stochastic learning problem where data points are distributed among computing nodes communicating over a directed graph.

Quantization

Paper
Add Code

Personalized Federated Learning: A Meta-Learning Approach

no code implementations • 19 Feb 2020 • Alireza Fallah, Aryan Mokhtari, Asuman Ozdaglar

Meta-Learning Personalized Federated Learning

Paper
Add Code

On the Convergence Theory of Debiased Model-Agnostic Meta-Reinforcement Learning

1 code implementation • NeurIPS 2021 • Alireza Fallah, Kristian Georgiev, Aryan Mokhtari, Asuman Ozdaglar

We consider Model-Agnostic Meta-Learning (MAML) methods for Reinforcement Learning (RL) problems, where the goal is to find a policy using data from several tasks represented by Markov Decision Processes (MDPs) that can be updated by one step of stochastic policy gradient for the realized MDP.

Meta-Learning Meta Reinforcement Learning +3

Paper
Code

Task-Robust Model-Agnostic Meta-Learning

no code implementations • NeurIPS 2020 • Liam Collins, Aryan Mokhtari, Sanjay Shakkottai

Meta-learning methods have shown an impressive ability to train models that rapidly learn new tasks.

Image Classification Meta-Learning

Paper
Add Code

Stochastic Continuous Greedy ++: When Upper and Lower Bounds Match

no code implementations • NeurIPS 2019 • Amin Karbasi, Hamed Hassani, Aryan Mokhtari, Zebang Shen

Concretely, for a monotone and continuous DR-submodular function, \SCGPP achieves a tight $[(1-1/e)\OPT -\epsilon]$ solution while using $O(1/\epsilon^2)$ stochastic gradients and $O(1/\epsilon)$ calls to the linear optimization oracle.

Paper
Add Code

A Decentralized Proximal Point-type Method for Saddle Point Problems

no code implementations • 31 Oct 2019 • Weijie Liu, Aryan Mokhtari, Asuman Ozdaglar, Sarath Pattathil, Zebang Shen, Nenggan Zheng

In this paper, we focus on solving a class of constrained non-convex non-concave saddle point problems in a decentralized manner by a group of nodes in a network.

Vocal Bursts Type Prediction

Paper
Add Code

One Sample Stochastic Frank-Wolfe

no code implementations • 10 Oct 2019 • Mingrui Zhang, Zebang Shen, Aryan Mokhtari, Hamed Hassani, Amin Karbasi

One of the beauties of the projected gradient descent method lies in its rather simple mechanism and yet stable behavior with inexact, stochastic gradients, which has led to its wide-spread use in many machine learning applications.

Paper
Add Code

FedPAQ: A Communication-Efficient Federated Learning Method with Periodic Averaging and Quantization

no code implementations • 28 Sep 2019 • Amirhossein Reisizadeh, Aryan Mokhtari, Hamed Hassani, Ali Jadbabaie, Ramtin Pedarsani

Federated learning is a distributed framework according to which a model is trained over a set of devices, while keeping data localized.

Federated Learning Quantization

Paper
Add Code

On the Convergence Theory of Gradient-Based Model-Agnostic Meta-Learning Algorithms

no code implementations • 27 Aug 2019 • Alireza Fallah, Aryan Mokhtari, Asuman Ozdaglar

We study the convergence of a class of gradient-based Model-Agnostic Meta-Learning (MAML) methods and characterize their overall complexity as well as their best achievable accuracy in terms of gradient norm for nonconvex loss functions.

Meta-Learning

Paper
Add Code

Robust and Communication-Efficient Collaborative Learning

1 code implementation • NeurIPS 2019 • Amirhossein Reisizadeh, Hossein Taheri, Aryan Mokhtari, Hamed Hassani, Ramtin Pedarsani

We consider a decentralized learning problem, where a set of computing nodes aim at solving a non-convex optimization problem collaboratively.

Quantization

Paper
Code

Convergence Rate of $\mathcal{O}(1/k)$ for Optimistic Gradient and Extra-gradient Methods in Smooth Convex-Concave Saddle Point Problems

no code implementations • 3 Jun 2019 • Aryan Mokhtari, Asuman Ozdaglar, Sarath Pattathil

To do so, we first show that both OGDA and EG can be interpreted as approximate variants of the proximal point method.

Paper
Add Code

Stochastic Conditional Gradient++

no code implementations • 19 Feb 2019 • Hamed Hassani, Amin Karbasi, Aryan Mokhtari, Zebang Shen

It is known that this rate is optimal in terms of stochastic gradient evaluations.

Stochastic Optimization

Paper
Add Code

Quantized Frank-Wolfe: Faster Optimization, Lower Communication, and Projection Free

no code implementations • 17 Feb 2019 • Mingrui Zhang, Lin Chen, Aryan Mokhtari, Hamed Hassani, Amin Karbasi

How can we efficiently mitigate the overhead of gradient communications in distributed optimization?

Distributed Optimization Quantization

Paper
Add Code

A Unified Analysis of Extra-gradient and Optimistic Gradient Methods for Saddle Point Problems: Proximal Point Approach

no code implementations • 24 Jan 2019 • Aryan Mokhtari, Asuman Ozdaglar, Sarath Pattathil

In this paper we consider solving saddle point problems using two variants of Gradient Descent-Ascent algorithms, Extra-gradient (EG) and Optimistic Gradient Descent Ascent (OGDA) methods.

Paper
Add Code

Efficient Distributed Hessian Free Algorithm for Large-scale Empirical Risk Minimization via Accumulating Sample Strategy

no code implementations • 26 Oct 2018 • Majid Jahani, Xi He, Chenxin Ma, Aryan Mokhtari, Dheevatsa Mudigere, Alejandro Ribeiro, Martin Takáč

In this paper, we propose a Distributed Accumulated Newton Conjugate gradiEnt (DANCE) method in which sample size is gradually increasing to quickly obtain a solution whose empirical loss is under satisfactory statistical accuracy.

Paper
Add Code

Escaping Saddle Points in Constrained Optimization

no code implementations • NeurIPS 2018 • Aryan Mokhtari, Asuman Ozdaglar, Ali Jadbabaie

We propose a generic framework that yields convergence to a second-order stationary point of the problem, if the convex set $\mathcal{C}$ is simple for a quadratic objective function.

Paper
Add Code

An Exact Quantized Decentralized Gradient Descent Algorithm

no code implementations • 29 Jun 2018 • Amirhossein Reisizadeh, Aryan Mokhtari, Hamed Hassani, Ramtin Pedarsani

We consider the problem of decentralized consensus optimization, where the sum of $n$ smooth and strongly convex functions are minimized over $n$ distributed agents that form a connected network.

Distributed Optimization Quantization

Paper
Add Code

Towards More Efficient Stochastic Decentralized Learning: Faster Convergence and Sparse Communication

no code implementations • ICML 2018 • Zebang Shen, Aryan Mokhtari, Tengfei Zhou, Peilin Zhao, Hui Qian

Recently, the decentralized optimization problem is attracting growing attention.

Paper
Add Code

Direct Runge-Kutta Discretization Achieves Acceleration

no code implementations • NeurIPS 2018 • Jingzhao Zhang, Aryan Mokhtari, Suvrit Sra, Ali Jadbabaie

We study gradient-based optimization methods obtained by directly discretizing a second-order ordinary differential equation (ODE) related to the continuous limit of Nesterov's accelerated gradient method.

Paper
Add Code

Stochastic Conditional Gradient Methods: From Convex Minimization to Submodular Maximization

no code implementations • 24 Apr 2018 • Aryan Mokhtari, Hamed Hassani, Amin Karbasi

Further, for a monotone and continuous DR-submodular function and subject to a general convex body constraint, we prove that our proposed method achieves a $((1-1/e)OPT-\eps)$ guarantee with $O(1/\eps^3)$ stochastic gradient computations.

Stochastic Optimization

Paper
Add Code

Conditional Gradient Method for Stochastic Submodular Maximization: Closing the Gap

no code implementations • 5 Nov 2017 • Aryan Mokhtari, Hamed Hassani, Amin Karbasi

More precisely, for a monotone and continuous DR-submodular function and subject to a \textit{general} convex body constraint, we prove that \alg achieves a $[(1-1/e)\text{OPT} -\eps]$ guarantee (in expectation) with $\mathcal{O}{(1/\eps^3)}$ stochastic gradient computations.

Paper
Add Code

First-Order Adaptive Sample Size Methods to Reduce Complexity of Empirical Risk Minimization

no code implementations • NeurIPS 2017 • Aryan Mokhtari, Alejandro Ribeiro

Theoretical analyses show that the use of adaptive sample size methods reduces the overall computational cost of achieving the statistical accuracy of the whole dataset for a broad range of deterministic and stochastic first-order methods.

Paper
Add Code

Large Scale Empirical Risk Minimization via Truncated Adaptive Newton Method

no code implementations • 22 May 2017 • Mark Eisen, Aryan Mokhtari, Alejandro Ribeiro

In this paper, we propose a novel adaptive sample size second-order method, which reduces the cost of computing the Hessian by solving a sequence of ERM problems corresponding to a subset of samples and lowers the cost of computing the Hessian inverse using a truncated eigenvalue decomposition.

Second-order methods

Paper
Add Code

IQN: An Incremental Quasi-Newton Method with Local Superlinear Convergence Rate

no code implementations • 2 Feb 2017 • Aryan Mokhtari, Mark Eisen, Alejandro Ribeiro

This makes their computational cost per iteration independent of the number of objective functions $n$.

Paper
Add Code

Surpassing Gradient Descent Provably: A Cyclic Incremental Method with Linear Convergence Rate

no code implementations • 1 Nov 2016 • Aryan Mokhtari, Mert Gürbüzbalaban, Alejandro Ribeiro

We prove that not only the proposed DIAG method converges linearly to the optimal solution, but also its linear convergence factor justifies the advantage of incremental methods on GD.

Paper
Add Code

Stochastic Averaging for Constrained Optimization with Application to Online Resource Allocation

no code implementations • 7 Oct 2016 • Tianyi Chen, Aryan Mokhtari, Xin Wang, Alejandro Ribeiro, Georgios B. Giannakis

Existing approaches to resource allocation for nowadays stochastic networks are challenged to meet fast convergence and tolerable delay requirements.

Paper
Add Code

A Class of Parallel Doubly Stochastic Algorithms for Large-Scale Learning

no code implementations • 15 Jun 2016 • Aryan Mokhtari, Alec Koppel, Alejandro Ribeiro

Algorithms that are parallel in either of these dimensions exist, but RAPSA is the first attempt at a methodology that is parallel in both the selection of blocks and the selection of elements of the training set.

Image Classification

Paper
Add Code

Adaptive Newton Method for Empirical Risk Minimization to Statistical Accuracy

no code implementations • NeurIPS 2016 • Aryan Mokhtari, Alejandro Ribeiro

We consider empirical risk minimization for large-scale datasets.

Paper
Add Code

A Decentralized Quasi-Newton Method for Dual Formulations of Consensus Optimization

no code implementations • 23 Mar 2016 • Mark Eisen, Aryan Mokhtari, Alejandro Ribeiro

The resulting dual D-BFGS method is a fully decentralized algorithm in which nodes approximate curvature information of themselves and their neighbors through the satisfaction of a secant condition.

Second-order methods

Paper
Add Code

Doubly Random Parallel Stochastic Methods for Large Scale Learning

no code implementations • 22 Mar 2016 • Aryan Mokhtari, Alec Koppel, Alejandro Ribeiro

Algorithms that are parallel in either of these dimensions exist, but RAPSA is the first attempt at a methodology that is parallel in both, the selection of blocks and the selection of elements of the training set.

Paper
Add Code

Online Optimization in Dynamic Environments: Improved Regret Rates for Strongly Convex Problems

no code implementations • 16 Mar 2016 • Aryan Mokhtari, Shahin Shahrampour, Ali Jadbabaie, Alejandro Ribeiro

In this paper, we address tracking of a time-varying parameter with unknown dynamics.

Paper
Add Code

DSA: Decentralized Double Stochastic Averaging Gradient Algorithm

no code implementations • 13 Jun 2015 • Aryan Mokhtari, Alejandro Ribeiro

The decentralized double stochastic averaging gradient (DSA) algorithm is proposed as a solution alternative that relies on: (i) The use of local stochastic averaging gradients.

Optimization and Control

Paper
Add Code

Global Convergence of Online Limited Memory BFGS

no code implementations • 6 Sep 2014 • Aryan Mokhtari, Alejandro Ribeiro

Global convergence of an online (stochastic) limited memory version of the Broyden-Fletcher- Goldfarb-Shanno (BFGS) quasi-Newton method for solving optimization problems with stochastic objectives that arise in large scale machine learning is established.

Paper
Add Code

A Quasi-Newton Method for Large Scale Support Vector Machines

no code implementations • 20 Feb 2014 • Aryan Mokhtari, Alejandro Ribeiro

This paper adapts a recently developed regularized stochastic version of the Broyden, Fletcher, Goldfarb, and Shanno (BFGS) quasi-Newton method for the solution of support vector machine classification problems.

General Classification

Paper
Add Code

RES: Regularized Stochastic BFGS Algorithm

no code implementations • 29 Jan 2014 • Aryan Mokhtari, Alejandro Ribeiro

Numerical experiments showcase reductions in convergence time relative to stochastic gradient descent algorithms and non-regularized stochastic versions of BFGS.

Second-order methods

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.