Search Results for author: Kaiyi Ji

Found 34 papers, 8 papers with code

Fair Resource Allocation in Multi-Task Learning

1 code implementation23 Feb 2024 Hao Ban, Kaiyi Ji

Inspired by fair resource allocation in communication networks, we formulate the optimization of MTL as a utility maximization problem, where the loss decreases across tasks are maximized under different fairness measurements.

Fairness Multi-Task Learning

Discriminative Adversarial Unlearning

no code implementations10 Feb 2024 Rohan Sharma, Shijie Zhou, Kaiyi Ji, Changyou Chen

We consider the scenario of two networks, the attacker $\mathbf{A}$ and the trained defender $\mathbf{D}$ pitted against each other in an adversarial objective, wherein the attacker aims at teasing out the information of the data to be unlearned in order to infer membership, and the defender unlearns to defend the network against the attack, whilst preserving its general performance.

Machine Unlearning Network Pruning

Achieving ${O}(ε^{-1.5})$ Complexity in Hessian/Jacobian-free Stochastic Bilevel Optimization

no code implementations6 Dec 2023 Yifan Yang, Peiyao Xiao, Kaiyi Ji

In this paper, we revisit the bilevel optimization problem, in which the upper-level objective function is generally nonconvex and the lower-level objective function is strongly convex.

Bilevel Optimization

SimFBO: Towards Simple, Flexible and Communication-efficient Federated Bilevel Learning

no code implementations NeurIPS 2023 Yifan Yang, Peiyao Xiao, Kaiyi Ji

Federated bilevel optimization (FBO) has shown great potential recently in machine learning and edge computing due to the emerging nested optimization structure in meta-learning, fine-tuning, hyperparameter tuning, etc.

Bilevel Optimization Edge-computing +1

Direction-oriented Multi-objective Learning: Simple and Provable Stochastic Algorithms

1 code implementation NeurIPS 2023 Peiyao Xiao, Hao Ban, Kaiyi Ji

In this paper, we propose a new direction-oriented multi-objective problem by regularizing the common descent direction within a neighborhood of a direction that optimizes a linear combination of objectives such as the average loss in MTL.

Multi-Task Learning

Achieving Linear Speedup in Non-IID Federated Bilevel Learning

no code implementations10 Feb 2023 Minhui Huang, Dewei Zhang, Kaiyi Ji

However, several important properties in federated learning such as the partial client participation and the linear speedup for convergence (i. e., the convergence rate and complexity are improved linearly with respect to the number of sampled clients) in the presence of non-i. i. d.~datasets, still remain open.

Bilevel Optimization Federated Learning

Communication-Efficient Federated Hypergradient Computation via Aggregated Iterative Differentiation

no code implementations9 Feb 2023 Peiyao Xiao, Kaiyi Ji

Federated bilevel optimization has attracted increasing attention due to emerging machine learning and communication applications.

Bilevel Optimization

Network Utility Maximization with Unknown Utility Functions: A Distributed, Data-Driven Bilevel Optimization Approach

no code implementations4 Jan 2023 Kaiyi Ji, Lei Ying

In this paper, we provide a new solution using a distributed and data-driven bilevel optimization approach, where the lower level is a distributed network utility maximization (NUM) algorithm with concave surrogate utility functions, and the upper level is a data-driven learning algorithm to find the best surrogate utility functions that maximize the sum of true network utility.

Bilevel Optimization

Will Bilevel Optimizers Benefit from Loops

no code implementations27 May 2022 Kaiyi Ji, Mingrui Liu, Yingbin Liang, Lei Ying

Existing studies in the literature cover only some of those implementation choices, and the complexity bounds available are not refined enough to enable rigorous comparison among different implementations.

Bilevel Optimization Computational Efficiency

Data Sampling Affects the Complexity of Online SGD over Dependent Data

no code implementations31 Mar 2022 Shaocong Ma, Ziyi Chen, Yi Zhou, Kaiyi Ji, Yingbin Liang

Moreover, we show that online SGD with mini-batch sampling can further substantially improve the sample complexity over online SGD with periodic data-subsampling over highly dependent data.

Stochastic Optimization

A Primal-Dual Approach to Bilevel Optimization with Multiple Inner Minima

no code implementations1 Mar 2022 Daouda Sow, Kaiyi Ji, Ziwei Guan, Yingbin Liang

Existing algorithms designed for such a problem were applicable to restricted situations and do not come with a full guarantee of convergence.

Bilevel Optimization Hyperparameter Optimization +2

Efficiently Escaping Saddle Points in Bilevel Optimization

no code implementations8 Feb 2022 Minhui Huang, Xuxing Chen, Kaiyi Ji, Shiqian Ma, Lifeng Lai

Moreover, we propose an inexact NEgative-curvature-Originated-from-Noise Algorithm (iNEON), a pure first-order algorithm that can escape saddle point and find local minimum of stochastic bilevel optimization.

Bilevel Optimization

How to Improve Sample Complexity of SGD over Highly Dependent Data?

no code implementations29 Sep 2021 Shaocong Ma, Ziyi Chen, Yi Zhou, Kaiyi Ji, Yingbin Liang

Specifically, with a $\phi$-mixing model that captures both exponential and polynomial decay of the data dependence over time, we show that SGD with periodic data-subsampling achieves an improved sample complexity over the standard SGD in the full spectrum of the $\phi$-mixing data dependence.

Stochastic Optimization

ES-Based Jacobian Enables Faster Bilevel Optimization

no code implementations29 Sep 2021 Daouda Sow, Kaiyi Ji, Yingbin Liang

Bilevel optimization (BO) has arisen as a powerful tool for solving many modern machine learning problems.

Bilevel Optimization Meta-Learning

Bilevel Optimization for Machine Learning: Algorithm Design and Convergence Analysis

no code implementations31 Jul 2021 Kaiyi Ji

For the problem-based formulation, we provide a convergence rate analysis for AID- and ITD-based bilevel algorithms.

BIG-bench Machine Learning Bilevel Optimization +2

Provably Faster Algorithms for Bilevel Optimization

1 code implementation NeurIPS 2021 Junjie Yang, Kaiyi Ji, Yingbin Liang

Bilevel optimization has been widely applied in many important machine learning applications such as hyperparameter optimization and meta-learning.

Bilevel Optimization Hyperparameter Optimization +1

Lower Bounds and Accelerated Algorithms for Bilevel Optimization

no code implementations7 Feb 2021 Kaiyi Ji, Yingbin Liang

Bilevel optimization has recently attracted growing interests due to its wide applications in modern machine learning problems.

Bilevel Optimization

Bilevel Optimization: Convergence Analysis and Enhanced Design

2 code implementations15 Oct 2020 Kaiyi Ji, Junjie Yang, Yingbin Liang

For the AID-based method, we orderwisely improve the previous convergence rate analysis due to a more practical parameter selection as well as a warm start strategy, and for the ITD-based method we establish the first theoretical convergence rate.

Bilevel Optimization Hyperparameter Optimization +1

Boosting One-Point Derivative-Free Online Optimization via Residual Feedback

no code implementations14 Oct 2020 Yan Zhang, Yi Zhou, Kaiyi Ji, Michael M. Zavlanos

As a result, our regret bounds are much tighter compared to existing regret bounds for ZO with conventional one-point feedback, which suggests that ZO with residual feedback can better track the optimizer of online optimization problems.

Provably Faster Algorithms for Bilevel Optimization and Applications to Meta-Learning

no code implementations28 Sep 2020 Kaiyi Ji, Junjie Yang, Yingbin Liang

For the AID-based method, we orderwisely improve the previous finite-time convergence analysis due to a more practical parameter selection as well as a warm start strategy, and for the ITD-based method we establish the first theoretical convergence rate.

Bilevel Optimization Hyperparameter Optimization +1

A New One-Point Residual-Feedback Oracle For Black-Box Learning and Control

no code implementations18 Jun 2020 Yan Zhang, Yi Zhou, Kaiyi Ji, Michael M. Zavlanos

When optimizing a deterministic Lipschitz function, we show that the query complexity of ZO with the proposed one-point residual feedback matches that of ZO with the existing two-point schemes.

Convergence of Meta-Learning with Task-Specific Adaptation over Partial Parameters

no code implementations NeurIPS 2020 Kaiyi Ji, Jason D. Lee, Yingbin Liang, H. Vincent Poor

Although model-agnostic meta-learning (MAML) is a very successful algorithm in meta-learning practice, it can have high computational cost because it updates all model parameters over both the inner loop of task-specific adaptation and the outer-loop of meta initialization training.

Meta-Learning

Proximal Gradient Algorithm with Momentum and Flexible Parameter Restart for Nonconvex Optimization

no code implementations26 Feb 2020 Yi Zhou, Zhe Wang, Kaiyi Ji, Yingbin Liang, Vahid Tarokh

Our APG-restart is designed to 1) allow for adopting flexible parameter restart schemes that cover many existing ones; 2) have a global sub-linear convergence rate in nonconvex and nonsmooth optimization; and 3) have guaranteed convergence to a critical point and have various types of asymptotic convergence rates depending on the parameterization of local geometry in nonconvex and nonsmooth optimization.

Theoretical Convergence of Multi-Step Model-Agnostic Meta-Learning

2 code implementations18 Feb 2020 Kaiyi Ji, Junjie Yang, Yingbin Liang

As a popular meta-learning approach, the model-agnostic meta-learning (MAML) algorithm has been widely used due to its simplicity and effectiveness.

Meta-Learning

Robust Stochastic Bandit Algorithms under Probabilistic Unbounded Adversarial Attack

no code implementations17 Feb 2020 Ziwei Guan, Kaiyi Ji, Donald J Bucci Jr, Timothy Y Hu, Joseph Palombo, Michael Liston, Yingbin Liang

This paper investigates the attack model where an adversary attacks with a certain probability at each round, and its attack value can be arbitrary and unbounded if it attacks.

Adversarial Attack

SpiderBoost and Momentum: Faster Variance Reduction Algorithms

no code implementations NeurIPS 2019 Zhe Wang, Kaiyi Ji, Yi Zhou, Yingbin Liang, Vahid Tarokh

SARAH and SPIDER are two recently developed stochastic variance-reduced algorithms, and SPIDER has been shown to achieve a near-optimal first-order oracle complexity in smooth nonconvex optimization.

Improved Zeroth-Order Variance Reduced Algorithms and Analysis for Nonconvex Optimization

no code implementations27 Oct 2019 Kaiyi Ji, Zhe Wang, Yi Zhou, Yingbin Liang

Two types of zeroth-order stochastic algorithms have recently been designed for nonconvex optimization respectively based on the first-order techniques SVRG and SARAH/SPIDER.

History-Gradient Aided Batch Size Adaptation for Variance Reduced Algorithms

no code implementations ICML 2020 Kaiyi Ji, Zhe Wang, Bowen Weng, Yi Zhou, Wei zhang, Yingbin Liang

In this paper, we propose a novel scheme, which eliminates backtracking line search but still exploits the information along optimization path by adapting the batch size via history stochastic gradients.

Momentum Schemes with Stochastic Variance Reduction for Nonconvex Composite Optimization

no code implementations7 Feb 2019 Yi Zhou, Zhe Wang, Kaiyi Ji, Yingbin Liang, Vahid Tarokh

In this paper, we develop novel momentum schemes with flexible coefficient settings to accelerate SPIDER for nonconvex and nonsmooth composite optimization, and show that the resulting algorithms achieve the near-optimal gradient oracle complexity for achieving a generalized first-order stationary condition.

Minimax Estimation of Neural Net Distance

no code implementations NeurIPS 2018 Kaiyi Ji, Yingbin Liang

An important class of distance metrics proposed for training generative adversarial networks (GANs) is the integral probability metric (IPM), in which the neural net distance captures the practical GAN training via two neural networks.

SpiderBoost and Momentum: Faster Stochastic Variance Reduction Algorithms

1 code implementation25 Oct 2018 Zhe Wang, Kaiyi Ji, Yi Zhou, Yingbin Liang, Vahid Tarokh

SARAH and SPIDER are two recently developed stochastic variance-reduced algorithms, and SPIDER has been shown to achieve a near-optimal first-order oracle complexity in smooth nonconvex optimization.

When Will Gradient Methods Converge to Max-margin Classifier under ReLU Models?

1 code implementation ICLR 2019 Tengyu Xu, Yi Zhou, Kaiyi Ji, Yingbin Liang

We study the implicit bias of gradient descent methods in solving a binary classification problem over a linearly separable dataset.

Binary Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.