1 code implementation • 12 Feb 2025 • Peiyao Xiao, Chaosheng Dong, Shaofeng Zou, Kaiyi Ji
Multi-task learning (MTL) has been widely adopted for its ability to simultaneously learn multiple tasks.
no code implementations • 4 Feb 2025 • Jianan Nie, Peiyao Xiao, Kaiyi Ji, Peng Gao
We introduce Reciprocal Geometry Network (ReGNet), a novel architecture that integrates geometric GNNs and reciprocal blocks to model short-range and long-range interactions, respectively.
no code implementations • 15 Nov 2024 • Shijie Zhou, Huaisheng Zhu, Rohan Sharma, Ruiyi Zhang, Kaiyi Ji, Changyou Chen
Diffusion models have emerged as a powerful foundation model for visual generation.
no code implementations • 24 Oct 2024 • Zhaofeng Si, Shu Hu, Kaiyi Ji, Siwei Lyu
Meta-learning is a general approach to equip machine learning models with the ability to handle few-shot scenarios when dealing with many tasks.
1 code implementation • 7 Oct 2024 • Yifan Yang, Hao Ban, Minhui Huang, Shiqian Ma, Kaiyi Ji
To the best of our knowledge, our methods are the first to completely eliminate the need for stepsize tuning, while achieving theoretical guarantees.
no code implementations • 4 Oct 2024 • Meng Ding, Jinhui Xu, Kaiyi Ji
This analysis reveals that naive FT methods struggle with forgetting because the pretrained model retains information about the forgetting data, and the fine-tuning process has no impact on this retained information.
no code implementations • 23 Jun 2024 • Chen Wang, Kaiyi Ji, Junyi Geng, Zhongqiang Ren, Taimeng Fu, Fan Yang, Yifan Guo, Haonan He, Xiangyu Chen, Zitong Zhan, Qiwei Du, Shaoshu Su, Bowen Li, Yuheng Qiu, Yi Du, Qihang Li, Yifan Yang, Xiao Lin, Zhipeng Zhao
The framework of IL consists of three primary components: a neural module, a reasoning engine, and a memory system.
no code implementations • 29 May 2024 • Qi Zhang, Peiyao Xiao, Shaofeng Zou, Kaiyi Ji
We provide a comprehensive convergence analysis of these algorithms and show that they converge to an $\epsilon$-accurate Pareto stationary point with a guaranteed $\epsilon$-level average CA distance (i. e., the gap between the updating direction and the CA direction) over all iterations, where totally $\mathcal{O}(\epsilon^{-2})$ and $\mathcal{O}(\epsilon^{-4})$ samples are needed for deterministic and stochastic settings, respectively.
no code implementations • 27 May 2024 • Meng Ding, Kaiyi Ji, Di Wang, Jinhui Xu
In this paper, we provide a general theoretical analysis of forgetting in the linear regression model via Stochastic Gradient Descent (SGD) applicable to both underparameterized and overparameterized regimes.
no code implementations • 25 May 2024 • Yudan Wang, Peiyao Xiao, Hao Ban, Kaiyi Ji, Shaofeng Zou
However, these methods often suffer from the issue of \textit{gradient conflict} such that the tasks with larger gradients dominate the update direction, resulting in a performance degeneration on other tasks.
Multi-Objective Reinforcement Learning
reinforcement-learning
+1
1 code implementation • 23 Feb 2024 • Hao Ban, Kaiyi Ji
Inspired by fair resource allocation in communication networks, we formulate the optimization of MTL as a utility maximization problem, where the loss decreases across tasks are maximized under different fairness measurements.
no code implementations • 10 Feb 2024 • Rohan Sharma, Shijie Zhou, Kaiyi Ji, Changyou Chen
We consider the scenario of two networks, the attacker $\mathbf{A}$ and the trained defender $\mathbf{D}$ pitted against each other in an adversarial objective, wherein the attacker aims at teasing out the information of the data to be unlearned in order to infer membership, and the defender unlearns to defend the network against the attack, whilst preserving its general performance.
no code implementations • 6 Dec 2023 • Yifan Yang, Peiyao Xiao, Kaiyi Ji
In this paper, we revisit the bilevel optimization problem, in which the upper-level objective function is generally nonconvex and the lower-level objective function is strongly convex.
no code implementations • NeurIPS 2023 • Yifan Yang, Peiyao Xiao, Kaiyi Ji
Federated bilevel optimization (FBO) has shown great potential recently in machine learning and edge computing due to the emerging nested optimization structure in meta-learning, fine-tuning, hyperparameter tuning, etc.
1 code implementation • NeurIPS 2023 • Peiyao Xiao, Hao Ban, Kaiyi Ji
In this paper, we propose a new direction-oriented multi-objective problem by regularizing the common descent direction within a neighborhood of a direction that optimizes a linear combination of objectives such as the average loss in MTL.
no code implementations • 10 Feb 2023 • Minhui Huang, Dewei Zhang, Kaiyi Ji
However, several important properties in federated learning such as the partial client participation and the linear speedup for convergence (i. e., the convergence rate and complexity are improved linearly with respect to the number of sampled clients) in the presence of non-i. i. d.~datasets, still remain open.
no code implementations • 9 Feb 2023 • Peiyao Xiao, Kaiyi Ji
Federated bilevel optimization has attracted increasing attention due to emerging machine learning and communication applications.
no code implementations • 4 Jan 2023 • Kaiyi Ji, Lei Ying
In this paper, we provide a new solution using a distributed and data-driven bilevel optimization approach, where the lower level is a distributed network utility maximization (NUM) algorithm with concave surrogate utility functions, and the upper level is a data-driven learning algorithm to find the best surrogate utility functions that maximize the sum of true network utility.
no code implementations • 27 May 2022 • Kaiyi Ji, Mingrui Liu, Yingbin Liang, Lei Ying
Existing studies in the literature cover only some of those implementation choices, and the complexity bounds available are not refined enough to enable rigorous comparison among different implementations.
no code implementations • 31 Mar 2022 • Shaocong Ma, Ziyi Chen, Yi Zhou, Kaiyi Ji, Yingbin Liang
Moreover, we show that online SGD with mini-batch sampling can further substantially improve the sample complexity over online SGD with periodic data-subsampling over highly dependent data.
no code implementations • 1 Mar 2022 • Daouda Sow, Kaiyi Ji, Ziwei Guan, Yingbin Liang
Existing algorithms designed for such a problem were applicable to restricted situations and do not come with a full guarantee of convergence.
no code implementations • 8 Feb 2022 • Minhui Huang, Xuxing Chen, Kaiyi Ji, Shiqian Ma, Lifeng Lai
Moreover, we propose an inexact NEgative-curvature-Originated-from-Noise Algorithm (iNEON), a pure first-order algorithm that can escape saddle point and find local minimum of stochastic bilevel optimization.
1 code implementation • 13 Oct 2021 • Daouda Sow, Kaiyi Ji, Yingbin Liang
Bilevel optimization has arisen as a powerful tool in modern machine learning.
no code implementations • 29 Sep 2021 • Daouda Sow, Kaiyi Ji, Yingbin Liang
Bilevel optimization (BO) has arisen as a powerful tool for solving many modern machine learning problems.
no code implementations • 29 Sep 2021 • Shaocong Ma, Ziyi Chen, Yi Zhou, Kaiyi Ji, Yingbin Liang
Specifically, with a $\phi$-mixing model that captures both exponential and polynomial decay of the data dependence over time, we show that SGD with periodic data-subsampling achieves an improved sample complexity over the standard SGD in the full spectrum of the $\phi$-mixing data dependence.
no code implementations • 31 Jul 2021 • Kaiyi Ji
For the problem-based formulation, we provide a convergence rate analysis for AID- and ITD-based bilevel algorithms.
1 code implementation • NeurIPS 2021 • Junjie Yang, Kaiyi Ji, Yingbin Liang
Bilevel optimization has been widely applied in many important machine learning applications such as hyperparameter optimization and meta-learning.
no code implementations • 7 Feb 2021 • Kaiyi Ji, Yingbin Liang
Bilevel optimization has recently attracted growing interests due to its wide applications in modern machine learning problems.
2 code implementations • 15 Oct 2020 • Kaiyi Ji, Junjie Yang, Yingbin Liang
For the AID-based method, we orderwisely improve the previous convergence rate analysis due to a more practical parameter selection as well as a warm start strategy, and for the ITD-based method we establish the first theoretical convergence rate.
no code implementations • 14 Oct 2020 • Yan Zhang, Yi Zhou, Kaiyi Ji, Michael M. Zavlanos
As a result, our regret bounds are much tighter compared to existing regret bounds for ZO with conventional one-point feedback, which suggests that ZO with residual feedback can better track the optimizer of online optimization problems.
no code implementations • 28 Sep 2020 • Kaiyi Ji, Junjie Yang, Yingbin Liang
For the AID-based method, we orderwisely improve the previous finite-time convergence analysis due to a more practical parameter selection as well as a warm start strategy, and for the ITD-based method we establish the first theoretical convergence rate.
no code implementations • 18 Jun 2020 • Yan Zhang, Yi Zhou, Kaiyi Ji, Michael M. Zavlanos
When optimizing a deterministic Lipschitz function, we show that the query complexity of ZO with the proposed one-point residual feedback matches that of ZO with the existing two-point schemes.
no code implementations • NeurIPS 2020 • Kaiyi Ji, Jason D. Lee, Yingbin Liang, H. Vincent Poor
Although model-agnostic meta-learning (MAML) is a very successful algorithm in meta-learning practice, it can have high computational cost because it updates all model parameters over both the inner loop of task-specific adaptation and the outer-loop of meta initialization training.
no code implementations • 26 Feb 2020 • Yi Zhou, Zhe Wang, Kaiyi Ji, Yingbin Liang, Vahid Tarokh
Our APG-restart is designed to 1) allow for adopting flexible parameter restart schemes that cover many existing ones; 2) have a global sub-linear convergence rate in nonconvex and nonsmooth optimization; and 3) have guaranteed convergence to a critical point and have various types of asymptotic convergence rates depending on the parameterization of local geometry in nonconvex and nonsmooth optimization.
2 code implementations • 18 Feb 2020 • Kaiyi Ji, Junjie Yang, Yingbin Liang
As a popular meta-learning approach, the model-agnostic meta-learning (MAML) algorithm has been widely used due to its simplicity and effectiveness.
no code implementations • 17 Feb 2020 • Ziwei Guan, Kaiyi Ji, Donald J Bucci Jr, Timothy Y Hu, Joseph Palombo, Michael Liston, Yingbin Liang
This paper investigates the attack model where an adversary attacks with a certain probability at each round, and its attack value can be arbitrary and unbounded if it attacks.
no code implementations • NeurIPS 2019 • Zhe Wang, Kaiyi Ji, Yi Zhou, Yingbin Liang, Vahid Tarokh
SARAH and SPIDER are two recently developed stochastic variance-reduced algorithms, and SPIDER has been shown to achieve a near-optimal first-order oracle complexity in smooth nonconvex optimization.
no code implementations • 27 Oct 2019 • Kaiyi Ji, Zhe Wang, Yi Zhou, Yingbin Liang
Two types of zeroth-order stochastic algorithms have recently been designed for nonconvex optimization respectively based on the first-order techniques SVRG and SARAH/SPIDER.
no code implementations • ICML 2020 • Kaiyi Ji, Zhe Wang, Bowen Weng, Yi Zhou, Wei zhang, Yingbin Liang
In this paper, we propose a novel scheme, which eliminates backtracking line search but still exploits the information along optimization path by adapting the batch size via history stochastic gradients.
no code implementations • 7 Feb 2019 • Yi Zhou, Zhe Wang, Kaiyi Ji, Yingbin Liang, Vahid Tarokh
In this paper, we develop novel momentum schemes with flexible coefficient settings to accelerate SPIDER for nonconvex and nonsmooth composite optimization, and show that the resulting algorithms achieve the near-optimal gradient oracle complexity for achieving a generalized first-order stationary condition.
no code implementations • NeurIPS 2018 • Kaiyi Ji, Yingbin Liang
An important class of distance metrics proposed for training generative adversarial networks (GANs) is the integral probability metric (IPM), in which the neural net distance captures the practical GAN training via two neural networks.
1 code implementation • 25 Oct 2018 • Zhe Wang, Kaiyi Ji, Yi Zhou, Yingbin Liang, Vahid Tarokh
SARAH and SPIDER are two recently developed stochastic variance-reduced algorithms, and SPIDER has been shown to achieve a near-optimal first-order oracle complexity in smooth nonconvex optimization.
1 code implementation • ICLR 2019 • Tengyu Xu, Yi Zhou, Kaiyi Ji, Yingbin Liang
We study the implicit bias of gradient descent methods in solving a binary classification problem over a linearly separable dataset.
no code implementations • 16 Feb 2018 • Kaiyi Ji, Jian Tan, Jinfeng Xu, Yuejie Chi
Low-rank matrix completion has achieved great success in many real-world data applications.