Search Results for author: Jiaming Yang

Found 6 papers, 2 papers with code

HERTA: A High-Efficiency and Rigorous Training Algorithm for Unfolded Graph Neural Networks

no code implementations • 26 Mar 2024 • Yongyi Yang, Jiaming Yang, Wei Hu, Michał Dereziński

In this paper, we propose HERTA: a High-Efficiency and Rigorous Training Algorithm for Unfolded GNNs that accelerates the whole training process, achieving a nearly-linear time worst-case training guarantee.

Paper
Add Code

Solving Dense Linear Systems Faster than via Preconditioning

no code implementations • 14 Dec 2023 • Michał Dereziński, Jiaming Yang

We give a stochastic optimization algorithm that solves a dense $n\times n$ real-valued linear system $Ax=b$, returning $\tilde x$ such that $\|A\tilde x-b\|\leq \epsilon\|b\|$ in time: $$\tilde O((n^2+nk^{\omega-1})\log1/\epsilon),$$ where $k$ is the number of singular values of $A$ larger than $O(1)$ times its smallest positive singular value, $\omega < 2. 372$ is the matrix multiplication exponent, and $\tilde O$ hides a poly-logarithmic in $n$ factor.

Stochastic Optimization

Paper
Add Code

CharacterGLM: Customizing Chinese Conversational AI Characters with Large Language Models

1 code implementation • 28 Nov 2023 • Jinfeng Zhou, Zhuang Chen, Dazhen Wan, Bosi Wen, Yi Song, Jifan Yu, Yongkang Huang, Libiao Peng, Jiaming Yang, Xiyao Xiao, Sahand Sabour, Xiaohan Zhang, Wenjing Hou, Yijia Zhang, Yuxiao Dong, Jie Tang, Minlie Huang

In this paper, we present CharacterGLM, a series of models built upon ChatGLM, with model sizes ranging from 6B to 66B parameters.

Dialogue Generation

304

Paper
Code

Federated Adversarial Learning: A Framework with Convergence Analysis

no code implementations • 7 Aug 2022 • Xiaoxiao Li, Zhao Song, Jiaming Yang

Unlike the convergence analysis in classical centralized training that relies on the gradient direction, it is significantly harder to analyze the convergence in FAL for three reasons: 1) the complexity of min-max optimization, 2) model not updating in the gradient direction due to the multi-local updates on the client-side before aggregation and 3) inter-client heterogeneity.

Federated Learning

Paper
Add Code

Pixelated Butterfly: Simple and Efficient Sparse training for Neural Network Models

1 code implementation • ICLR 2022 • Tri Dao, Beidi Chen, Kaizhao Liang, Jiaming Yang, Zhao Song, Atri Rudra, Christopher Ré

To address this, our main insight is to optimize over a continuous superset of sparse matrices with a fixed structure known as products of butterfly matrices.

Language Modelling

173

Paper
Code

Provable Federated Adversarial Learning via Min-max Optimization

no code implementations • 29 Sep 2021 • Xiaoxiao Li, Zhao Song, Jiaming Yang

Unlike the convergence analysis in centralized training that relies on the gradient direction, it is significantly harder to analyze the convergence in FAL for two reasons: 1) the complexity of min-max optimization, and 2) model not updating in the gradient direction due to the multi-local updates on the client-side before aggregation.

Federated Learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.