Search Results for author: Jiayi Huang

Found 11 papers, 3 papers with code

MoC-System: Efficient Fault Tolerance for Sparse Mixture-of-Experts Model Training

no code implementations8 Aug 2024 Weilin Cai, Le Qin, Jiayi Huang

As large language models continue to scale up, distributed training systems have expanded beyond 10k nodes, intensifying the importance of fault tolerance.

A Survey on Mixture of Experts

1 code implementation26 Jun 2024 Weilin Cai, Juyong Jiang, Fan Wang, Jing Tang, Sunghun Kim, Jiayi Huang

Large language models (LLMs) have garnered unprecedented advancements across diverse fields, ranging from natural language processing to computer vision and beyond.

In-Context Learning Survey

Calibrating Bayesian Learning via Regularization, Confidence Minimization, and Selective Inference

1 code implementation17 Apr 2024 Jiayi Huang, Sangwoo Park, Osvaldo Simeone

This paper proposes an extension of variational inference (VI)-based Bayesian learning that integrates calibration regularization for improved ID performance, confidence minimization for OOD detection, and selective calibration to ensure a synergistic use of calibration regularization and confidence minimization.

Variational Inference

Shortcut-connected Expert Parallelism for Accelerating Mixture-of-Experts

no code implementations7 Apr 2024 Weilin Cai, Juyong Jiang, Le Qin, Junwei Cui, Sunghun Kim, Jiayi Huang

Expert parallelism has been introduced as a strategy to distribute the computational workload of sparsely-gated mixture-of-experts (MoE) models across multiple computing devices, facilitating the execution of these increasingly large-scale models.

Horizon-Free and Instance-Dependent Regret Bounds for Reinforcement Learning with General Function Approximation

no code implementations7 Dec 2023 Jiayi Huang, Han Zhong, LiWei Wang, Lin F. Yang

To tackle long planning horizon problems in reinforcement learning with general function approximation, we propose the first algorithm, termed as UCRL-WVTR, that achieves both \emph{horizon-free} and \emph{instance-dependent}, since it eliminates the polynomial dependency on the planning horizon.

regression

Tackling Heavy-Tailed Rewards in Reinforcement Learning with Function Approximation: Minimax Optimal and Instance-Dependent Regret Bounds

no code implementations NeurIPS 2023 Jiayi Huang, Han Zhong, LiWei Wang, Lin F. Yang

Our algorithm, termed as \textsc{Heavy-LSVI-UCB}, achieves the \emph{first} computationally efficient \emph{instance-dependent} $K$-episode regret of $\tilde{O}(d \sqrt{H \mathcal{U}^*} K^\frac{1}{1+\epsilon} + d \sqrt{H \mathcal{V}^* K})$.

Reinforcement Learning (RL)

Calibration-Aware Bayesian Learning

1 code implementation12 May 2023 Jiayi Huang, Sangwoo Park, Osvaldo Simeone

Deep learning models, including modern systems like large language models, are well known to offer unreliable estimates of the uncertainty of their decisions.

Breaking the Moments Condition Barrier: No-Regret Algorithm for Bandits with Super Heavy-Tailed Payoffs

no code implementations NeurIPS 2021 Han Zhong, Jiayi Huang, Lin F. Yang, LiWei Wang

Despite a large amount of effort in dealing with heavy-tailed error in machine learning, little is known when moments of the error can become non-existential: the random noise $\eta$ satisfies Pr$\left[|\eta| > |y|\right] \le 1/|y|^{\alpha}$ for some $\alpha > 0$.

Continual Learning Approach for Improving the Data and Computation Mapping in Near-Memory Processing System

no code implementations28 Apr 2021 Pritam Majumder, Jiayi Huang, Sungkeun Kim, Abdullah Muzahid, Dylan Siegers, Chia-Che Tsai, Eun Jung Kim

Along with NMP and memory system development, the mapping for placing data and guiding computation in the memory-cube network has become crucial in driving the performance improvement in NMP.

Continual Learning

Toward Taming the Overhead Monster for Data-Flow Integrity

no code implementations19 Feb 2021 Lang Feng, Jiayi Huang, Jeff Huang, Jiang Hu

Data-Flow Integrity (DFI) is a well-known approach to effectively detecting a wide range of software attacks.

Hardware Architecture

Coloring Big Graphs with AlphaGoZero

no code implementations26 Feb 2019 Jiayi Huang, Mostofa Patwary, Gregory Diamos

We show that recent innovations in deep reinforcement learning can effectively color very large graphs -- a well-known NP-hard problem with clear commercial applications.

Reinforcement Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.