no code implementations • 8 Aug 2024 • Weilin Cai, Le Qin, Jiayi Huang
As large language models continue to scale up, distributed training systems have expanded beyond 10k nodes, intensifying the importance of fault tolerance.
1 code implementation • 26 Jun 2024 • Weilin Cai, Juyong Jiang, Fan Wang, Jing Tang, Sunghun Kim, Jiayi Huang
Large language models (LLMs) have garnered unprecedented advancements across diverse fields, ranging from natural language processing to computer vision and beyond.
1 code implementation • 17 Apr 2024 • Jiayi Huang, Sangwoo Park, Osvaldo Simeone
This paper proposes an extension of variational inference (VI)-based Bayesian learning that integrates calibration regularization for improved ID performance, confidence minimization for OOD detection, and selective calibration to ensure a synergistic use of calibration regularization and confidence minimization.
no code implementations • 7 Apr 2024 • Weilin Cai, Juyong Jiang, Le Qin, Junwei Cui, Sunghun Kim, Jiayi Huang
Expert parallelism has been introduced as a strategy to distribute the computational workload of sparsely-gated mixture-of-experts (MoE) models across multiple computing devices, facilitating the execution of these increasingly large-scale models.
no code implementations • 7 Dec 2023 • Jiayi Huang, Han Zhong, LiWei Wang, Lin F. Yang
To tackle long planning horizon problems in reinforcement learning with general function approximation, we propose the first algorithm, termed as UCRL-WVTR, that achieves both \emph{horizon-free} and \emph{instance-dependent}, since it eliminates the polynomial dependency on the planning horizon.
no code implementations • NeurIPS 2023 • Jiayi Huang, Han Zhong, LiWei Wang, Lin F. Yang
Our algorithm, termed as \textsc{Heavy-LSVI-UCB}, achieves the \emph{first} computationally efficient \emph{instance-dependent} $K$-episode regret of $\tilde{O}(d \sqrt{H \mathcal{U}^*} K^\frac{1}{1+\epsilon} + d \sqrt{H \mathcal{V}^* K})$.
1 code implementation • 12 May 2023 • Jiayi Huang, Sangwoo Park, Osvaldo Simeone
Deep learning models, including modern systems like large language models, are well known to offer unreliable estimates of the uncertainty of their decisions.
no code implementations • NeurIPS 2021 • Han Zhong, Jiayi Huang, Lin F. Yang, LiWei Wang
Despite a large amount of effort in dealing with heavy-tailed error in machine learning, little is known when moments of the error can become non-existential: the random noise $\eta$ satisfies Pr$\left[|\eta| > |y|\right] \le 1/|y|^{\alpha}$ for some $\alpha > 0$.
no code implementations • 28 Apr 2021 • Pritam Majumder, Jiayi Huang, Sungkeun Kim, Abdullah Muzahid, Dylan Siegers, Chia-Che Tsai, Eun Jung Kim
Along with NMP and memory system development, the mapping for placing data and guiding computation in the memory-cube network has become crucial in driving the performance improvement in NMP.
no code implementations • 19 Feb 2021 • Lang Feng, Jiayi Huang, Jeff Huang, Jiang Hu
Data-Flow Integrity (DFI) is a well-known approach to effectively detecting a wide range of software attacks.
Hardware Architecture
no code implementations • 26 Feb 2019 • Jiayi Huang, Mostofa Patwary, Gregory Diamos
We show that recent innovations in deep reinforcement learning can effectively color very large graphs -- a well-known NP-hard problem with clear commercial applications.