no code implementations • 16 Nov 2024 • Jinbo Wang, Ruijin Wang, Fengli Zhang
In this work, based on the key insight that the convergence process of the model is a highly predictable process, we break away from the traditional horizontal solution of defense and innovatively transform the problem of solving the optimal aggregation gradients into a vertical solution problem.
no code implementations • 1 Jul 2024 • Hongkang Yang, Zehao Lin, Wenjin Wang, Hao Wu, Zhiyu Li, Bo Tang, Wenqiang Wei, Jinbo Wang, Zeyun Tang, Shichao Song, Chenyang Xi, Yu Yu, Kai Chen, Feiyu Xiong, Linpeng Tang, Weinan E
The model is named $\text{Memory}^3$, since explicit memory is the third form of memory in LLMs after implicit memory (model parameters) and working memory (context key-values).
1 code implementation • 31 May 2024 • Mingze Wang, Jinbo Wang, Haotian He, Zilin Wang, Guanhua Huang, Feiyu Xiong, Zhiyu Li, Weinan E, Lei Wu
In this work, we propose an Implicit Regularization Enhancement (IRE) framework to accelerate the discovery of flat solutions in deep learning, thereby improving generalization and convergence.
no code implementations • 10 Apr 2024 • Seraj Al Mahmud Mostafa, Jinbo Wang, Benjamin Holt, Jianwu Wang
Ocean eddies play a significant role both on the sea surface and beneath it, contributing to the sustainability of marine life dependent on oceanic behaviors.
no code implementations • 13 Jul 2023 • Zeping Min, Jinbo Wang
This paper explores the integration of Large Language Models (LLMs) into Automatic Speech Recognition (ASR) systems to improve transcription accuracy.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3