Search Results for author: Binhang Yuan

Found 19 papers, 10 papers with code

Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model

no code implementations5 Apr 2024 Xinrun Du, Zhouliang Yu, Songyang Gao, Ding Pan, Yuyang Cheng, Ziyang Ma, Ruibin Yuan, Xingwei Qu, Jiaheng Liu, Tianyu Zheng, Xinchen Luo, Guorui Zhou, Binhang Yuan, Wenhu Chen, Jie Fu, Ge Zhang

In this study, we introduce CT-LLM, a 2B large language model (LLM) that illustrates a pivotal shift towards prioritizing the Chinese language in developing LLMs.

Language Modelling Large Language Model

DeFT: Flash Tree-attention with IO-Awareness for Efficient Tree-search-based LLM Inference

no code implementations30 Mar 2024 Jinwei Yao, Kaiqi Chen, Kexun Zhang, Jiaxuan You, Binhang Yuan, Zeke Wang, Tao Lin

Decoding using tree search can greatly enhance the inference quality for transformer-based Large Language Models (LLMs).

Exploring the Robustness of Decentralized Training for Large Language Models

no code implementations1 Dec 2023 Lin Lu, Chenxi Dai, Wangcheng Tao, Binhang Yuan, Yanan sun, Pan Zhou

Decentralized training of large language models has emerged as an effective way to democratize this technology.

Federated Learning

Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time

1 code implementation26 Oct 2023 Zichang Liu, Jue Wang, Tri Dao, Tianyi Zhou, Binhang Yuan, Zhao Song, Anshumali Shrivastava, Ce Zhang, Yuandong Tian, Christopher Re, Beidi Chen

We show that contextual sparsity exists, that it can be accurately predicted, and that we can exploit it to speed up LLM inference in wall-clock time without compromising LLM's quality or in-context learning ability.

In-Context Learning

Serving Deep Learning Model in Relational Databases

no code implementations7 Oct 2023 Alexandre Eichenberger, Qi Lin, Saif Masood, Hong Min, Alexander Sim, Jie Wang, Yida Wang, Kesheng Wu, Binhang Yuan, Lixi Zhou, Jia Zou

Serving deep learning (DL) models on relational data has become a critical requirement across diverse commercial and scientific domains, sparking growing interest recently.

FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU

1 code implementation13 Mar 2023 Ying Sheng, Lianmin Zheng, Binhang Yuan, Zhuohan Li, Max Ryabinin, Daniel Y. Fu, Zhiqiang Xie, Beidi Chen, Clark Barrett, Joseph E. Gonzalez, Percy Liang, Christopher Ré, Ion Stoica, Ce Zhang

As a result, when running OPT-175B on a single 16GB GPU, FlexGen achieves significantly higher throughput compared to state-of-the-art offloading systems, reaching a generation throughput of 1 token/s for the first time with an effective batch size of 144.

Language Modelling Large Language Model

Stochastic Gradient Descent without Full Data Shuffle

1 code implementation12 Jun 2022 Lijie Xu, Shuang Qiu, Binhang Yuan, Jiawei Jiang, Cedric Renggli, Shaoduo Gan, Kaan Kara, Guoliang Li, Ji Liu, Wentao Wu, Jieping Ye, Ce Zhang

In this paper, we first conduct a systematic empirical study on existing data shuffling strategies, which reveals that all existing strategies have room for improvement -- they all suffer in terms of I/O performance or convergence rate.

Computational Efficiency

Fine-tuning Language Models over Slow Networks using Activation Compression with Guarantees

1 code implementation2 Jun 2022 Jue Wang, Binhang Yuan, Luka Rimanic, Yongjun He, Tri Dao, Beidi Chen, Christopher Re, Ce Zhang

Communication compression is a crucial technique for modern distributed learning systems to alleviate their communication bottlenecks over slower networks.

Decentralized Training of Foundation Models in Heterogeneous Environments

1 code implementation2 Jun 2022 Binhang Yuan, Yongjun He, Jared Quincy Davis, Tianyi Zhang, Tri Dao, Beidi Chen, Percy Liang, Christopher Re, Ce Zhang

Our key technical contribution is a scheduling algorithm that allocates different computational "tasklets" in the training of foundation models to a group of decentralized GPU devices connected by a slow heterogeneous network.

Scheduling

Persia: An Open, Hybrid System Scaling Deep Learning-based Recommenders up to 100 Trillion Parameters

1 code implementation10 Nov 2021 Xiangru Lian, Binhang Yuan, XueFeng Zhu, Yulong Wang, Yongjun He, Honghuan Wu, Lei Sun, Haodong Lyu, Chengjun Liu, Xing Dong, Yiqiao Liao, Mingnan Luo, Congfei Zhang, Jingru Xie, Haonan Li, Lei Chen, Renjie Huang, Jianying Lin, Chengchun Shu, Xuezhong Qiu, Zhishan Liu, Dongying Kong, Lei Yuan, Hai Yu, Sen yang, Ce Zhang, Ji Liu

Specifically, in order to ensure both the training efficiency and the training accuracy, we design a novel hybrid training algorithm, where the embedding layer and the dense neural network are handled by different synchronization mechanisms; then we build a system called Persia (short for parallel recommendation training system with hybrid acceleration) to support this hybrid training algorithm.

Recommendation Systems

Tensor Relational Algebra for Machine Learning System Design

no code implementations1 Sep 2020 Binhang Yuan, Dimitrije Jankov, Jia Zou, Yuxin Tang, Daniel Bourgeois, Chris Jermaine

This implementation abstraction provides little built-in support for ML systems to scale past a single machine, or for handling large models with matrices or tensors that do not easily fit into the RAM of an ASIC.

BIG-bench Machine Learning

A Federated Learning Framework for Healthcare IoT devices

no code implementations7 May 2020 Binhang Yuan, Song Ge, Wenhui Xing

The Internet of Things (IoT) revolution has shown potential to give rise to many medical applications with access to large volumes of healthcare data collected by IoT devices.

Federated Learning

Distributed Learning of Deep Neural Networks using Independent Subnet Training

2 code implementations4 Oct 2019 Binhang Yuan, Cameron R. Wolfe, Chen Dun, Yuxin Tang, Anastasios Kyrillidis, Christopher M. Jermaine

These properties of IST can cope with issues due to distributed data, slow interconnects, or limited device memory, making IST a suitable approach for cases of mandatory distribution.

BIG-bench Machine Learning Image Classification +2

Diagnosing Cardiac Abnormalities from 12-Lead Electrocardiograms Using Enhanced Deep Convolutional Neural Networks

no code implementations15 Aug 2019 Binhang Yuan, Wenhui Xing

We train an enhanced deep convolutional neural network in order to identify eight cardiac abnormalities from the standard 12-lead electrocardiograms (ECGs) using the dataset of 14000 ECGs.

Cannot find the paper you are looking for? You can Submit a new open access paper.