Search Results for author: Lingxiao Ma

Found 10 papers, 4 papers with code

LUT Tensor Core: Lookup Table Enables Efficient Low-Bit LLM Inference Acceleration

no code implementations12 Aug 2024 Zhiwen Mo, Lei Wang, Jianyu Wei, Zhichen Zeng, Shijie Cao, Lingxiao Ma, Naifeng Jing, Ting Cao, Jilong Xue, Fan Yang, Mao Yang

Then, LUT Tensor Core proposes the hardware design featuring an elongated tiling shape design to enhance table reuse and a bit-serial design to support various precision combinations in mpGEMM.

Language Modelling Large Language Model

Scaling Deep Learning Computation over the Inter-Core Connected Intelligence Processor with T10

no code implementations9 Aug 2024 Yiqi Liu, Yuqi Xue, Yu Cheng, Lingxiao Ma, Ziming Miao, Jilong Xue, Jian Huang

As AI chips incorporate numerous parallelized cores to scale deep learning (DL) computing, inter-core communication is enabled recently by employing high-bandwidth and low-latency interconnect links on the chip (e. g., Graphcore IPU).

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

2 code implementations27 Feb 2024 Shuming Ma, Hongyu Wang, Lingxiao Ma, Lei Wang, Wenhui Wang, Shaohan Huang, Li Dong, Ruiping Wang, Jilong Xue, Furu Wei

Recent research, such as BitNet, is paving the way for a new era of 1-bit Large Language Models (LLMs).

BitNet: Scaling 1-bit Transformers for Large Language Models

2 code implementations17 Oct 2023 Hongyu Wang, Shuming Ma, Li Dong, Shaohan Huang, Huaijie Wang, Lingxiao Ma, Fan Yang, Ruiping Wang, Yi Wu, Furu Wei

The increasing size of large language models has posed challenges for deployment and raised concerns about environmental impact due to high energy consumption.

Language Modelling Quantization

FlexMoE: Scaling Large-scale Sparse Pre-trained Model Training via Dynamic Device Placement

no code implementations8 Apr 2023 Xiaonan Nie, Xupeng Miao, Zilong Wang, Zichao Yang, Jilong Xue, Lingxiao Ma, Gang Cao, Bin Cui

We first present an empirical analysis on the problems and opportunities of training MoE models, which motivates us to overcome the routing imbalance and fluctuation problems by a dynamic expert management and device placement mechanism.

Scheduling

Architectural Implications of Graph Neural Networks

no code implementations2 Sep 2020 Zhihui Zhang, Jingwen Leng, Lingxiao Ma, Youshan Miao, Chao Li, Minyi Guo

Graph neural networks (GNN) represent an emerging line of deep learning models that operate on graph structures.

Towards Efficient Large-Scale Graph Neural Network Computing

no code implementations19 Oct 2018 Lingxiao Ma, Zhi Yang, Youshan Miao, Jilong Xue, Ming Wu, Lidong Zhou, Yafei Dai

This evolution has led to large graph-based irregular and sparse models that go beyond what existing deep learning frameworks are designed for.

Graph Neural Network graph partitioning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.