no code implementations • 9 Mar 2025 • Qidong Su, Wei Zhao, Xin Li, Muralidhar Andoorveedu, Chenhao Jiang, Zhanda Zhu, Kevin Song, Christina Giannoula, Gennady Pekhimenko
To improve the efficiency of distributed large language model (LLM) inference, various parallelization strategies, such as tensor and pipeline parallelism, have been proposed.
1 code implementation • 19 Oct 2022 • Muralidhar Andoorveedu, Zhanda Zhu, Bojian Zheng, Gennady Pekhimenko
We implement Tempo and evaluate the throughput, memory usage, and accuracy/loss on the BERT Large pre-training task.
no code implementations • 25 Aug 2022 • Zhengyi Li, Cong Guo, Zhanda Zhu, Yangjie Zhou, Yuxian Qiu, Xiaotian Gao, Jingwen Leng, Minyi Guo
To deal with the runtime overhead, we use a coarse-grained version of the border function.
1 code implementation • 3 Mar 2021 • Minghao Gou, Hao-Shu Fang, Zhanda Zhu, Sheng Xu, Chenxi Wang, Cewu Lu
In the first stage, an encoder-decoder like convolutional neural network Angle-View Net(AVN) is proposed to predict the SO(3) orientation of the gripper at every location of the image.