Search Results for author: Shengen Yan

Found 7 papers, 5 papers with code

Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better

1 code implementation2 Apr 2024 Enshu Liu, Junyi Zhu, Zinan Lin, Xuefei Ning, Matthew B. Blaschko, Sergey Yekhanin, Shengen Yan, Guohao Dai, Huazhong Yang, Yu Wang

For example, LCSC achieves better performance using 1 number of function evaluation (NFE) than the base model with 2 NFE on consistency distillation, and decreases the NFE of DM from 15 to 9 while maintaining the generation quality on CIFAR-10.

Evaluating Quantized Large Language Models

1 code implementation28 Feb 2024 Shiyao Li, Xuefei Ning, Luning Wang, Tengxuan Liu, Xiangsheng Shi, Shengen Yan, Guohao Dai, Huazhong Yang, Yu Wang

Post-training quantization (PTQ) has emerged as a promising technique to reduce the cost of large language models (LLMs).

Quantization

LV-Eval: A Balanced Long-Context Benchmark with 5 Length Levels Up to 256K

1 code implementation6 Feb 2024 Tao Yuan, Xuefei Ning, Dong Zhou, Zhijie Yang, Shiyao Li, Minghui Zhuang, Zheyue Tan, Zhuyu Yao, Dahua Lin, Boxun Li, Guohao Dai, Shengen Yan, Yu Wang

In contrast, the average context lengths of mainstream benchmarks are insufficient (5k-21k), and they suffer from potential knowledge leakage and inaccurate metrics, resulting in biased evaluation.

16k

A Simulation Platform for Multi-tenant Machine Learning Services on Thousands of GPUs

no code implementations10 Jan 2022 Ruofan Liang, Bingsheng He, Shengen Yan, Peng Sun

Multi-tenant machine learning services have become emerging data-intensive workloads in data centers with heavy usage of GPU resources.

BIG-bench Machine Learning Scheduling

Characterization and Prediction of Deep Learning Workloads in Large-Scale GPU Datacenters

1 code implementation3 Sep 2021 Qinghao Hu, Peng Sun, Shengen Yan, Yonggang Wen, Tianwei Zhang

Modern GPU datacenters are critical for delivering Deep Learning (DL) models and services in both the research community and industry.

Management Scheduling

Optimizing Network Performance for Distributed DNN Training on GPU Clusters: ImageNet/AlexNet Training in 1.5 Minutes

1 code implementation19 Feb 2019 Peng Sun, Wansen Feng, Ruobing Han, Shengen Yan, Yonggang Wen

To address this problem, we propose a communication backend named GradientFlow for distributed DNN training, and employ a set of network optimization techniques.

Distributed, Parallel, and Cluster Computing

Deep Image: Scaling up Image Recognition

no code implementations13 Jan 2015 Ren Wu, Shengen Yan, Yi Shan, Qingqing Dang, Gang Sun

We present a state-of-the-art image recognition system, Deep Image, developed using end-to-end deep learning.

Data Augmentation

Cannot find the paper you are looking for? You can Submit a new open access paper.