no code implementations • 10 Oct 2024 • Ethan He, Abhinav Khattar, Ryan Prenger, Vijay Korthikanti, Zijie Yan, Tong Liu, Shiqing Fan, Ashwath Aithal, Mohammad Shoeybi, Bryan Catanzaro
Upcycling pre-trained dense language models into sparse mixture-of-experts (MoE) models is an efficient approach to increase the model capacity of already trained models.
no code implementations • 3 Mar 2021 • Shiqing Fan, Ye Luo
Then we conducted a motion blur image generation experiment on some general facial data set, and used the pairs of blurred and sharp face image data to perform the training and testing experiments of the processor GAN, and gave some visual displays.
no code implementations • 3 Mar 2021 • Shiqing Fan, Liu Liying, Ye Luo
Convolutional neural networks (CNNs) have been used in many machine learning fields.
no code implementations • 12 Feb 2021 • Ye Luo, Shiqing Fan
We present a new model of neural networks called Min-Max-Plus Neural Networks (MMP-NNs) based on operations in tropical arithmetic.
no code implementations • 8 Jul 2020 • Siyu Wang, Yi Rong, Shiqing Fan, Zhen Zheng, Lansong Diao, Guoping Long, Jun Yang, Xiaoyong Liu, Wei. Lin
The last decade has witnessed growth in the computational requirements for training deep neural networks.