Search Results for author: Shiqing Fan

Found 5 papers, 0 papers with code

Upcycling Large Language Models into Mixture of Experts

no code implementations10 Oct 2024 Ethan He, Abhinav Khattar, Ryan Prenger, Vijay Korthikanti, Zijie Yan, Tong Liu, Shiqing Fan, Ashwath Aithal, Mohammad Shoeybi, Bryan Catanzaro

Upcycling pre-trained dense language models into sparse mixture-of-experts (MoE) models is an efficient approach to increase the model capacity of already trained models.

MMLU

Deblurring Processor for Motion-Blurred Faces Based on Generative Adversarial Networks

no code implementations3 Mar 2021 Shiqing Fan, Ye Luo

Then we conducted a motion blur image generation experiment on some general facial data set, and used the pairs of blurred and sharp face image data to perform the training and testing experiments of the processor GAN, and gave some visual displays.

Deblurring Face Detection +3

Min-Max-Plus Neural Networks

no code implementations12 Feb 2021 Ye Luo, Shiqing Fan

We present a new model of neural networks called Min-Max-Plus Neural Networks (MMP-NNs) based on operations in tropical arithmetic.

Cannot find the paper you are looking for? You can Submit a new open access paper.