no code implementations • 28 Nov 2022 • Tianjin Huang, Tianlong Chen, Meng Fang, Vlado Menkovski, Jiaxu Zhao, Lu Yin, Yulong Pei, Decebal Constantin Mocanu, Zhangyang Wang, Mykola Pechenizkiy, Shiwei Liu
Recent works have impressively demonstrated that there exists a subnetwork in randomly initialized convolutional neural networks (CNNs) that can match the performance of the fully trained dense networks at initialization, without any optimization of the weights of the network (i. e., untrained networks).
Transformers have quickly shined in the computer vision world since the emergence of Vision Transformers (ViTs).
Recent works on sparse neural network training (sparse training) have shown that a compelling trade-off between performance and efficiency can be achieved by training intrinsically sparse neural networks from scratch.
Even more unconventionally, our proposed method enables directly training sparse unbalanced GANs with an extremely sparse generator from scratch.
In this paper, we focus on sparse training and highlight a perhaps counter-intuitive finding, that random pruning at initialization can be quite powerful for the sparse training of modern neural networks.
To counter this issue, personalized FL (PFL) was proposed to produce dedicated local models for each individual user.
Federated learning (FL) is particularly vulnerable to heterogeneously distributed data, since a common global model in FL may not adapt to the heterogeneous data distribution of each user.
Perhaps most importantly, we find instead of inheriting parameters from expensive pre-trained GANs, directly training sparse GANs from scratch can be a much more efficient solution.
One of the major challenges in the supervised learning approaches is expressing and collecting the rich knowledge that experts have with respect to the meaning present in the image data.
Our framework, FreeTickets, is defined as the ensemble of these relatively cheap sparse subnetworks.
Works on lottery ticket hypothesis (LTH) and single-shot network pruning (SNIP) have raised a lot of attention currently on post-training pruning (iterative magnitude pruning), and before-training pruning (pruning at initialization).
Ranked #3 on Sparse Learning on ImageNet
By starting from a random sparse network and continuously exploring sparse connectivities during training, we can perform an Over-Parameterization in the space-time manifold, closing the gap in the expressibility between sparse training and dense training.
Ranked #4 on Sparse Learning on ImageNet
Quantum key distribution (QKD) employed orbital angular momentum (OAM) for high-dimensional encoding enhances the system security and information capacity between two communication parties.
However, comparing different sparse topologies and determining how sparse topologies evolve during training, especially for the situation in which the sparse structure optimization is involved, remain as challenging open questions.
Concretely, by exploiting the cosine similarity metric to measure the importance of the connections, our proposed method, Cosine similarity-based and Random Topology Exploration (CTRE), evolves the topology of sparse neural networks by adding the most important connections to the network without calculating dense gradient in the backward.
Despite the success of ANNs, it is challenging to train and deploy modern ANNs on commodity hardware due to the ever-increasing model size and the unprecedented growth in the data volumes.
However, LSTMs are prone to be memory-bandwidth limited in realistic applications and need an unbearable period of training and inference time as the model size is ever-increasing.