Using Deep Neural Networks (DNNs) in machine learning tasks is promising in delivering high-quality results but challenging to meet stringent latency requirements and energy constraints because of the memory-bound and the compute-bound execution pattern of DNNs.
With event-driven algorithms, especially the spiking neural networks (SNNs), achieving continuous improvement in neuromorphic vision processing, a more challenging event-stream-dataset is urgently needed.
We evaluated our method on three independent datasets and the experimental results showed that our proposed method outperformed six existing state-of-the-art methods.
To overcome the infeasibility in obtaining an optimal or near-optimal scheme from the POMDP framework, we investigate the behaviors of the optimal scheme for two extreme cases in the MDP framework, and leverage intuition gained from these behaviors to propose a heuristic scheme for the realistic environment with TDR close to the maximum achievable TDR in the idealized environment.
Although spiking neural networks (SNNs) take benefits from the bio-plausible neural modeling, the low accuracy under the common local synaptic plasticity learning rules limits their application in many practical tasks.
Biological spiking neurons with intrinsic dynamics underlie the powerful representation and learning capabilities of the brain for processing multimodal information in complex environments.
In summary, the EOQ framework is specially designed for reducing the high cost of convolution and BN in network training, demonstrating a broad application prospect of online training in resource-limited devices.
Graph Convolutional Networks (GCNs) have received significant attention from various research fields due to the excellent performance in learning graph representations.
Moreover, analysis on the activation's mean in the forward pass reveals that the self-normalization property gets weaker with larger fan-in of each layer, which explains the performance degradation on large benchmarks like ImageNet.
However, neural network quantization can be used to reduce computation load while maintaining comparable accuracy and original network structure.
To this end, we propose a threshold-dependent batch normalization (tdBN) method based on the emerging spatio-temporal backpropagation, termed "STBP-tdBN", enabling direct training of a very deep SNN and the efficient implementation of its inference on neuromorphic hardware.
Graph convolutional network (GCN) emerges as a promising direction to learn the inductive representation in graph data commonly used in widespread applications, such as E-commerce, social networks, and knowledge graphs.
Recurrent neural networks (RNNs) are powerful in the tasks oriented to sequential data, such as natural language processing and video recognition.
We further theoretically and experimentally discover that the HT format has better performance on compressing weight matrices, while the TT format is more suited for compressing convolutional kernels.
As the emerging trend of graph-based deep learning, Graph Neural Networks (GNNs) excel for their capability to generate high-quality node feature vectors (embeddings).
Distributed, Parallel, and Cluster Computing
We demonstrate the advantages of this model in multiple different tasks, including few-shot learning, continual learning, and fault-tolerance learning in neuromorphic vision sensors.
Neuromorphic data, recording frameless spike events, have attracted considerable attention for the spatiotemporal information components and the event-driven processing fashion.
In this work, we first characterize the hybrid execution patterns of GCNs on Intel Xeon CPU.
Distributed, Parallel, and Cluster Computing
Recently, backpropagation through time inspired learning algorithms are widely introduced into SNNs to improve the performance, which brings the possibility to attack the models accurately given Spatio-temporal gradient maps.
Powered by our metric and framework, we analyze extensive initialization, normalization, and network structures.
In summary, this work provides a new solution for lensless imaging through scattering media using transfer learning in DNNs.
However, due to the higher dimension of convolutional kernels, the space complexity of 3DCNNs is generally larger than that of traditional two dimensional convolutional neural networks (2DCNNs).
As well known, the huge memory and compute costs of both artificial neural networks (ANNs) and spiking neural networks (SNNs) greatly hinder their deployment on edge devices with high efficiency.
Using Recurrent Neural Networks (RNNs) in sequence modeling tasks is promising in delivering high-quality results but challenging to meet stringent latency requirements because of the memory-bound execution pattern of RNNs.
To the best of our knowledge, DashNet is the first framework that can integrate and process ANNs and SNNs in a hybrid paradigm, which provides a novel solution to achieve both effectiveness and efficiency for high-speed object tracking.
In this way, all the operations in the training and inference can be bit-wise operations, pushing towards faster processing speed, decreased memory cost, and higher energy efficiency.
As a promising solution to boost the performance of distance-related algorithms (e. g., K-means and KNN), FPGA-based acceleration attracts lots of attention, but also comes with numerous challenges.
Distributed, Parallel, and Cluster Computing Programming Languages
We identify that the effectiveness expects less data correlation while the efficiency expects regular execution pattern.
As neural networks continue their reach into nearly every aspect of software operations, the details of those networks become an increasingly sensitive subject.
Cryptography and Security Hardware Architecture
Increasing the sparsity granularity can lead to better hardware utilization, but it will compromise the sparsity for maintaining accuracy.
For example, we improve the perplexity per word (PPW) of a ternary LSTM on Penn Tree Bank (PTB) corpus from 126 (the state-of-the-art result to the best of our knowledge) to 110. 3 with a full precision model in 97. 2, and a ternary GRU from 142 to 113. 5 with a full precision model in 102. 7.
We propose to execute deep neural networks (DNNs) with dynamic and sparse graph (DSG) structure for compressive memory and accelerative execution during both training and inference.
Spiking neural networks (SNNs) that enables energy efficient implementation on emerging neuromorphic hardware are gaining more attention.
Crossbar architecture based devices have been widely adopted in neural network accelerators by taking advantage of the high efficiency on vector-matrix multiplication (VMM) operations.
Batch Normalization (BN) has been proven to be quite effective at accelerating and improving the training of deep neural networks (DNNs).
no code implementations • 13 Jan 2018 • Sheng-Kai Liao, Wen-Qi Cai, Johannes Handsteiner, Bo Liu, Juan Yin, Liang Zhang, Dominik Rauch, Matthias Fink, Ji-Gang Ren, Wei-Yue Liu, Yang Li, Qi Shen, Yuan Cao, Feng-Zhi Li, Jian-Feng Wang, Yong-Mei Huang, Lei Deng, Tao Xi, Lu Ma, Tai Hu, Li Li, Nai-Le Liu, Franz Koidl, Peiyuan Wang, Yu-Ao Chen, Xiang-Bin Wang, Michael Steindorfer, Georg Kirchner, Chao-Yang Lu, Rong Shu, Rupert Ursin, Thomas Scheidl, Cheng-Zhi Peng, Jian-Yu Wang, Anton Zeilinger, Jian-Wei Pan
This was on the one hand the transmission of images in a one-time pad configuration from China to Austria as well as from Austria to China.
The idea is to shift traffic from a congested cell to its adjacent under-utilized cells by leveraging inter-cell D2D communication, so that the traffic can be served without using extra spectrum, effectively improving the spectrum temporal efficiency.
Networking and Internet Architecture
By simultaneously considering the layer-by-layer spatial domain (SD) and the timing-dependent temporal domain (TD) in the training phase, as well as an approximated derivative for the spike activity, we propose a spatio-temporal backpropagation (STBP) training framework without using any complicated technology.
Through this way, we build a unified framework that subsumes the binary or ternary networks as its special cases, and under which a heuristic algorithm is provided at the website https://github. com/AcrossV/Gated-XNOR.
Conventional single image based localization methods usually fail to localize a querying image when there exist large variations between the querying image and the pre-built scene.