Afterward, we discuss how to further improve the performance (latency) and the energy efficiency of Edge AI systems through HW/SW-level optimizations, such as pruning, quantization, and approximation.
Since recent works still focus on the fault-modeling and random fault injection in SNNs, the impact of memory faults in SNN hardware architectures on accuracy and the respective fault-mitigation techniques are not thoroughly explored.
Continual learning is essential for all real-world applications, as frozen pre-trained models cannot effectively deal with non-stationary data distributions.
From tiny pacemaker chips to aircraft collision avoidance systems, the state-of-the-art Cyber-Physical Systems (CPS) have increasingly started to rely on Deep Neural Networks (DNNs).
The key mechanisms of SparkXD are: (1) improving the SNN error tolerance through fault-aware training that considers bit errors from approximate DRAM, (2) analyzing the error tolerance of the improved SNN model to find the maximum tolerable bit error rate (BER) that meets the targeted accuracy constraint, and (3) energy-efficient DRAM data mapping for the resilient SNN model that maps the weights in the appropriate DRAM location to minimize the DRAM access energy.
We propose DNN-Life, a specialized aging analysis and mitigation framework for DNNs, which jointly exploits hardware- and software-level knowledge to improve the lifetime of a DNN weight memory with reduced energy overhead.
Quantization Hardware Architecture
We analyze the corresponding on-chip memory requirements and leverage it to propose a novel methodology to explore different scratchpad memory designs and their energy/area trade-offs.
Many convolutional neural network (CNN) accelerators face performance- and energy-efficiency challenges which are crucial for embedded implementations, due to high DRAM access latency and energy.
With a constant improvement in the network architectures and training methodologies, Neural Networks (NNs) are increasingly being deployed in real-world Machine Learning systems.
In this paper, we perform a comprehensive error resilience analysis of DNNs subjected to hardware faults (e. g., permanent faults) in the weight memory.
A suitable approximate multiplier is then selected for each computing element from a library of approximate multipliers in such a way that (i) one approximate multiplier serves several layers, and (ii) the overall classification error and energy consumption are minimized.
The goal is to reduce the hardware requirements of CapsNets by removing unused/redundant connections and capsules, while keeping high accuracy through tests of different learning rate policies and batch sizes.
Because these libraries contain from tens to thousands of approximate implementations for a single arithmetic operation it is intractable to find an optimal combination of approximate circuits in the library even for an application consisting of a few operations.
We perform an in-depth evaluation for a Spiking Deep Belief Network (SDBN) and a DNN having the same number of layers and neurons (to obtain a fair comparison), in order to study the efficiency of our methodology and to understand the differences between SNNs and DNNs w. r. t.
Our experimental results show that the ROMANet saves DRAM access energy by 12% for the AlexNet, by 36% for the VGG-16, and by 46% for the MobileNet, while also improving the DRAM throughput by 10%, as compared to the state-of-the-art.
By leveraging this analysis, we propose a methodology to explore different on-chip memory designs and a power-gating technique to further reduce the energy consumption, depending upon the utilization across different operations of a CapsuleNet.
To address this limitation, decision-based attacks have been proposed which can estimate the model but they require several thousand queries to generate a single untargeted attack image.
Capsule Networks preserve the hierarchical spatial relationships between objects, and thereby bears a potential to surpass the performance of traditional Convolutional Neural Networks (CNNs) in performing tasks like image classification.
Therefore, computing paradigms are evolving towards machine learning (ML)-based systems because of their ability to efficiently and accurately process the enormous amount of data.
Adversarial examples have emerged as a significant threat to machine learning algorithms, especially to the convolutional neural networks (CNNs).
In this paper, we introduce a novel technique based on the Secure Selective Convolutional (SSC) techniques in the training loop that increases the robustness of a given DNN by allowing it to learn the data distribution based on the important edges in the input image.
In this paper, we propose CapsAcc, the first specialized CMOS-based hardware architecture to perform CapsuleNets inference with high performance and energy efficiency.
Distributed, Parallel, and Cluster Computing Hardware Architecture
Most of the data manipulation attacks on deep neural networks (DNNs) during the training stage introduce a perceptible noise that can be catered by preprocessing during inference or can be identified during the validation phase.
The state-of-the-art accelerators for Convolutional Neural Networks (CNNs) typically focus on accelerating only the convolutional layers, but do not prioritize the fully-connected layers much.