Search Results for author: Christos-Savvas Bouganis

Found 29 papers, 10 papers with code

Multi-Precision Policy Enforced Training (MuPPET) : A Precision-Switching Strategy for Quantised Fixed-Point Training of CNNs

no code implementations • ICML 2020 • Aditya Rajagopal, Diederik Vink, Stylianos Venieris, Christos-Savvas Bouganis

Large-scale convolutional neural networks (CNNs) suffer from very long training times, spanning from hours to weeks, limiting the productivity and experimentation of deep learning practitioners.

Paper
Add Code

SMOF: Streaming Modern CNNs on FPGAs with Smart Off-Chip Eviction

no code implementations • 27 Mar 2024 • Petros Toupas, Zhewen Yu, Christos-Savvas Bouganis, Dimitrios Tzovaras

Convolutional Neural Networks (CNNs) have demonstrated their effectiveness in numerous vision tasks.

Paper
Add Code

Understanding Why Label Smoothing Degrades Selective Classification and How to Fix It

no code implementations • 19 Mar 2024 • Guoxuan Xia, Olivier Laurent, Gianni Franchi, Christos-Savvas Bouganis

We first demonstrate empirically across a range of tasks and architectures that LS leads to a consistent degradation in SC.

Paper
Add Code

SATAY: A Streaming Architecture Toolflow for Accelerating YOLO Models on FPGA Devices

no code implementations • 4 Sep 2023 • Alexander Montgomerie-Corcoran, Petros Toupas, Zhewen Yu, Christos-Savvas Bouganis

The YOLO family of models is considered the most efficient for object detection, having only a single model pass.

Autonomous Vehicles Object +2

Paper
Add Code

Mixed-TD: Efficient Neural Network Accelerator with Layer-Specific Tensor Decomposition

1 code implementation • 8 Jun 2023 • Zhewen Yu, Christos-Savvas Bouganis

The deployment of neural networks to such dataflow architecture accelerators is usually hindered by the available on-chip memory as it is desirable to preload the weights of neural networks on-chip to maximise the system performance.

Efficient Neural Network Quantization +1

Paper
Code

fpgaHART: A toolflow for throughput-oriented acceleration of 3D CNNs for HAR onto FPGAs

no code implementations • 31 May 2023 • Petros Toupas, Christos-Savvas Bouganis, Dimitrios Tzovaras

A variety of 3D CNN models were evaluated using the proposed toolflow on multiple FPGA devices, demonstrating its potential to deliver competitive performance compared to earlier hand-tuned and model-specific designs.

Action Recognition Autonomous Vehicles +3

Paper
Add Code

FMM-X3D: FPGA-based modeling and mapping of X3D for Human Action Recognition

no code implementations • 29 May 2023 • Petros Toupas, Christos-Savvas Bouganis, Dimitrios Tzovaras

3D Convolutional Neural Networks are gaining increasing attention from researchers and practitioners and have found applications in many domains, such as surveillance systems, autonomous vehicles, human monitoring systems, and video retrieval.

Action Recognition Autonomous Vehicles +3

Paper
Add Code

ATHEENA: A Toolflow for Hardware Early-Exit Network Automation

no code implementations • 17 Apr 2023 • Benjamin Biggs, Christos-Savvas Bouganis, George A. Constantinides

Additionally, the toolflow can achieve a throughput matching the same baseline with as low as $46\%$ of the resources the baseline requires.

Quantization

Paper
Add Code

HARFLOW3D: A Latency-Oriented 3D-CNN Accelerator Toolflow for HAR on FPGA Devices

2 code implementations • 30 Mar 2023 • Petros Toupas, Alexander Montgomerie-Corcoran, Christos-Savvas Bouganis, Dimitrios Tzovaras

For Human Action Recognition tasks (HAR), 3D Convolutional Neural Networks have proven to be highly effective, achieving state-of-the-art results.

Action Recognition Scheduling +1

Paper
Code

Window-Based Early-Exit Cascades for Uncertainty Estimation: When Deep Ensembles are More Efficient than Single Models

1 code implementation • ICCV 2023 • Guoxuan Xia, Christos-Savvas Bouganis

Experiments on ImageNet-scale data across a number of network architectures and uncertainty tasks show that the proposed window-based early-exit approach is able to achieve a superior uncertainty-computation trade-off compared to scaling single models.

Binary Classification

Paper
Code

SVD-NAS: Coupling Low-Rank Approximation and Neural Architecture Search

1 code implementation • 22 Aug 2022 • Zhewen Yu, Christos-Savvas Bouganis

The task of compressing pre-trained Deep Neural Networks has attracted wide interest of the research community due to its great benefits in freeing practitioners from data access requirements.

Neural Architecture Search

Paper
Code

Augmenting Softmax Information for Selective Classification with Out-of-Distribution Data

1 code implementation • 15 Jul 2022 • Guoxuan Xia, Christos-Savvas Bouganis

However, the performance of detection methods is generally evaluated on the task in isolation, rather than also considering potential downstream tasks in tandem.

Out of Distribution (OOD) Detection

Paper
Code

On the Usefulness of Deep Ensemble Diversity for Out-of-Distribution Detection

1 code implementation • 15 Jul 2022 • Guoxuan Xia, Christos-Savvas Bouganis

As such we show that practically, even better OOD detection performance can be achieved for Deep Ensembles by averaging task-specific detection scores such as Energy over the ensemble.

Binary Classification Out-of-Distribution Detection +2

Paper
Code

Multi-DNN Accelerators for Next-Generation AI Systems

no code implementations • 19 May 2022 • Stylianos I. Venieris, Christos-Savvas Bouganis, Nicholas D. Lane

As the use of AI-powered applications widens across multiple domains, so do increase the computational demands.

Paper
Add Code

Low-Cost On-device Partial Domain Adaptation (LoCO-PDA): Enabling efficient CNN retraining on edge devices

no code implementations • 1 Mar 2022 • Aditya Rajagopal, Christos-Savvas Bouganis

Consequently, it is likely that the observed data distribution upon deployment is a subset of the training data distribution.

Partial Domain Adaptation

Paper
Add Code

perf4sight: A toolflow to model CNN training performance on Edge GPUs

1 code implementation • 12 Aug 2021 • Aditya Rajagopal, Christos-Savvas Bouganis

The increased memory and processing capabilities of today's edge devices create opportunities for greater edge intelligence.

Paper
Code

Caffe Barista: Brewing Caffe with FPGAs in the Training Loop

1 code implementation • 18 Jun 2020 • Diederik Adriaan Vink, Aditya Rajagopal, Stylianos I. Venieris, Christos-Savvas Bouganis

CNN training on FPGAs is a nascent field of research.

Paper
Code

Multi-Precision Policy Enforced Training (MuPPET): A precision-switching strategy for quantised fixed-point training of CNNs

no code implementations • 16 Jun 2020 • Aditya Rajagopal, Diederik Adriaan Vink, Stylianos I. Venieris, Christos-Savvas Bouganis

Large-scale convolutional neural networks (CNNs) suffer from very long training times, spanning from hours to weeks, limiting the productivity and experimentation of deep learning practitioners.

Paper
Add Code

Now that I can see, I can improve: Enabling data-driven finetuning of CNNs on the edge

1 code implementation • 15 Jun 2020 • Aditya Rajagopal, Christos-Savvas Bouganis

In today's world, a vast amount of data is being generated by edge devices that can be used as valuable training data to improve the performance of machine learning algorithms in terms of the achieved accuracy or to reduce the compute requirements of the model.

Paper
Code

Approximate LSTMs for Time-Constrained Inference: Enabling Fast Reaction in Self-Driving Cars

no code implementations • 2 May 2019 • Alexandros Kouris, Stylianos I. Venieris, Michail Rizakis, Christos-Savvas Bouganis

The need to recognise long-term dependencies in sequential data such as video streams has made Long Short-Term Memory (LSTM) networks a prominent Artificial Intelligence model for many emerging applications.

Autonomous Navigation Self-Driving Cars

Paper
Add Code

DroNet: Efficient convolutional neural network detector for real-time UAV applications

2 code implementations • 18 Jul 2018 • Christos Kyrkou, George Plastiras, Stylianos Venieris, Theocharis Theocharides, Christos-Savvas Bouganis

Through the analysis we propose a CNN architecture that is capable of detecting vehicles from aerial UAV images and can operate between 5-18 frames-per-second for a variety of platforms with an overall accuracy of ~95%.

Object Detection In Aerial Images One-Shot Object Detection +1

Paper
Code

CascadeCNN: Pushing the Performance Limits of Quantisation in Convolutional Neural Networks

no code implementations • 13 Jul 2018 • Alexandros Kouris, Stylianos I. Venieris, Christos-Savvas Bouganis

This work presents CascadeCNN, an automated toolflow that pushes the quantisation limits of any given CNN model, aiming to perform high-throughput inference.

Paper
Add Code

Deploying Deep Neural Networks in the Embedded Space

no code implementations • 22 Jun 2018 • Stylianos I. Venieris, Alexandros Kouris, Christos-Savvas Bouganis

Recently, Deep Neural Networks (DNNs) have emerged as the dominant model across various AI applications.

Paper
Add Code

f-CNN$^{\text{x}}$: A Toolflow for Mapping Multi-CNN Applications on FPGAs

no code implementations • 25 May 2018 • Stylianos I. Venieris, Christos-Savvas Bouganis

The predictive power of Convolutional Neural Networks (CNNs) has been an integral factor for emerging latency-sensitive applications, such as autonomous drones and vehicles.

Scheduling

Paper
Add Code

CascadeCNN: Pushing the performance limits of quantisation

no code implementations • 22 May 2018 • Alexandros Kouris, Stylianos I. Venieris, Christos-Savvas Bouganis

This work presents CascadeCNN, an automated toolflow that pushes the quantisation limits of any given CNN model, to perform high-throughput inference by exploiting the computation time-accuracy trade-off.

Paper
Add Code

Toolflows for Mapping Convolutional Neural Networks on FPGAs: A Survey and Future Directions

no code implementations • 15 Mar 2018 • Stylianos I. Venieris, Alexandros Kouris, Christos-Savvas Bouganis

In the past decade, Convolutional Neural Networks (CNNs) have demonstrated state-of-the-art performance in various Artificial Intelligence tasks.

Paper
Add Code

Approximate FPGA-based LSTMs under Computation Time Constraints

no code implementations • 7 Jan 2018 • Michalis Rizakis, Stylianos I. Venieris, Alexandros Kouris, Christos-Savvas Bouganis

Recurrent Neural Networks and in particular Long Short-Term Memory (LSTM) networks have demonstrated state-of-the-art accuracy in several emerging Artificial Intelligence tasks.

Autonomous Vehicles Image Captioning +1

Paper
Add Code

fpgaConvNet: A Toolflow for Mapping Diverse Convolutional Neural Networks on Embedded FPGAs

no code implementations • 23 Nov 2017 • Stylianos I. Venieris, Christos-Savvas Bouganis

By selectively optimising for throughput, latency or multiobjective criteria, the presented tool is able to efficiently explore the design space and generate hardware designs from high-level ConvNet specifications, explicitly optimised for the performance metric of interest.

Paper
Add Code

Robust Multi-Image Based Blind Face Hallucination

no code implementations • CVPR 2015 • Yonggang Jin, Christos-Savvas Bouganis

This paper proposes a robust multi-image based blind face hallucination framework to super-resolve LR faces.

Deblurring Face Hallucination +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.