Search Results for author: Rafael Sousa

Found 2 papers, 0 papers with code

Tensor Slicing and Optimization for Multicore NPUs

no code implementations6 Apr 2023 Rafael Sousa, Marcio Pereira, Yongin Kwon, TaeHo Kim, Namsoon Jung, Chang Soo Kim, Michael Frank, Guido Araujo

Although code generation for Convolution Neural Network (CNN) models has been extensively studied, performing efficient data slicing and parallelization for highly-constrai\-ned Multicore Neural Processor Units (NPUs) is still a challenging problem.

Code Generation Compiler Optimization

Advancing Direct Convolution using Convolution Slicing Optimization and ISA Extensions

no code implementations8 Mar 2023 Victor Ferrari, Rafael Sousa, Marcio Pereira, João P. L. de Carvalho, José Nelson Amaral, José Moreira, Guido Araujo

The speed-up over an Im2Col + BLAS method based on current BLAS implementations for end-to-end machine-learning model inference is in the range of 9% - 25% for Intel x86 and 10% - 42% for IBM POWER10 architectures.

Blocking Code Generation

Cannot find the paper you are looking for? You can Submit a new open access paper.