Search Results for author: Victor Ferrari

Found 2 papers, 0 papers with code

Advancing Direct Convolution using Convolution Slicing Optimization and ISA Extensions

no code implementations8 Mar 2023 Victor Ferrari, Rafael Sousa, Marcio Pereira, João P. L. de Carvalho, José Nelson Amaral, José Moreira, Guido Araujo

The speed-up over an Im2Col + BLAS method based on current BLAS implementations for end-to-end machine-learning model inference is in the range of 9% - 25% for Intel x86 and 10% - 42% for IBM POWER10 architectures.

Blocking Code Generation

Cannot find the paper you are looking for? You can Submit a new open access paper.