Search Results for author: Tal Kopetz

Found 4 papers, 0 papers with code

Towards Optimal Compression: Joint Pruning and Quantization

no code implementations • 15 Feb 2023 • Ben Zandonati, Glenn Bucagu, Adrian Alan Pol, Maurizio Pierini, Olya Sirkin, Tal Kopetz

Model compression is instrumental in optimizing deep neural network inference on resource-constrained hardware.

Model Compression Neural Architecture Search +1

Paper
Add Code

FIT: A Metric for Model Sensitivity

no code implementations • 16 Oct 2022 • Ben Zandonati, Adrian Alan Pol, Maurizio Pierini, Olya Sirkin, Tal Kopetz

This response is non-linear and heterogeneous throughout the network.

Model Compression Quantization

Paper
Add Code

FBM: Fast-Bit Allocation for Mixed-Precision Quantization

no code implementations • 30 May 2022 • Moshe Kimhi, Tal Rozen, Tal Kopetz, Olya Sirkin, Avi Mendelson, Chaim Baskin

Quantized neural networks are well known for reducing latency, power consumption, and model size without significant degradation in accuracy, making them highly applicable for systems with limited resources and low power requirements.

Quantization

Paper
Add Code

Lightweight Jet Reconstruction and Identification as an Object Detection Task

no code implementations • 9 Feb 2022 • Adrian Alan Pol, Thea Aarrestad, Ekaterina Govorkova, Roi Halily, Anat Klempner, Tal Kopetz, Vladimir Loncar, Jennifer Ngadiuba, Maurizio Pierini, Olya Sirkin, Sioni Summers

We experiment with 8-bit and ternary quantization, benchmarking their accuracy and inference latency against a single-precision floating-point.

Benchmarking object-detection +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.