Search Results for author: Piotr Zielinski

Found 6 papers, 2 papers with code

A Closer Look at Hardware-Friendly Weight Quantization

1 code implementation • 7 Oct 2022 • Sungmin Bae, Piotr Zielinski, Satrajit Chatterjee

We study the two methods on MobileNetV1 and MobileNetV2 using multiple empirical metrics to identify the sources of performance differences between the two classes, namely, sensitivity to outliers and convergence instability of the quantizer scaling factor.

Quantization

527

Paper
Code

On the Generalization Mystery in Deep Learning

no code implementations • 18 Mar 2022 • Satrajit Chatterjee, Piotr Zielinski

The generalization mystery in deep learning is the following: Why do over-parameterized neural networks trained with gradient descent (GD) generalize well on real datasets even though they are capable of fitting random datasets of comparable size?

Memorization

Paper
Add Code

Enabling Binary Neural Network Training on the Edge

2 code implementations • 8 Feb 2021 • Erwei Wang, James J. Davis, Daniele Moro, Piotr Zielinski, Jia Jie Lim, Claudionor Coelho, Satrajit Chatterjee, Peter Y. K. Cheung, George A. Constantinides

The ever-growing computational demands of increasingly complex machine learning models frequently necessitate the use of powerful cloud-based infrastructure for their training.

Quantization

527

Paper
Code

Making Coherence Out of Nothing At All: Measuring Evolution of Gradient Alignment

no code implementations • 1 Jan 2021 • Satrajit Chatterjee, Piotr Zielinski

Using $m$-coherence, we study the evolution of alignment of per-example gradients in ResNet and EfficientNet models on ImageNet and several variants with label noise, particularly from the perspective of the recently proposed Coherent Gradients (CG) theory that provides a simple, unified explanation for memorization and generalization [Chatterjee, ICLR 20].

Memorization

Paper
Add Code

Making Coherence Out of Nothing At All: Measuring the Evolution of Gradient Alignment

no code implementations • 3 Aug 2020 • Satrajit Chatterjee, Piotr Zielinski

Using $m$-coherence, we study the evolution of alignment of per-example gradients in ResNet and Inception models on ImageNet and several variants with label noise, particularly from the perspective of the recently proposed Coherent Gradients (CG) theory that provides a simple, unified explanation for memorization and generalization [Chatterjee, ICLR 20].

Memorization

Paper
Add Code

Weak and Strong Gradient Directions: Explaining Memorization, Generalization, and Hardness of Examples at Scale

no code implementations • 16 Mar 2020 • Piotr Zielinski, Shankar Krishnan, Satrajit Chatterjee

The key insight of CGH is that, since the overall gradient for a single step of SGD is the sum of the per-example gradients, it is strongest in directions that reduce the loss on multiple examples if such directions exist.

Memorization

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.