Search Results for author: Jarosław Błasiok

Found 10 papers, 4 papers with code

Smooth ECE: Principled Reliability Diagrams via Kernel Smoothing

1 code implementation21 Sep 2023 Jarosław Błasiok, Preetum Nakkiran

We show that a simple modification fixes both constructions: first smooth the observations using an RBF kernel, then compute the Expected Calibration Error (ECE) of this smoothed function.

When Does Optimizing a Proper Loss Yield Calibration?

no code implementations NeurIPS 2023 Jarosław Błasiok, Parikshit Gopalan, Lunjia Hu, Preetum Nakkiran

Optimizing proper loss functions is popularly believed to yield predictors with good calibration properties; the intuition being that for such losses, the global optimum is to predict the ground-truth probabilities, which is indeed calibrated.

Loss Minimization Yields Multicalibration for Large Neural Networks

no code implementations19 Apr 2023 Jarosław Błasiok, Parikshit Gopalan, Lunjia Hu, Adam Tauman Kalai, Preetum Nakkiran

We show that minimizing the squared loss over all neural nets of size $n$ implies multicalibration for all but a bounded number of unlucky values of $n$.

Fairness

A Unifying Theory of Distance from Calibration

no code implementations30 Nov 2022 Jarosław Błasiok, Parikshit Gopalan, Lunjia Hu, Preetum Nakkiran

We study the fundamental question of how to define and measure the distance from calibration for probabilistic predictors.

What You See is What You Get: Principled Deep Learning via Distributional Generalization

1 code implementation7 Apr 2022 Bogdan Kulynych, Yao-Yuan Yang, Yaodong Yu, Jarosław Błasiok, Preetum Nakkiran

In contrast, we show that Differentially-Private (DP) training provably ensures the high-level WYSIWYG property, which we quantify using a notion of distributional generalization.

The Generic Holdout: Preventing False-Discoveries in Adaptive Data Science

no code implementations14 Sep 2018 Preetum Nakkiran, Jarosław Błasiok

In this work, we propose a new framework for adaptive science which exponentially improves on this number of queries under a restricted yet scientifically relevant setting, where the goal of the scientist is to find a single (or a few) true hypotheses about the universe based on the samples.

Holdout Set

Optimal streaming and tracking distinct elements with high probability

no code implementations5 Apr 2018 Jarosław Błasiok

The distinct elements problem is one of the fundamental problems in streaming algorithms --- given a stream of integers in the range $\{1,\ldots, n\}$, we wish to provide a $(1+\varepsilon)$ approximation to the number of distinct elements in the input.

Data Structures and Algorithms

Predicting Positive and Negative Links with Noisy Queries: Theory & Practice

1 code implementation19 Sep 2017 Charalampos E. Tsourakakis, Michael Mitzenmacher, Kasper Green Larsen, Jarosław Błasiok, Ben Lawson, Preetum Nakkiran, Vasileios Nakos

The {\em edge sign prediction problem} aims to predict whether an interaction between a pair of nodes will be positive or negative.

Clustering

ADAGIO: Fast Data-aware Near-Isometric Linear Embeddings

1 code implementation17 Sep 2016 Jarosław Błasiok, Charalampos E. Tsourakakis

We verify experimentally the efficiency of our method on numerous real-world datasets, where we find that our method ($<$10 secs) is more than 3\, 000$\times$ faster than the state-of-the-art method \cite{hedge2015} ($>$9 hours) on medium scale datasets with 60\, 000 data points in 784 dimensions.

Computational Efficiency Dimensionality Reduction

An improved analysis of the ER-SpUD dictionary learning algorithm

no code implementations18 Feb 2016 Jarosław Błasiok, Jelani Nelson

Then, given some small number $p$ of samples, i. e.\ columns of $Y$, the goal is to learn the dictionary $A$ up to small error, as well as $X$.

Dictionary Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.