Search Results for author: Kasper Green Larsen

Found 24 papers, 3 papers with code

Heavy hitters via cluster-preserving clustering

no code implementations • 5 Apr 2016 • Kasper Green Larsen, Jelani Nelson, Huy L. Nguyen, Mikkel Thorup

Our main innovation is an efficient reduction from the heavy hitters to a clustering problem in which each heavy hitter is encoded as some form of noisy spectral cluster in a much bigger graph, and the goal is to identify every cluster.

Clustering

Paper
Add Code

Fast Exact k-Means, k-Medians and Bregman Divergence Clustering in 1D

2 code implementations • 25 Jan 2017 • Allan Grønlund, Kasper Green Larsen, Alexander Mathiasen, Jesper Sindahl Nielsen, Stefan Schneider, Mingzhou Song

We present all the existing work that had been overlooked and compare the various solutions theoretically.

Clustering

16,075

Paper
Code

Predicting Positive and Negative Links with Noisy Queries: Theory & Practice

1 code implementation • 19 Sep 2017 • Charalampos E. Tsourakakis, Michael Mitzenmacher, Kasper Green Larsen, Jarosław Błasiok, Ben Lawson, Preetum Nakkiran, Vasileios Nakos

The {\em edge sign prediction problem} aims to predict whether an interaction between a pair of nodes will be positive or negative.

Clustering

Paper
Code

Fully Understanding the Hashing Trick

no code implementations • NeurIPS 2018 • Casper Benjamin Freksen, Lior Kamma, Kasper Green Larsen

We settle this question by giving tight asymptotic bounds on the exact tradeoff between the central parameters, thus providing a complete understanding of the performance of feature hashing.

Open-Ended Question Answering

Paper
Add Code

The Query Complexity of a Permutation-Based Variant of Mastermind

no code implementations • 20 Dec 2018 • Peyman Afshani, Manindra Agrawal, Benjamin Doerr, Carola Doerr, Kasper Green Larsen, Kurt Mehlhorn

We study the query complexity of a permutation-based variant of the guessing game Mastermind.

Paper
Add Code

Optimal Minimal Margin Maximization with Boosting

no code implementations • 30 Jan 2019 • Allan Grønlund, Kasper Green Larsen, Alexander Mathiasen

A common goal in a long line of research, is to maximize the smallest margin using as few base hypotheses as possible, culminating with the AdaBoostV algorithm by (R{\"a}tsch and Warmuth [JMLR'04]).

Paper
Add Code

Optimal Learning of Joint Alignments with a Faulty Oracle

no code implementations • 21 Sep 2019 • Kasper Green Larsen, Michael Mitzenmacher, Charalampos E. Tsourakakis

The goal is to recover $n$ discrete variables $g_i \in \{0, \ldots, k-1\}$ (up to some global offset) given noisy observations of a set of their pairwise differences $\{(g_i - g_j) \bmod k\}$; specifically, with probability $\frac{1}{k}+\delta$ for some $\delta > 0$ one obtains the correct answer, and with the remaining probability one obtains a uniformly random incorrect answer.

Paper
Add Code

Margin-Based Generalization Lower Bounds for Boosted Classifiers

no code implementations • NeurIPS 2019 • Allan Grønlund, Lior Kamma, Kasper Green Larsen, Alexander Mathiasen, Jelani Nelson

To date, the strongest known generalization (upper bound) is the $k$th margin bound of Gao and Zhou (2013).

Generalization Bounds

Paper
Add Code

Near-Tight Margin-Based Generalization Bounds for Support Vector Machines

no code implementations • ICML 2020 • Allan Grønlund, Lior Kamma, Kasper Green Larsen

Support Vector Machines (SVMs) are among the most fundamental tools for binary classification.

Binary Classification Generalization Bounds

Paper
Add Code

Margins are Insufficient for Explaining Gradient Boosting

no code implementations • NeurIPS 2020 • Allan Grønlund, Lior Kamma, Kasper Green Larsen

We then explain the short comings of the $k$'th margin bound and prove a stronger and more refined margin-based generalization bound for boosted classifiers that indeed succeeds in explaining the performance of modern gradient boosters.

Paper
Add Code

CountSketches, Feature Hashing and the Median of Three

no code implementations • 3 Feb 2021 • Kasper Green Larsen, Rasmus Pagh, Jakub Tětek

For $t > 1$, the estimator takes the median of $2t-1$ independent estimates, and the probability that the estimate is off by more than $2 \|v\|_2/\sqrt{s}$ is exponentially small in $t$.

Paper
Add Code

Compression Implies Generalization

no code implementations • 15 Jun 2021 • Allan Grønlund, Mikael Høgsgaard, Lior Kamma, Kasper Green Larsen

The framework is simple and powerful enough to extend the generalization bounds by Arora et al. to also hold for the original network.

BIG-bench Machine Learning Generalization Bounds

Paper
Add Code

Optimality of the Johnson-Lindenstrauss Dimensionality Reduction for Practical Measures

no code implementations • 14 Jul 2021 • Yair Bartal, Ora Nova Fandina, Kasper Green Larsen

They provided upper bounds on its quality for a wide range of practical measures and showed that indeed these are best possible in many cases.

Dimensionality Reduction

Paper
Add Code

Towards Optimal Lower Bounds for k-median and k-means Coresets

no code implementations • 25 Feb 2022 • Vincent Cohen-Addad, Kasper Green Larsen, David Saulpic, Chris Schwiegelshohn

Given a set of points in a metric space, the $(k, z)$-clustering problem consists of finding a set of $k$ points called centers, such that the sum of distances raised to the power of $z$ of every data point to its closest center is minimized.

Clustering

Paper
Add Code

The Fast Johnson-Lindenstrauss Transform is Even Faster

1 code implementation • 4 Apr 2022 • Ora Nova Fandina, Mikael Møller Høgsgaard, Kasper Green Larsen

In this work, we give a surprising new analysis of the Fast JL transform, showing that the $k \ln^2 n$ term in the embedding time can be improved to $(k \ln^2 n)/\alpha$ for an $\alpha = \Omega(\min\{\varepsilon^{-1}\ln(1/\varepsilon), \ln n\})$.

Dimensionality Reduction

Paper
Code

Optimal Weak to Strong Learning

no code implementations • 3 Jun 2022 • Kasper Green Larsen, Martin Ritzert

The classic algorithm AdaBoost allows to convert a weak learner, that is an algorithm that produces a hypothesis which is slightly better than chance, into a strong learner, achieving arbitrarily high accuracy when given enough training data.

Generalization Bounds

Paper
Add Code

Improved Coresets for Euclidean $k$-Means

no code implementations • 15 Nov 2022 • Vincent Cohen-Addad, Kasper Green Larsen, David Saulpic, Chris Schwiegelshohn, Omar Ali Sheikh-Omar

the Euclidean $k$-median problem) consists of finding $k$ centers such that the sum of squared distances (resp.

Paper
Add Code

Bagging is an Optimal PAC Learner

no code implementations • 5 Dec 2022 • Kasper Green Larsen

Finally, the seminal work by Hanneke (2016) gave an algorithm with a provably optimal sample complexity.

Learning Theory PAC learning

Paper
Add Code

The Impossibility of Parallelizing Boosting

no code implementations • 23 Jan 2023 • Amin Karbasi, Kasper Green Larsen

The aim of boosting is to convert a sequence of weak learners into a strong learner.

Paper
Add Code

AdaBoost is not an Optimal Weak to Strong Learner

no code implementations • 27 Jan 2023 • Mikael Møller Høgsgaard, Kasper Green Larsen, Martin Ritzert

AdaBoost is a classic boosting algorithm for combining multiple inaccurate classifiers produced by a weak learner, to produce a strong learner with arbitrarily high accuracy when given enough training data.

Paper
Add Code

Sparse Dimensionality Reduction Revisited

no code implementations • 13 Feb 2023 • Mikael Møller Høgsgaard, Lion Kamma, Kasper Green Larsen, Jelani Nelson, Chris Schwiegelshohn

In this work, we revisit sparse embeddings and identify a loophole in the lower bound.

Dimensionality Reduction

Paper
Add Code

Boosting, Voting Classifiers and Randomized Sample Compression Schemes

no code implementations • 5 Feb 2024 • Arthur da Cunha, Kasper Green Larsen, Martin Ritzert

At the center of this paradigm lies the concept of building the strong learner as a voting classifier, which outputs a weighted majority vote of the weak learners.

Paper
Add Code

Replicable Learning of Large-Margin Halfspaces

no code implementations • 21 Feb 2024 • Alkis Kalavasis, Amin Karbasi, Kasper Green Larsen, Grigoris Velegkas, Felix Zhou

Departing from the requirement of polynomial time algorithms, using the DP-to-Replicability reduction of Bun, Gaboardi, Hopkins, Impagliazzo, Lei, Pitassi, Sorrell, and Sivakumar [STOC, 2023], we show how to obtain a replicable algorithm for large-margin halfspaces with improved sample complexity with respect to the margin parameter $\tau$, but running time doubly exponential in $1/\tau^2$ and worse sample complexity dependence on $\epsilon$ than one of our previous algorithms.

Paper
Add Code

Majority-of-Three: The Simplest Optimal Learner?

no code implementations • 12 Mar 2024 • Ishaq Aden-Ali, Mikael Møller Høgsgaard, Kasper Green Larsen, Nikita Zhivotovskiy

Furthermore, we prove a near-optimal high-probability bound on this algorithm's error.

Learning Theory PAC learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.