no code implementations • 15 Jan 2025 • Steve Hanneke, Hongao Wang
Finally, we extend these algorithms and results to the agnostic case, showing an equivalence between the minimal assumptions on the data process for learnability in the agnostic and realizable cases, for every concept class, as well as the equivalence of optimistically universal learnability.
no code implementations • 8 Dec 2024 • Pramith Devulapalli, Steve Hanneke
In this paper, we adopt a learning-theoretic perspective in understanding the fundamental nature of learning different classes of functions from both discrete data streams and continuous data streams.
no code implementations • 3 Dec 2024 • Steve Hanneke, Mingyue Xu
In this paper, we consider the problem of universal learning by ERM in the realizable case and study the possible universal rates.
no code implementations • 26 Nov 2024 • Yannay Alon, Steve Hanneke, Shay Moran, Uri Shalit
We further refine this for positive values of $\varepsilon$ and identify for each $\varepsilon$ how many examples per task are needed to achieve an error of $\varepsilon$ in the limit as the number of tasks $n$ goes to infinity.
no code implementations • 3 Nov 2024 • Steve Hanneke, Vinod Raman, Amirreza Shaeiri, Unique Subedi
Along the way, we show that the trichotomy of possible minimax rates of the expected number of mistakes established by Hanneke et al. [2023b] for finite label spaces in the realizable setting continues to hold even when the label space is unbounded.
no code implementations • 16 Oct 2024 • Idan Attias, Steve Hanneke, Arvind Ramaswami
Assuming we have a compression scheme for binary classes of size $f(d_\mathrm{VC})$, where $d_\mathrm{VC}$ is the VC dimension, then we have the following results: (1) If the binary compression scheme is a majority-vote or a stable compression scheme, then there exists a multiclass compression scheme of size $O(f(d_\mathrm{G}))$, where $d_\mathrm{G}$ is the graph dimension.
no code implementations • 12 Oct 2024 • Steve Hanneke, Kun Wang
With knowledge of $\mathcal{M}$, supposing that the true model $M\in \mathcal{M}$, the objective is to identify an arm $\hat{\pi}$ of near-maximal mean reward $f^M(\hat{\pi})$ with high probability in a bounded number of rounds.
no code implementations • 29 Aug 2024 • Steve Hanneke, Samory Kpotufe
We show that some basic moduli of continuity $\delta$ -- which measure how fast target risk decreases as source risk decreases -- appear to be at the root of many of the classical relatedness measures in transfer learning and related literature.
no code implementations • 29 Jul 2024 • Steve Hanneke, Kasper Green Larsen, Nikita Zhivotovskiy
This simple algorithm is known to have an optimal error in terms of the VC-dimension of $\mathcal{H}$ and the number of samples $n$.
no code implementations • 10 Jul 2024 • Simone Fioravanti, Steve Hanneke, Shay Moran, Hilla Schefler, Iska Tsubari
This naturally raises the question of whether DP learnability continues to imply online learnability in more general scenarios: indeed, Alon, Hanneke, Holzman, and Moran (2021) explicitly leave it as an open question in the context of partial concept classes, and the same question is open in the general multiclass setting.
no code implementations • 27 May 2024 • Zachary Chase, Bogdan Chornomaz, Steve Hanneke, Shay Moran, Amir Yehudayoff
In particular, we prove that for every $d$ there is a class with VC dimension $d$ that cannot be embedded in any extremal class of VC dimension smaller than exponential in $d$.
no code implementations • 16 Mar 2024 • Steve Hanneke, Shay Moran, Tom Waknine
In classical PAC learning, both uniform convergence and sample compression satisfy a form of `completeness': whenever a class is learnable, it can also be learned by a learning rule that adheres to these principles.
no code implementations • 20 Feb 2024 • Pramith Devulapalli, Steve Hanneke
Understanding the self-directed learning complexity has been an important problem that has captured the attention of the online learning theory community since the early 1990s.
no code implementations • 12 Feb 2024 • Yuval Filmus, Steve Hanneke, Idan Mehalel, Shay Moran
We demonstrate that the optimal mistake bound under bandit feedback is at most $O(k)$ times higher than the optimal mistake bound in the full information case, where $k$ represents the number of labels.
no code implementations • NeurIPS 2023 • Steve Hanneke, Shay Moran, Jonathan Shafer
We present new upper and lower bounds on the number of learner mistakes in the `transductive' online learning setting of Ben-David, Kushilevitz and Mansour (1997).
no code implementations • 29 Sep 2023 • Steve Hanneke, Aryeh Kontorovich, Guy Kornowski
While the recent work of Hanneke et al. (2023) established tight uniform convergence bounds for average-smooth functions in the realizable case and provided a computationally efficient realizable learning algorithm, both of these results currently lack analogs in the general agnostic (i. e. noisy) case.
no code implementations • NeurIPS 2023 • Idan Attias, Steve Hanneke, Alkis Kalavasis, Amin Karbasi, Grigoris Velegkas
Additionally, in the context of online learning we provide a dimension that characterizes the minimax instance optimal cumulative loss up to a constant factor and design an optimal online learner for realizable regression, thus resolving an open question raised by Daskalakis and Golowich in STOC '22.
no code implementations • 5 Jul 2023 • Steve Hanneke, Shay Moran, Qian Zhang
Pseudo-cubes are a structure, rooted in the work of Daniely and Shalev-Shwartz (2014), and recently shown by Brukhim, Carmon, Dinur, Moran, and Yehudayoff (2022) to characterize PAC learnability (i. e., uniform rates) for multiclass classification.
no code implementations • NeurIPS 2023 • Surbhi Goel, Steve Hanneke, Shay Moran, Abhishek Shetty
We study the problem of sequential prediction in the stochastic setting with an adversary that is allowed to inject clean-label adversarial (or out-of-distribution) examples.
no code implementations • 29 Apr 2023 • Steve Hanneke, Samory Kpotufe, Yasaman Mahdaviyeh
Theoretical studies on transfer learning or domain adaptation have so far focused on situations with a known hypothesis class or model; however in practice, some amount of model selection is usually involved, often appearing under the umbrella term of hyperparameter-tuning: for example, one may think of the problem of tuning for the right neural network architecture towards a target task, while leveraging data from a related source task.
no code implementations • 30 Mar 2023 • Steve Hanneke, Shay Moran, Vinod Raman, Unique Subedi, Ambuj Tewari
We argue that the best expert has regret at most Littlestone dimension relative to the best concept in the class.
no code implementations • 27 Feb 2023 • Yuval Filmus, Steve Hanneke, Idan Mehalel, Shay Moran
We prove an analogous result for randomized learners: we show that the optimal expected mistake bound in learning a class $\mathcal{H}$ equals its randomized Littlestone dimension, which is the largest $d$ for which there exists a tree shattered by $\mathcal{H}$ whose average depth is $2d$.
no code implementations • 14 Feb 2023 • Moise Blanchard, Steve Hanneke, Patrick Jaillet
We show that optimistic universal learning for contextual bandits with adversarial rewards is impossible in general, contrary to all previously studied settings in online learning -- including standard supervised learning.
no code implementations • 31 Dec 2022 • Moise Blanchard, Steve Hanneke, Patrick Jaillet
Lastly, we consider the case of added continuity assumptions on rewards and show that these lead to universal consistency for significantly larger classes of data-generating processes.
no code implementations • 6 Oct 2022 • Steve Hanneke, Amin Karbasi, Mohammad Mahmoody, Idan Mehalel, Shay Moran
In this work we aim to characterize the smallest achievable error $\epsilon=\epsilon(\eta)$ by the learner in the presence of such an adversary in both realizable and agnostic settings.
no code implementations • 15 Sep 2022 • Omar Montasser, Steve Hanneke, Nathan Srebro
We present a minimax optimal learner for the problem of learning predictors robust to adversarial examples at test-time.
no code implementations • 31 Aug 2022 • Olivier Bousquet, Steve Hanneke, Shay Moran, Jonathan Shafer, Ilya Tolstikhin
We solve this problem in a principled manner, by introducing a combinatorial dimension called VCL that characterizes the best $d'$ for which $d'/n$ is a strong minimax lower bound.
no code implementations • 26 Jun 2022 • Idan Attias, Steve Hanneke
We study robustness to test-time adversarial attacks in the regression setting with $\ell_p$ losses and arbitrary perturbation sets.
no code implementations • 11 Mar 2022 • Steve Hanneke
This work provides an online learning rule that is universally consistent under processes on (X, Y) pairs, under conditions only on the X process.
no code implementations • 8 Mar 2022 • Maria-Florina Balcan, Avrim Blum, Steve Hanneke, Dravyansh Sharma
Remarkably, we provide a complete characterization of learnability in this setting, in particular, nearly-tight matching upper and lower bounds on the region that can be certified, as well as efficient algorithms for computing this region given an ERM oracle.
no code implementations • 11 Feb 2022 • Idan Attias, Steve Hanneke, Yishay Mansour
This shows that there is a significant benefit in semi-supervised robust learning even in the worst-case distribution-free model, and establishes a gap between the supervised and semi-supervised label complexities which is known not to hold in standard non-robust PAC learning.
no code implementations • 21 Jan 2022 • Moise Blanchard, Romain Cosson, Steve Hanneke
We resolve an open problem of Hanneke on the subject of universally consistent online learning with non-i. i. d.
no code implementations • 20 Oct 2021 • Omar Montasser, Steve Hanneke, Nathan Srebro
We study the problem of adversarially robust learning in the transductive setting.
no code implementations • 20 Jul 2021 • Steve Hanneke
This open problem asks whether there exists an online learning algorithm for binary classification that guarantees, for all target concepts, to make a sublinear number of mistakes, under only the assumption that the (possibly random) sequence of points X allows that such a learning algorithm can exist for that sequence.
no code implementations • 18 Jul 2021 • Noga Alon, Steve Hanneke, Ron Holzman, Shay Moran
In fact we exhibit easy-to-learn partial concept classes which provably cannot be captured by the traditional PAC theory.
no code implementations • 1 Mar 2021 • Avrim Blum, Steve Hanneke, Jian Qian, Han Shao
We study the problem of robust learning under clean-label data-poisoning attacks, where the attacker injects (an arbitrary set of) correctly-labeled examples to the training set to fool the algorithm into making mistakes on specific test instances at test time.
no code implementations • 3 Feb 2021 • Omar Montasser, Steve Hanneke, Nathan Srebro
We study the problem of learning predictors that are robust to adversarial examples with respect to an unknown perturbation set, relying instead on interaction with an adversarial attacker or access to attack oracles, examining different models for such interactions.
no code implementations • 2 Feb 2021 • Steve Hanneke, Roi Livni, Shay Moran
More precisely, given any concept class C and any hypothesis class H, we provide nearly tight bounds (up to a log factor) on the optimal mistake bounds for online learning C using predictors from H. Our bound yields an exponential improvement over the previously best known bound by Chase and Freitag (2020).
no code implementations • 9 Nov 2020 • Olivier Bousquet, Steve Hanneke, Shay Moran, Ramon van Handel, Amir Yehudayoff
How quickly can a given class of concepts be learned from examples?
3 code implementations • 9 Nov 2020 • Steve Hanneke, Aryeh Kontorovich
We analyze a family of supervised learning algorithms based on sample compression schemes that are stable, in the sense that removing points from the training set which were not selected for the compression set does not alter the resulting classifier.
no code implementations • NeurIPS 2020 • Omar Montasser, Steve Hanneke, Nathan Srebro
We study the problem of reducing adversarially robust learning to standard PAC learning, i. e. the complexity of learning adversarially robust predictors using access to only a black-box non-robust learner.
no code implementations • 29 Jun 2020 • Steve Hanneke, Samory Kpotufe
A perplexing fact remains in the evolving theory on the subject: while we would hope for performance bounds that account for the contribution from multiple tasks, the vast majority of analyses result in bounds that improve at best in the number $n$ of samples per task, but most often do not improve in $N$.
no code implementations • 24 May 2020 • Olivier Bousquet, Steve Hanneke, Shay Moran, Nikita Zhivotovskiy
It has been recently shown by Hanneke (2016) that the optimal sample complexity of PAC learning for any VC class C is achieved by a particular improper learning algorithm, which outputs a specific majority-vote of hypotheses in C. This leaves the question of when this bound can be achieved by proper learning algorithms, which are restricted to always output a hypothesis from C. In this paper we aim to characterize the classes for which the optimal sample complexity can be achieved by a proper learning algorithm.
no code implementations • NeurIPS 2019 • Steve Hanneke, Samory Kpotufe
We aim to understand the value of additional labeled or unlabeled target data in transfer learning, for any given amount of source data; this is motivated by practical questions around minimizing sampling costs, whereby, target data is usually harder or costlier to acquire than source data, but can yield better accuracy.
no code implementations • 24 Jun 2019 • Steve Hanneke, Aryeh Kontorovich, Sivan Sabato, Roi Weiss
This is the first learning algorithm known to enjoy this property; by comparison, the $k$-NN classifier and its variants are not generally universally Bayes-consistent, except under additional structural assumptions, such as an inner product, a norm, finite dimension, or a Besicovitch-type property.
no code implementations • 12 Feb 2019 • Omar Montasser, Steve Hanneke, Nathan Srebro
We study the question of learning an adversarially robust predictor.
no code implementations • 3 Oct 2018 • Idan Attias, Steve Hanneke, Aryeh Kontorovich, Menachem Sadigurschi
For the $\ell_2$ loss, does every function class admit an approximate compression scheme of polynomial size in the fat-shattering dimension?
no code implementations • 21 May 2018 • Steve Hanneke, Aryeh Kontorovich, Menachem Sadigurschi
We give an algorithmically efficient version of the learner-to-compression scheme conversion in Moran and Yehudayoff (2016).
no code implementations • 21 May 2018 • Steve Hanneke, Aryeh Kontorovich
We establish a tight characterization of the worst-case rates for the excess risk of agnostic learning with sample compression schemes and for uniform convergence for agnostic sample compression schemes.
no code implementations • 20 Feb 2018 • Steve Hanneke, Adam Kalai, Gautam Kamath, Christos Tzamos
A generative model may generate utter nonsense when it is fit to maximize the likelihood of observed data.
no code implementations • 23 Jun 2017 • Steve Hanneke, Liu Yang
We also identify the optimal dependence on the number of pieces in the query complexity of passive testing in the special case of piecewise constant functions.
no code implementations • 5 Jun 2017 • Steve Hanneke
We are then interested in the question of whether there exist learning rules guaranteed to be universally consistent given only the assumption that universally consistent learning is possible for the given data process.
no code implementations • 29 Apr 2017 • Amit Dhurandhar, Steve Hanneke, Liu Yang
In particular, we propose an approach to provably determine the time instant from which the new/changed features start becoming relevant with respect to an output variable in an agnostic (supervised) learning setting.
no code implementations • 26 Dec 2015 • Steve Hanneke, Liu Yang
Under these conditions, we propose a learning method, and establish that for bounded VC subgraph classes, the cumulative excess risk grows sublinearly in the number of predictions, at a quantified rate.
no code implementations • 22 Dec 2015 • Steve Hanneke
This article studies the achievable guarantees on the error rates of certain learning algorithms, with particular focus on refining logarithmic factors.
no code implementations • 2 Jul 2015 • Steve Hanneke
This work establishes a new upper bound on the number of samples sufficient for PAC learning in the realizable case.
no code implementations • 20 May 2015 • Steve Hanneke, Varun Kanade, Liu Yang
Some of the results also describe an active learning variant of this setting, and provide bounds on the number of queries for the labels of points in the sequence sufficient to obtain the stated bounds on the error rates.
no code implementations • 20 May 2015 • Liu Yang, Steve Hanneke, Jaime Carbonell
We study the optimal rates of convergence for estimating a prior distribution over a VC class from a sequence of independent data sets respectively labeled by independent target functions sampled from the prior.
no code implementations • 3 Oct 2014 • Steve Hanneke, Liu Yang
This work establishes distribution-free upper and lower bounds on the minimax label complexity of active learning with general hypothesis classes, under various noise models.
no code implementations • 5 Apr 2014 • Yair Wiener, Steve Hanneke, Ran El-Yaniv
We introduce a new and improved characterization of the label complexity of disagreement-based active learning, in which the leading quantity is the version space compression set size.
no code implementations • 16 Jul 2012 • Steve Hanneke, Liu Yang
Specifically, it presents an active learning algorithm based on an arbitrary classification-calibrated surrogate loss function, along with an analysis of the number of label requests sufficient for the classifier returned by the algorithm to achieve a given risk under the 0-1 loss.