no code implementations • 5 Apr 2023 • Jane Lange, Arsen Vasilyan
Given $2^{\tilde{O}(\sqrt{n}/\varepsilon)}$ uniformly random examples of an unknown function $f:\{\pm 1\}^n \rightarrow \{\pm 1\}$, our algorithm outputs a hypothesis $g:\{\pm 1\}^n \rightarrow \{\pm 1\}$ that is monotone and $(\mathrm{opt} + \varepsilon)$-close to $f$, where $\mathrm{opt}$ is the distance from $f$ to the closest monotone function.
no code implementations • 27 Mar 2023 • Guy Blanc, Jane Lange, Ali Malik, Li-Yang Tan
We show how any PAC learning algorithm that works under the uniform distribution can be transformed, in a blackbox fashion, into one that works under an arbitrary and unknown distribution $\mathcal{D}$.
no code implementations • 14 Jul 2022 • Guy Blanc, Caleb Koch, Jane Lange, Li-Yang Tan
Here $S(f)$ is the sensitivity of $f$, a discrete analogue of the Lipschitz constant, and $\Delta_f(x^\star)$ is the distance from $x^\star$ to its nearest counterfactuals.
no code implementations • 29 Jun 2022 • Guy Blanc, Jane Lange, Mingda Qiao, Li-Yang Tan
The previous fastest algorithm for this problem ran in $n^{O(\log n)}$ time, a consequence of Ehrenfeucht and Haussler (1989)'s classic algorithm for the distribution-free setting.
no code implementations • 17 Jun 2022 • Guy Blanc, Jane Lange, Ali Malik, Li-Yang Tan
Using the framework of boosting, we prove that all impurity-based decision tree learning algorithms, including the classic ID3, C4. 5, and CART, are highly noise tolerant.
no code implementations • 19 Nov 2021 • Guy Blanc, Jane Lange, Ali Malik, Li-Yang Tan
Specifically, can the behavior of an algorithm $\mathcal{A}$ in the presence of oblivious adversaries always be well-approximated by that of an algorithm $\mathcal{A}'$ in the presence of adaptive adversaries?
no code implementations • NeurIPS 2021 • Guy Blanc, Jane Lange, Li-Yang Tan
We consider the problem of explaining the predictions of an arbitrary blackbox model $f$: given query access to $f$ and an instance $x$, output a small set of $x$'s features that in conjunction essentially determines $f(x)$.
no code implementations • 1 Sep 2021 • Guy Blanc, Jane Lange, Mingda Qiao, Li-Yang Tan
We give an $n^{O(\log\log n)}$-time membership query algorithm for properly and agnostically learning decision trees under the uniform distribution over $\{\pm 1\}^n$.
no code implementations • 2 Jul 2021 • Guy Blanc, Jane Lange, Mingda Qiao, Li-Yang Tan
Greedy decision tree learning heuristics are mainstays of machine learning practice, but theoretical justification for their empirical success remains elusive.
no code implementations • 8 May 2021 • Guy Blanc, Jane Lange, Li-Yang Tan
Given an $\eta$-corrupted set of uniform random samples labeled by a size-$s$ stochastic decision tree, our algorithm runs in time $n^{O(\log(s/\varepsilon)/\varepsilon^2)}$ and returns a hypothesis with error within an additive $2\eta + \varepsilon$ of the Bayes optimal.
no code implementations • 16 Dec 2020 • Guy Blanc, Jane Lange, Li-Yang Tan
We give the first {\sl reconstruction algorithm} for decision trees: given queries to a function $f$ that is $\mathrm{opt}$-close to a size-$s$ decision tree, our algorithm provides query access to a decision tree $T$ where: $\circ$ $T$ has size $S = s^{O((\log s)^2/\varepsilon^3)}$; $\circ$ $\mathrm{dist}(f, T)\le O(\mathrm{opt})+\varepsilon$; $\circ$ Every query to $T$ is answered with $\mathrm{poly}((\log s)/\varepsilon)\cdot \log n$ queries to $f$ and in $\mathrm{poly}((\log s)/\varepsilon)\cdot n\log n$ time.
no code implementations • NeurIPS 2020 • Guy Blanc, Neha Gupta, Jane Lange, Li-Yang Tan
We show that top-down decision tree learning heuristics are amenable to highly efficient learnability estimation: for monotone target functions, the error of the decision tree hypothesis constructed by these heuristics can be estimated with polylogarithmically many labeled examples, exponentially smaller than the number necessary to run these heuristics, and indeed, exponentially smaller than information-theoretic minimum required to learn a good decision tree.
no code implementations • NeurIPS 2020 • Guy Blanc, Neha Gupta, Jane Lange, Li-Yang Tan
We propose a simple extension of top-down decision tree learning heuristics such as ID3, C4. 5, and CART.
no code implementations • ICML 2020 • Guy Blanc, Jane Lange, Li-Yang Tan
We give strengthened provable guarantees on the performance of widely employed and empirically successful {\sl top-down decision tree learning heuristics}.
no code implementations • 18 Nov 2019 • Guy Blanc, Jane Lange, Li-Yang Tan
We analyze the quality of this heuristic, obtaining near-matching upper and lower bounds: $\circ$ Upper bound: For every $f$ with decision tree size $s$ and every $\varepsilon \in (0,\frac1{2})$, this heuristic builds a decision tree of size at most $s^{O(\log(s/\varepsilon)\log(1/\varepsilon))}$.