Complexity Theoretic Limitations on Learning Halfspaces

21 May 2015  ·  Amit Daniely ·

We study the problem of agnostically learning halfspaces which is defined by a fixed but unknown distribution $\mathcal{D}$ on $\mathbb{Q}^n\times \{\pm 1\}$. We define $\mathrm{Err}_{\mathrm{HALF}}(\mathcal{D})$ as the least error of a halfspace classifier for $\mathcal{D}$. A learner who can access $\mathcal{D}$ has to return a hypothesis whose error is small compared to $\mathrm{Err}_{\mathrm{HALF}}(\mathcal{D})$. Using the recently developed method of the author, Linial and Shalev-Shwartz we prove hardness of learning results under a natural assumption on the complexity of refuting random $K$-$\mathrm{XOR}$ formulas. We show that no efficient learning algorithm has non-trivial worst-case performance even under the guarantees that $\mathrm{Err}_{\mathrm{HALF}}(\mathcal{D}) \le \eta$ for arbitrarily small constant $\eta>0$, and that $\mathcal{D}$ is supported in $\{\pm 1\}^n\times \{\pm 1\}$. Namely, even under these favorable conditions its error must be $\ge \frac{1}{2}-\frac{1}{n^c}$ for every $c>0$. In particular, no efficient algorithm can achieve a constant approximation ratio. Under a stronger version of the assumption (where $K$ can be poly-logarithmic in $n$), we can take $\eta = 2^{-\log^{1-\nu}(n)}$ for arbitrarily small $\nu>0$. Interestingly, this is even stronger than the best known lower bounds (Arora et. al. 1993, Feldamn et. al. 2006, Guruswami and Raghavendra 2006) for the case that the learner is restricted to return a halfspace classifier (i.e. proper learning).

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here