Distribution-Calibrated Hierarchical Classification

NeurIPS 2009  ·  Ofer Dekel ·

While many advances have already been made on the topic of hierarchical classi- fication learning, we take a step back and examine how a hierarchical classifica- tion problem should be formally defined. We pay particular attention to the fact that many arbitrary decisions go into the design of the the label taxonomy that is provided with the training data, and that this taxonomy is often unbalanced. We correct this problem by using the data distribution to calibrate the hierarchical classification loss function. This distribution-based correction must be done with care, to avoid introducing unmanagable statstical dependencies into the learning problem. This leads us off the beaten path of binomial-type estimation and into the uncharted waters of geometric-type estimation. We present a new calibrated definition of statistical risk for hierarchical classification, an unbiased geometric estimator for this risk, and a new algorithmic reduction from hierarchical classifi- cation to cost-sensitive classification.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here