The Huber loss function describes the penalty incurred by an estimation procedure f. Huber (1964) defines the loss function piecewise by[1]
L δ ( a ) = { 1 2 a 2 for | a | ≤ δ , δ ⋅ ( | a | − 1 2 δ ) , otherwise. {\displaystyle L_{\delta }(a)={\begin{cases}{\frac {1}{2}}{a^{2}}&{\text{for }}|a|\leq \delta ,\\\delta \cdot \left(|a|-{\frac {1}{2}}\delta \right),&{\text{otherwise.}}\end{cases}}}
This function is quadratic for small values of a, and linear for large values, with equal values and slopes of the different sections at the two points where | a | = δ |a|=\delta . The variable a often refers to the residuals, that is to the difference between the observed and predicted values a = y − f ( x ) a=y-f(x), so the former can be expanded to[2]
L δ ( y , f ( x ) ) = { 1 2 ( y − f ( x ) ) 2 for | y − f ( x ) | ≤ δ , δ ⋅ ( | y − f ( x ) | − 1 2 δ ) , otherwise. {\displaystyle L_{\delta }(y,f(x))={\begin{cases}{\frac {1}{2}}(y-f(x))^{2}&{\text{for }}|y-f(x)|\leq \delta ,\\\delta \ \cdot \left(|y-f(x)|-{\frac {1}{2}}\delta \right),&{\text{otherwise.}}\end{cases}}}
The Huber loss is the convolution of the absolute value function with the rectangular function, scaled and translated. Thus it "smoothens out" the former's corner at the origin.
.. math:: \ell(x, y) = L = {l_1, ..., l_N}^T
with
.. math::
l_n = \begin{cases}
0.5 (x_n - y_n)^2, & \text{if } |x_n - y_n| < delta \\
delta * (|x_n - y_n| - 0.5 * delta), & \text{otherwise }
\end{cases}
Paper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Matrix Completion | 5 | 8.33% |
Reinforcement Learning (RL) | 4 | 6.67% |
quantile regression | 4 | 6.67% |
Reinforcement Learning | 3 | 5.00% |
Clustering | 2 | 3.33% |
Distributional Reinforcement Learning | 2 | 3.33% |
Multi-Task Learning | 2 | 3.33% |
Dimensionality Reduction | 2 | 3.33% |
Object Detection | 2 | 3.33% |
Component | Type |
|
---|---|---|
🤖 No Components Found | You can add them if they exist; e.g. Mask R-CNN uses RoIAlign |