Huber loss – Knowledge and References

Explore chapters and articles related to this topic

Gradient boosting machines

Published in Brandon M. Greenwell, Tree-Based Methods for Statistical Learning in R, 2022

For regression, LS and LAD loss are common choices. LAD loss has the benefit of being robust in the presence of long-tailed error distributions and response outliers. Regardless of loss, gradient tree boosting is already robust to long-tailed distributions or outliers in the feature space due to the robustness of the individual base learners (in this case, regression trees). Recall that trees are invariant to strictly monotone transformations of the predictors (e.g., using xj, exj, or log⁡xj for the j-th predictor all produce the same results). Consequently, there is little need to be concerned with transformations of the features in most tree-based ensembles. For outcomes with normally distributed errors (or at least approximately so), LAD loss will be less efficient than LS loss and generalization performance will suffer. A happy compromise is provided by the Huber loss function for Huber M-regression described in Friedman [2001] and Hastie et al. [2009, p. 360]. The Huber loss function provides resistance to outliers and long-tailed error distributions while maintaining high efficiency in cases where errors are more normally distributed.

Multiobjective optimization of support vector regression parameters by teaching-learning-based optimization for modeling of electric discharge machining responses

View Chapter

Purchase Book

Published in Angelos P. Markopoulos, J. Paulo Davim, Advanced Machining Processes, 2017

Ushasta Aich, Simul Banerjee

A number of loss functions, namely quadratic loss function, Huber loss function, ε-insensitive loss function, and so on, are already developed for handling different types of problems [16]. In general, these loss functions are some modified measurements of distances of the points and their corresponding estimated values. Square values of the distances between actual points and corresponding estimated values are considered for assigning loss in quadratic loss function. Quadratic loss function corresponds to the conventional least squares error criterion. Huber loss function is the combination of linear and quadratic loss functions. This robust loss function exhibits optimal properties when the underlying distribution of the data is unknown. Still, these two loss functions—quadratic and Huber—will produce no sparseness in the support vectors. To address these issues, Vapnik [12] proposed ε-insensitive loss function as a trade-off between the robust loss function of Huber and one that enables sparsity within the support vectors. ε-Insensitive loss function (refer Figure 7.3) may be defined as [14]: () L(yi,f(xi))=|yi, experimentalf(xi)|−ε,if|yi, experimental–f(xi)|≥ε=0,if|yi, experimental–f(xi)|<ε

Residual reinforcement learning for logistics cart transportation

View Article

Journal Information

Published in Advanced Robotics, 2022

Ryosuke Matsuo, Shinya Yasuda, Taichi Kumagai, Natsuhiko Sato, Hiroshi Yoshida, Takehisa Yairi

Therefore, the objective for minimizing the TD error is written as where Huber loss is used for robustness to outliers. Also, , which is the parameter of the target network, is updated for tracking θ. Concretely, is updated as where τ is the hyper-parameter and the smaller τ is, the more slowly the target network is updated. Additionally, TD3 utilizes two methods for Q-learning: (i) clipped double Q-learning, where two Q-functions are used and a small value of the two Q-functions is adopted as the target value for avoiding overestimation, and (ii) target policy smoothing, where noise is added to the action when the Q value of the target network in the target value calculation is calculated for stabilizing learning.

A Ranking-based Weakly Supervised Learning model for telemonitoring of Parkinson’s disease

View Article

Journal Information

Published in IISE Transactions on Healthcare Systems Engineering, 2022

Dhari F. Alenezi, Hang Shi, Jing Li

The Huber loss combines the robustness of -norm with the stability of -norm. For huge errors, it is linear; for small errors, it is quadratic. The Huber loss also gives fast methods for computing gradient and performing Hessian times vector operations. The problem is now convex, unconstrained, and differentiable, and can be formulated as follows:

Deep multistage multi-task learning for quality prediction of multistage manufacturing systems

View Article

Journal Information

Published in Journal of Quality Technology, 2021

Hao Yan, Nurettin Dorukhan Sergin, William A. Brenneman, Stephen Joseph Lange, Shan Ba

Huber loss can be used instead of the mean-squared error. The Huber loss function uses a linear function when the difference is large which enables a more robust estimation. Furthermore, we find that it can also help the model identify and focus more directly on the related output variables by being more robust to the unrelated output variables. We will discuss how to optimize the model parameters efficiently in Section 3.3.