Explore chapters and articles related to this topic
Gradient boosting machines
Published in Brandon M. Greenwell, Tree-Based Methods for Statistical Learning in R, 2022
For regression, LS and LAD loss are common choices. LAD loss has the benefit of being robust in the presence of long-tailed error distributions and response outliers. Regardless of loss, gradient tree boosting is already robust to long-tailed distributions or outliers in the feature space due to the robustness of the individual base learners (in this case, regression trees). Recall that trees are invariant to strictly monotone transformations of the predictors (e.g., using xj, exj, or logxj for the j-th predictor all produce the same results). Consequently, there is little need to be concerned with transformations of the features in most tree-based ensembles. For outcomes with normally distributed errors (or at least approximately so), LAD loss will be less efficient than LS loss and generalization performance will suffer. A happy compromise is provided by the Huber loss function for Huber M-regression described in Friedman [2001] and Hastie et al. [2009, p. 360]. The Huber loss function provides resistance to outliers and long-tailed error distributions while maintaining high efficiency in cases where errors are more normally distributed.
Multiobjective optimization of support vector regression parameters by teaching-learning-based optimization for modeling of electric discharge machining responses
Published in Angelos P. Markopoulos, J. Paulo Davim, Advanced Machining Processes, 2017
A number of loss functions, namely quadratic loss function, Huber loss function, ε-insensitive loss function, and so on, are already developed for handling different types of problems [16]. In general, these loss functions are some modified measurements of distances of the points and their corresponding estimated values. Square values of the distances between actual points and corresponding estimated values are considered for assigning loss in quadratic loss function. Quadratic loss function corresponds to the conventional least squares error criterion. Huber loss function is the combination of linear and quadratic loss functions. This robust loss function exhibits optimal properties when the underlying distribution of the data is unknown. Still, these two loss functions—quadratic and Huber—will produce no sparseness in the support vectors. To address these issues, Vapnik [12] proposed ε-insensitive loss function as a trade-off between the robust loss function of Huber and one that enables sparsity within the support vectors. ε-Insensitive loss function (refer Figure 7.3) may be defined as [14]: () L(yi,f(xi))=|yi, experimentalf(xi)|−ε,if|yi, experimental–f(xi)|≥ε=0,if|yi, experimental–f(xi)|<ε
Residual reinforcement learning for logistics cart transportation
Published in Advanced Robotics, 2022
Ryosuke Matsuo, Shinya Yasuda, Taichi Kumagai, Natsuhiko Sato, Hiroshi Yoshida, Takehisa Yairi
Therefore, the objective for minimizing the TD error is written as where Huber loss is used for robustness to outliers. Also, , which is the parameter of the target network, is updated for tracking θ. Concretely, is updated as where τ is the hyper-parameter and the smaller τ is, the more slowly the target network is updated. Additionally, TD3 utilizes two methods for Q-learning: (i) clipped double Q-learning, where two Q-functions are used and a small value of the two Q-functions is adopted as the target value for avoiding overestimation, and (ii) target policy smoothing, where noise is added to the action when the Q value of the target network in the target value calculation is calculated for stabilizing learning.
A Ranking-based Weakly Supervised Learning model for telemonitoring of Parkinson’s disease
Published in IISE Transactions on Healthcare Systems Engineering, 2022
Dhari F. Alenezi, Hang Shi, Jing Li
The Huber loss combines the robustness of -norm with the stability of -norm. For huge errors, it is linear; for small errors, it is quadratic. The Huber loss also gives fast methods for computing gradient and performing Hessian times vector operations. The problem is now convex, unconstrained, and differentiable, and can be formulated as follows:
Deep multistage multi-task learning for quality prediction of multistage manufacturing systems
Published in Journal of Quality Technology, 2021
Hao Yan, Nurettin Dorukhan Sergin, William A. Brenneman, Stephen Joseph Lange, Shan Ba
Huber loss can be used instead of the mean-squared error. The Huber loss function uses a linear function when the difference is large which enables a more robust estimation. Furthermore, we find that it can also help the model identify and focus more directly on the related output variables by being more robust to the unrelated output variables. We will discuss how to optimize the model parameters efficiently in Section 3.3.