Explore chapters and articles related to this topic
Gradient boosting machines
Published in Brandon M. Greenwell, Tree-Based Methods for Statistical Learning in R, 2022
LightGBM [Ke et al., 2017] offers many of the same advantages as XGBoost, including sparse optimization, parallel tree building, a plethora of loss functions, enhanced regularization, bagging, histogram binning, and early stopping. A major difference between the two is that LightGBM defaults to building trees leaf-wise (or best-first). Unlike XGBoost, LightGBM can more naturally handle categorical features in a way similar to what's described in Section 2.4. In addition, the LightGBM algorithm utilizes two novel techniques, gradient-based one-side sampling (GOSS) and exclusive feature bundling (EFB).
Predicting pedestrian crash occurrence and injury severity in Texas using tree-based machine learning models
Published in Transportation Planning and Technology, 2023
Bo Zhao, Natalia Zuniga-Garcia, Lu Xing, Kara M. Kockelman
LightGBM is a popular gradient boosting decision tree model. Compared with XGBoost, LightGBM incorporates gradient-based one-side sampling (GOSS) to improve computational efficiency (Ke et al. 2017). The basic assumption behind GOSS is that samples with larger gradients, i.e. under-trained instances, will contribute more to the information gain. Therefore, to retain the accuracy of information gain estimation, GOSS keeps all the instances with large gradients (e.g. larger than a pre-defined threshold or among the top percentiles) and only randomly drops instances with small gradients. It was shown that LightGBM could lead to a more accurate gain estimation than uniformly random sampling, with the same target sampling rate, especially when the value of information gain has a large range.
Ensemble Classifier for Stock Trading Recommendation
Published in Applied Artificial Intelligence, 2022
LightGBM is also a gradient learning tree-based framework with boosting technique to integrate results from weak learners to enhance performance (Ke, Menget al. 2017). Its main difference from the XGBoost model is that it uses Gradient-based One Side Sampling (GOSS) and automated feature selection with Exclusive Feature Bundling (EFB). With GOSS, LightGBM uses histogram algorithm to discretize continuous floating-point eigenvalues into many bins. Histogram algorithm does not need extra storage of presorted results and thus greatly reduces memory consumption without scarifying the accuracy of the model. EFB reduces the optimal bundling of exclusive features to a graph coloring problem and solves it by a greedy algorithm with a constant approximation ratio. Both GOSS and EFB makes LightGBM lighter and efficient. In addition, LightGBM uses leaf-wise tree growth strategy, which effectively finds the leaves with the highest branching gain each time from all the leaves, and then goes through the branching cycle. Therefore, it can reduce more errors and obtain better precision with the same number of times of segmentation. A maximum depth limit is set to prevent overfitting while ensuring high efficiency.
A hybrid model for high spatial and temporal resolution population distribution prediction
Published in International Journal of Digital Earth, 2022
Yuhang Zhang, Yi Zhang, Bo Huang, Xin Liu
LightGBM is a novel gradient-boosting decision tree algorithm suitable for classification problems. It uses an improved histogram algorithm to divide continuous values into k intervals, and the split points are selected from the k intervals. In addition, LightGBM uses leaf-wise strategy to improve accuracy and limit the growth depth. Gradient-based one-side sampling introduced in LightGBM only considers data with a large gradient to reduce computation cost. Finally, this method uses exclusive feature bundling to bind many mutually exclusive features into a single feature to achieve dimension reduction. Thus, unlike other gradient-boosting decision tree methods, LightGBM accelerates the training process without losing accuracy and has been widely used in classification and regression problems. Several studies have demonstrated the superiority of LightGBM compared with other tree-based models such as Random Forest and XGBoost (Huang 2021; Ke et al. 2017; Wang, Zhang, and Zhao 2017).