Xgboost – Knowledge and References

Explore chapters and articles related to this topic

Implementation

Published in Seyedeh Leili Mirtaheri, Reza Shahbazian, Machine Learning Theory to Applications, 2022

Seyedeh Leili Mirtaheri, Reza Shahbazian

XGboost is an optimized distributed gradient boosting designed for high efficiency, flexibility, and portability [116]. It is an open-source library implementing the gradient boosting decision tree algorithm. It has become more popular recently since it was chosen by many winning Machine Learning teams in a lot of Machine Learning competitions. XGBoost uses the Gradient Boosting framework for implementing Machine Learning algorithms. It offers a parallel tree boosting (also known as GBDT or GBM), which solves several data science problems expeditiously. The same code runs on main distributed environments (MPI, SGE, Hadoop) and can solve a huge number of problems. The term gradient boosting comes from the idea of improving or boosting a weak model by mixing it with many other weak models hoping the result of generating an overall strong model. XGBoost boosts weak learning models using iterative learning. It offers interfaces for Python, Java, C++, Julia, and R and is compatible with Linux, Windows, and MacOS while supporting GPUs. XGBoost also supports distributed processing frameworks such as Apache Hadoop/Spark/Flink and Dataflow.

Employee Turnover Prediction Using Single Voting Model

View Chapter

Purchase Book

Published in Pethuru Raj Chelliah, Usha Sakthivel, Nagarajan Susila, Applied Learning Algorithms for Intelligent IoT, 2021

R. Valarmathi, M. Umadevi, T. Sheela

The proposed work chooses the top 10 features using CART analysis. The imbalanced dataset is split into 60% and 40%, thereby reducing the bias and improving the sensitivity and specificity of the ensemble model built [1]. The authors have addressed the employee turnover issue with six different machine learning (ML) algorithms, namely, SVM (RBF Kernel), XGBoost, logistic regression, Naive Bayesian, Random Forest with Depth controlled, and LDA and KNN. As predicted, the algorithms perform with different measures, such as AUC, running time, and memory utilization. The experimental results shows that XGBoost outperforms all other algorithms in terms of accuracy and memory utilization. Logistic regression took a minimum of 52 seconds to run the prediction algorithm [2]. The author demonstrated the prediction of employee turnover by analyzing varying size (small, medium, and large) and complexity of the human resource datasets with 10 different supervised classifiers. The author has provided guidelines for using the statistical methods and recommended gradient boosting for analyzing large datasets, as it takes less time for data preprocessing and ranks the features automatically and reliably [3]. Employee attrition dataset created by IBM Watson predicts the employee turnover [4].

Predictive Analysis of Type 2 Diabetes Using Hybrid ML Model and IoT

View Chapter

Purchase Book

Published in Sudhir Kumar Sharma, Bharat Bhushan, Narayan C. Debnath, IoT Security Paradigms and Applications, 2020

Abhishek Sharma, Nikhil Sharma, Ila Kaushik, Santosh Kumar, Naghma Khatoon

XGBoost is an ensemble machine learning algorithm that merges several machine learning techniques in one predictive model to achieve higher performance. It is a decision tree-based machine learning algorithm that uses a gradient boosting framework. This algorithm has a high bias and low variance [23]. This algorithm is highly scalable and has a faster learning process through parallel and distributed computing. It also offers efficient memory usage. In this algorithm, each base learner learns from its predecessors and aims to reduce the errors of the past tree. These base learners are weak learners in which bias is high and the predictive power is just a little better than random guessing. This high bias is reduced by using sequential decision trees. Each of these weak learners provides some crucial information for making predictions, making the boosting method to produce a strong learner by productively combining all the weak learners [24]. The process in XGBoost is shown in Figure 14.1.

A novel explainable modeling method for cleaned coal quality evaluation in jigged fluidized bed

View Article

Journal Information

Published in Energy Sources, Part A: Recovery, Utilization, and Environmental Effects, 2023

Zhiping Wen, Yali Kuang, Hongyue Zi, Changchun Zhou, Guanghui Wang

XGBoost is an efficiency, flexibility, and portability algorithm that can solve classification and regression problems (Chen, He, and Benesty 2015). The XGBoost model uses a gradient boosting framework based on the ensemble learning method of decision trees (Zhang et al. 2018). Specifically, for a target dataset with multiple features, the predicted output of a XGBoost model can be regarded as Eq. (1). The final predicted values are the sum of all prediction trees. The N represents the number of trees in the model. Further calculations of equation (1) require minimizing the loss (L) and regularization objective (Ω) in Eq. (2) and Eq. (3). The Ω can effectively avoid model overfitting during training.

Square Static – Deep Hyper Optimization and Genetic Meta-Learning Approach for Disease Classification

View Article

Journal Information

Published in IETE Journal of Research, 2023

P. Dhivya, P. Rajesh Kanna, K. Deepa, S. Santhiya

XGBoost is a machine-learning library that utilizes a gradient-boosting framework to construct a predictive model. The algorithm sequentially adds new weak learners to the ensemble, and each learner is trained to minimize the error of the previous ensemble. As a result, the model’s performance improves with each iteration. Some of the key features of XGBoost are its ability to handle missing values in input data, L1 and L2 regularization to prevent overfitting, support for parallel processing on multiple CPUs or GPUs, built-in support for k-fold cross-validation to help tune model hyperparameters, and the ability to provide insights into the importance of each input feature for predicting the target variable. The LightGBM algorithm uses a gradient-based approach, which involves computing the gradients of the loss function with respect to each parameter and updating the model in the direction of the negative gradient. One of the unique features of LightGBM is its use of a histogram-based approach for splitting data, which improves training efficiency while maintaining accuracy [18].

Exploring the use of association rules in random forest for predicting heart disease

View Article

Journal Information

Published in Computer Methods in Biomechanics and Biomedical Engineering, 2023

Khalidou Abdoulaye Barry, Youness Manzali, Rachid Flouchi, Youssef Balouki, Khadija Chelhi, Mohamed Elfar

The library of gradient boosting algorithms known as eXtreme Gradient Boosting, or XGBoost, is tailored for use with contemporary data science tools and issues. The fact that XGBoost is extremely scalable and parallelizable, rapid to execute, and often outperforms other algorithms are some of its main advantages. It also uses a more regularized model formalization to control over-fitting, which improves performance (Sinha et al. 2020). The parameters which were used in this algorithm are as follows: Max_depth: varies in the interval(1,9).Min_child_weight: takes the following values([1,2,3]).Gamma: takes the following values([0,0.1,0.2,0.3,0.4,0.5]).