Hyperparameter optimization – Knowledge and References

Explore chapters and articles related to this topic

Expert Systems for Microgrids

Published in KTM Udayanga Hemapala, MK Perera, Smart Microgrid Systems, 2023

The parameters of a model that are not manually set by the programmer are estimated or learned from a given dataset. In contrast, hyperparameters are externally defined values that cannot be determined from a given dataset. They are predefined to estimate the model parameters. In RL, there are several important hyperparameters, such as the learning rate, the decay rate, and the discount rate. The selection of the best combination of hyperparameters is called hyperparameter optimization or tuning. However, this can be treated as another problem by following different tuning algorithms. The most common approach is the trial and error method in which these parameters are tested using values assigned from the experience of the programmer. There are other structured methods, such as grid search and random search.

Evolutionary Computing and Swarm Intelligence for Hyper Parameters Optimization Problem in Convolutional Neural Networks

View Chapter

Purchase Book

Published in Ali Ahmadian, Soheil Salahshour, Soft Computing Approach for Mathematical Modeling of Engineering Problems, 2021

Senthil kumar Mohan, A John, Ananth kumar Tamilarasan

Optimizing or tuning hyperparameters poses a problem in machine learning, which can be solved by choosing a collection of sufficient hyperparameters for learning algorithms. A hyperparameter is a parameter that tracks the learning process using its value. Comparison values of other parameters (usually node weights) are learned. The same model is used for generalizing diverse data patterns with different limits, weights, or speeds of learning. These steps are called hyperparameters, and they have to be optimized in such a way that the machine learning problem can be solved optimally by the model (Goldberg 1989). Hyperparameter optimization seeks a tuple of hyperparameters that produce an optimized model that minimizes the independent data supplied with a predefined loss function. The Machine Learning models consist of two distinct parameter types:Hyperparameters = are all the parameters that the user will randomly set before beginning training (e.g. the number of Random Forest estimators).Model parameters = In place of learning through the model training

Pseudo Damage Training for Seismic Fracture Detection Machine

View Chapter

Purchase Book

Published in Jian Zhang, Zhishen Wu, Mohammad Noori, Yong Li, Experimental Vibration Analysis for Civil Structures, 2020

Luyao Wang, Ji Dang, Xin Wang

Searching for hyperparameters is an iterative process constrained by computer configuration and time cost. Engineer wants the best model for the task given the available resources, so the procedure of training is essentially the trade-off among different hyperparameters and time cost. In this research, the grid search method, which is simply an exhaustive searching through a manually specified subset of the hyperparameter space, is used for hyperparameter optimization. There are three other hyperparameters that need to be tuned for good performance on unseen data: the learning rate, batch size, and epoch. Grid search selects a finite set of “reasonable” values for each hyperparameter: {Learningrate(1r)∈{1e−3,1e−4,1e−5}Batchsize(bs)∈{8,16,32}Epoch(ep)∈{10,20,30}

Predicting pedestrian crash occurrence and injury severity in Texas using tree-based machine learning models

View Article

Journal Information

Published in Transportation Planning and Technology, 2023

Bo Zhao, Natalia Zuniga-Garcia, Lu Xing, Kara M. Kockelman

Hyperparameter optimization returns those with best performance based on specific evaluation metrics. The optimization of hyperparameters can be represented in equation form as: where is the ML model; f(x) represents an objective function to minimize, such as RMSE for regression models or F1 score for classification models, evaluated on the validation set; is the set of hyperparameters that yields the lowest value of the score; and can take on any value in the domain . Bayesian hyperparameter optimization methods build a probability model of the objective function, i.e. , by tracking the past evaluation results and using them to select the most promising hyperparameters to evaluate in the true objective function (Klein et al. 2017). Specifically, the process of Bayesian hyperparameter tuning can be described as follows: (1) build a surrogate probability model of the objective function; (2) find the hyperparameters that perform best on the surrogate; (3) apply these hyperparameters to the true objective function; (4) update the surrogate model incorporating the new results; and (5) repeat steps 2–4 until the maximum number of iterations or specified time is reached.

Comparative Study of AutoML Approach, Conventional Ensemble Learning Method, and KNearest Oracle-AutoML Model for Predicting Student Dropouts in Sub-Saharan African Countries

View Article

Journal Information

Published in Applied Artificial Intelligence, 2022

Yuda N Mnyawami, Hellen H Maziku, Joseph C Mushi

The hyperparameter optimization can be achieved by grid search, random search, and Bayesian optimization (Yang and Shami 2020). Grid search suffers from the high dimensional data and is computationally expensive (Bergstra and Bengio 2012). Randomized search solves large-scale problems efficiently in a way that is impossible for grid search (Zabinsky 2011). On the other hand, random search does not employ a search technique to forecast the subsequent trial and does not use data from trials to choose the next set (Tsiakmaki et al. 2020). Therefore, this study selected the Bayesian optimization method due to its superiority over random and grid search (Turner et al. 2021). When performing Bayesian optimization, prior information is established for the optimization function, which is gathered from the previous dataset to update the posterior of the optimization function using Bayesian theorem (Wu et al. 2019).

Effects of flooding on pavement performance: a machine learning-based network-level assessment

View Article

Journal Information

Published in Sustainable and Resilient Infrastructure, 2022

Moeid Shariatfar, Yong-Cheol Lee, Kunhee Choi, Minkyum Kim

The XGB model is comprised of several hyperparameters such as booster, gamma, max, depth, seed, etc. The approaches that were conducted to attain optimal parameters that best fit the model with a minimized loss, were the trial and error method and the Bayesian hyperparameter optimization method. The advantage of Bayesian optimization compared to other methods (such as grid search and random search that blindly search the hyper parameter space), is that it uses the outcome of previous iteration for deciding the next hyperparameter values and results in providing a higher efficiency in hyperparameter optimization (Brochu et al., 2010; Lavanya Gupta, 2020). For the Bayesian optimization, the study adopted the HYPEROPT from the Python libraries, which slightly affected the model applied on datasets 1 and 2 and increased the accuracy (Bergstra et al., 2013). The suggested hyperparameter values with the Bayesian optimization method did not increase the model’s performance with datasets 3 and 4 and the parameters were optimized by the trial and error method. This can be due to the increase of the features because of adding one more year of historical data. Additional historical data also increases the possibility of inconsistency of the dataset which can be another reason for the decrease of Bayesian method performance for hyperparameter optimization.