Hyperparameters – Knowledge and References

Explore chapters and articles related to this topic

Expert Systems for Microgrids

Published in KTM Udayanga Hemapala, MK Perera, Smart Microgrid Systems, 2023

The parameters of a model that are not manually set by the programmer are estimated or learned from a given dataset. In contrast, hyperparameters are externally defined values that cannot be determined from a given dataset. They are predefined to estimate the model parameters. In RL, there are several important hyperparameters, such as the learning rate, the decay rate, and the discount rate. The selection of the best combination of hyperparameters is called hyperparameter optimization or tuning. However, this can be treated as another problem by following different tuning algorithms. The most common approach is the trial and error method in which these parameters are tested using values assigned from the experience of the programmer. There are other structured methods, such as grid search and random search.

Evolutionary Computing and Swarm Intelligence for Hyper Parameters Optimization Problem in Convolutional Neural Networks

View Chapter

Purchase Book

Published in Ali Ahmadian, Soheil Salahshour, Soft Computing Approach for Mathematical Modeling of Engineering Problems, 2021

Senthil kumar Mohan, A John, Ananth kumar Tamilarasan

Optimizing or tuning hyperparameters poses a problem in machine learning, which can be solved by choosing a collection of sufficient hyperparameters for learning algorithms. A hyperparameter is a parameter that tracks the learning process using its value. Comparison values of other parameters (usually node weights) are learned. The same model is used for generalizing diverse data patterns with different limits, weights, or speeds of learning. These steps are called hyperparameters, and they have to be optimized in such a way that the machine learning problem can be solved optimally by the model (Goldberg 1989). Hyperparameter optimization seeks a tuple of hyperparameters that produce an optimized model that minimizes the independent data supplied with a predefined loss function. The Machine Learning models consist of two distinct parameter types:Hyperparameters = are all the parameters that the user will randomly set before beginning training (e.g. the number of Random Forest estimators).Model parameters = In place of learning through the model training

Deep Learning in Brain Segmentation

View Chapter

Purchase Book

Published in Saravanan Krishnan, Ramesh Kesavan, B. Surendiran, G. S. Mahalakshmi, Handbook of Artificial Intelligence in Biomedical Engineering, 2021

Hao-Yu Yang

The hyperparameters in a neural network refer to those parameters that are not attributed in the backward pass and parameters update steps. Since hyperparameters are not learned in the training process, they need to be manually adjusted by the developers. Take the learning rate, for example. Learning rate is how much the parameters are adjusted each time during the parameter update step. Balancing the learning rate is important because a low learning rate leads to slow convergence and too high of learning will lead to unstable training. Hence, experimenting with some sensible choice of hyperparameters is crucial to the performance of the model and requires a solid understanding of CNNs. Other hyperparameters include a number of layers, layer arrangements, stride size, kernel size, padding size, etc.

Multi-task Gaussian process upper confidence bound for hyperparameter tuning and its application for simulation studies of additive manufacturing

View Article

Journal Information

Published in IISE Transactions, 2023

Bo Shen, Raghav Gnanasambandam, Rongxuan Wang, Zhenyu James Kong

Different from single-task BO introduced above, Multi-Task Bayesian Optimization (MTBO) (Swersky et al., 2013) is a general method to efficiently optimize multiple different but correlated “black-box” functions. The settings for multi-task Bayesian optimization widely exist in many real-world applications. For example, the K-fold cross-validation (Bengio and Grandvalet, 2004) is a widely used technique to estimate the generalization error of a machine learning model for a given set of hyperparameters. However, it needs to retrain a model K times using all K training-validation splits. The validation errors of a model trained on K different training-validation splits can be treated as K “black-box” functions, which need to be minimized as K different tasks. These K tasks will be highly correlated, as the data are randomly partitioned among K training-validation splits. The performance of our proposed method in the application of fast cross-validation (see Swersky et al. (2013); Moss et al. (2020)) is presented in Section 5.1, which aims at minimizing the average validation errors in K-fold cross-validation.

Improving multi-scale attention networks: Bayesian optimization for segmenting medical images

View Article

Journal Information

Published in The Imaging Science Journal, 2023

Jimut Bahan Pal, Dripta Mj

The overall loss function of the proposed model encompasses unknown hyperparameters , , and that are associated with the losses at the different scales. Simple methods for hyperparameter tuning include grid and random search-based algorithms. We employ here a more sophisticated BO approach to determine the optimal configuration of the loss function hyperparameters. Such a methodology is preferred in settings involving expensive objective function, like in the present instance. BO [8] uses Gaussian process as surrogate model to approximate the costly objective function (see, e.g. [42]), and optimizes an acquisition function, which is defined based on the posterior mean and variance of the proxy, for identifying the next input location for evaluation. The process typically involves making a trade-off between exploration and exploitation [8].

Effects of flooding on pavement performance: a machine learning-based network-level assessment

View Article

Journal Information

Published in Sustainable and Resilient Infrastructure, 2022

Moeid Shariatfar, Yong-Cheol Lee, Kunhee Choi, Minkyum Kim

Hyperparameters are variables that determine the structure of the machine learning algorithm and can have a significant effect on the model’s performance (Radhakrishnan, 2017). In general, hyperparameter search is conducted manually through rules-of-thumb or testing different sets of hyperparameters (Claesen & De Moor, 2015). This study used different optimization methods for each model. For example, for the KNN model, K = 7 (number of neighbors) was obtained as the optimal number for K thorough manual trial. The optimization for other models was conducted whether through Bayesian optimization method or trial and error. Table 4 shows some of the parameters of each algorithm, their range, and selected values for dataset 3 (other datasets were optimized by approximately similar values). The hyperparameter optimization for the XGB model is described later in more detail.