Gradient descent – Knowledge and References

Explore chapters and articles related to this topic

BIM - based machine learning engine for smart real estate appraisal

Published in Paulo Jorge da Silva Bartolo, Fernando Moreira da Silva, Shaden Jaradat, Helena Bartolo, Industry 4.0 – Shaping The Future of The Digital World, 2020

T. Su, L.H. Li

Gradient descent is an iterative optimization algorithm that can be used to minimize the cost function and find the best weights of attributes to build the training model. A cost function is used for as a measurement of how wrong the prediction outcome from the sale price, which is calculated as follows: CostFunction=∑i=1mModelGuessi−Saleprice2m

Visual Perception

View Chapter

Purchase Book

Published in Laxmidhar Behera, Swagat Kumar, Prem Kumar Patchaikani, Ranjith Ravindranathan Nair, Samrat Dutta, Intelligent Control of Robotic Systems, 2020

Laxmidhar Behera, Swagat Kumar, Prem Kumar Patchaikani, Ranjith Ravindranathan Nair, Samrat Dutta

Backpropagation (BP) is an abbreviation for backward propagation of errors. The BP method calculates the gradient of a cost function with respect to all the weights associated with the network. The gradients with respect to the parameters (weights) are given to the optimization method (usually gradient descent method) to update the weights, which in an attempt minimizes the cost function. To train a feature vector x∈Rn with output vector y∈Rm, we need training examples (xi, yi). Given m training examples, i.e., a set of {(x(1), y(1)), …, (x(m), y(m))} the ANN is trained using batch gradient descent method. The batch gradient descent computes the gradient using the whole dataset.

Overview of Deep Learning Algorithms Applied to Medical Images

View Chapter

Purchase Book

Published in Ayman El-Baz, Jasjit S. Suri, Big Data in Multimodal Medical Imaging, 2019

Behnaz Abdollahi, Ayman El-Baz, Hermann B. Frieboes

Gradient descent is an iterative optimization technique widely used to optimize machine learning cost functions. It is an iterative method that updates the parameters until they converge to the optimized solution values. Assuming a model with cost function J and two parameters w and b, the first two equations below calculate the partial derivatives of w and b, while the second two equations update the parameters. dw=dJ/dwdb=dJ/dbw=w−learningrate⋅dwb=b−learningrate⋅db.

Handwritten MODI Character Recognition Using Transfer Learning with Discriminant Feature Analysis

View Article

Journal Information

Published in IETE Journal of Research, 2023

Savitri Chandure, Vandana Inamdar

Result optimization is achieved by hyper parameter tuning while training. The stochastic gradient descent algorithm uses a subset of the training set which is of a mini-batch size to update the parameters. The batch size selected here is 64 and the activation function is ReLU. Learning Rate (LR) is a crucial factor for generalization of the network having a positive value in the range of 0–1. Experimentations are carried out with an increasing learning rate using the retrained network. Figure 3(a) shows the effect of the learning rate on the network performance. It shows that a slow learning rate leads to a very slow convergence and a faster rate results in an unstable network. Based on the performance curve, the learning rate selected is 0.001. Also, experimentation is continued by varying the number of epochs for a given learning rate and batch size. As shown in Figure 3(b), eight epochs give a better result, so the number of epochs selected is eight. Partitioning of the data samples into training set and testing set is also found to play a vital role. The best split found for a given dataset is 80% training and 20% testing.

Face mask recognition system using MobileNetV2 with optimization function

View Article

Journal Information

Published in Applied Artificial Intelligence, 2022

Atheer Hadi Issa Al-Rammahi

On dataset of images, extensive experiments were conducted to evaluate the performance and effectiveness of the suggested models. On dataset, Figure 6 shows the MobileNetV2 model’s training and validation curves. Figure 5 shows that over 10 epochs, the training and validation accuracy achieved by MobileNetV2 are 99%. Hence, the MobileNetV2 model achieved equal training accuracy on dataset. We made use of already existing MobileNetV2 architecture from keras. We remove the ADD layer and replace it with our own softmax layer brief explanation of the layers. In our model gradient descent is used which is an optimization algorithm used when training a machine learning model. It is based on a convex function and frequently adjusts its parameters to reduce a given function to a local minimum.

Client profile prediction using convolutional neural networks for efficient recommendation systems in the context of smart factories

View Article

Journal Information

Published in Enterprise Information Systems, 2022

Nadia Nedjah, Victor Ribeiro Azevedo, Luiza De Macedo Mourelle

The gradient descent algorithm has three variants, whose usage is dependent on the amount of data that is used to compute the gradient of the cost function. These are: batch gradient descent; mini-batch gradient descent and stochastic gradient descent. During the gradient descent, there is a batch that indicates the total number of samples in the dataset used to compute the gradient for each iteration. For the batch gradient descent, the entire dataset is used as the batch before updating the weights. However, with the number of large training samples, this training algorithm can take a long time to converge. The mini-batch training performs the training with a reduced number of training sets, updating the model parameters in each iteration. The stochastic gradient descent refers to a process, wherein a random probability is used. Some samples are selected randomly at each iteration. This training algorithm is thus more efficient because the improvement in performance that occurs for the first sample.