Explore chapters and articles related to this topic
High-Performance Computing and Its Requirements in Deep Learning
Published in Sanjay Saxena, Sudip Paul, High-Performance Medical Image Processing, 2022
Biswajit Jena, Gopal Krishna Nayak, Sanjay Saxena
Finally, after looking at the complex backpropagation algorithm, the complete simple neural networks have been completed. This basic idea is further developed as Convolutional Neural Networks in which specified layers like convolutional layer and pooling layers and CNN is widely used in computer vision applications. One can always see how it changed even the field of NLP, and to move on further, we see recurrent neural networks which are widely used for natural language processing applications such as machine translation, speech recognition, time-series processing, etc., where advanced concepts like LSTM is used which is an alternate version for RNN. They perform much better than RNN for the above-specified applications. The next subsection will be a briefing on reinforcement learning which will be the final part discussed in this section.
Framework for Video Summarization Using CNN-LSTM Approach in IoT Surveillance Networks
Published in Monika Mangla, Ashok Kumar, Vaishali Mehta, Megha Bhushan, Sachi Nandan Mohanty, Real-Life Applications of the Internet of Things, 2022
Chaitrali Chaudhari, Satish Devane
RNNs are used to process variable-length sequences of inputs and encode the extracted sequential features. They are used to analyze the hidden sequential patterns. A video is consisting of many frames that carry visual information through which the context present in the video is conveyed. RNNs can interpret such sequences but suffer from problems like vanishing and exploding gradients which are overcome by LSTMs when trained with backpropagation through time. LSTM is a special kind of RNN architecture with a gating mechanism capable of learning long term dependencies and preserve sequence information over time. They are created as the solution to short-term memory and can learn long-term contextual information from temporal sequences. A common architecture of LSTM consists of cell memory and three gates, an input gate, an output gate and a forget gate adjusted by a sigmoid unit. Its special structure control long-term sequence pattern identification.
Deep Learning
Published in Peter Wlodarczak, Machine Learning and its Applications, 2019
A recurrent neural network is trained using backpropagation and gradient descent, just like a feedforward network. However, because of the loops, the backpropagation mechanism does not work in the same way like for a feedforward network. In a feedforward network, the algorithm moves backwards from the final error through the layers and adjusts the weights up or down, whichever reduces the error. To train a recurrent network, an extention of backpropagation called backpropagation through time, or BPTT is used. Backpropagation through time is a gradient-based method for training recurrent networks. In backpropagation through time, time is defined by the ordered series of calculations moving from one time step to the other. In essence, the structure of the recurrent neural network is unfolded. A copy of the neurons that contain loops is created and the cyclic graphs of the recurrent neural network are transformed into acyclic graphs, turning the recurrent neural network into a feedforward network. Every copy shares the same parameters. A recurrent network and the unfolded network are shown in Figure 8.6.
Semantic Role Labeling Based on Valence Structure and Deep Neural Network
Published in IETE Journal of Research, 2023
Traditional recurrent neural networks have the problem of vanishing or exploding gradients, which means that it is difficult to model long-distance dependencies. The long short-term memory network [6] aims to alleviate this problem. The LSTM unit consists of a memory unit, an input gate, a forget gate and an output gate. The memory unit carries the memory content of the LSTM unit, and the gate controls the amount of change and exposure of the memory content. Let represent the input vector at time t, represent the hidden state output of the LSTM unit at time t-1, and represent the cell state at time t-1. The work flow of the LSTM at time t can be expressed as shown from Equations (7) to (12). where represent input gate, forget gate, output gate and cell state respectively; represents the sigmoid function, ⊙ represents the dot product between elements; is the weight matrix of ; is the weight matrix of ; is the bias vector.
A deep learning approach for integrated production planning and predictive maintenance
Published in International Journal of Production Research, 2023
Hassan Dehghan Shoorkand, Mustapha Nourelfath, Adnène Hajji
LSTM is in fact a specific recurrent neural network architecture, especially designed to overcome exploding and vanishing gradient problems in classical recurrent neural networks, which typically occur while learning long-term dependencies, even in case of very long minimal time lags (Hochreiter and Schmidhuber 1996). Instead of neurons, LSTM networks have memory blocks that are connected through layers. A block has components that make it smarter than a classical neuron and a memory for recent sequences. A block contains gates that manage the block’s state and output. A block operates upon an input sequence and each gate within a block uses the sigmoid activation units to control whether they are triggered or not, making the change of state and addition of information flowing through the block conditional. The cell remembers values over arbitrary time intervals and the three gates regulate the flow of information into and out of the cell. LSTM networks are well-suited to classifying, processing and making predictions based on a time-series. It is well-established that LSTM has a great capacity to retain memory and learn from data sequences (Aydin and Guldamlasioglu 2017).
Modeling the infiltration rate of wastewater infiltration basins considering water quality parameters using different artificial neural network techniques
Published in Engineering Applications of Computational Fluid Mechanics, 2022
Ghada Abdalrahman, Sai Hin Lai, Pavitra Kumar, Ali Najah Ahmed, Mohsen Sherif, Ahmed Sefelnasr, Kwok Wing Chau, Ahmed Elshafie
Recurrent neural network (RNN) uses sequential data or time-series data as training data to learn the model, but unlike the feedforward, it has a memory from previous inputs to influence the current input and the output, as well as the neurons in the same layer, are interconnected and allow feedback. RNN depends on the hidden state feature, which remembers the calculated information from previous elements of the sequence, and the same inputs can produce different outputs in a series. It has the advantage of reducing the complexity of other neural networks as it converts the independent activations into dependent activations by using the same parameters of weight and biases to all layers to produce output. Besides, RNN can take more than one input vector and bring out more than one output. On the other hand, RNN has some limitations in processing some activation functions in long sequences as well as it is a little bit more challenging to run than other neural networks.