LSTM – Knowledge and References

Explore chapters and articles related to this topic

Deep Learning for Information Retrieval

Published in Anuradha D. Thakare, Shilpa Laddha, Ambika Pawar, Hybrid Intelligent Systems for Information Retrieval, 2023

Anuradha D. Thakare, Shilpa Laddha, Ambika Pawar

LSTMs are enhanced over RNNS. Prioritization of tasks plays a very important role in completing all tasks on time. In the case of RNNs, it does not happen. It does not keep track of important information; it simply adds new information and completely transforms existing information. Whereas LSTMs do the prioritization, it selectively remembers or forgets the information depending on its importance. It makes minor changes in information by applying multiplications and additions. In LSTM networks, the information flows in the form of cell states. LSTM is an extension of RNNs with extended memory. Due to this extended memory, it learns from important experiences despite very long-time lags. It enables RNNs to store information for a long time. The LSTM performs different operations such as read, write, and delete information. LSTM assigns weights to the information based on the importance, which are also learned by the algorithm. Thus, LSTM learns over time about the importance of information. LSTM networks are made of memory blocks that are called cells. It has a cell, and the hidden states are passed to the next cell. The memory blocks remember and manipulate information stored in memory using three gates. As shown in Figure 8.7, LSTM has three gates—input, forget, and output gates; these gates control information flow. The input gate allows new information, whereas the forget gate removes the information from network; the output gate outputs information at the current time step. Figure 8.7 illustrates LSTM.

Energy Efficiency in Smart Street Lighting System for ITU

View Chapter

Purchase Book

Published in Tugrul Daim, Marina Dabić, Yu-Shan Su, The Routledge Companion to Technology Management, 2023

Eren Deliaslan, M. Özgür Kayalica, Gülgün Kayakutlu

Artificial neural networks (ANN) successfully overcome this drawback of ARIMA, but have produced mixed results for purely linear time series. This means that neither ARIMA nor ANN is solely sufficient in modeling the real-world time series that contains both linear and nonlinear correlation structures (Zhang 2003). An ANN is an information processing system which contains many highly interconnected processing neurons. These neurons work together in a distributed manner to learn from the input information, to coordinate internal processing, and to optimize its final output (Jiang et al. 2010). A RNN is a type of ANN, which has a recurring connection to itself. RNNs have the memory to record important features about the input. This is the reason why they are preferred for sequential data like time series, speech, text, financial data, audio, video, and weather, compared to other algorithms (Fawaz et al. 2019). LSTM is a version of RNN, but unlike RNN, there is a long-term memory in LSTM. An LSTM network was invented to address the vanishing gradients problem. The key insight in the LSTM design incorporates nonlinear, data-dependent controls into the RNN cell, which ensures training based on the gradient of the objective function (Sherstinsky 2020).

IoT and Deep Learning-Based Prophecy of COVID-19

View Chapter

Purchase Book

Published in Anand Sharma, Sunil Kumar Jangir, Manish Kumar, Dilip Kumar Choubey, Tarun Shrivastava, S. Balamurugan, Industrial Internet of Things, 2022

K. Meena, R. Raja Sekar

LSTM consists of three gates: forget gate, input gate, and output gate. These three gates determine the cell state (act as memory), which is the central part of LSTM. The information is added or removed from the cell based on the gates. xt denotes current input. Ct indicates the content of the latest cell state and Ct−1 denotes the cell state of the previous LSTM unit. ht denotes the current output and ht−1 represents the previous LSTM unit's output. Wf, Wi, and Wo are the weights applied to forget gate, input gate, and output gate, respectively, and bi, bf, and bo are the biases applied to forget gate, input gate, and output gate, respectively [35]. It includes two functions: (1) sigmoid function (σ) and (2) Tanh function. The sigmoid activation function is used in all three gates, which converts the output value to stay in the range of 0 to 1. Similarly, the Tanh activation function converts the output to fall in the range of −1 to 1, which is used in the input and output gates. Using these functions the network can learn about the data itself, i.e. which part is important and which is not important. Therefore gates play a vital role in deciding which information to be kept and which one to be discarded.

A CNN-LSTM–Based Model to Fault Diagnosis for CPR1000

View Article

Journal Information

Published in Nuclear Technology, 2023

Changan Ren, He Li, Jichong Lei, Jie Liu, Wei Li, Kekun Gao, Guocai Huang, Xiaohua Yang, Tao Yu

Each LSTM unit is made up of a memory cell and three primary gates: input, output, and forget. This structure allows the LSTM to control the flow of information by deciding which information to forget and which to remember, enabling it to learn long-term dependencies. The input gate , along with the second gate , manages the new information stored in the memory state at time . The forget gate determines the past information to be eliminated or kept in the memory cell at time while the output gate regulates the information that could be used for the output of the memory cell. To summarize, Eqs. (4) through (8) briefly describe the operations carried out by an LSTM unit:

A long short-term memory model for forecasting the surgical case volumes at a hospital

View Article

Journal Information

Published in IISE Transactions on Healthcare Systems Engineering, 2023

Hieu Bui, Sandra D. Ekşiog˜lu, Adria A. Villafranca, Joseph A. Sanford, Kevin W. Sexton

We propose a long short-term memory (LSTM) model to predict the volume of surgical cases per week. LSTM is an artificial recurrent neural network (RNN) architecture capable of learning order dependence in sequential data. Successful implementations of LSTM require large amounts of data. Our data analysis shows that only a few procedures have the necessary data to develop, train and test procedure-specific LSTMs (see Figure 8). For example, for only 22 procedures we have more than 800 records per procedure. These procedures account for 29.5% of the observations. We developed an LSTM model for each of the 22 procedures for which we have sufficient data. The rest of the procedures are clustered together based on certain similarities. An LSTM model is developed for each cluster. We create 26 clusters, which account for 16% of the observations. In the following sections, we provide details of the clustering algorithm and of the LSTM model.

A recurrent neural network model for predicting two-leader car-following behavior

View Article

Journal Information

Published in Transportation Letters, 2023

Sanhita Das, Akhilesh Kumar Maurya, Arka Dey

LSTM and GRU are the two variants of RNN architecture that have different internal mechanisms known as gates, which help in the regulation of the flow of information. LSTM cell consists of an input gate, output gate, forget gate, and a cell state. Figure 3(a) shows the general structure of a LSTM cell where denotes the input vector, and denote the memory cell state and output at time , respectively. Similarly, GRU has gated units with two gates, a reset gate and an update gate denoted by and , respectively in Figure 3(b). Both LSTM and GRU are efficient in handling the long-term dependency; however, GRU is a little faster than LSTM due to the lesser tensor operation. The activation functions used inside the cell are hyperbolic tangent and sigmoid functions, which convert the values in the range of [−1, 1] and (0, 1), respectively (Ma and Qu 2020).