Seq2seq

Explore chapters and articles related to this topic

Psychiatric Chatbot for COVID-19 Using Machine Learning Approaches

Published in Roshani Raut, Salah-ddine Krit, Prasenjit Chatterjee, Machine Vision for Industry 4.0, 2022

Priyanka Jain, Subhash Tatale, Nivedita Bhirud, Sanket Sonje, Apurva Kirdatt, Mihir Gune, N. K. Jain

A general Seq2Seq model is a neural network that converts the given input sequence into another sequence (Sutskever et al. (2014)). The word transformer here adds some attention mechanism to encoder-decoder models. In the Seq2Seq transformer model, the attention mechanism checks the input sequence and decides which other parts of the sequence are important. It considers all possible inputs at the same time and assigns them weights. The encoder of the transformer will clean the input text then encode the sequence by adding a pad sequence of fixed length. This encoded sequence goes as an input to the decoder. A decoder learns the word sequence that comes from encoder output and assigns weights as per importance and returns the linear layer. This linear layer has a sequence, which learns the embedding of words with respect to their importance in the sentence. The transformer takes the input query, again cleans and encodes the query and then predicts the words, one word at a time based on all the words before the current word and, based on this, the final output will be generated.

Transformers: Basics and Introduction

View Chapter

Purchase Book

Published in Uday Kamath, Kenneth L. Graham, Wael Emara, Transformers for Machine Learning, 2022

Uday Kamath, Kenneth L. Graham, Wael Emara

Machine translation and many other NLP problems produced promising results using sequence-to-sequence architecture (abbreviated as seq2seq) with a recurrent neural network (e.g., RNN) based encoder and decoder [51, 238]. Fig. 2.2 shows an example of machine translation where a French sentence, J'aime le thé., and it's English equivalent, I love tea. form the sentence pair as an example of input for the training. The tokens <eos> and <bos> are special ways to handle the end and beginning of the sentence, respectively. As shown in Fig. 2.2 the final hidden state of the encoder acts as the input for the decoder.

Time series forecasting to jointly model bridge responses

View Chapter

Purchase Book

Published in Hiroshi Yokota, Dan M. Frangopol, Bridge Maintenance, Safety, Management, Life-Cycle Sustainability and Innovations, 2021

O. Bahrami, R. Hou, W. Wang, J.P. Lynch

The backbone of a Seq2Seq model is a recurrent neural network (RNN). An RNN can be viewed as a unit cell that is rolled over itself for a set number of times. At each time step, the cell takes an input xi, and the hidden state from the previous step hi−1 and returns the current hidden state hi (Figure 4). The network aims at capturing sequential information within the data presented to it through the use of hidden states from the previous time steps. The main computation of the RNN is the unit cell Φ where ΦE denotes the cell for the encoder and ΦD for the decoder. The RNN architecture is used to model an encoder whose design is optimized to output a low dimension context vector using input time series x[i]; the decoder uses the context vector to feed another RNN that outputs another time series y[i]. In this study, x and y correspond to responses of the two bridges to the same truck.

Neural network-based parametric system identification: a review

View Article

Journal Information

Published in International Journal of Systems Science, 2023

Aoxiang Dong, Andrew Starr, Yifan Zhao

Based on NN-GC, in (Wang et al., 2018), the RNN-GC framework was developed where both linear and nonlinear Granger causalities are derived by calculating the prediction error of LSTM with and without a certain input variable. In comparison to NN-GC (Montalto et al., 2015), which uses MLP as the prediction model, RNN is less prone to overfitting due to its weight sharing over time, which makes the parameter dimension irrelevant to the length of the input time series. This makes RNN-GC less sensitive to the model order, resulting in a higher accuracy of causality detection. Furthermore, since the parameter dimension is irrelevant to the length of the input time series, one fixed RNN structure is able to fit time series with different maximum time delays, which makes it more flexible than MLP (Wang et al., 2018). A similar framework has been implemented in (Rosoł et al., 2022), except that the Wilcoxon signed-rank test is adopted to assess the significance of causality. In (Li et al., 2020), causality is expanded to include the effect of future values on present values, which is detected using a bidirectional LSTM. Recently, Liu et al. (2021) proposed the seq2seq-LSTM Granger Causality (SLGC) framework, where the LSTM-based encoder-decoder model is adopted for prediction. The seq2seq model can be adopted when the length of the input and output sequence is not fixed and equal in training. The nonlinear Granger causality is calculated by comparing the fraction of variance explained by the forecasting model with and without a certain input variable, measured by the R-squared score. However, these methods are relatively more complicated than the following fully end-to-end method.

SATP-GAN: self-attention based generative adversarial network for traffic flow prediction

View Article

Journal Information

Published in Transportmetrica B: Transport Dynamics, 2021

Liang Zhang, Jianqing Wu, Jun Shen, Ming Chen, Rui Wang, Xinliang Zhou, Cankun Xu, Quankai Yao, Qiang Wu

In general, LSTM will not be used for time-series prediction alone to improve the accuracy of prediction. Inspired by the success of machine translation (Cho et al. 2014), the Seq2Seq model in the natural language process (NLP) has shown great potential. More specifically, a standard Seq2Seq model consists of two key components, an encoder, which maps the source input x to a vector representation, and a decoder, which generates an output sequence based on the source and encoder. Both the encoder and the decoder are LSTMs to capture different compositional patterns.

Trajectory adjustment for nonprehensile manipulation using latent space of trained sequence-to-sequence model^*

View Article

Journal Information

Published in Advanced Robotics, 2019

K. Kutsuzawa, S. Sakaino, T. Tsuji

Sequence-to-sequence (seq2seq) models [24,25] are neural networks used for time-series conversion. Seq2seq models are used in a wide variety of applications such as natural language translation, text summarization [29], and forming associations between language commands and robot motions [30]. A seq2seq model consists of two recurrent neural networks (RNNs) as illustrated in Figure 1. The input side RNN is called encoder, while the output side RNN is called decoder. Seq2seq models have almost no limitation for the construction of a time-series.

Explore chapters and articles related to this topic

Psychiatric Chatbot for COVID-19 Using Machine Learning Approaches

Transformers: Basics and Introduction

Time series forecasting to jointly model bridge responses

Neural network-based parametric system identification: a review

SATP-GAN: self-attention based generative adversarial network for traffic flow prediction

Trajectory adjustment for nonprehensile manipulation using latent space of trained sequence-to-sequence model*

Trajectory adjustment for nonprehensile manipulation using latent space of trained sequence-to-sequence model^*