This post creates a Stacked LSTM and learns a simple pattern in the sequence.

A few points about Stacked LSTM :

  • It was introduced by Graves as an application to speech recognition
  • The key aspect to remember for the API usage is to use return_sequences as true

Let us look at a damped time series sequence of length 50 and predict the next 5 steps of the sequence

Data Preparation

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
import numpy as np
from keras.utils import to_categorical
from keras.models import Sequential
from keras.layers import LSTM, Dense
import matplotlib.pyplot as plt
from keras import optimizers
from math import sin
from math import pi
import matplotlib.pyplot as plt
import pandas as pd

def create_seq(length, period, decay): return([ 0.5 + 0.5 np.sin(2pii/period) np.exp(-decay*i) for i in range(length)])

n_samples = 1000 n_patterns = 5 n_steps = 50 n_output = 5 np.random.seed(1234) periods = np.random.randint(10,20,5) decays = np.random.uniform(0.01,0.1,5) periods = np.repeat(periods, 200).reshape(n_samples,1) decays = np.repeat(decays, 200).reshape(n_samples,1)

meta_parameters = pd.DataFrame(np.c_[periods, decays]) meta_parameters.columns = ['p','d'] data = meta_parameters.apply(lambda x : np.array(create_seq(n_steps+n_output, x['p'],x['d'])), axis=1)

X = np.zeros((n_samples, n_steps+n_output)) for i in range(len(X)): X[i,:] = data.iloc[i] X_train = X[:,:n_steps] Y_train = X[:,n_steps:] X_train = X_train.reshape(X_train.shape[0],X_train.shape[1],1)

Creating a Stacked LSTM Model

1
2
3
4
5
6
model= Sequential()
model.add(LSTM(20, return_sequences=True, input_shape=(n_steps,1)))
model.add(LSTM(20))
model.add(Dense(n_output))
model.compile(loss='mae', optimizer='adam')
model.summary()
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
Layer (type)                 Output Shape              Param #   
=================================================================
lstm_47 (LSTM)               (None, 60, 20)            1760      
_________________________________________________________________
lstm_48 (LSTM)               (None, 20)                3280      
_________________________________________________________________
dense_19 (Dense)             (None, 5)                 105       
=================================================================
Total params: 5,145
Trainable params: 5,145
Non-trainable params: 0
_________________________________________________________________

Train the model

1
model.fit(X_train,Y_train, epochs=20, validation_split=0.2)

Stacked LSTM to LSTM is similar to Deep Neural Networks to Perceptron. The addition of stacked layers increases the abstractions that the network can learn from the data.