I was able to undersand the basic idea behind RNN after working through an example from the book by Antonio Gulli.

Data Download

The dataset comprises all the characters in the text available at Alice-Gutenberg. The basic idea of this exercise is to use SimpleRNN to create 10 character sequences as input sequences that will be put through SimpleRNN to generate the next character in the sequence

import urllib
url  = "http://www.gutenberg.org/files/11/11-0.txt"
data   = urllib.request.urlopen(url)
lines  = []
for line in data:
    line = line.strip().lower()
    line = line.decode('ascii','ignore')
    if len(line) >0 : lines.append(line)
text = " ".join(lines)

Preprocessing Data - Building a word look up

chars = set(c for c in text)
nb_chars = len(chars)
char2idx = {c:i for i, c in enumerate(chars)}
idx2char = {i:c for i, c in enumerate(chars)}
SEQLEN = 10
STEP = 1
input_chars = []
label_chars = []
for i in range(0, len(text)-SEQLEN, STEP):
input_chars.append(text[i:i+SEQLEN])
label_chars.append(text[i+SEQLEN])

Preparing training and test data

1
2
3
4
5
6


X = np.zeros((len(input_chars), SEQLEN, nb_chars))
Y = np.zeros((len(input_chars), nb_chars))
for i,seq in enumerate(input_chars):
    for j,c in enumerate(seq):
        X[i,j,char2idx[c]] = 1
    Y[i,char2idx[label_chars[i]]] = 1

Defining Simple RNN layer

model = Sequential()
model.add(SimpleRNN(HIDDEN_SIZE, return_sequences=False,
                    input_shape= (SEQLEN, nb_chars),
                    unroll=True))
model.add(Dense(nb_chars, activation="softmax"))
model.compile(loss="categorical_crossentropy", optimizer="rmsprop")

Train the Model

def get_test_input(word):
    X = np.zeros((1,SEQLEN, nb_chars))
    for i,c in enumerate(word):
        X[0,i,char2idx[c]] = 1
    return(X)
NUM_ITERATIONS=25
for iteration in range(NUM_ITERATIONS):
test_idx   = np.random.randint(len(input_chars))
test_chars = input_chars[test_idx]
model.fit(X,Y, batch_size = BATCH_SIZE,
epochs= NUM_EPOCHS_PER_ITERATION,
verbose=False)
print("="*50)
print(f"Iteration : {iteration}")
test_idx = np.random.randint(len(input_chars))
test_chars = input_chars[test_idx]
gen_chars = test_chars
for i in range(NUM_PREDS_PER_EPOCH ):
X_test = get_test_input(test_chars)
output_char = idx2char[np.argmax(model.predict(X_test))]
gen_chars += output_char
test_chars = test_chars[1:]+ output_char
print(gen_chars)

Even though the model code looks simple, there is a lot that is going on behind it.

Generating text via SimpleRNN

Contents

Data Download

Preprocessing Data - Building a word look up

Preparing training and test data

Defining Simple RNN layer

Train the Model