DAY 47-100 DAYS MLCODE: Recurrent Neural Network

My Tech World

DAY 47-100 DAYS MLCODE: Recurrent Neural Network

December 28, 2018 100-Days-Of-ML-Code blog 0

In the previous blog we discussed how to use the pre-trained model of CNN and in this blog, we’ll discuss the Recurrent Neural Network (RNN).

Recurrent Neural Network

Till 45th day we have discussed the networks are feed forward i.e. activation flows only in one direction from the input layer to output layer direction. Recurrent Neural Network (RNN) is also like feedforward network except it has connections pointing backward. Below is the example of basic RNN

RNN
RNN

As shown in the image above at each time step t, every neuron receives the input X(t) and the output from previous step y(t-1). Each neuron also have two weights one for X(t) and y(t-1). Then we can say that

Output of single recurrent neuron for a single Instance

Memory Cells

Since the output y(t) at time step t is dependent on the output of previous time steps, we can say that it has the memory to store the value. A part of a neural network that preserve some state is called memory cell ( or cell).

RNN can take sequence of input and can output a sequence of output. Such network useful for predicting time series like stock.

Encoder

RNN can take the sequence of input and can produce a single output and ignore other outputs called sequence-to-vector network. Sequence-to-vector network is called Encoder.

Decoder

RNN can take the single vector and can produce the sequence of output called vector-to-sequence network or Decoder.

Let’s now create a simple example of RNN using TensorFlow.

Load the data of MNIST Fashion data

(X_train, y_train), (X_test, y_test) = tf.keras.datasets.fashion_mnist.load_data()

Verify the data before start of processing

# Shapes of training set
print(“Training set (images) shape: {shape}”.format(shape=X_train.shape))
print(“Training set (labels) shape: {shape}”.format(shape=y_train.shape))

# Shapes of test set
print(“Test set (images) shape: {shape}”.format(shape=X_test.shape))
print(“Test set (labels) shape: {shape}”.format(shape=y_test.shape))

Output:
Training set (images) shape: (60000, 28, 28)
Training set (labels) shape: (60000,)
Test set (images) shape: (10000, 28, 28)
Test set (labels) shape: (10000,)

Construct the model, here the image of 28 * 28 work as 28 steps and 28 input. We are going to have 200 neurons with output of 10

steps = 28
inputs = 28
neurons = 200
out_no = 10
tf.reset_default_graph()
learning_rate = 0.001

X = tf.placeholder(tf.float32, [None, steps, inputs])
y = tf.placeholder(tf.int32, [None])

basic_cell = tf.contrib.rnn.BasicRNNCell(num_units= neurons)
outputs, states = tf.nn.dynamic_rnn(basic_cell, X, dtype=tf.float32)

logits = tf.layers.dense(states, out_no)
xentropy = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=y,
                                                          logits=logits)
loss = tf.reduce_mean(xentropy)
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(loss)
correct = tf.nn.in_top_k(logits, y, 1)
accuracy = tf.reduce_mean(tf.cast(correct, tf.float32))

init = tf.global_variables_initializer()

Now train the model:

n_epochs = 50
batch_size = 50

with tf.Session() as sess:
init.run()
for epoch in range(n_epochs):
for X_batch, y_batch in select_batch(X_train, y_train, batch_size):
X_batch = X_batch.reshape((-1, steps, inputs))
sess.run(training_op, feed_dict={X: X_batch, y: y_batch})
acc_train = accuracy.eval(feed_dict={X: X_batch, y: y_batch})
acc_test = accuracy.eval(feed_dict={X: X_test, y: y_test})
print(epoch, “Train accuracy:”, acc_train, “Test accuracy:”, acc_test)

Output:
0 Train accuracy: 0.58 Test accuracy: 0.6416
1 Train accuracy: 0.62 Test accuracy: 0.6564
2 Train accuracy: 0.6 Test accuracy: 0.6666
3 Train accuracy: 0.66 Test accuracy: 0.6647
4 Train accuracy: 0.54 Test accuracy: 0.6696
5 Train accuracy: 0.58 Test accuracy: 0.6737
6 Train accuracy: 0.82 Test accuracy: 0.6637
—————–
—————–

Our model was able to achieve the accuracy of 87 %.

This is the simple example of RNN using TensorFlow. You can find the entire code here.