Part 4: Recurrent Neural Networks(RNN)
This article, mainly discussing the Recurrent Neural Network(RNN) which is one of the main part of Supervised Learning with neural networks working with time series data.
(1) What are Recurrent Neural Networks?
- A recurrent neural network (RNN) is a type of artificial neural network which uses sequential data or time series data.
- These deep learning algorithms are commonly used for ordinal or temporal problems with RNN.
- Derived from feedforward networks, RNNs exhibit similar behavior to how human brains function.
(2) What are the applications based on Recurrent Neural Networks?
Recurrent Neural Networks are mainly working with time series data.
- Speech Recogintion
- Language Modelling & Prediction
- Image Recognition and characterization
- language translation, natural language processing (nlp)
- Stock Market Analysis, Voice based system , Chatbot
- music generation, sentiment classification, and machine translation
(3) What are the types of Recurrent Neural Network
- Like feedforward and convolutional neural networks (CNNs), recurrent neural networks utilize training data to learn
- They are distinguished by their “memory” as they take information from prior inputs to influence the current input and output.
- different types of RNNs are used for different use cases, which are expressed using the following diagrams:
- Mainly One to one, one to many,many to one, many to many use cases are used.
(4) How Recurrent Neural Network works
- RNN converts the independent activations into dependent activations by providing the same weights and biases to all the layers, thus reducing the complexity of increasing parameters and memorizing each previous outputs by giving each output as input to the next hidden layer.
- Hence these three layers can be joined together such that the weights and bias of all the hidden layers is the same, into a single recurrent layer.
(5) Advantages of Recurrent Neural Network
- An RNN remembers each and every information through time. It is useful in time series prediction only because of the feature to remember previous inputs as well. This is called Long Short Term Memory.
- Recurrent neural network are even used with convolutional layers to extend the effective pixel neighborhood.
(6) Main issues of standard RNN
- There are two major obstacles RNN’s have had to deal with gradient issues.
- A gradient is a partial derivative with respect to its inputs. It measures how much the output of a function changes.
- The higher the gradient, the steeper the slope and the faster a model can learn. But if the slope is zero, the model stops learning. A gradient simply measures the change in all weights with regard to the change in error.
(a) Exploding Gradient issue- Assigning high importance to the weights.This problem can be easily solved by truncating or squashing the gradients.
(b) Vanishing Gradient issue- Vanishing gradients occur when the values of a gradient are too small and the model stops learning or takes way too long as a result.Fortunately, it was solved through the concept of LSTM
(7) Varients of RNN Architectures
- To solving the problems occured in RNN, found some varients of RNN architectures.
(a) Bidirectional recurrent neural networks (BRNN): These are a variant network architecture of RNNs.Bidirectional RNNs pull in future data to improve the accuracy of it.
(b) Long short-term memory (LSTM): This is a popular RNN architecture, as a solution to vanishing gradient problem.LSTMs have “cells” in the hidden layers of the neural network, which have three gates–an input gate, an output gate, and a forget gate. These gates control the flow of information which is needed to predict the output in the network.
( c) Gated recurrent units (GRUs): This RNN variant is similar the LSTMs as it also works to address the short-term memory problem of RNN models. Instead of using a “cell state” regulate information, it uses hidden states, and instead of three gates, it has two — a reset gate and an update gate.
This article is mainly about Recurrent Neural networks (RNN)which are commonly used in Supervised Learning with neural networks. For the the problems in RNN, LSTM, GRUs , it has founded Tranformer model artictecture as for the solutions. It gives accurate outputs to the RNN based problems. In next articles we will talk about that.