SlideShare a Scribd company logo
Recurrent Neural Network
What’s in it for you?
RNN
What is a Neural Network?
Why Recurrent Neural Network?
Popular Neural Networks
How does a RNN work?
Vanishing and Exploding Gradient Problem
Long Short Term Memory (LSTM)
Use case implementation of LSTM
What is a Recurrent Neural Network?
Introduction to RNN
Do you know how Google’s autocomplete feature predicts the rest of the words a user is typing?
h
y
x
A
B
C
h
y
x
A
B
C
h
y
x
A
B
C
Fed to a Recurrent Neural
Network
What is the best food to
eat in Las
Vegas
Google search
Autocompletes the
search
Introduction to RNN
Do you know how Google’s autocomplete feature predicts the rest of the words a user is typing?
h
y
x
A
B
C
h
y
x
A
B
C
h
y
x
A
B
C
Fed to a Recurrent Neural
Network
What is the best food to
eat in Las
Vegas
Google search
Autocompletes the
search
Collection of large volumes of
most frequently occurring
consecutive words
Introduction to RNN
Do you know how Google’s autocomplete feature predicts the rest of the words a user is typing?
Collection of large volumes of
most frequently occurring
consecutive words
h
y
x
A
B
C
h
y
x
A
B
C
h
y
x
A
B
C
Fed to a Recurrent Neural
Network
What is the best food to
eat in Las
Vegas
Google search
Autocompletes the
search
Analyses the data by finding the
sequence of words occurring
frequently and builds a model to
predict the next word in the sentence
Introduction to RNN
Do you know how Google’s autocomplete feature predicts the rest of the words a user is typing?
Collection of large volumes of
most frequently occurring
consecutive words
h
y
x
A
B
C
h
y
x
A
B
C
h
y
x
A
B
C
Fed to a Recurrent Neural
Network
What is the best food to
eat in Las
Vegas
Google search
Autocompletes the
search
Introduction to RNN
Do you know how Google’s autocomplete feature predicts the rest of the words a user is typing?
Collection of large volumes of
most frequently occurring
consecutive words
h
y
x
A
B
C
h
y
x
A
B
C
h
y
x
A
B
C
Fed to a Recurrent Neural
Network
What is the best food to
eat in Las
Vegas
Google search
Autocompletes the
search
What is a Neural Network?
Neural Networks used in Deep Learning, consists of different layers connected to each other and work on
the structure and functions of a human brain. It learns from huge volumes of data and uses complex
algorithms to train a neural net.
Input Layer Hidden Layers
Output Layer
German Shepherd
Labrador
Image pixels of 2 different breeds
of dog
Identifies the dogs
breed
What is a Neural Network?
Neural Networks used in Deep Learning, consists of different layers connected to each other and work on
the structure and functions of a human brain. It learns from huge volumes of data and uses complex
algorithms to train a neural net.
Hidden Layers
Output Layer
Input Layer
German Shepherd
Labrador
Image pixels of 2 different breeds
of dog
Identifies the dogs
breed
Such networks do not
require memorizing the
past output
Popular Neural Networks
Feed Forward Neural
Network
Used in general Regression and Classification
problems
Convolution Neural
Network
Used for Image Recognition
Deep Neural Network
Used for Acoustic Modeling
Deep Belief Network Used for Cancer Detection
Recurrent Neural
Network
Used for Speech Recognition
Feed Forward Neural Network
Input Layers Hidden Layers Output Layer
x y^
input Predicted
output
Simplified presentation
In a Feed-Forward Network , information flows only in forward direction, from the input nodes, through the hidden layers (if
any) and to the output nodes. There are no cycles or loops in the network.
Input Layer Hidden Layers
Output Layer
• Decisions are based on current input
• No memory about the past
• No future scope
Why Recurrent Neural Network?
Feed Forward Neural
Network
01
02
03
cannot handle
sequential data
cannot memorize
previous inputs
considers only the
current input
Issues in Feed Forward Neural Network
Why Recurrent Neural Network?
h
y
x
A
B
C
Recurrent Neural
Network
01
02
03
can handle
sequential data
can memorize previous
inputs due to its internal
memory
considers the current input and
also the previously received
inputs
Solution to Feed Forward Neural Network
Applications of RNN
Image captioning
RNN is used to caption an image by analyzing the activities present in it
“A Dog catching a ball in mid air”
Applications of RNN
Time series prediction
Any time series problem like predicting the prices of stocks in a particular month can
be solved using RNN
Applications of RNN
Natural Language Processing
Text mining and Sentiment analysis can be carried out using RNN for Natural
Language Processing
When it rains, look for rainbows. When
it’s dark, look for stars.
Positive Sentiment
Applications of RNN
Machine Translation
Given an input in one language, RNN can be used to translate the input into
different languages as output
Here the person is speaking in
English and it is getting translated into
Chinese, Italian, French, German and
Spanish languages
What is a Recurrent Neural Network?
Recurrent Neural Network works on the principle of saving the output of a layer and feeding
this back to the input in order to predict the output of the layer.
h
y
x
A
B
C
Recurrent Neural Network
x
x
x
h
h
h
h
h
h
h
h
y
y
Input Layer Hidden Layers Output Layer
www.simplilearn.com
How does a RNN look like?
A, B and C are the parameters
How does a RNN look like?
www.simplilearn.com
How does a RNN work?
= new state
fc = function with parameter c
= old state
= input vector at time step t
= f (h(t-1) ,c )
C
A
B
C
A
B
C
A
B
y(t-1) y(t+1)y(t)
x(t-1) x(t+1)
h(t-1) h(t+1)
h(t)
h(t)
x(t)
x(t)
h(t)
h(t-1)
x(t)
Types of Recurrent Neural Network
one to one
Single output
Single input
one to one network is known as the Vanilla Neural
Network. Used for regular machine learning
problems
Types of Recurrent Neural Network
one to many
Multiple outputs
Single input
one to many network generates sequence of
outputs. Example: Image captioning
Types of Recurrent Neural Network
many to one
Single output
Multiple inputs
many to one network takes in a sequence of
inputs. Example: Sentiment analysis where a
given sentence can be classified as expressing
positive or negative sentiments
Types of Recurrent Neural Network
many to many
Multiple outputs
Multiple inputs
many to many network takes in a sequence of
inputs and generates a sequence of outputs.
Example: Machine Translation
Vanishing Gradient Problem
While training a RNN, your slope can be either too
small or very large and this makes training difficult.
When the slope is too small, the problem is know as
Vanishing gradient.
Change in Y
Change in X
Y
X
s0
E0
x0
s1
E1
s2
E2
x2x1
s3
E3
x3
Loss of Information through time
Backpropagate the error
Exploding Gradient Problem
When the slope tends to grow exponentially
instead of decaying, this problem is called
Exploding gradient.
Y
X
• Long training time
• Poor performance
• Bad accuracys0
E0
x0
s1
E1
s2
E2
x2x1
s3
E3
x3
Loss of Information through time
Backpropagate the error
Issues in Gradient Problem
Explaining Gradient Problem
s1
y1
x1
s2
y2
s3
y3
x3x2
st
yt
xt
…….s0
Consider the following 2 examples to understand what should be the next word in the sequence:
The person who took my bike and…………………………………………., a thief.
The students who got into Engineering with …………………………………………., from Asia.
____
____
was
were
Explaining Gradient Problem
s1
y1
x1
s2
y2
s3
y3
x3x2
st
yt
xt
…….s0
Consider the following 2 examples:
The person who took my bike and…………………………………………., a thief.
The students who got into Engineering with …………………………………………., were from Asia .
____was
In order to understand what would be the next word in the sequence, the RNN must memorize the
previous context whether the subject was singular noun or plural noun
Explaining Gradient Problem
Consider the following 2 examples:
The person who took my bike and…………………………………………., was a thief.
The students who got into Engineering with …………………………………………., from Asia .
In order to understand what would be the next word in the sequence, the RNN must memorize the
previous context whether the subject was singular noun or plural noun
____were
It might be sometimes difficult for
the error to backpropagate to the
beginning of the sequence to
predict what should be the output
s1
y1
x1
s2
y2
s3
y3
x3x2
st
yt
xt
…….s0
Solution to Gradient Problem
Identity
Initialization
Truncated
Backpropagation
Gradient
Clipping
2
1 3
Exploding Gradient
Weight
initialization
Choosing the right Activation
Function
Long Short-Term
Memory Networks
(LSTMs)
2
1 3
Vanishing Gradient
Long-Term Dependencies
Here we do not need any further context. It’s pretty
clear the last word is going to be “sky”.
Suppose we try to predict the
last word in the text
“ The clouds are in the ___”sky
Long-Term Dependencies
Suppose we try to predict the
last word in the text
“I have been staying in Spain for the last 10 years… I
can speak fluent ______.”
• Here we need the context of Spain to predict the last word in
the text.
• It’s possible that the gap between the relevant information
and the point where it is needed to become very large.
• LSTMs help us solve this problem.
Spanish
The word we predict will depend on the
previous few words in context
Long Short-Term Memory Networks
LSTMs are special kind of Recurrent Neural Networks, capable of learning long-term dependencies. Remembering
information for long periods of time is their default behavior.
tanh
ht-1
xt-1
tanh
ht
xt
tanh
ht+1
Xt+1
A AA
All recurrent neural networks have the form of a chain of repeating modules of neural network. In standard RNNs, this repeating
module will have a very simple structure, such as a single tanh layer.
Long Short-Term Memory Networks
LSTMs are special kind of Recurrent Neural Networks, capable of learning long-term dependencies. Remembering
information for long periods of time is their default behavior.
+x
tanh
x
tanh
x
+x
tanh
x
tanh
x
+x
tanh
x
tanh
x
ht-1 ht Ht+1
xt-1
xt Xt+1
A A
LSTMs also have a chain like structure, but the repeating module has a different structure. Instead of having a single neural
network layer, there are four interacting layers communicating in a very special way.
Long Short-Term Memory Networks
3 step process of LSTMs
Step 1 Step 2 Step 3
Forget irrelevant parts
of previous state
Selectively update cell
state values
Output certain parts of
cell state
Ct-1 Ct
Long Short-Term Memory Networks
3 step process of LSTMs
Step 1 Step 2 Step 3
Forget irrelevant parts
of previous state
Selectively update cell
state values
Output certain parts of
cell state
Ct-1 Ct
Long Short-Term Memory Networks
3 step process of LSTMs
Step 1 Step 2 Step 3
Forget irrelevant parts
of previous state
Selectively update cell
state values
Output certain parts of
cell state
Ct-1 Ct
ht
Ct
Working of LSTMs
First step in the LSTM is to decide which information to be omitted in from
the cell in that particular time step. It is decided by the sigmoid function. It
looks at the previous state (ht-1 ) and the current input xt and computes the
function.
+x
tanh
x
tanh
x
ht
xt
Ct-1
ht-1
ft
ft = forget gate
Decides which information to delete
that is not important from previous
time step
Decides how much of the past it should rememberStep-1
ht
Ct
Working of LSTMs
First step in the LSTM is to decide which information to be omitted in from
the cell in that particular time step. It is decided by the sigmoid function. It
looks at the previous state (ht-1 ) and the current input xt and computes the
function.
+x
tanh
x
tanh
x
ht
xt
Ct-1
ht-1
ft
ft = forget gate
Decides which information to delete
that is not important from previous
time step
Consider an LSTM is fed with the following inputs from previous and present time step :
Alice is good in Physics. John on the other hand is good in Chemistry.
John plays football well. He told me yesterday over phone that he had served as
the captain of his college football team.
Previous Output
Current Input
ht-1
xt
Forget gate realizes there might be a change in context after
encountering the first full stop.
Compares with the current input sentence at xt
The next sentence talks about John, so the information on
Alice is deleted.
The position of subject is vacated and is assigned to John
ht
Ct
Working of LSTMs
First step in the LSTM is to decide which information to be omitted in from
the cell in that particular time step. It is decided by the sigmoid function. It
looks at the previous state (ht-1 ) and the current input xt and computes the
function.
+x
tanh
x
tanh
x
ht
xt
Ct-1
ht-1
ft
ft = forget gate
Decides which information to delete
that is not important from previous
time step
Consider an LSTM is fed with the following inputs from previous and present time step :
Alice is good in Physics. John on the other hand is good in Chemistry.
John plays football well. He told me yesterday over phone that he had served as
the captain of his college football team.
Previous Output
Current Input
ht-1
xt
Forget gate realizes there might be a change in context after
encountering the first full stop.
Compares with the current input sentence at xt
The next sentence talks about John, so the information on
Alice is deleted.
The position of subject is vacated and is assigned to John
ht
Ct
Working of LSTMs
First step in the LSTM is to decide which information to be omitted in from
the cell in that particular time step. It is decided by the sigmoid function. It
looks at the previous state (ht-1 ) and the current input xt and computes the
function.
+x
tanh
x
tanh
x
ht
xt
Ct-1
ht-1
ft
ft = forget gate
Decides which information to delete
that is not important from previous
time step
Consider an LSTM is fed with the following inputs from previous and present time step :
Alice is good in Physics. John on the other hand is good in Chemistry.
John plays football well. He told me yesterday over phone that he had served as
the captain of his college football team.
Previous Output
Current Input
ht-1
xt
Forget gate realizes there might be a change in context after
encountering the first full stop.
Compares with the current input sentence at xt
The next sentence talks about John, so the information on
Alice is deleted.
The position of subject is vacated and is assigned to John
ht
Ct
Working of LSTMs
First step in the LSTM is to decide which information to be omitted in from
the cell in that particular time step. It is decided by the sigmoid function. It
looks at the previous state (ht-1 ) and the current input xt and computes the
function.
+x
tanh
x
tanh
x
ht
xt
Ct-1
ht-1
ft
ft = forget gate
Decides which information to delete
that is not important from previous
time step
Consider an LSTM is fed with the following inputs from previous and present time step :
Alice is good in Physics. John on the other hand is good in Chemistry.
John plays football well. He told me yesterday over phone that he had served as
the captain of his college football team.
Previous Output
Current Input
ht-1
xt
Forget gate realizes there might be a change in context after
encountering the first full stop.
Compares with the current input sentence at xt
The next sentence talks about John, so the information on
Alice is deleted.
The position of subject is vacated and is assigned to John
ht
Ct
Working of LSTMs
First step in the LSTM is to decide which information to be omitted in from
the cell in that particular time step. It is decided by the sigmoid function. It
looks at the previous state (ht-1 ) and the current input xt and computes the
function.
+x
tanh
x
tanh
x
ht
xt
Ct-1
ht-1
ft
ft = forget gate
Decides which information to delete
that is not important from previous
time step
Consider an LSTM is fed with the following inputs from previous and present time step :
Alice is good in Physics. John on the other hand is good in Chemistry.
John plays football well. He told me yesterday over phone that he had served as
the captain of his college football team.
Previous Output
Current Input
ht-1
xt
Forget gate realizes there might be a change in context after
encountering the first full stop.
Compares with the current input sentence at xt
The next sentence talks about John, so the information on
Alice is deleted.
The position of subject is vacated and is assigned to John
ht
Ct
Working of LSTMs
In the second layer, there are 2 parts. One is the sigmoid function and the
other is the tanh. In the sigmoid function, it decides which values to let
through(0 or 1). tanh function gives the weightage to the values which are
passed deciding their level of importance(-1 to 1).
+x
tanh
x
tanh
x
ht
xt
Ct-1
ht-1
ft
Ct
it ~
Decides how much should this unit add to the current stateStep-2
it = input gate
Determines which information to let
through based on its significance in the
current time step
ht
Ct
Working of LSTMs
+x
tanh
x
tanh
x
ht
xt
Ct-1
ht-1
ft
Ct
it ~
Consider the current input at xt
John plays football well. He told me yesterday over phone that he had served as
the captain of his college football team. Current Input
Input gate analyses
the important
information
ht
Ct
Working of LSTMs
+x
tanh
x
tanh
x
ht
xt
Ct-1
ht-1
ft
Ct
it ~
Consider the current input at xt
John plays football well. He told me yesterday over phone that he had served as
the captain of his college football team. Current Input
John plays football
and he was the
captain of his college
team is important
ht
Ct
Working of LSTMs
+x
tanh
x
tanh
x
ht
xt
Ct-1
ht-1
ft
Ct
it ~
Consider the current input at xt
John plays football well. He told me yesterday over phone that he had served as
the captain of his college football team. Current Input
He told me over
phone yesterday is
less important, hence
it is forgotten
ht
Ct
Working of LSTMs
+x
tanh
x
tanh
x
ht
xt
Ct-1
ht-1
ft
Ct
it ~
Consider the current input at xt
John plays football well. He told me yesterday over phone that he had served as
the captain of his college football team. Current Input
This process of
adding some new
information can be
done via
the input gate
Working of LSTMs
The third step is to decide what will be our output. First, we run a sigmoid
layer which decides what parts of the cell state make it to the output.
Then, we put the cell state through tanh to push the values to be between
-1 and 1 and multiply it by the output of the sigmoid gate.
+x
tanh
x
tanh
x
ht
xt
Ct-1
ht-1
ft
Ct
it ~
ht
ot
Ct
Decides what part of the current cell state makes it to the outputStep-3
ot = output gate
Allows the passed in information to
impact the output in the current time step
Working of LSTMs
+x
tanh
x
tanh
x
ht
xt
Ct-1
ht-1
ft
Ct
it ~
ht
ot
Ct
Let’s consider this example to predicting the next word in the sentence:
John played tremendously well against the opponent and won for his team. For his
contributions, brave ____ was awarded player of the match.
There could be a lot
of choices for the
empty space
Working of LSTMs
+x
tanh
x
tanh
x
ht
xt
Ct-1
ht-1
ft
Ct
it ~
ht
ot
Ct
Let’s consider this example to predicting the next word in the sentence:
John played tremendously well against the opponent and won for his team. For his
contributions, brave ____ was awarded player of the match.
Current input brave is
an adjective
Working of LSTMs
+x
tanh
x
tanh
x
ht
xt
Ct-1
ht-1
ft
Ct
it ~
ht
ot
Ct
Let’s consider this example to predicting the next word in the sentence:
John played tremendously well against the opponent and won for his team. For his
contributions, brave ____ was awarded player of the match.
Adjectives describe a
noun
Working of LSTMs
+x
tanh
x
tanh
x
ht
xt
Ct-1
ht-1
ft
Ct
it ~
ht
ot
Ct
Let’s consider this example to predicting the next word in the sentence:
John played tremendously well against the opponent and won for his team. For his
contributions, brave ____ was awarded player of the match.John
“John” could be the
best output after
brave
Use case implementation of LSTM
Let’s predict the prices of stocks using LSTM network
Based on the stock price
data between 2012-
2016
Predict the stock prices of 2017
Use case implementation of LSTM
1. Import the Libraries
2. Import the training dataset
3. Feature Scaling
Use case implementation of LSTM
4. Create a data structure with 60 timesteps and 1 output
5. Import keras libraries and packages
Use case implementation of LSTM
6. Initialize the RNN
7. Adding the LSTM layers and some Dropout regularization
Use case implementation of LSTM
8. Adding the output layer
9. Compile the RNN
10. Fit the RNN to the training set
Use case implementation of LSTM
11. Load the stock price test data for 2017
12. Get the predicted stock price of 2017
Use case implementation of LSTM
13. Visualize the results of predicted and real stock price
Key Takeaways
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | Simplilearn

More Related Content

PPTX
Recurrent Neural Networks (RNNs)
PPTX
PPTX
PDF
Rnn and lstm
PDF
Introduction to Recurrent Neural Network
PDF
Recurrent neural networks rnn
PPTX
Recurrent neural network
PPTX
Recurrent Neural Network
Recurrent Neural Networks (RNNs)
Rnn and lstm
Introduction to Recurrent Neural Network
Recurrent neural networks rnn
Recurrent neural network
Recurrent Neural Network

What's hot (20)

PDF
Convolutional Neural Network Models - Deep Learning
PPTX
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
PPTX
Convolutional Neural Networks
PDF
Convolutional Neural Networks (CNN)
PPTX
Regularization in deep learning
PPT
Back propagation
PDF
Introduction to Recurrent Neural Network
PPTX
Deep neural networks
PPTX
Machine Learning - Convolutional Neural Network
PPTX
Deep learning
PDF
LSTM Basics
PPT
rnn BASICS
PPSX
Perceptron (neural network)
PPTX
Convolutional Neural Network (CNN) - image recognition
PPTX
K Nearest Neighbor Algorithm
PDF
Introduction to object detection
PPT
backpropagation in neural networks
PDF
LSTM Tutorial
PPTX
Object detection with deep learning
Convolutional Neural Network Models - Deep Learning
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
Convolutional Neural Networks
Convolutional Neural Networks (CNN)
Regularization in deep learning
Back propagation
Introduction to Recurrent Neural Network
Deep neural networks
Machine Learning - Convolutional Neural Network
Deep learning
LSTM Basics
rnn BASICS
Perceptron (neural network)
Convolutional Neural Network (CNN) - image recognition
K Nearest Neighbor Algorithm
Introduction to object detection
backpropagation in neural networks
LSTM Tutorial
Object detection with deep learning
Ad

Similar to Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | Simplilearn (20)

PPTX
Complete solution for Recurrent neural network.pptx
PDF
Deep Learning: Application & Opportunity
PDF
A Brief Introduction on Recurrent Neural Network and Its Application
PPT
14889574 dl ml RNN Deeplearning MMMm.ppt
PDF
Recurrent Neural Networks
PPTX
10.0 SequenceModeling-merged-compressed_edited.pptx
PDF
An In-Depth Explanation of Recurrent Neural Networks (RNNs) - InsideAIML
PPTX
Introduction to deep learning
PDF
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
PDF
EXPERIMENTS ON DIFFERENT RECURRENT NEURAL NETWORKS FOR ENGLISH-HINDI MACHINE ...
PPTX
recurrent_neural_networks_april_2020.pptx
PPTX
RNN and LSTM model description and working advantages and disadvantages
PDF
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
PDF
rnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
PDF
IRJET- Survey on Text Error Detection using Deep Learning
PDF
Convolutional and Recurrent Neural Networks
PDF
Sequence Modelling with Deep Learning
PDF
DEEPLEARNING recurrent neural networs.pdf
PPTX
Recurrent-Neural-Networks-Mastering-Sequences-in-1.pptx
PDF
Recurrent Neural Networks A Deep Dive in 2025.pdf
Complete solution for Recurrent neural network.pptx
Deep Learning: Application & Opportunity
A Brief Introduction on Recurrent Neural Network and Its Application
14889574 dl ml RNN Deeplearning MMMm.ppt
Recurrent Neural Networks
10.0 SequenceModeling-merged-compressed_edited.pptx
An In-Depth Explanation of Recurrent Neural Networks (RNNs) - InsideAIML
Introduction to deep learning
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
EXPERIMENTS ON DIFFERENT RECURRENT NEURAL NETWORKS FOR ENGLISH-HINDI MACHINE ...
recurrent_neural_networks_april_2020.pptx
RNN and LSTM model description and working advantages and disadvantages
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
rnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
IRJET- Survey on Text Error Detection using Deep Learning
Convolutional and Recurrent Neural Networks
Sequence Modelling with Deep Learning
DEEPLEARNING recurrent neural networs.pdf
Recurrent-Neural-Networks-Mastering-Sequences-in-1.pptx
Recurrent Neural Networks A Deep Dive in 2025.pdf
Ad

More from Simplilearn (20)

PPTX
Top 50 Scrum Master Interview Questions | Scrum Master Interview Questions & ...
PPTX
Bagging Vs Boosting In Machine Learning | Ensemble Learning In Machine Learni...
PPTX
Future Of Social Media | Social Media Trends and Strategies 2025 | Instagram ...
PPTX
SQL Query Optimization | SQL Query Optimization Techniques | SQL Basics | SQL...
PPTX
SQL INterview Questions .pTop 45 SQL Interview Questions And Answers In 2025 ...
PPTX
How To Start Influencer Marketing Business | Influencer Marketing For Beginne...
PPTX
Cyber Security Roadmap 2025 | How To Become Cyber Security Engineer In 2025 |...
PPTX
How To Become An AI And ML Engineer In 2025 | AI Engineer Roadmap | AI ML Car...
PPTX
What Is GitHub Copilot? | How To Use GitHub Copilot? | How does GitHub Copilo...
PPTX
Top 10 Data Analyst Certification For 2025 | Best Data Analyst Certification ...
PPTX
Complete Data Science Roadmap For 2025 | Data Scientist Roadmap For Beginners...
PPTX
Top 7 High Paying AI Certifications Courses For 2025 | Best AI Certifications...
PPTX
Data Cleaning In Data Mining | Step by Step Data Cleaning Process | Data Clea...
PPTX
Top 10 Data Analyst Projects For 2025 | Data Analyst Projects | Data Analysis...
PPTX
AI Engineer Roadmap 2025 | AI Engineer Roadmap For Beginners | AI Engineer Ca...
PPTX
Machine Learning Roadmap 2025 | Machine Learning Engineer Roadmap For Beginne...
PPTX
Kotter's 8-Step Change Model Explained | Kotter's Change Management Model | S...
PPTX
Gen AI Engineer Roadmap For 2025 | How To Become Gen AI Engineer In 2025 | Si...
PPTX
Top 10 Data Analyst Certification For 2025 | Best Data Analyst Certification ...
PPTX
Complete Data Science Roadmap For 2025 | Data Scientist Roadmap For Beginners...
Top 50 Scrum Master Interview Questions | Scrum Master Interview Questions & ...
Bagging Vs Boosting In Machine Learning | Ensemble Learning In Machine Learni...
Future Of Social Media | Social Media Trends and Strategies 2025 | Instagram ...
SQL Query Optimization | SQL Query Optimization Techniques | SQL Basics | SQL...
SQL INterview Questions .pTop 45 SQL Interview Questions And Answers In 2025 ...
How To Start Influencer Marketing Business | Influencer Marketing For Beginne...
Cyber Security Roadmap 2025 | How To Become Cyber Security Engineer In 2025 |...
How To Become An AI And ML Engineer In 2025 | AI Engineer Roadmap | AI ML Car...
What Is GitHub Copilot? | How To Use GitHub Copilot? | How does GitHub Copilo...
Top 10 Data Analyst Certification For 2025 | Best Data Analyst Certification ...
Complete Data Science Roadmap For 2025 | Data Scientist Roadmap For Beginners...
Top 7 High Paying AI Certifications Courses For 2025 | Best AI Certifications...
Data Cleaning In Data Mining | Step by Step Data Cleaning Process | Data Clea...
Top 10 Data Analyst Projects For 2025 | Data Analyst Projects | Data Analysis...
AI Engineer Roadmap 2025 | AI Engineer Roadmap For Beginners | AI Engineer Ca...
Machine Learning Roadmap 2025 | Machine Learning Engineer Roadmap For Beginne...
Kotter's 8-Step Change Model Explained | Kotter's Change Management Model | S...
Gen AI Engineer Roadmap For 2025 | How To Become Gen AI Engineer In 2025 | Si...
Top 10 Data Analyst Certification For 2025 | Best Data Analyst Certification ...
Complete Data Science Roadmap For 2025 | Data Scientist Roadmap For Beginners...

Recently uploaded (20)

PPTX
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
PPTX
Institutional Correction lecture only . . .
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PDF
Insiders guide to clinical Medicine.pdf
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
Cell Types and Its function , kingdom of life
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
TR - Agricultural Crops Production NC III.pdf
PDF
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PDF
Microbial disease of the cardiovascular and lymphatic systems
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PDF
O7-L3 Supply Chain Operations - ICLT Program
PPTX
PPH.pptx obstetrics and gynecology in nursing
PDF
RMMM.pdf make it easy to upload and study
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PDF
Supply Chain Operations Speaking Notes -ICLT Program
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
Institutional Correction lecture only . . .
Renaissance Architecture: A Journey from Faith to Humanism
102 student loan defaulters named and shamed – Is someone you know on the list?
Insiders guide to clinical Medicine.pdf
Anesthesia in Laparoscopic Surgery in India
Cell Types and Its function , kingdom of life
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
TR - Agricultural Crops Production NC III.pdf
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
human mycosis Human fungal infections are called human mycosis..pptx
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Microbial disease of the cardiovascular and lymphatic systems
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
O7-L3 Supply Chain Operations - ICLT Program
PPH.pptx obstetrics and gynecology in nursing
RMMM.pdf make it easy to upload and study
Module 4: Burden of Disease Tutorial Slides S2 2025
Supply Chain Operations Speaking Notes -ICLT Program

Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | Simplilearn

  • 2. What’s in it for you? RNN What is a Neural Network? Why Recurrent Neural Network? Popular Neural Networks How does a RNN work? Vanishing and Exploding Gradient Problem Long Short Term Memory (LSTM) Use case implementation of LSTM What is a Recurrent Neural Network?
  • 3. Introduction to RNN Do you know how Google’s autocomplete feature predicts the rest of the words a user is typing? h y x A B C h y x A B C h y x A B C Fed to a Recurrent Neural Network What is the best food to eat in Las Vegas Google search Autocompletes the search
  • 4. Introduction to RNN Do you know how Google’s autocomplete feature predicts the rest of the words a user is typing? h y x A B C h y x A B C h y x A B C Fed to a Recurrent Neural Network What is the best food to eat in Las Vegas Google search Autocompletes the search Collection of large volumes of most frequently occurring consecutive words
  • 5. Introduction to RNN Do you know how Google’s autocomplete feature predicts the rest of the words a user is typing? Collection of large volumes of most frequently occurring consecutive words h y x A B C h y x A B C h y x A B C Fed to a Recurrent Neural Network What is the best food to eat in Las Vegas Google search Autocompletes the search Analyses the data by finding the sequence of words occurring frequently and builds a model to predict the next word in the sentence
  • 6. Introduction to RNN Do you know how Google’s autocomplete feature predicts the rest of the words a user is typing? Collection of large volumes of most frequently occurring consecutive words h y x A B C h y x A B C h y x A B C Fed to a Recurrent Neural Network What is the best food to eat in Las Vegas Google search Autocompletes the search
  • 7. Introduction to RNN Do you know how Google’s autocomplete feature predicts the rest of the words a user is typing? Collection of large volumes of most frequently occurring consecutive words h y x A B C h y x A B C h y x A B C Fed to a Recurrent Neural Network What is the best food to eat in Las Vegas Google search Autocompletes the search
  • 8. What is a Neural Network? Neural Networks used in Deep Learning, consists of different layers connected to each other and work on the structure and functions of a human brain. It learns from huge volumes of data and uses complex algorithms to train a neural net. Input Layer Hidden Layers Output Layer German Shepherd Labrador Image pixels of 2 different breeds of dog Identifies the dogs breed
  • 9. What is a Neural Network? Neural Networks used in Deep Learning, consists of different layers connected to each other and work on the structure and functions of a human brain. It learns from huge volumes of data and uses complex algorithms to train a neural net. Hidden Layers Output Layer Input Layer German Shepherd Labrador Image pixels of 2 different breeds of dog Identifies the dogs breed Such networks do not require memorizing the past output
  • 10. Popular Neural Networks Feed Forward Neural Network Used in general Regression and Classification problems Convolution Neural Network Used for Image Recognition Deep Neural Network Used for Acoustic Modeling Deep Belief Network Used for Cancer Detection Recurrent Neural Network Used for Speech Recognition
  • 11. Feed Forward Neural Network Input Layers Hidden Layers Output Layer x y^ input Predicted output Simplified presentation In a Feed-Forward Network , information flows only in forward direction, from the input nodes, through the hidden layers (if any) and to the output nodes. There are no cycles or loops in the network. Input Layer Hidden Layers Output Layer • Decisions are based on current input • No memory about the past • No future scope
  • 12. Why Recurrent Neural Network? Feed Forward Neural Network 01 02 03 cannot handle sequential data cannot memorize previous inputs considers only the current input Issues in Feed Forward Neural Network
  • 13. Why Recurrent Neural Network? h y x A B C Recurrent Neural Network 01 02 03 can handle sequential data can memorize previous inputs due to its internal memory considers the current input and also the previously received inputs Solution to Feed Forward Neural Network
  • 14. Applications of RNN Image captioning RNN is used to caption an image by analyzing the activities present in it “A Dog catching a ball in mid air”
  • 15. Applications of RNN Time series prediction Any time series problem like predicting the prices of stocks in a particular month can be solved using RNN
  • 16. Applications of RNN Natural Language Processing Text mining and Sentiment analysis can be carried out using RNN for Natural Language Processing When it rains, look for rainbows. When it’s dark, look for stars. Positive Sentiment
  • 17. Applications of RNN Machine Translation Given an input in one language, RNN can be used to translate the input into different languages as output Here the person is speaking in English and it is getting translated into Chinese, Italian, French, German and Spanish languages
  • 18. What is a Recurrent Neural Network? Recurrent Neural Network works on the principle of saving the output of a layer and feeding this back to the input in order to predict the output of the layer. h y x A B C Recurrent Neural Network x x x h h h h h h h h y y Input Layer Hidden Layers Output Layer
  • 19. www.simplilearn.com How does a RNN look like? A, B and C are the parameters
  • 20. How does a RNN look like? www.simplilearn.com
  • 21. How does a RNN work? = new state fc = function with parameter c = old state = input vector at time step t = f (h(t-1) ,c ) C A B C A B C A B y(t-1) y(t+1)y(t) x(t-1) x(t+1) h(t-1) h(t+1) h(t) h(t) x(t) x(t) h(t) h(t-1) x(t)
  • 22. Types of Recurrent Neural Network one to one Single output Single input one to one network is known as the Vanilla Neural Network. Used for regular machine learning problems
  • 23. Types of Recurrent Neural Network one to many Multiple outputs Single input one to many network generates sequence of outputs. Example: Image captioning
  • 24. Types of Recurrent Neural Network many to one Single output Multiple inputs many to one network takes in a sequence of inputs. Example: Sentiment analysis where a given sentence can be classified as expressing positive or negative sentiments
  • 25. Types of Recurrent Neural Network many to many Multiple outputs Multiple inputs many to many network takes in a sequence of inputs and generates a sequence of outputs. Example: Machine Translation
  • 26. Vanishing Gradient Problem While training a RNN, your slope can be either too small or very large and this makes training difficult. When the slope is too small, the problem is know as Vanishing gradient. Change in Y Change in X Y X s0 E0 x0 s1 E1 s2 E2 x2x1 s3 E3 x3 Loss of Information through time Backpropagate the error
  • 27. Exploding Gradient Problem When the slope tends to grow exponentially instead of decaying, this problem is called Exploding gradient. Y X • Long training time • Poor performance • Bad accuracys0 E0 x0 s1 E1 s2 E2 x2x1 s3 E3 x3 Loss of Information through time Backpropagate the error Issues in Gradient Problem
  • 28. Explaining Gradient Problem s1 y1 x1 s2 y2 s3 y3 x3x2 st yt xt …….s0 Consider the following 2 examples to understand what should be the next word in the sequence: The person who took my bike and…………………………………………., a thief. The students who got into Engineering with …………………………………………., from Asia. ____ ____ was were
  • 29. Explaining Gradient Problem s1 y1 x1 s2 y2 s3 y3 x3x2 st yt xt …….s0 Consider the following 2 examples: The person who took my bike and…………………………………………., a thief. The students who got into Engineering with …………………………………………., were from Asia . ____was In order to understand what would be the next word in the sequence, the RNN must memorize the previous context whether the subject was singular noun or plural noun
  • 30. Explaining Gradient Problem Consider the following 2 examples: The person who took my bike and…………………………………………., was a thief. The students who got into Engineering with …………………………………………., from Asia . In order to understand what would be the next word in the sequence, the RNN must memorize the previous context whether the subject was singular noun or plural noun ____were It might be sometimes difficult for the error to backpropagate to the beginning of the sequence to predict what should be the output s1 y1 x1 s2 y2 s3 y3 x3x2 st yt xt …….s0
  • 31. Solution to Gradient Problem Identity Initialization Truncated Backpropagation Gradient Clipping 2 1 3 Exploding Gradient Weight initialization Choosing the right Activation Function Long Short-Term Memory Networks (LSTMs) 2 1 3 Vanishing Gradient
  • 32. Long-Term Dependencies Here we do not need any further context. It’s pretty clear the last word is going to be “sky”. Suppose we try to predict the last word in the text “ The clouds are in the ___”sky
  • 33. Long-Term Dependencies Suppose we try to predict the last word in the text “I have been staying in Spain for the last 10 years… I can speak fluent ______.” • Here we need the context of Spain to predict the last word in the text. • It’s possible that the gap between the relevant information and the point where it is needed to become very large. • LSTMs help us solve this problem. Spanish The word we predict will depend on the previous few words in context
  • 34. Long Short-Term Memory Networks LSTMs are special kind of Recurrent Neural Networks, capable of learning long-term dependencies. Remembering information for long periods of time is their default behavior. tanh ht-1 xt-1 tanh ht xt tanh ht+1 Xt+1 A AA All recurrent neural networks have the form of a chain of repeating modules of neural network. In standard RNNs, this repeating module will have a very simple structure, such as a single tanh layer.
  • 35. Long Short-Term Memory Networks LSTMs are special kind of Recurrent Neural Networks, capable of learning long-term dependencies. Remembering information for long periods of time is their default behavior. +x tanh x tanh x +x tanh x tanh x +x tanh x tanh x ht-1 ht Ht+1 xt-1 xt Xt+1 A A LSTMs also have a chain like structure, but the repeating module has a different structure. Instead of having a single neural network layer, there are four interacting layers communicating in a very special way.
  • 36. Long Short-Term Memory Networks 3 step process of LSTMs Step 1 Step 2 Step 3 Forget irrelevant parts of previous state Selectively update cell state values Output certain parts of cell state Ct-1 Ct
  • 37. Long Short-Term Memory Networks 3 step process of LSTMs Step 1 Step 2 Step 3 Forget irrelevant parts of previous state Selectively update cell state values Output certain parts of cell state Ct-1 Ct
  • 38. Long Short-Term Memory Networks 3 step process of LSTMs Step 1 Step 2 Step 3 Forget irrelevant parts of previous state Selectively update cell state values Output certain parts of cell state Ct-1 Ct
  • 39. ht Ct Working of LSTMs First step in the LSTM is to decide which information to be omitted in from the cell in that particular time step. It is decided by the sigmoid function. It looks at the previous state (ht-1 ) and the current input xt and computes the function. +x tanh x tanh x ht xt Ct-1 ht-1 ft ft = forget gate Decides which information to delete that is not important from previous time step Decides how much of the past it should rememberStep-1
  • 40. ht Ct Working of LSTMs First step in the LSTM is to decide which information to be omitted in from the cell in that particular time step. It is decided by the sigmoid function. It looks at the previous state (ht-1 ) and the current input xt and computes the function. +x tanh x tanh x ht xt Ct-1 ht-1 ft ft = forget gate Decides which information to delete that is not important from previous time step Consider an LSTM is fed with the following inputs from previous and present time step : Alice is good in Physics. John on the other hand is good in Chemistry. John plays football well. He told me yesterday over phone that he had served as the captain of his college football team. Previous Output Current Input ht-1 xt Forget gate realizes there might be a change in context after encountering the first full stop. Compares with the current input sentence at xt The next sentence talks about John, so the information on Alice is deleted. The position of subject is vacated and is assigned to John
  • 41. ht Ct Working of LSTMs First step in the LSTM is to decide which information to be omitted in from the cell in that particular time step. It is decided by the sigmoid function. It looks at the previous state (ht-1 ) and the current input xt and computes the function. +x tanh x tanh x ht xt Ct-1 ht-1 ft ft = forget gate Decides which information to delete that is not important from previous time step Consider an LSTM is fed with the following inputs from previous and present time step : Alice is good in Physics. John on the other hand is good in Chemistry. John plays football well. He told me yesterday over phone that he had served as the captain of his college football team. Previous Output Current Input ht-1 xt Forget gate realizes there might be a change in context after encountering the first full stop. Compares with the current input sentence at xt The next sentence talks about John, so the information on Alice is deleted. The position of subject is vacated and is assigned to John
  • 42. ht Ct Working of LSTMs First step in the LSTM is to decide which information to be omitted in from the cell in that particular time step. It is decided by the sigmoid function. It looks at the previous state (ht-1 ) and the current input xt and computes the function. +x tanh x tanh x ht xt Ct-1 ht-1 ft ft = forget gate Decides which information to delete that is not important from previous time step Consider an LSTM is fed with the following inputs from previous and present time step : Alice is good in Physics. John on the other hand is good in Chemistry. John plays football well. He told me yesterday over phone that he had served as the captain of his college football team. Previous Output Current Input ht-1 xt Forget gate realizes there might be a change in context after encountering the first full stop. Compares with the current input sentence at xt The next sentence talks about John, so the information on Alice is deleted. The position of subject is vacated and is assigned to John
  • 43. ht Ct Working of LSTMs First step in the LSTM is to decide which information to be omitted in from the cell in that particular time step. It is decided by the sigmoid function. It looks at the previous state (ht-1 ) and the current input xt and computes the function. +x tanh x tanh x ht xt Ct-1 ht-1 ft ft = forget gate Decides which information to delete that is not important from previous time step Consider an LSTM is fed with the following inputs from previous and present time step : Alice is good in Physics. John on the other hand is good in Chemistry. John plays football well. He told me yesterday over phone that he had served as the captain of his college football team. Previous Output Current Input ht-1 xt Forget gate realizes there might be a change in context after encountering the first full stop. Compares with the current input sentence at xt The next sentence talks about John, so the information on Alice is deleted. The position of subject is vacated and is assigned to John
  • 44. ht Ct Working of LSTMs First step in the LSTM is to decide which information to be omitted in from the cell in that particular time step. It is decided by the sigmoid function. It looks at the previous state (ht-1 ) and the current input xt and computes the function. +x tanh x tanh x ht xt Ct-1 ht-1 ft ft = forget gate Decides which information to delete that is not important from previous time step Consider an LSTM is fed with the following inputs from previous and present time step : Alice is good in Physics. John on the other hand is good in Chemistry. John plays football well. He told me yesterday over phone that he had served as the captain of his college football team. Previous Output Current Input ht-1 xt Forget gate realizes there might be a change in context after encountering the first full stop. Compares with the current input sentence at xt The next sentence talks about John, so the information on Alice is deleted. The position of subject is vacated and is assigned to John
  • 45. ht Ct Working of LSTMs In the second layer, there are 2 parts. One is the sigmoid function and the other is the tanh. In the sigmoid function, it decides which values to let through(0 or 1). tanh function gives the weightage to the values which are passed deciding their level of importance(-1 to 1). +x tanh x tanh x ht xt Ct-1 ht-1 ft Ct it ~ Decides how much should this unit add to the current stateStep-2 it = input gate Determines which information to let through based on its significance in the current time step
  • 46. ht Ct Working of LSTMs +x tanh x tanh x ht xt Ct-1 ht-1 ft Ct it ~ Consider the current input at xt John plays football well. He told me yesterday over phone that he had served as the captain of his college football team. Current Input Input gate analyses the important information
  • 47. ht Ct Working of LSTMs +x tanh x tanh x ht xt Ct-1 ht-1 ft Ct it ~ Consider the current input at xt John plays football well. He told me yesterday over phone that he had served as the captain of his college football team. Current Input John plays football and he was the captain of his college team is important
  • 48. ht Ct Working of LSTMs +x tanh x tanh x ht xt Ct-1 ht-1 ft Ct it ~ Consider the current input at xt John plays football well. He told me yesterday over phone that he had served as the captain of his college football team. Current Input He told me over phone yesterday is less important, hence it is forgotten
  • 49. ht Ct Working of LSTMs +x tanh x tanh x ht xt Ct-1 ht-1 ft Ct it ~ Consider the current input at xt John plays football well. He told me yesterday over phone that he had served as the captain of his college football team. Current Input This process of adding some new information can be done via the input gate
  • 50. Working of LSTMs The third step is to decide what will be our output. First, we run a sigmoid layer which decides what parts of the cell state make it to the output. Then, we put the cell state through tanh to push the values to be between -1 and 1 and multiply it by the output of the sigmoid gate. +x tanh x tanh x ht xt Ct-1 ht-1 ft Ct it ~ ht ot Ct Decides what part of the current cell state makes it to the outputStep-3 ot = output gate Allows the passed in information to impact the output in the current time step
  • 51. Working of LSTMs +x tanh x tanh x ht xt Ct-1 ht-1 ft Ct it ~ ht ot Ct Let’s consider this example to predicting the next word in the sentence: John played tremendously well against the opponent and won for his team. For his contributions, brave ____ was awarded player of the match. There could be a lot of choices for the empty space
  • 52. Working of LSTMs +x tanh x tanh x ht xt Ct-1 ht-1 ft Ct it ~ ht ot Ct Let’s consider this example to predicting the next word in the sentence: John played tremendously well against the opponent and won for his team. For his contributions, brave ____ was awarded player of the match. Current input brave is an adjective
  • 53. Working of LSTMs +x tanh x tanh x ht xt Ct-1 ht-1 ft Ct it ~ ht ot Ct Let’s consider this example to predicting the next word in the sentence: John played tremendously well against the opponent and won for his team. For his contributions, brave ____ was awarded player of the match. Adjectives describe a noun
  • 54. Working of LSTMs +x tanh x tanh x ht xt Ct-1 ht-1 ft Ct it ~ ht ot Ct Let’s consider this example to predicting the next word in the sentence: John played tremendously well against the opponent and won for his team. For his contributions, brave ____ was awarded player of the match.John “John” could be the best output after brave
  • 55. Use case implementation of LSTM Let’s predict the prices of stocks using LSTM network Based on the stock price data between 2012- 2016 Predict the stock prices of 2017
  • 56. Use case implementation of LSTM 1. Import the Libraries 2. Import the training dataset 3. Feature Scaling
  • 57. Use case implementation of LSTM 4. Create a data structure with 60 timesteps and 1 output 5. Import keras libraries and packages
  • 58. Use case implementation of LSTM 6. Initialize the RNN 7. Adding the LSTM layers and some Dropout regularization
  • 59. Use case implementation of LSTM 8. Adding the output layer 9. Compile the RNN 10. Fit the RNN to the training set
  • 60. Use case implementation of LSTM 11. Load the stock price test data for 2017 12. Get the predicted stock price of 2017
  • 61. Use case implementation of LSTM 13. Visualize the results of predicted and real stock price

Editor's Notes