Implementing Recurrent Neural Networks in PyTorch

Last Updated : 21 May, 2025

Recurrent Neural Networks (RNNs) are neural networks that are particularly effective for sequential data. Unlike traditional feedforward neural networks RNNs have connections that form loops allowing them to maintain a hidden state that can capture information from previous inputs. This makes them suitable for tasks such as time series prediction, natural language processing and many more task. In this article we will explore how to implement RNNs using PyTorch.

Before we start implementing the RNN we need to set up our environment. Ensure you have PyTorch installed. You can install it using pip:

!pip install torch

Classifying Movie Reviews Using RNN

In this example we will use a public dataset to perform sentiment analysis on movie reviews. The goal is to classify each review as positive or negative using an RNN.

You can download the dataset from here.

1. Importing Libraries

We are importing:

PyTorch (torch, torch.nn, torch.optim) for building and training neural networks.
Pandasand NumPyfor data handling and numerical operations.
Matplotlibfor visualization.
Scikit-learn’s train_test_split and LabelEncoder for data splitting and label encoding.

Python

import torch
import torch.nn as nn
import torch.optim as optim
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from torch.utils.data import Dataset, DataLoader

2. Loading and Preprocessing the Dataset

Load the dataset using pd.read_csv() and assign column names.
Lowercase and tokenize the text using pandas string methods.
Encode labels into numeric form with LabelEncoder().
Split the data into training and testing sets using train_test_split().
Create a vocabulary set from all unique words in the dataset.
Map each unique word to a unique index.
Define encode_and_pad() function to convert tokenized sentences into sequences of indices and pad them to the maximum sequence length.
Process training and testing texts with encode_and_pad() to prepare data for modeling.

Python

df = pd.read_csv("/content/IMDB Dataset.csv", names=["text", "label"])

df['text'] = df['text'].str.lower().str.split()

le = LabelEncoder()
df['label'] = le.fit_transform(df['label'])

train_data, test_data = train_test_split(df, test_size=0.2, random_state=42)

vocab = {word for phrase in df['text'] for word in phrase}
word_to_idx = {word: idx for idx, word in enumerate(vocab, start=1)}

max_length = df['text'].str.len().max()

def encode_and_pad(text):
    encoded = [word_to_idx[word] for word in text]
    return encoded + [0] * (max_length - len(encoded))

train_data['text'] = train_data['text'].apply(encode_and_pad)
test_data['text'] = test_data['text'].apply(encode_and_pad)

3. Creating Dataset and Data Loader

Define a custom SentimentDataset class inheriting from PyTorch’s Dataset.
Store texts and labels from input data within the class.
Implement __len__ method to return total number of samples.
Implement __getitem__ method to retrieve a single sample by index, converting text and label to PyTorch tensors with correct data types.
Create dataset instances for training and testing data.
Wrap datasets in DataLoaders with a batch size of 32.
Shuffle training data in DataLoader for randomness, keep test data ordered.
Prepare data for efficient batch loading during model training and evaluation.

Python

class SentimentDataset(Dataset):
    def __init__(self, data):
        self.texts = data['text'].values
        self.labels = data['label'].values
    
    def __len__(self):
        return len(self.texts)
    
    def __getitem__(self, idx):
        text = self.texts[idx]
        label = self.labels[idx]
        return torch.tensor(text, dtype=torch.long), torch.tensor(label, dtype=torch.long)

train_dataset = SentimentDataset(train_data)
test_dataset = SentimentDataset(test_data)

train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False)

4. Defining the RNN Model

Define a SentimentRNN class inheriting from PyTorch’s nn.Module.
Initialize an embedding layer to convert word indices into dense vectors.
Add an RNN layer to process the input sequences.
Include a fully connected layer to map RNN outputs to the final output size.
In the forward method pass input sequences through the embedding layer.
Create an initial hidden state of zeros and process the sequence using the RNN layer.
Take the output from the last time step and pass it through the fully connected layer to produce predictions.
Set parameters of vocabulary size, embedding size, hidden size and output size.
Start the SentimentRNN model with these parameters.

Python

class SentimentRNN(nn.Module):
    def __init__(self, vocab_size, embed_size, hidden_size, output_size):
        super(SentimentRNN, self).__init__()
        self.embedding = nn.Embedding(vocab_size, embed_size)
        self.rnn = nn.RNN(embed_size, hidden_size, batch_first=True)
        self.fc = nn.Linear(hidden_size, output_size)
    
    def forward(self, x):
        x = self.embedding(x)
        h0 = torch.zeros(1, x.size(0), hidden_size).to(x.device)
        out, _ = self.rnn(x, h0)
        out = self.fc(out[:, -1, :])
        return out

vocab_size = len(vocab) + 1
embed_size = 128
hidden_size = 128
output_size = 2 
model = SentimentRNN(vocab_size, embed_size, hidden_size, output_size)

5. Training the Model

Define the loss function as cross-entropy loss.
Set up the Adam optimizer with a learning rate of 0.001.
Specify the number of training epochs.
For each epoch set the model to training mode.
Initialize epoch loss to zero.
For each batch of texts and labels from the training loader: compute model outputs, calculate the loss and zero the optimizer gradients.
Perform backpropagation by computing gradients and update model weights with the optimizer and accumulate the batch loss into epoch loss.

Python

criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

num_epochs = 10
for epoch in range(num_epochs):
    model.train()
    epoch_loss = 0
    for texts, labels in train_loader:
        outputs = model(texts)
        loss = criterion(outputs, labels)
        
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
        epoch_loss += loss.item()
    
    print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {epoch_loss / len(train_loader):.4f}')

Output:

6. Evaluating the Model

Set the model to evaluation mode.
Initialize counters for correct predictions and total samples.
Use torch.no_grad() to disable gradient calculations.
Iterate over test loader batches and compute model outputs.
Determine predicted classes by selecting the max output score and update total samples count.
Increment correct count for matching predictions and true labels.

Python

model.eval()
correct = 0
total = 0
with torch.no_grad():
    for texts, labels in test_loader:
        outputs = model(texts)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

accuracy = 100 * correct / total
print(f'Accuracy: {accuracy:.2f}%')

Output:

Accuracy: 86.64%

7. Visualizing Training Loss

Python

losses = []

for epoch in range(num_epochs):
    model.train()
    epoch_loss = 0
    for texts, labels in train_loader:
        outputs = model(texts)
        loss = criterion(outputs, labels)
        
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
        epoch_loss += loss.item()
    
    losses.append(epoch_loss / len(train_loader))
    print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {epoch_loss / len(train_loader):.4f}')

plt.figure(figsize=(10, 6))
plt.plot(range(1, num_epochs + 1), losses, marker='o')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Training Loss')
plt.show()

Output :

Plotting the same :

Based on the training loss plot RNN model demonstrates good performance. Although the loss fluctuates throughout the training process it shows a steady decrease by the final epochs indicating that the model is improving over time. The consistent reduction in loss towards the end tells that the model is learning effectively and converging towards a more optimized state.

You can also make RNN model using Tenserflow and for that you can refer to this article: Training of Recurrent Neural Networks (RNN) in TensorFlow

Bidirectional Recurrent Neural Network

abhijat_sarari

Improve

Article Tags :

Implementing Recurrent Neural Networks in PyTorch

Classifying Movie Reviews Using RNN

1. Importing Libraries

2. Loading and Preprocessing the Dataset

3. Creating Dataset and Data Loader

4. Defining the RNN Model

5. Training the Model

6. Evaluating the Model

7. Visualizing Training Loss

Similar Reads

Thank You!

What kind of Experience do you want to share?