AutoEncoder with TensorFlow - ML

Last Updated : 27 May, 2025

Autoencoders are neural networks used for unsupervised learning tasks like dimensionality reduction, anomaly detection and feature extraction. They consist of two key parts:

Encoder that compresses data into a compact form.
Decoder that reconstructs the original data from this compressed representation.

The main goal is to minimize the difference between the input and its reconstruction. In this article we'll implement a Convolutional Neural Network (CNN) based autoencoder using TensorFlow and the MNIST dataset.

Implementing Autoencoders using TensorFlow

Lets see various steps involved for implementing using TensorFlow.

Step 1: Importing libraries

We will be using NumPy, Matplotlib and TensorFlow libraries.

Python

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np
import matplotlib.pyplot as plt

Step 2: Loading the Dataset

Now we load the MNIST dataset using tf.keras.datasets.mnist.load_data(). This dataset includes 60,000 training and 10,000 testing grayscale digit images of size 28×28 pixels. To prepare the data:

Normalize the pixel values by dividing by 255.0 to scale them between 0 and 1.
Reshape the data to (28, 28, 1) so each image has a single channel helps in making it compatible with CNN layers.

This preprocessing ensures that the input format matches what the convolutional layers expect.

Python

(x_train, _), (x_test, _) = keras.datasets.mnist.load_data()
x_train = x_train.astype("float32") / 255.0
x_test = x_test.astype("float32") / 255.0

Output:

Step 3: Defining the Autoencoder

In this step we are going to define our autoencoder. It consists of two important components:

Encoder: It compresses the input image into a lower-dimensional representation (latent space). It consists of three convolutional layers with ReLU activation helps in reducing dimensionality step by step using MaxPooling layers and the final output is the compressed representation of the input image.
Decoder: It reconstructs the original image from the compressed representation. It uses three transposed convolution layers to progressively upsample the data back to the original size. Then the final layer uses a sigmoid activation function to output pixel values between 0 and 1.

Python

latent_dim = 32 

encoder_input = keras.Input(shape=(28, 28, 1))
x = layers.Conv2D(16, (3, 3), activation="relu", padding="same")(encoder_input)
x = layers.MaxPooling2D((2, 2), padding="same")(x)
x = layers.Conv2D(8, (3, 3), activation="relu", padding="same")(x)
x = layers.MaxPooling2D((2, 2), padding="same")(x)
x = layers.Conv2D(8, (3, 3), activation="relu", padding="same")(x)
encoded = layers.MaxPooling2D((2, 2), padding="same")(x)

x = layers.Conv2D(8, (3, 3), activation="relu", padding="same")(encoded)
x = layers.UpSampling2D((2, 2))(x)
x = layers.Conv2D(8, (3, 3), activation="relu", padding="same")(x)
x = layers.UpSampling2D((2, 2))(x)
x = layers.Conv2D(16, (3, 3), activation="relu")(x)
x = layers.UpSampling2D((2, 2))(x)
decoded = layers.Conv2D(1, (3, 3), activation="sigmoid", padding="same")(x)

Step 4: Compiling and Training the Autoencoder

Here we define the autoencoder model by specifying the input (encoder_input) and output (decoded). Then the model is compiled using the Adam optimizer and binary cross-entropy loss which is suitable for image reconstruction tasks.

epochs=10, batch_size=128, shuffle=True, validation_data=(x_test, x_test): Trains the autoencoder for 10 epochs using batches of 128 samples, shuffling the data each epoch and validates performance using test data (input and output are the same since it’s reconstruction).

Python

autoencoder = keras.Model(encoder_input, decoded)
autoencoder.compile(optimizer="adam", loss="binary_crossentropy")

autoencoder.fit(
    x_train, x_train,
    epochs=10,
    batch_size=128,
    shuffle=True,
    validation_data=(x_test, x_test)
)

Output:

Step 5: Visualizing Original and Reconstructed Images

In this step we pass test images through the trained autoencoder to get the reconstructed images. Then we use Matplotlib to plot the original and reconstructed images side by side for comparison.

encoded_imgs = autoencoder.predict(x_test): Uses the trained autoencoder to reconstruct images from the test dataset.

Python

encoded_imgs = autoencoder.predict(x_test)
n = 10  

plt.figure(figsize=(20, 4))
for i in range(n):

    ax = plt.subplot(2, n, i + 1)
    plt.imshow(x_test[i].reshape(28, 28), cmap="gray")
    plt.axis("off")

    ax = plt.subplot(2, n, i + 1 + n)
    plt.imshow(encoded_imgs[i].reshape(28, 28), cmap="gray")
    plt.axis("off")
plt.show()

Output:

In the figure above the top row consists of the original test images while the bottom row contains the reconstructed images generated by the autoencoder. The reconstructed images resemble the original ones indicates that the model successfully learned essential features of the digits.

AutoEncoder with TensorFlow - ML

Vishal_V

Improve

Article Tags :

AutoEncoder with TensorFlow - ML

Implementing Autoencoders using TensorFlow

Step 1: Importing libraries

Step 2: Loading the Dataset

Step 3: Defining the Autoencoder

Step 4: Compiling and Training the Autoencoder

Step 5: Visualizing Original and Reconstructed Images

Similar Reads

Introduction to Deep Learning

Basic Neural Network

Activation Functions

Artificial Neural Network

Classification

Regression

Hyperparameter tuning

Introduction to Convolution Neural Network

Recurrent Neural Network

Thank You!

What kind of Experience do you want to share?