Open In App

Image Classification using CNN

Last Updated : 05 Aug, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Image classification is a key task in machine learning where the goal is to assign a label to an image based on its content. Convolutional Neural Networks (CNNs) are specifically designed to analyze and interpret images. Unlike traditional neural networks, they are good at detecting patterns, shapes and textures by breaking down an image into smaller parts and learning from these details. By processing these patterns in multiple layers, it can identify increasingly complex features making them effective for tasks like classifying images of animals, objects or scenes. In this article, we will see how CNNs work and how to use them to build an image classifier.

Key Components of CNNs

A Convolutional Neural Network (CNN) is made up of several layers, each designed to perform a specific function in processing images:

  1. Convolutional Layers: Filters or kernels that detect features such as edges or textures.
  2. ReLU Activation: Adds non-linearity, helping the model learn complex patterns.
  3. Pooling Layers: Reduce the dimensions of the image making the network more efficient while preserving important features.
  4. Fully Connected Layers: After feature extraction, these layers make the final prediction based on the detected patterns.
  5. Softmax Output: Converts the network’s output into probabilities, showing the likelihood of each class.

How CNNs Work for Image Classification?

The process of image classification with a CNN involves several stages:

  1. Preprocessing the Image: Images need to be preprocessed before feeding them into the CNN. This includes resizing, normalizing and sometimes augmenting images to make the model more robust and reduce overfitting.
  2. Feature Extraction: CNNs automatically detect features from images, starting with simple features like edges and progressing to more complex patterns like objects or faces as we go deeper into the network.
  3. Classification: After extracting features, the fully connected layers use the learned information to classify the image. Based on the features detected, the model assigns the image to one of the predefined categories.

Implementation of Image Classification using CNN

Lets see the implementation of Image Classification step-by-step:

Step 1: Importing Libraries

We will be using Tensorflow and Matplotlib libraries for building, training and visualizing training and validation accuracy of the model.

Python
import tensorflow as tf
from tensorflow.keras import layers, models, datasets
import matplotlib.pyplot as plt

Step 2: Downloading and Preparing the Dataset

Next we load the CIFAR-10 dataset and preprocess it. It consists of 60,000 32x32 color images across 10 categories.

  • Scaling: We scale the image pixel values from [0, 255] to [0, 1] by dividing by 255.
  • One-hot encoding: Converts the labels (0-9) into a one-hot vector (e.g., for label 2: [0, 0, 1, 0, 0, 0, 0, 0, 0, 0]).
Python
(x_train, y_train), (x_test, y_test) = datasets.cifar10.load_data()

x_train, x_test = x_train / 255.0, x_test / 255.0

num_classes = 10
y_train = tf.keras.utils.to_categorical(y_train, num_classes)
y_test  = tf.keras.utils.to_categorical(y_test , num_classes)

Output:

cnn1
Downloading the Dataset

Step 3: Building the CNN Model

Now, we define the CNN architecture and start with convolutional layers followed by max-pooling layers, flatten the output and then feed it into fully connected layers.

  • Flatten Layer: Converts the 2D matrix into a 1D vector for the dense layers.
  • Dense Layers: Fully connected layers used for decision making, with the final layer using softmax activation to predict probabilities.
Python
model = models.Sequential([
    
    layers.Conv2D(32, (3,3), activation='relu', padding='same', input_shape=(32,32,3)),
    layers.MaxPooling2D(2,2),

    layers.Conv2D(64, (3,3), activation='relu', padding='same'),
    layers.MaxPooling2D(2,2),

    layers.Conv2D(64, (3,3), activation='relu', padding='same'),
    layers.Flatten(),

    layers.Dense(64, activation='relu'),
    layers.Dense(num_classes, activation='softmax')
])

model.summary()

Output:

cnn2
Building the CNN Model

Step 4: Compiling and Training the Model

We then compile the model by defining the optimizer, loss function and evaluation metric, followed by training. Adam optimizer is used as it adjusts the learning rate during training.

Python
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

history = model.fit(x_train, y_train,
                    epochs=15,
                    batch_size=64,
                    validation_split=0.2,
                    verbose=2)

Output:

cnn3
Training the Model

Step 5: Evaluating the Model

After training, we evaluate the model on the test dataset to check how well it performs on unseen data.

Python
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=0)

print(f"Test accuracy = {test_acc:.3f}")

Step 6: Plotting of Accuracy Curves

Finally, we visualize the training and validation accuracy during training using matplotlib.

Python
plt.plot(history.history['accuracy'], label='train')
plt.plot(history.history['val_accuracy'], label='val')
plt.legend()
plt.title('Accuracy')
plt.show()

Output:

cnn4
Plotting of Accuracy

Benefits of Using CNNs for Image Classification

  1. Automatic Feature Learning: CNNs automatically learn relevant features from images, reducing the need for manual feature extraction and making them more adaptable to different tasks.
  2. Translation Invariance: They can recognize objects in images regardless of their position making them more robust to changes in image orientation.
  3. Efficiency: Pooling layers help reduce the image's size making the model more computationally efficient while retaining key features.
  4. Scalability: They can handle large datasets effectively, as they can process high volumes of data and continue improving with more examples.

Challenges in Image Classification

While CNNs have several advantages, they also come with certain challenges that we need to be solve while implementing them.

  1. Overfitting: CNNs can overfit if the model is too complex or if there’s limited data. Techniques like data augmentation and dropout help mitigate this.
  2. Computational Demands: Training CNNs requires significant computational resources, often relying on high-performance GPUs or cloud services.
  3. Data Quality: High-quality, labeled datasets are important for accurate classification. Poor data can lead to inaccurate predictions.
  4. Long Training Times: Training deep CNN models can be computationally expensive and time-consuming, especially with large datasets.

Similar Reads