Open In App

Python - Central Limit Theorem

Last Updated : 01 Aug, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Central Limit Theorem (CLT) is a key concept in statistics that explains why many distributions tend to look like a normal distribution when averaged. It states that if you take a large number of random samples from any population, the distribution of their means will be approximately normal, even if the original population is not. Implementing it using Python can significantly enhance data analysis capabilities.

Step 1: Import Required Libraries

  • NumPy is used for numerical operations including random number generation and calculating means.
  • matplotlib.pyplot is used for plotting the histograms.
Python
import numpy as np
import matplotlib.pyplot as plt

Step 2: Set Sample Sizes and Initialize Storage

  • Define different sample sizes to observe how increasing the sample size affects the sampling distribution.
  • Prepare a list (all_sample_means) to store the computed means for each sample size.
Python
sample_sizes = [1, 10, 50, 100]
all_sample_means = []

Step 3: Set Random Seed for Reproducibility and Generate Sample Means for Each Sample Size

  • Ensures the results are reproducible—every run will produce the same random numbers.
  • Draw 1,000 random samples, each with values between -40 and 40.
  • Compute the mean of each sample.
  • Collect all these means for later visualization.
Python
np.random.seed(1)

for size in sample_sizes:
    sample_means = [np.mean(np.random.randint(-40, 40, size))
                    for _ in range(1000)]
    all_sample_means.append(sample_means)

Step 4: Plot Distributions

  • Create a 2x2 grid of plots so each sample size’s distribution is shown side by side for easy comparison.
  • The x-axis shows the sample means and the y-axis shows their density.
  • As the sample size increases, the histograms become more concentrated and bell-shaped.
Python
fig, axes = plt.subplots(2, 2, figsize=(10, 8))

for ax, means, size in zip(axes.flatten(), all_sample_means, sample_sizes):
    ax.hist(means, bins=20, density=True, alpha=0.75,
            color='green', edgecolor='black')
    ax.set_title(f"Sample size = {size}")
    ax.set_xlabel("Sample Mean")
    ax.set_ylabel("Density")
    ax.grid(True, linestyle='--', alpha=0.5)

plt.tight_layout()
plt.show()

Output:  

CLT

It is evident from the graphs that as we keep on increasing the sample size from 1 to 100 the histogram tends to take the shape of a normal distribution.


Central Limit Theorem (CLT) in Machine Learning
Article Tags :
Practice Tags :

Similar Reads