Open In App

Blind source separation using FastICA in Scikit Learn

Last Updated : 23 Jul, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Independent Component Analysis(ICA) is a method used for separating mixed signals into their original statistically independent components. FastICA is a widely used efficient algorithm for solving this problem especially in Blind Source Separation where the goal is to recover unknown source signals from observed mixtures. It is commonly applied in fields like audio processing, medical imaging and financial data analysis. FastICA is of two types.

  1. Deflation-based FastICA where the components are found in one by one manner.
  2. Symmetric FastICA where the components are found simultaneously. 

FastICA can also work with different nonlinearity function and optimize the extraction order in the deflation-based version.  

Blind Source Separation

Blind Source Separation (BSS) refers to the process of separating signals when:

  • The source signals are unknown.
  • The method of mixing is also unknown.

Even without knowing much about the signals or how they were mixed we can separate them using FastICA. This is useful in many areas like sound processing, medical diagnostics and more. 

Mathematical Explanation of FastICA Algorithm

Let there be n original source signals combined linearly into m observed mixed signals. The original signals (sources) are represented as a vector:

s = \left( s_1, s_2, \ldots, s_n \right)^T.

The observed mixed signals are:

x = \left( x_1, x_2, \ldots, x_m \right)^T

The mixing process is modeled as:

x=Gs

Where G is m×n matrix of mixing coefficients. To find an unmixing matrix \mathbf{U} such that:

y=Ux

where \mathbf{y} approximates the original independent sources \mathbf{s}.

Step 1: Center the Data

Centering means making each observed signal zero-mean.

\tilde{\mathbf{x}} = \mathbf{x} - \mathbb{E}[\mathbf{x}]

where \mathbb{E}[\mathbf{x}] is the mean vector (mean of each signal).

Step 2: Whiten the Data

Whitening removes correlations between components and sets their variances to 1. The covariance matrix of whitened data is the identity matrix. Compute covariance matrix of centered data:

\mathbf{C} = \mathbb{E}[\tilde{\mathbf{x}} \tilde{\mathbf{x}}^T]

Perform eigenvalue decomposition:

\mathbf{C} = \mathbf{E} \mathbf{D} \mathbf{E}^T

where:

  • \mathbf{E} is the matrix of eigenvectors,
  • \mathbf{D} is the diagonal matrix of eigenvalues.
  • The whitening matrix \mathbf{V} is:

\mathbf{V} = \mathbf{D}^{-\frac{1}{2}} \mathbf{E}^T

Apply whitening:

\mathbf{z} = \mathbf{V} \tilde{\mathbf{x}}


Now \mathbf{z} has covariance:

\mathbb{E}[\mathbf{z} \mathbf{z}^T] = \mathbf{I}

Whitening simplifies the problem because the independent components now lie on an uncorrelated unit sphere.

Step 3: Estimate Independent Components Using Fixed-Point Iteration

The key insight in FastICA is to find vectors \mathbf{w} such that the projection \mathbf{w}^T \mathbf{z} is maximally non-Gaussian. Define a nonlinear function g(\cdot) and its derivative g'(\cdot) which help measure non-Gaussianity. Common choices:

g(u)=tanh⁡(u), \quad g'(u) = 1 - \tanh^2(u)

The iteration to update \mathbf{w} is:

\mathbf{w}^{\text{new}} = \mathbb{E}[\mathbf{z} g(\mathbf{w}^T \mathbf{z})] - \mathbb{E}[g'(\mathbf{w}^T \mathbf{z})] \mathbf{w}

Normalize:

\mathbf{w}^{\text{new}} \leftarrow \frac{\mathbf{w}^{\text{new}}}{\|\mathbf{w}^{\text{new}}\|}

Repeat until convergence.

Step 4: Deflation for multiple components.

To find multiple independent components \mathbf{w}_1, \mathbf{w}_2, ..., \mathbf{w}_n after estimating \mathbf{w}_p orthogonalize it with respect to previously found vectors:

\mathbf{w}_p \leftarrow \mathbf{w}_p - \sum_{j=1}^{p-1} (\mathbf{w}_p^T \mathbf{w}_j) \mathbf{w}_j

Normalize again:

\mathbf{w}_p \leftarrow \frac{\mathbf{w}_p}{\|\mathbf{w}_p\|}

Python Implementation of FastICA

Now lets implement it step by step:

Step 1: Import Required Libraries

we will import some python libraries like NumPy, Matplotlib and Scikit learn we can perform complex computations easily and effectively.

Python
import numpy as np
import matplotlib.pyplot as plt
from sklearn.decomposition import FastICA

Step 2: Generate Source Signals (Sine, Square, Noise)

In this step we create original signals that will act as the sources we want to later separate or analyze. These are basic signal types commonly used in signal processing:

  • s1: Smooth periodic signal
  • s2: Sharp square wave (high kurtosis)
  • s3: Random Gaussian noise
  • S: Shape=(2000,3) each column is one independent source
  • X= observed signals (mixed)
Python
np.random.seed(0)
n_samples = 2000
time = np.linspace(0, 8, n_samples)

s1 = np.sin(2 * time) 
s2 = np.sign(np.sin(3 * time))  
s3 = np.random.normal(0, 1, n_samples)  

S = np.c_[s1, s2, s3]

A = np.array([
    [1, 1, 1],
    [0.5, 2, 1.0],
    [1.5, 1.0, 2.0]
])

X = np.dot(S, A.T)

Step 3: Apply FastICA to Recover the Signals

Now we will compute ICA model using FastICA and also as given earlier we will also compute PCA model for showing the comparison.

  • fit_transform(X) performs centering, whitening and fixed-point iteration
  • S_estimated: Approximates original signals (shape: 2000 x 3)
  • A_estimated: Estimated mixing matrix
Python
ica = FastICA(n_components=3)
S_estimated = ica.fit_transform(X)  
A_estimated = ica.mixing_ 

Step 4: Plot the Results (Original, Mixed, Recovered)

Now we will plot the graph with our achieved values and can under stand the efficiency of ICA for blind source separation of signals as well as PCA as it failed to do this.

Python
plt.figure(figsize=(12, 8))

plt.subplot(3, 1, 1)
plt.title("Original Source Signals")
plt.plot(S)
plt.xlabel("Samples")

plt.subplot(3, 1, 2)
plt.title("Mixed Signals (Observed)")
plt.plot(X)
plt.xlabel("Samples")

plt.subplot(3, 1, 3)
plt.title("Recovered Signals (After ICA)")
plt.plot(S_estimated)
plt.xlabel("Samples")

plt.tight_layout()
plt.show()

Output:

download-
Blind source separation

The output shows three stages of signal processing.

  • In the first plot we can see the original source signals i.e a smooth sine wave, a square-shaped signal and some random noise.
  • The second plot shows them mixed together making it hard to tell them apart.
  • The third plot shows the signals separated again using FastICA closely matching the originals.

Similar Reads