Problem Solving on Scatter Matrix
Last Updated :
13 Jun, 2024
A scatter matrix, also known as a pair plot, is a powerful visualization tool in data analysis. It provides a grid of scatter plots that display relationships between pairs of variables in a dataset, helping engineers and data scientists to identify patterns, correlations, and potential outliers.
Read More: Scatter Plot Matrix
We calculate Sw ( within the class scatter matrix ) and SB ( between the class scatter matrix ) for the available data points.
SW : To minimize variability within a class, inner class scatter.
SB : To increase between class variability, between class scatter.
scatter plot of the pointsX1 = (y1, y2) ={ (2,2), (1,2), (1,2), (1,2), (2,2) }
X2 = (y1,y2) ={ (9, 10), (6,8), (9,5), (8,7), (10,8) }
Within class scatter matrix:
S_W = \sum_{i=1}^{c}S_i \\ S_i = \sum_{x\in D_i}^{c} (x-m_i)(x-m_i)^{T}
Si is the class specific covariance matrix.
mi is the mean of individual class
Mean Calculation
We calculate the mean for each of the points present in the class. Here mean is the total sum of observations divided by the number of observations, we require this mean for calculating the covariance of the matrix.
m_1 = [\frac{2+1+1+1+2}{5} , \frac{2+2+2+2+2}{5} ] \\ = [1.4,2]
m_2 = [\frac{9+6+9+8+10}{5} , \frac{10+8+5+7+8}{5} ] \\ = [8.4,7.6]
Covariance matrix computation :
We subtract the mean value from each of the observation and then we calculate the average after performing the matrix multiplication with the transpose of the matrix.
Class specific covariance for the first class :
(X1-m_1) = \begin{bmatrix} 0.6 & -0.4 & -0.4 & -0.4 & 0.6\\ 0 & 0 & 0& 0& 0 \end{bmatrix}
1) \begin{bmatrix} 0.6 \\ 0 \end{bmatrix} * \begin{bmatrix} 0.6 &0 \end{bmatrix} = \begin{bmatrix} 0.36 &0 \\ 0 &0 \end{bmatrix} \\\\ 2) \begin{bmatrix} -0.4 \\ 0 \end{bmatrix} * \begin{bmatrix} -0.4 &0 \end{bmatrix} = \begin{bmatrix} 0.16 &0 \\ 0 &0 \end{bmatrix} \\\\ 3) \begin{bmatrix} -0.4 \\ 0 \end{bmatrix} * \begin{bmatrix} -0.4 &0 \end{bmatrix} = \begin{bmatrix} 0.16 &0 \\ 0 &0 \end{bmatrix} \\\\ 4) \begin{bmatrix} -0.4 \\ 0 \end{bmatrix} * \begin{bmatrix} -0.4 &0 \end{bmatrix} = \begin{bmatrix} 0.16 &0 \\ 0 &0 \end{bmatrix} \\\\ 5) \begin{bmatrix} 0.6 \\ 0 \end{bmatrix} * \begin{bmatrix} 0.6 &0 \end{bmatrix} = \begin{bmatrix} 0.36 &0 \\ 0 &0 \end{bmatrix} \\\\
Averaging values from 1,2,3,4 and 5.
We calculate the sum of all the values for each element in the matrix for S1 and divide by the number of the observation which in the present computation is 5.
matrix_{00} = \frac{ 0.36+0.16+0.16+0.16+0.36}{5}\\ =\frac{1.2}{5}\\ = 0.24\\\\ matrix_{01} = \frac{ 0+0+0+0+0}{5}\\ = 0\\\\ matrix_{10} = \frac{ 0+0+0+0+0}{5}\\ = 0\\\\ matrix_{11} = \frac{ 0+0+0+0+0}{5}\\ = 0\\\\
Therefore S1 is :
S_1 = \begin{bmatrix} 0.24 &0 \\ 0 &0 \end{bmatrix}
Class specific covariance for the second class :
(X2-m_2) = \begin{bmatrix} 0.6 & -2.4 & 0.6 & -0.4 & 1.6\\ 2.4 & 0.4 & -2.6& -0.6& 0.4 \end{bmatrix}
1) \begin{bmatrix} 0.6 \\ 2.4 \end{bmatrix} * \begin{bmatrix} 0.6 &2.4 \end{bmatrix} = \begin{bmatrix} 0.36 &1.44 \\ 1.44 &05.76 \end{bmatrix} \\\\ 2) \begin{bmatrix} -2.4 \\ 0.4 \end{bmatrix} * \begin{bmatrix} -2.4 &0.4 \end{bmatrix} = \begin{bmatrix} 5.76 &-0.96 \\ -0.96 &0.16 \end{bmatrix} \\\\ 3) \begin{bmatrix} 0.6 \\ -2.6 \end{bmatrix} * \begin{bmatrix} 0.6 &-2.6 \end{bmatrix} = \begin{bmatrix} 0.36 &-1.56 \\ 1.56 &6.76 \end{bmatrix} \\\\ 4) \begin{bmatrix} -0.4 \\ -0.6 \end{bmatrix} * \begin{bmatrix} -0.4 &-0.6 \end{bmatrix} = \begin{bmatrix} 0.16 &0.24 \\ 0.24 &0.36 \end{bmatrix} \\\\ 5) \begin{bmatrix} 1.6 \\ 0.4 \end{bmatrix} * \begin{bmatrix} 1.6 &0.4 \end{bmatrix} = \begin{bmatrix} 2.56 &0.64 \\ 0.64 &0.16 \end{bmatrix} \\\\
Averaging values from 1,2,3,4 and 5
we calculate the sum of all the values for each element in the matrix for S2 and divide by the number of the observation which in the present computation is 5.
matrix_{00} = \frac{ 0.36+5.76+0.36+0.16+2.56}{5}\\ =\frac{9.2}{5}\\ = 1.84\\\\ matrix_{01} = \frac{ 1.44-0.96-1.56+0.24+0.64}{5}\\ =\frac{-0.2}{5}\\ = -0.04\\\\ matrix_{10} = \frac{ 1.44-0.96-1.56+0.24+0.64}{5}\\ =\frac{-0.2}{5}\\ = -0.04\\\\ matrix_{11} = \frac{ 5.76+0.16+6.76+0.36+0.16}{5}\\ =\frac{13.2}{5}\\ = 2.64\\\\
Therefore S2 is :
S_2 = \begin{bmatrix} 1.84 &-0.04 \\ -0.04 &2.64 \end{bmatrix}
Within class scatter matrix Sw :
SW = S1 + S2
S_W= \begin{bmatrix} 0.24 &0 \\ 0 &0 \end{bmatrix} + \begin{bmatrix} 1.84 &-0.04 \\ -0.04 &2.64 \end{bmatrix} \\\\ =\begin{bmatrix} 2.08 &-0.04 \\ -0.04 &2.64 \end{bmatrix}
Between class scatter matrix SB :
S_B = (m_1 - m_2) * (m_1 - m_2)^{T}\\\\ S_B = \begin{bmatrix} 7 & 5.6 \end{bmatrix} * \begin{bmatrix} 7 \\5.6 \end{bmatrix} \\\\ =\begin{bmatrix} 49 &37.2 \\ 37.2 &31.36 \end{bmatrix}
Total scatter matrix :
ST = SB + SW
S_w =\begin{bmatrix} 49 &37.2 \\ 37.2 &31.36 \end{bmatrix} + \begin{bmatrix} 2.08 &-0.04 \\ -0.04 &2.64 \end{bmatrix} \\ =\begin{bmatrix} 51.08 & 37.16 \\ 37.16 &34 \end{bmatrix}
Therefore we have calculated between class scatter matrix and within class scatter matrix for the available data points.
We make use of these computations in feature extraction , where the main goal is to increase the distance between the class in the projection of points and decrease the distance between the points within the class in the projection. Here we aim at generating data projection at the required dimension.
Conclusion - Scatter Matrix
Scatter matrices are invaluable tools in engineering mathematics and data science, facilitating the exploration and analysis of complex datasets. They aid in identifying correlations, detecting outliers, and selecting features, ultimately enhancing the data modeling process.
Similar Reads
Scatter Plot Matrix In a dataset, for k set of variables/columns (X1, X2, ....Xk), the scatter plot matrix plot all the pairwise scatter between different variables in the form of a matrix. Scatter plot matrix answer the following questions: Are there any pair-wise relationships between different variables? And if ther
3 min read
Coding Problems on Matrix Data Structure Matrix are a fundamental data structure with vast applications in computer science. This article will explore a variety of coding problems that uses the matrix data structure. Through these hands-on exercises, readers will develop practical skills in matrix manipulation and algorithm implementation.
7 min read
Permutation Matrix Permutation Matrices stand out as a distinct and important element, mentioning the algebraic linear regression and integers in the combination. These matrices are composed of 0s and 1s and are more than just a special mathematical matrix. Knowing the permutation matrices provides the capability to i
13 min read
Matplotlib Scatter Scatter plots are one of the most fundamental and powerful tools for visualizing relationships between two numerical variables. matplotlib.pyplot.scatter() plots points on a Cartesian plane defined by X and Y coordinates. Each point represents a data observation, allowing us to visually analyze how
5 min read
Operations on Sparse Matrices Given two sparse matrices (Sparse Matrix and its representations | Set 1 (Using Arrays and Linked Lists)), perform operations such as add, multiply or transpose of the matrices in their sparse form itself. The result should consist of three sparse matrices, one obtained by adding the two input matri
15+ min read
What Is a Scatter Plot in Python? Scatter plots are a fundamental tool in data visualization, providing a visual representation of the relationship between two variables. In Python, scatter plots are commonly created using libraries such as Matplotlib and Seaborn. This article will delve into the concept of scatter plots, their appl
6 min read
The Celebrity Problem Given a square matrix mat[][] of size n x n, such that mat[i][j] = 1 means ith person knows jth person, the task is to find the celebrity. A celebrity is a person who is known to all but does not know anyone. Return the index of the celebrity, if there is no celebrity return -1.Note: Follow 0-based
15 min read
Top 50 Problems on Matrix/Grid Data Structure asked in SDE Interviews A Matrix/Grid is a two-dimensional array that consists of rows and columns. It is an arrangement of elements in horizontal or vertical lines of entries. Here is the list of the top 50 frequently asked interview questions on Matrix/Grid in the SDE Interviews. Problems in this Article are divided into
2 min read
Find the mean vector of a Matrix Given a matrix of size M x N, the task is to find the Mean Vector of the given matrix. Examples: Input : mat[][] = {{1, 2, 3}, {4, 5, 6}, {7, 8, 9}} Output : Mean Vector is [4 5 6] Mean of column 1 is (1 + 4 + 7) / 3 = 4 Mean of column 2 is (2 + 5 + 8) / 3 = 5 Mean of column 3 is (3 + 6 + 9) / 3 = 6
6 min read
Nullity of a Matrix Prerequisite â Mathematics | System of Linear Equations Let A be a matrix. Since, number of non-zero rows in the row reduced form of a matrix A is called the rank of A, denoted as rank(A) and Nullity is the complement to the rank of a matrix .Please go through the Prerequisite first and read the ran
2 min read