
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Statistical Methods Available for a NumPy Array
In this article, we will show you a list of a few statistical methods of NumPy library in python.
Statistics is dealing with collecting and analyzing data. It describes methods for collecting samples, describing data, and concluding data. NumPy is the core package for scientific calculations, hence NumPy statistical Functions go hand in hand.
Numpy has a number of statistical functions that can be used to do statistical data analysis. Let us discuss a few of them here.
numpy.amin() and numpy.amax()
These functions return the minimum and the maximum from the elements in the given array along the specified axis.
Example
import numpy as np # input array inputArray = np.array([[2,6,3],[1,5,4],[8,12,9]]) print('Input Array is:') print(inputArray) # Printing new line print() print("Minimum element in an array:", np.amin(inputArray)) print() print("Maximum element in an array:", np.amax(inputArray)) print() print('Minimum element in an array among axis 0(rows):') print(np.amin(inputArray, 0)) print('Minimum element in an array among axis 1(columns):') print(np.amin(inputArray, 1)) print() print('Maximum element in an array among axis 0(rows):') print(np.amax(inputArray, 0)) print() print('Maximum element in an array among axis 1(columns):') print(np.amax(inputArray, axis=1)) print()
Output
On executing, the above program will generate the following output ?
Input Array is: [[ 2 6 3] [ 1 5 4] [ 8 12 9]] Minimum element in an array: 1 Maximum element in an array: 12 Minimum element in an array among axis 0(rows): [1 5 3] Minimum element in an array among axis 1(columns): [2 1 8] Maximum element in an array among axis 0(rows): [ 8 12 9] Maximum element in an array among axis 1(columns): [ 6 5 12]
numpy.ptp()
Example
The numpy.ptp() function returns the range (maximum-minimum) of values along an axis. The ptp() is an abbreviation for peak-to-peak.
import numpy as np # input array inputArray = np.array([[2,6,3],[1,5,4],[8,12,9]]) print('Input Array is:') print(inputArray) print() print('The peak to peak(ptp) values of an array') print(np.ptp(inputArray)) print() print('Range (maximum-minimum) of values along axis 1(columns):') print(np.ptp(inputArray, axis = 1)) print() print('Range (maximum-minimum) of values along axis 0(rows):') print(np.ptp(inputArray, axis = 0))
Output
On executing, the above program will generate the following output ?
Input Array is: [[ 2 6 3] [ 1 5 4] [ 8 12 9]] The peak to peak(ptp) values of an array 11 Range (maximum-minimum) of values along axis 1(columns): [4 4 4] Range (maximum-minimum) of values along axis 0(rows): [7 7 6]
numpy.percentile()
Percentile (or a centile) is a measure used in statistics indicating the value below which a given percentage of observations in a group of observations fall.
It computes the nth percentile of data along the given axis.
Syntax
numpy.percentile(a, q, axis)
Parameters
a | Input array |
q | The percentile to compute must be between 0-100 |
axis | The axis along which the percentile is to be calculated |
Example
import numpy as np # input array inputArray = np.array([[20,45,70],[30,25,50],[10,80,90]]) print('Input Array is:') print(inputArray) print() print('Applying percentile() function to print 10th percentile:') print(np.percentile(inputArray, 10)) print() print('10th percentile of array along the axis 1(columns):') print(np.percentile(inputArray, 10, axis = 1)) print() print('10th percentile of array along the axis 0(rows):') print(np.percentile(inputArray, 10, axis = 0))
Output
On executing, the above program will generate the following output ?
Input Array is: [[20 45 70] [30 25 50] [10 80 90]] Applying percentile() function to print 10th percentile: 18.0 10th percentile of array along the axis 1(columns): [25. 26. 24.] 10th percentile of array along the axis 0(rows): [12. 29. 54.]
numpy.median()
Median is defined as the value separating the higher half of a data sample from the lower half.
The numpy.median() function calculates the median of the multi-dimensional or one-dimensional arrays.
Example
import numpy as np # input array inputArray = np.array([[20,45,70],[30,25,50],[10,80,90]]) print('Input Array is:') print(inputArray) print() # printing the median of an array print('Median of an array:') print(np.median(inputArray)) print() print('Median of array along the axis 0(rows):') print(np.median(inputArray, axis = 0) ) print() print('Median of array along the axis 1(columns):') print(np.median(inputArray, axis = 1))
Output
On executing, the above program will generate the following output ?
Input Array is: [[20 45 70] [30 25 50] [10 80 90]] Median of an array: 45.0 Median of array along the axis 0(rows): [20. 45. 70.] Median of array along the axis 1(columns): [45. 30. 80.]
numpy.mean()
Arithmetic mean is the sum of elements along an axis divided by the number of elements.
The numpy.mean() function returns the arithmetic mean of elements in the array. If the axis is mentioned, it is calculated along it.
Example
import numpy as np # input array inputArray = np.array([[20,45,70],[30,25,50],[10,80,90]]) print('Input Array is:') print(inputArray) print() # printing the mean of an array print('Mean of an array:') print(np.mean(inputArray)) print() print('Mean of an array along the axis 0(rows):') print(np.mean(inputArray, axis = 0) ) print() print('Mean of an array along the axis 1(columns):') print(np.mean(inputArray, axis = 1))
Output
On executing, the above program will generate the following output ?
Input Array is: [[20 45 70] [30 25 50] [10 80 90]] Mean of an array: 46.666666666666664 Mean of an array along the axis 0(rows): [20. 50. 70.] Mean of an array along the axis 1(columns): [45. 35. 60.]
numpy.average()
The numpy.average() function computes the weighted average along the axis of multidimensional arrays whose weights are specified in another array.
The function can have an axis parameter. If the axis is not specified, the array is flattened.
Example
import numpy as np # input array inputArray = np.array([1,2,3,4]) print('Input Array is:') print(inputArray) print() # printing the average of all elements in an array print('Average of all elements in an array:') print(np.average(inputArray)) print()
Output
On executing, the above program will generate the following output ?
Input Array is: [1 2 3 4] Average of all elements in an array: 2.5
Standard Deviation & Variance
Standard deviation
Standard deviation is the square root of the average of squared deviations from mean. The formula for standard deviation is as follows ?
std = sqrt(mean(abs(x - x.mean())**2))
If the array is [1, 2, 3, 4], then its mean is 2.5. Hence the squared deviations are [2.25, 0.25, 0.25, 2.25] and the square root of its mean divided by 4, i.e., sqrt (5/4) is 1.1180339887498949.
Variance
Variance is the average of squared deviations, i.e., mean(abs(x - x.mean())**2). In other words, the standard deviation is the square root of variance.
Example
import numpy as np # input array inputArray= [1,2,3,4] # printing the standard deviation of array print("Input Array =",inputArray) print("Standard deviation of array = ", np.std(inputArray)) # printing the variance of array print("Variance of array = ", np.var(inputArray))
Output
On executing, the above program will generate the following output ?
Input Array = [1, 2, 3, 4] Standard deviation of array = 1.118033988749895 Variance of array = 1.25
Conclusion
By using examples, we studied some of the few statistical methods for a Numpy array in this article.