0% found this document useful (0 votes)

32 views47 pages

Python For DataScience

The document provides an overview of Python programming basics, including setting up the working directory, file handling, data types, operators, and functions. It also covers data structures such as lists, tuples, sets, and dictionaries, along with their associated functions, as well as an introduction to the Pandas library for data manipulation. Additionally, it discusses Numpy for numerical computing and Matplotlib for data visualization, along with examples of regression and classification in data analysis.

Uploaded by

Mahesh Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views47 pages

Python For DataScience

Uploaded by

Mahesh Kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 47

Unit I Basics of Python 10

Introduction – Setting working directory – Creating and saving, File execution, clearing
console, removing variables from environment, clearing environment – variable creation –
Operators – Data types and its associated operations – sequence data types – conditions and
branching – Functions-Virtual Environments

Introduction to Python

Python is a versatile and popular programming language known for its simplicity and
readability. It is widely used in various fields, including web development, data analysis,
artificial intelligence, and scientific computing.

Setting Working Directory

The working directory is the folder where Python scripts are executed and where files are
read from or written to. To set the working directory:

Coding
import os
# To get the current working directory
print(os.getcwd())
# To change the working directory
os.chdir('path_to_directory')

Creating and Saving Files

Creating and saving files in Python can be done using the open() function:

# Create and write to a file

with open('example.txt', 'w') as file:
file.write("Hello, World!")

# Read from a file

with open('example.txt', 'r') as file:
content = file.read()
print(content)

File Execution

To execute a Python file from the console or terminal:

python filename.py

Clearing Console
Clearing the console can be done using a system call:

import os

# Clear console (Windows)

os.system('cls')

# Clear console (Unix/Linux/MacOS)

os.system('clear')

Removing Variables from Environment

To remove a variable from the environment:

# Create a variable
x = 10

# Delete the variable

del x

Clearing Environment

Clearing the entire environment is not a built-in feature of Python, but you can delete all
variables in the global scope:

# Delete all variables in the global scope

globals().clear()

Variable Creation

Variables in Python are created by simply assigning a value to a name:

x=5 # Integer
y = 3.14 # Float
name = "John" # String

Operators

Python supports various operators:

 Arithmetic Operators: +, -, *, /, %, //, **

 Comparison Operators: ==, !=, >, <, >=, <=
 Logical Operators: and, or, not
 Assignment Operators: =, +=, -=, *=, /=, %=, //=, **=
 Bitwise Operators: &, |, ^, ~, <<, >>

Data Types and Associated Operations

 Numbers: Integers, Floats, Complex numbers

o Operations: Arithmetic, type conversion, etc.
 Strings: Immutable sequences of characters
o Operations: Concatenation, slicing, formatting, etc.
 Lists: Mutable sequences
o Operations: Indexing, slicing, appending, inserting, removing, etc.
 Tuples: Immutable sequences
o Operations: Indexing, slicing, etc.
 Sets: Unordered collections of unique elements
o Operations: Union, intersection, difference, etc.
 Dictionaries: Key-value pairs
o Operations: Accessing, updating, removing elements, etc.

Sequence Data Types

 Lists

my_list = [1, 2, 3, 4, 5]

 Tuples

my_tuple = (1, 2, 3, 4, 5)

 Strings

my_string = "Hello, World!"

 Ranges

my_range = range(1, 10)

Conditions and Branching

Python uses if, elif, and else for conditional branching:

x = 10
if x > 0:
print("Positive")
elif x < 0:
print("Negative")
else:
print("Zero")

Functions

Functions are defined using the def keyword:

def greet(name):
return f"Hello, {name}!"

print(greet("Alice"))
Virtual Environments

Virtual environments allow you to create isolated Python environments for different projects:

Copy code
# Create a virtual environment
python -m venv myenv

# Activate the virtual environment (Windows)

myenv\Scripts\activate

# Activate the virtual environment (Unix/Linux/MacOS)

source myenv/bin/activate

# Deactivate the virtual environment

deactivate
Unit II PYTHON DATA STRUCTURES, PACKAGES 10

List – Tuples- Set – Dictionary – Its associated functions - File handling - Modes– Reading
and writing files - Introduction to Pandas – Series – Data frame – Indexing and loading –
Data manipulation – Merging – Group by – Scales – Pivot table – Date and time.

Lists

Lists are ordered, mutable collections of items.

Creating Lists:

my_list = [1, 2, 3, 4, 5]

Functions and Methods:

 append(x): Add an item to the end.

 extend(iterable): Extend list by appending elements from an iterable.
 insert(i, x): Insert an item at a given position.
 remove(x): Remove first item with value x.
 pop([i]): Remove and return item at position i (default last).
 clear(): Remove all items.
 index(x[, start[, end]]): Return index of first item with value x.
 count(x): Return number of times x appears.
 sort(key=None, reverse=False): Sort items.
 reverse(): Reverse the elements.
 copy(): Return a shallow copy.

Tuples

Tuples are ordered, immutable collections of items.

Creating Tuples:

my_tuple = (1, 2, 3, 4, 5)

Functions and Methods:

 count(x): Return the number of times x appears.

 index(x): Return the index of the first item with value x.

Sets

Sets are unordered collections of unique items.

Creating Sets:

my_set = {1, 2, 3, 4, 5}
Functions and Methods:

 add(x): Add an item.

 remove(x): Remove an item.
 discard(x): Remove an item if present.
 pop(): Remove and return an arbitrary item.
 clear(): Remove all items.
 union(*others): Return the union.
 intersection(*others): Return the intersection.
 difference(*others): Return the difference.
 symmetric_difference(other): Return the symmetric difference.
 issubset(other): Check if set is subset of other.
 issuperset(other): Check if set is superset of other.

Dictionaries

Dictionaries are unordered collections of key-value pairs.

Creating Dictionaries:

my_dict = {'a': 1, 'b': 2, 'c': 3}

Functions and Methods:

 keys(): Return a new view of the dictionary's keys.

 values(): Return a new view of the dictionary's values.
 items(): Return a new view of the dictionary's items.
 get(key[, default]): Return the value for key if key is in the dictionary.
 setdefault(key[, default]): Insert key with a value of default if key is not
in the dictionary.
 update([other]): Update the dictionary with the key/value pairs from other.
 pop(key[, default]): Remove specified key and return the corresponding
value.
 popitem(): Remove and return a (key, value) pair.

File Handling

Modes:

 'r': Read (default).

 'w': Write (truncate file).
 'x': Create (fail if exists).
 'a': Append.
 'b': Binary mode.
 't': Text mode (default).
 '+': Update (read and write).
Reading and Writing Files:

# Writing to a file
with open('example.txt', 'w') as file:
file.write("Hello, World!")

# Reading from a file

with open('example.txt', 'r') as file:
content = file.read()

Introduction to Pandas

Pandas is a powerful data manipulation library in Python.

Series

A Series is a one-dimensional labeled array capable of holding any data type.

Creating a Series:

import pandas as pd

data = [1, 2, 3, 4, 5]
series = pd.Series(data)

DataFrame

A DataFrame is a two-dimensional labeled data structure.

Creating a DataFrame:

data = {
'Column1': [1, 2, 3],
'Column2': [4, 5, 6]
}
df = pd.DataFrame(data)
Indexing and Loading

Indexing:

df['Column1'] # Access a single column

df[['Column1', 'Column2']] # Access multiple columns
df.iloc[0] # Access a row by index
df.loc[0] # Access a row by label

Loading Data:

df = pd.read_csv('file.csv') # Load CSV file

df = pd.read_excel('file.xlsx') # Load Excel file
Data Manipulation

Basic Operations:

df['NewColumn'] = df['Column1'] + df['Column2'] # Add a new column

df.drop('Column1', axis=1, inplace=True) # Drop a column
df.rename(columns={'OldName': 'NewName'}, inplace=True) # Rename a column

Merging

Combining DataFrames:

df1 = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})

df2 = pd.DataFrame({'A': [5, 6], 'B': [7, 8]})
merged_df = pd.concat([df1, df2])

Group By

Grouping data:

grouped = df.groupby('Column1')
summary = grouped['Column2'].sum()

Scales

Scaling data can be done using libraries like sklearn.preprocessing.

Example:

from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()
scaled_data = scaler.fit_transform(df)

Pivot Table

Creating a pivot table:

pivot = df.pivot_table(values='Value', index='Index', columns='Columns', aggfunc='mean')

Date and Time

Handling date and time data:

df['Date'] = pd.to_datetime(df['Date'])
df['Year'] = df['Date'].dt.year
df['Month'] = df['Date'].dt.month
df['Day'] = df['Date'].dt.day

Unit III: Packages for Data Analysis

Numpy – 1D and 2D numpy – Associated operations –Broadcasting - Linear algebra and

related operations – Indexing and other operations – Matplotlib – scatterplot – line plot – bar
plot – histogram – box plot – pair plot – Case study on regression and classification.

Numpy

Numpy is a powerful numerical computing library in Python, providing support for large
multi-dimensional arrays and matrices along with a large collection of high-level
mathematical functions.

1D and 2D Numpy Arrays

Creating 1D Arrays:

import numpy as np

arr_1d = np.array([1, 2, 3, 4, 5])

Creating 2D Arrays:

arr_2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

Associated Operations

Basic Operations:

# Element-wise addition
result = arr_1d + 2

# Element-wise subtraction
result = arr_1d - 2

# Element-wise multiplication
result = arr_1d * 2

# Element-wise division
result = arr_1d / 2
Aggregations:

# Sum of elements
np.sum(arr_1d)

# Mean of elements
np.mean(arr_1d)

# Standard deviation
np.std(arr_1d)

# Maximum and minimum

np.max(arr_1d)
np.min(arr_1d)

Broadcasting

Broadcasting allows Numpy to perform element-wise operations on arrays of different

shapes.

Example:

arr = np.array([1, 2, 3])

scalar = 2

result = arr + scalar # [3, 4, 5]

Linear Algebra and Related Operations

Dot Product:

a = np.array([1, 2])
b = np.array([3, 4])

dot_product = np.dot(a, b) # 11

Matrix Multiplication:

A = np.array([[1, 2], [3, 4]])

B = np.array([[5, 6], [7, 8]])

result = np.matmul(A, B)

Inverse of a Matrix:

matrix = np.array([[1, 2], [3, 4]])

inverse = np.linalg.inv(matrix)
Indexing and Other Operations

Indexing:

arr = np.array([1, 2, 3, 4, 5])

# Accessing elements
element = arr[0] # 1

# Slicing
subarray = arr[1:3] # [2, 3]

Reshaping:

arr = np.array([[1, 2, 3], [4, 5, 6]])

reshaped = arr.reshape((3, 2)) # [[1, 2], [3, 4], [5, 6]]

Matplotlib

Matplotlib is a plotting library for creating static, interactive, and animated visualizations in
Python.

Scatter Plot

Creating a Scatter Plot:

import matplotlib.pyplot as plt

x = np.array([1, 2, 3, 4, 5])
y = np.array([5, 4, 3, 2, 1])

plt.scatter(x, y)
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.title('Scatter Plot')
plt.show()

Line Plot

Creating a Line Plot:

plt.plot(x, y)
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.title('Line Plot')
plt.show()
Bar Plot

Creating a Bar Plot:

categories = ['A', 'B', 'C']

values = [10, 20, 15]

plt.bar(categories, values)
plt.xlabel('Categories')
plt.ylabel('Values')
plt.title('Bar Plot')
plt.show()
Histogram

Creating a Histogram:

data = np.random.randn(1000)

plt.hist(data, bins=30)
plt.xlabel('Value')
plt.ylabel('Frequency')
plt.title('Histogram')
plt.show()
Box Plot

Creating a Box Plot:

data = [np.random.normal(0, std, 100) for std in range(1, 4)]

plt.boxplot(data, vert=True, patch_artist=True)

plt.xlabel('Distribution')
plt.ylabel('Value')
plt.title('Box Plot')
plt.show()
Pair Plot

Creating a Pair Plot:

import seaborn as sns

import pandas as pd

df = pd.DataFrame({
'A': np.random.randn(100),
'B': np.random.randn(100),
'C': np.random.randn(100),
'D': np.random.randn(100)
})
sns.pairplot(df)
plt.show()

Case Study: Regression and Classification

Regression

Linear Regression Example:

from sklearn.linear_model import LinearRegression

# Sample data
X = np.array([[1], [2], [3], [4], [5]])
y = np.array([1, 2, 3, 4, 5])

# Create and train the model

model = LinearRegression()
model.fit(X, y)

# Make predictions
predictions = model.predict(X)

# Plot results
plt.scatter(X, y, color='blue')
plt.plot(X, predictions, color='red')
plt.xlabel('X')
plt.ylabel('y')
plt.title('Linear Regression')
plt.show()
Classification

Logistic Regression Example:

from sklearn.linear_model import LogisticRegression

from sklearn.datasets import load_iris

# Load data
iris = load_iris()
X = iris.data
y = iris.target

# Create and train the model

model = LogisticRegression(max_iter=200)
model.fit(X, y)

# Make predictions
predictions = model.predict(X)

# Plot results
plt.scatter(X[:, 0], X[:, 1], c=predictions, cmap='viridis')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.title('Logistic Regression Classification')
plt.show()

Programs for Python for Data Science

Basic python programs:

Addition of two numbers Output

a=eval(input(“enter first no”)) enter first no
b=eval(input(“enter second no”)) 5
c=a+b enter second no
print(“the sum is “,c) 6
the sum is 11
Area of rectangle Output
l=eval(input(“enter the length of rectangle”)) enter the length of rectangle 5
b=eval(input(“enter the breath of rectangle”)) enter the breath of rectangle 6
a=l*b 30
print(a)
Area & circumference of circle output
r=eval(input(“enter the radius of circle”)) enter the radius of circle4
a=3.14*r*r the area of circle 50.24
c=2*3.14*r the circumference of circle
print(“the area of circle”,a) 25.12
print(“the circumference of circle”,c)
Calculate simple interest Output
p=eval(input(“enter principle amount”)) enter principle amount 5000
n=eval(input(“enter no of years”)) enter no of years 4
r=eval(input(“enter rate of interest”)) enter rate of interest6
si=p*n*r/100 simple interest is 1200.0
print(“simple interest is”,si)

Calculate engineering cutoff Output

p=eval(input(“enter physics marks”)) enter physics marks 100
c=eval(input(“enter chemistry marks”)) enter chemistry marks 99
m=eval(input(“enter maths marks”)) enter maths marks 96
cutoff=(p/4+c/4+m/2) cutoff = 97.75
print(“cutoff =”,cutoff)

Check voting eligibility output

age=eval(input(“enter ur age”)) Enter ur age
If(age>=18): 19
print(“eligible for voting”) Eligible for voting
else:
print(“not eligible for voting”)

Find greatest of three numbers output

a=eval(input(“enter the value of a”)) enter the value of a 9
b=eval(input(“enter the value of b”)) enter the value of a 1
c=eval(input(“enter the value of c”)) enter the value of a 8
if(a>b): the greatest no is 9
if(a>c):
print(“the greatest no is”,a)
else:
print(“the greatest no is”,c)
else:
if(b>c):
print(“the greatest no is”,b)
else:
print(“the greatest no is”,c)
Programs on for loop
Print n natural numbers Output

for i in range(1,5,1): 1234

print(i)
Print n odd numbers Output
for i in range(1,10,2):
13579
print(i)

Print n even numbers Output

for i in range(2,10,2):
2468
print(i)
Print squares of numbers Output

for i in range(1,5,1): 1 4 9 16

print(i*i)

Print squares of numbers Output

for i in range(1,5,1): 1 8 27 64

print(i*i*i)

Programs on while loop

Print n natural numbers Output

i=1 1
while(i<=5): 2
print(i) 3
i=i+1 4
5
Print n odd numbers Output
i=2 2
while(i<=10): 4
print(i) 6
i=i+2 8
10
Print n even numbers Output
i=1 1
while(i<=10): 3
print(i) 5
i=i+2 7
9
Print n squares of numbers Output
i=1 1
while(i<=5): 4
print(i*i) 9
i=i+1 16
25

Print n cubes numbers Output

i=1 1
while(i<=3): 8
print(i*i*i) 27
i=i+1

find sum of n numbers Output

i=1 55
sum=0
while(i<=10):
sum=sum+i
i=i+1
print(sum)

factorial of n numbers/product of n numbers Output

i=1 3628800
product=1
while(i<=10):
product=product*i
i=i+1
print(product)
sum of n numbers Output
def add(): enter a value
a=eval(input(“enter a value”)) 6
b=eval(input(“enter b value”)) enter b value
c=a+b 4
print(“the sum is”,c) the sum is 10
add()

area of rectangle using function Output

def area(): enter the length of
l=eval(input(“enter the length of rectangle”)) rectangle 20
b=eval(input(“enter the breath of rectangle”)) enter the breath of
a=l*b rectangle 5
print(“the area of rectangle is”,a) the area of rectangle is
area() 100

swap two values of variables Output

def swap(): enter a value3
a=eval(input("enter a value")) enter b value5
b=eval(input("enter b value")) a= 5 b= 3
c=a
a=b
b=c
print("a=",a,"b=",b)
swap()
check the no divisible by 5 or not Output
def div(): enter n value10
n=eval(input("enter n value")) the number is divisible by
if(n%5==0): 5
print("the number is divisible by 5")
else:
print("the number not divisible by 5")
div()

find reminder and quotient of given no Output

def reminder(): enter a 6
a=eval(input("enter a")) enter b 3
b=eval(input("enter b")) the reminder is 0
R=a%b enter a 8
print("the reminder is",R) enter b 4
def quotient(): the reminder is 2.0
a=eval(input("enter a"))
b=eval(input("enter b"))
Q=a/b
print("the reminder is",Q)
reminder()
quotient()

convert the temperature Output

enter temperature in
def ctof(): centigrade 37
c=eval(input("enter temperature in centigrade")) the temperature in
f=(1.8*c)+32 Fahrenheit is 98.6
print("the temperature in Fahrenheit is",f) enter temp in Fahrenheit
def ftoc(): 100
f=eval(input("enter temp in Fahrenheit")) the temperature in
c=(f-32)/1.8 centigrade is 37.77
print("the temperature in centigrade is",c)
ctof()
ftoc()
program for basic calculator Output
def add(): enter a value 10
a=eval(input("enter a value")) enter b value 10
b=eval(input("enter b value")) the sum is 20
c=a+b enter a value 10
print("the sum is",c) enter b value 10
def sub(): the diff is 0
a=eval(input("enter a value")) enter a value 10
b=eval(input("enter b value")) enter b value 10
c=a-b the mul is 100
print("the diff is",c) enter a value 10
def mul(): enter b value 10
a=eval(input("enter a value")) the div is 1
b=eval(input("enter b value"))
c=a*b
print("the mul is",c)
def div():
a=eval(input("enter a value"))
b=eval(input("enter b value"))
c=a/b
print("the div is",c)
add()
sub()
mul()
div()
NUMPY ARRAYS

ALGORITHM

Step1: Start

Step2: Import numpy module

Step3: Print the basic characteristics and operactions of array Step4: Stop

PROGRAM

import numpy as np

# Creating array object arr = np.array( [[ 1, 2, 3],

[ 4, 2, 5]] )

# Printing type of arr object print("Array is of type: ", type(arr)) # Printing array dimensions
(axes)

print("No. of dimensions: ", arr.ndim) # Printing shape of array print("Shape of array: ",
arr.shape)

# Printing size (total number of elements) of array print("Size of array: ", arr.size)

# Printing type of elements in array

print("Array stores elements of type: ", arr.dtype)

OUTPUT

Array is of type: <class 'numpy.ndarray'> No. of dimensions: 2

Shape of array: (2, 3) Size of array: 6

Array stores elements of type: int32

PROGRAM TO PERFORM ARRAY SLICING

a = np.array([[1,2,3],[3,4,5],[4,5,6]])

print(a)

print("After slicing") print(a[1:])

Output

[[1 2 3]

[3 4 5]

[4 5 6]]

After slicing [[3 4 5]

[4 5 6]]

CREATE A DATAFRAME USING A LIST OF ELEMENTS.

ALGORITHM

Step1: Start

Step2: import numpy and pandas module

Step3: Create a dataframe using the dictionary

Step4: Print the output

Step5: Stop

PROGRAM

import numpy as np import pandas as pd

data = np.array([['','Col1','Col2'], ['Row1',1,2],

['Row2',3,4]])

print(pd.DataFrame(data=data[1:,1:],

index = data[1:,0], columns=data[0,1:]))

# Take a 2D array as input to your DataFrame my_2darray = np.array([[1, 2, 3], [4, 5, 6]])
print(pd.DataFrame(my_2darray))

# Take a dictionary as input to your DataFrame my_dict = {1: ['1', '3'], 2: ['1', '2'], 3: ['2', '4']}

print(pd.DataFrame(my_dict))

# Take a DataFrame as input to your DataFrame

my_df = pd.DataFrame(data=[4,5,6,7], index=range(0,4), columns=['A'])

print(pd.DataFrame(my_df))

# Take a Series as input to your DataFrame

my_series = pd.Series({"United Kingdom":"London", "India":"New Delhi", "United
States":"Washington", "Belgium":"Brussels"})

print(pd.DataFrame(my_series))

df = pd.DataFrame(np.array([[1, 2, 3], [4, 5, 6]]))

# Use the `shape` property print(df.shape)

# Or use the `len()` function with the `index` property print(len(df.index))

Output:

Col1 Col2

Row1 1 2

Row2 3 4

0 1 2

0 1 2 3

1 4 5 61 23

0 1 1 2

1 3 2 4A

0 4

1 5

2 6

3 7

United Kingdom London India New Delhi United States Washington Belgium
Brussels

(2, 3)

2
BASIC PLOTS USING MATPLOTLIB

ALGORITHM

Step1: Start

Step2: import Matplotlib module

Step3: Create a Basic plots using Matplotlib Step4: Print the output

Step5: Stop

Program:3a

# importing the required module import matplotlib.pyplot as plt

# x axis values x = [1,2,3]

# corresponding y axis values y = [2,4,1]

# plotting the points plt.plot(x, y)

# naming the x axis plt.xlabel('x - axis') # naming the y axis plt.ylabel('y - axis')

# giving a title to my graph plt.title('My first graph!')

# function to show the plot plt.show()

Output:

Program:3b

import matplotlib.pyplot as plt a = [1, 2, 3, 4, 5]

b = [0, 0.6, 0.2, 15, 10, 8, 16, 21]

plt.plot(a)

# o is for circles and r is # for red

plt.plot(b, "or") plt.plot(list(range(0, 22, 3)))

# naming the x-axis plt.xlabel('Day ->')

# naming the y-axis plt.ylabel('Temp ->')

c = [4, 2, 6, 8, 3, 20, 13, 15]

plt.plot(c, label = '4th Rep')

# get current axes command ax = plt.gca()

# get command over the individual # boundary line of the graph body
ax.spines['right'].set_visible(False) ax.spines['top'].set_visible(False)

# set the range or the bounds of

# the left boundary line to fixed range ax.spines['left'].set_bounds(-3, 40)

# set the interval by which # the x-axis set the marks

plt.xticks(list(range(-3, 10)))

# set the intervals by which y-axis # set the marks plt.yticks(list(range(-3, 20, 3)))

# legend denotes that what color # signifies what

ax.legend(['1st Rep', '2nd Rep', '3rd Rep', '4th Rep'])

# annotate command helps to write

# ON THE GRAPH any text xy denotes # the position on the graph

plt.annotate('Temperature V / s Days', xy = (1.01, -2.15))

# gives a title to the Graph plt.title('All Features Discussed') plt.show()

Output:

Program:

import matplotlib.pyplot as plt

a = [1, 2, 3, 4, 5]

b = [0, 0.6, 0.2, 15, 10, 8, 16, 21]

c = [4, 2, 6, 8, 3, 20, 13, 15]

# use fig whenever u want the # output in a new window also # specify the window size you
# want ans to be displayed

fig = plt.figure(figsize =(10, 10))

# creating multiple plots in a # single plot

sub1 = plt.subplot(2, 2, 1)

sub2 = plt.subplot(2, 2, 2)

sub3 = plt.subplot(2, 2, 3)

sub4 = plt.subplot(2, 2, 4) sub1.plot(a, 'sb')

# sets how the display subplot # x axis values advances by 1 # within the specified range
sub1.set_xticks(list(range(0, 10, 1))) sub1.set_title('1st Rep')

sub2.plot(b, 'or')

# sets how the display subplot x axis # values advances by 2 within the

# specified range sub2.set_xticks(list(range(0, 10, 2))) sub2.set_title('2nd Rep')

# can directly pass a list in the plot

# function instead adding the reference sub3.plot(list(range(0, 22, 3)), 'vg')

sub3.set_xticks(list(range(0, 10, 1))) sub3.set_title('3rd Rep')

sub4.plot(c, 'Dm')

# similarly we can set the ticks for # the y-axis range(start(inclusive), # end(exclusive), step)

sub4.set_yticks(list(range(0, 24, 2))) sub4.set_title('4th Rep')

# without writing plt.show() no plot # will be visible

plt.show()

Output:
Normal Curve

ALGORITHM

Step 1: Start the Program

Step 2: Import packages scipy and call function scipy.stats

Step 3: Import packages numpy, matplotlib and seaborn

Step 4: Create the distribution

Step 5: Visualizing the distribution Step 6: Stop the process

Program:

# import required libraries from scipy.stats import norm import numpy as np

import matplotlib.pyplot as plt import seaborn as sb

# Creating the distribution data = np.arange(1,10,0.01)

pdf = norm.pdf(data , loc = 5.3 , scale = 1 )

#Visualizing the distribution

sb.set_style('whitegrid')

sb.lineplot(data, pdf , color = 'black') plt.xlabel('Heights')

plt.ylabel('Probability Density')

Output:
CORRELATION AND SCATTER PLOTS

ALGORITHM

Step 1: Start the Program Step 2: Create variable y1, y2

Step 3: Create variable x, y3 using random function

Step 4: plot the scatter plot Step 5: Print the result Step 6: Stop the process

Program:

# Scatterplot and Correlations # Data

x-pp random randn(100) yl=x*5+9

y2=-5°x

y3=no_random.randn(100) #Plot

plt.reParams update('figure figsize' (10,8), 'figure dpi¹:100})

plt scatter(x, yl, label=fyl, Correlation = {np.round(np.corrcoef(x,y1)[0,1], 2)})

plt scatter(x, y2, label=fy2 Correlation = (np.round(np.corrcoef(x,y2)[0,1], 2)})

plt scatter(x, y3, label=fy3 Correlation = (np.round(np.corrcoef(x,y3)[0,1], 2)})

# Plot

plt titlef('Scatterplot and Correlations') plt(legend)

plt(show)

Output
SIMPLE LINEAR REGRESSION

ALGORITHM

Step 1: Start the Program

Step 2: Import numpy and matplotlib package Step 3: Define coefficient function

Step 4: Calculate cross-deviation and deviation about x Step 5: Calculate regression

coefficients

Step 6: Plot the Linear regression and define main function

Step 7: Print the result

Step 8: Stop the process

PROGRAM:

import numpy as np

import matplotlib.pyplot as plt

def estimate_coef(x, y):

# number of observations/points n = np.size(x)

# mean of x and y vector m_x = np.mean(x)

m_y = np.mean(y)

# calculating cross-deviation and deviation about x SS_xy = np.sum(yx) - nm_y*m_x

SS_xx = np.sum(xx) - nm_x*m_x

# calculating regression coefficients b_1 = SS_xy / SS_xx

b_0 = m_y - b_1*m_x return (b_0, b_1)

def plot_regression_line(x, y, b):

# plotting the actual points as scatter plot plt.scatter(x, y, color = "m",

marker = "o", s = 30)

# predicted response vector y_pred = b[0] + b[1]*x

# plotting the regression line plt.plot(x, y_pred, color = "g")

# putting labels plt.xlabel('x')

plt.ylabel('y')

# function to show plot plt.show()

def main():

# observations / data

x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

y = np.array([1, 3, 2, 5, 7, 8, 8, 9, 10, 12])

# estimating coefficients b = estimate_coef(x, y)

print("Estimated coefficients:\nb_0 = {} \

\nb_1 = {}".format(b[0], b[1]))

# plotting regression line plot_regression_line(x, y, b)

if name == " main ": main()

Output :

Estimated coefficients:

b_0 = -0.0586206896552

b_1 = 1.45747126437

Graph:
MATPLOTLIB

Draw a line in a diagram from position (1, 3) to (2, 8) then to (6, 1) and
finally to position (8, 10):

import matplotlib.pyplot as plt

import numpy as np

xpoints = np.array([1, 2, 6, 8])

ypoints = np.array([3, 8, 1, 10])

plt.plot(xpoints, ypoints)
plt.show()
Draw a line diagram to plot from (1, 3) to (8, 10), we have to pass two
arrays [1, 8] and [3, 10] to the plot function.

import matplotlib.pyplot as plt

import numpy as np
xpoints = np.array([1, 8])
ypoints = np.array([3, 10])
plt.plot(xpoints, ypoints)
plt.show()

Markers

Draw a line diagram with marker to plot from (1, 3) to (8, 10), we have to
pass two arrays [1, 8] and [3, 10] to the plot function.

import matplotlib.pyplot as plt

import numpy as np
ypoints = np.array([3, 8, 1, 10])
plt.plot(ypoints, marker = 'o')
plt.show()
Marker Size
Draw a line diagram with marker size will be 20 to plot from (1, 3) to (8, 10),
we have to pass two arrays [1, 8] and [3, 10] to the plot function.

import matplotlib.pyplot as plt

import numpy as np
ypoints = np.array([3, 8, 1, 10])
plt.plot(ypoints, marker = 'o', ms = 20)
plt.show()
Marker Color
Draw a line diagram with marker size will be 20 with marker colour red to
plot from (1, 3) to (8, 10), we have to pass two arrays [1, 8] and [3, 10] to
the plot function.

import matplotlib.pyplot as plt

import numpy as np
ypoints = np.array([3, 8, 1, 10])
plt.plot(ypoints, marker = 'o', ms = 20, mec = 'r')
####plt.plot(ypoints, marker = 'o', ms = 20, mec = '#4CAF50', mfc
= '#4CAF50')
###plt.plot(ypoints, marker = 'o', ms = 20, mec = 'hotpink', mfc
= 'hotpink')
plt.show()
Create Labels for a Plot
With Pyplot, you can use the xlabel() and ylabel() functions to set a label
for the x- and y-axis.

import numpy as np
import matplotlib.pyplot as plt

x = np.array([80, 85, 90, 95, 100, 105, 110, 115, 120, 125])
y = np.array([240, 250, 260, 270, 280, 290, 300, 310, 320, 330])
plt.plot(x, y)
plt.title("Sports Watch Data")
plt.xlabel("Average Pulse")
plt.ylabel("Calorie Burnage")
plt.show()
Set Font Properties for Title and Labels

You can use the fontdict parameter in xlabel(), ylabel(), and title() to set font
properties for the title and labels.

Example

Set font properties for the title and labels:

import numpy as np

import matplotlib.pyplot as plt

x = np.array([80, 85, 90, 95, 100, 105, 110, 115, 120, 125])

y = np.array([240, 250, 260, 270, 280, 290, 300, 310, 320, 330])

font1 = {'family':'serif','color':'blue','size':20}

font2 = {'family':'serif','color':'darkred','size':15}
plt.title("Sports Watch Data", fontdict = font1)

plt.xlabel("Average Pulse", fontdict = font2)

plt.ylabel("Calorie Burnage", fontdict = font2)

plt.plot(x, y)

plt.show()

Matplotlib Scatter
With Pyplot, you can use the scatter() function to draw a scatter plot.

The scatter() function plots one dot for each observation. It needs two arrays
of the same length, one for the values of the x-axis, and one for values on
the y-axis:
import matplotlib.pyplot as plt
import numpy as np
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
plt.scatter(x, y)
plt.show()

ColorMap

import matplotlib.pyplot as plt

import numpy as np
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
colors = np.array([0, 10, 20, 30, 40, 45, 50, 55, 60, 70, 80, 90, 100])
plt.scatter(x, y, c=colors, cmap='viridis')
plt.colorbar()
plt.show()

Creating Bars
With Pyplot, you can use the bar() function to draw bar graphs:

import matplotlib.pyplot as plt

import numpy as np
x = np.array(["A", "B", "C", "D"])
y = np.array([3, 8, 1, 10])
plt.bar(x,y)
plt.show()
import matplotlib.pyplot as plt
import numpy as np
x = np.array(["A", "B", "C", "D"])
y = np.array([3, 8, 1, 10])
plt.bar(x, y, color = "red")
plt.show()

Histogram
A histogram is a graph showing frequency distributions.

It is a graph showing the number of observations within each given interval.

In Matplotlib, we use the hist() function to create histograms.

The hist() function will use an array of numbers to create a histogram, the
array is sent into the function as an argument.
import matplotlib.pyplot as plt
import numpy as np
x = np.random.normal(170, 10, 250)
plt.hist(x)
plt.show()

Creating Pie Charts

With Pyplot, you can use the pie() function to draw pie charts:

import matplotlib.pyplot as plt

import numpy as np
y = np.array([35, 25, 25, 15])
plt.pie(y)
plt.show()
import matplotlib.pyplot as plt
import numpy as np
y = np.array([35, 25, 25, 15])
mylabels = ["Apples", "Bananas", "Cherries", "Dates"]
plt.pie(y, labels = mylabels)
plt.show()

Explode
The explode parameter, if specified, and not None, must be an array with one
value for each wedge.
Each value represents how far from the center each wedge is displayed:

import matplotlib.pyplot as plt

import numpy as np
y = np.array([35, 25, 25, 15])
mylabels = ["Apples", "Bananas", "Cherries", "Dates"]
myexplode = [0.2, 0, 0, 0]
plt.pie(y, labels = mylabels, explode = myexplode)
plt.show()

Legend
To add a list of explanation for each wedge, use the legend() function:

import matplotlib.pyplot as plt

import numpy as np

y = np.array([35, 25, 25, 15])

mylabels = ["Apples", "Bananas", "Cherries", "Dates"]

plt.pie(y, labels = mylabels)

plt.legend()
plt.show()

Python program to perform Data Manipulation operations using Pandas

package.

import pandas as pd
# Create a DataFrame
data = { 'Name': ['John', 'Emma', 'Sam', 'Lisa', 'Tom'], 'Age': [25, 30, 28, 32, 27], 'Country':
['USA', 'Canada', 'Australia', 'UK', 'Germany'], 'Salary': [50000, 60000, 55000, 70000, 52000]
}
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)
# Selecting columns
name_age = df[['Name', 'Age']]
print("\nName and Age columns:")
print(name_age)
# Filtering rows
filtered_df = df[df['Country'] == 'USA']
print("\nFiltered DataFrame (Country = 'USA'):")
print(filtered_df)
# Sorting by a column
sorted_df = df.sort_values('Salary', ascending=False)
print("\nSorted DataFrame (by Salary in descending order):")
print(sorted_df)
# Aggregating data
average_salary = df['Salary'].mean() print("\nAverage Salary:", average_salary)
# Adding a new column
df['Experience'] = [3, 6, 4, 8, 5]
print("\nDataFrame with added Experience column:")
print(df)
# Updating values
df.loc[df['Name'] == 'Emma', 'Salary'] = 65000
print("\nDataFrame after updating Emma's Salary:")
print(df)
# Deleting a column df = df.drop('Experience', axis=1)
print("\nDataFrame after deleting Experience column:")
print(df)

My Book of Python Computing - Abhijit Kar Gupta
50% (2)
My Book of Python Computing - Abhijit Kar Gupta
385 pages
The Classe Brochure
No ratings yet
The Classe Brochure
36 pages
5.promotion Basics PDF
No ratings yet
5.promotion Basics PDF
24 pages
01 Introduction To Python
No ratings yet
01 Introduction To Python
36 pages
Notes For Fintech Assesment, Cheatsheet
No ratings yet
Notes For Fintech Assesment, Cheatsheet
19 pages
Getting Started With Python Cheat Sheet
No ratings yet
Getting Started With Python Cheat Sheet
1 page
01 Introduction To Python
No ratings yet
01 Introduction To Python
36 pages
Python Cheat Sheet For Beginners
No ratings yet
Python Cheat Sheet For Beginners
1 page
Python BasicsGUIA PYTHON-01
No ratings yet
Python BasicsGUIA PYTHON-01
1 page
Cheat Sheet: Python For Data Science
No ratings yet
Cheat Sheet: Python For Data Science
4 pages
Cheat Sheet: Python For Data Science
No ratings yet
Cheat Sheet: Python For Data Science
4 pages
Final Class XII IP Study Material 2023-24
No ratings yet
Final Class XII IP Study Material 2023-24
20 pages
Ip File Library Stock
No ratings yet
Ip File Library Stock
36 pages
Python Development
No ratings yet
Python Development
22 pages
Python Notes
No ratings yet
Python Notes
24 pages
Introduction To Python Programming
No ratings yet
Introduction To Python Programming
9 pages
Report of Python (1.)
No ratings yet
Report of Python (1.)
52 pages
Data Analysis Python Read The Docs Io en Latest
No ratings yet
Data Analysis Python Read The Docs Io en Latest
79 pages
LastMinuteRevisionMaterial IP24 25 82423
No ratings yet
LastMinuteRevisionMaterial IP24 25 82423
18 pages
Cheat Sheet For Sharing
No ratings yet
Cheat Sheet For Sharing
4 pages
Columbiax - BAMM 101 - Python For Analytics
No ratings yet
Columbiax - BAMM 101 - Python For Analytics
38 pages
oG1M8adGXOGe DHBiQVrXgXHO6GrHU01tHWZgd tpRqUW65xGX9ufzrZMtM6hjBWlvlYViPn6r2Cgghq2M8oiXNNdf0HeL-DQvJKWM
No ratings yet
oG1M8adGXOGe DHBiQVrXgXHO6GrHU01tHWZgd tpRqUW65xGX9ufzrZMtM6hjBWlvlYViPn6r2Cgghq2M8oiXNNdf0HeL-DQvJKWM
42 pages
DS Final
No ratings yet
DS Final
46 pages
Python Syllabus From Basic To Advanced. (Data Automation and Visualization) - 2
No ratings yet
Python Syllabus From Basic To Advanced. (Data Automation and Visualization) - 2
11 pages
Jenisha INTERNSHIP REPORT-2
No ratings yet
Jenisha INTERNSHIP REPORT-2
19 pages
AML LAB MANUAL Yash
No ratings yet
AML LAB MANUAL Yash
60 pages
Data-Structures Lecture
No ratings yet
Data-Structures Lecture
13 pages
LMRS Ip 2020 21
No ratings yet
LMRS Ip 2020 21
21 pages
DAO Cheatsheet
No ratings yet
DAO Cheatsheet
3 pages
Part1 Cours Python
No ratings yet
Part1 Cours Python
62 pages
Python Job Level Material
No ratings yet
Python Job Level Material
202 pages
File
No ratings yet
File
10 pages
DSP Full Notes Unit 1 To 5
No ratings yet
DSP Full Notes Unit 1 To 5
61 pages
Basics
No ratings yet
Basics
17 pages
Slide 9
No ratings yet
Slide 9
25 pages
Python, Data Analysis, Data Visualization, Machine Learning, Python With Data Science
No ratings yet
Python, Data Analysis, Data Visualization, Machine Learning, Python With Data Science
11 pages
Python
No ratings yet
Python
5 pages
Complete Python Questions Answers
No ratings yet
Complete Python Questions Answers
6 pages
Intro To Scientific Computing With Python
No ratings yet
Intro To Scientific Computing With Python
87 pages
11thjuly Python
No ratings yet
11thjuly Python
52 pages
Python Unit I
No ratings yet
Python Unit I
81 pages
Python Odp
No ratings yet
Python Odp
24 pages
Aiml Notes
No ratings yet
Aiml Notes
84 pages
Python
No ratings yet
Python
30 pages
Esc Enter M Y A B D + D Z F Shift + Up/Down Space Shift + Space
No ratings yet
Esc Enter M Y A B D + D Z F Shift + Up/Down Space Shift + Space
12 pages
Python Module4
No ratings yet
Python Module4
8 pages
Python For Machine Learning
No ratings yet
Python For Machine Learning
78 pages
Python Lecture II & IIIqkekrk2k1i
No ratings yet
Python Lecture II & IIIqkekrk2k1i
10 pages
Python Course
No ratings yet
Python Course
44 pages
Advance Python Compressed$20241122125807
No ratings yet
Advance Python Compressed$20241122125807
37 pages
Introduction To Quantitative Data Analysis in Python (I)
No ratings yet
Introduction To Quantitative Data Analysis in Python (I)
11 pages
ENGG1810 Recap
No ratings yet
ENGG1810 Recap
28 pages
Python Cheat Sheet - The Basics Edx
No ratings yet
Python Cheat Sheet - The Basics Edx
2 pages
Pandas What Can Pandas Do For You ?: Statsmodels SM Seaborn Sns
No ratings yet
Pandas What Can Pandas Do For You ?: Statsmodels SM Seaborn Sns
9 pages
Dictionaries
No ratings yet
Dictionaries
87 pages
Python Scripting For System Administration: Rebeka Mukherjee
No ratings yet
Python Scripting For System Administration: Rebeka Mukherjee
50 pages
Pythonn SE
No ratings yet
Pythonn SE
18 pages
Python 1
No ratings yet
Python 1
87 pages
Python
No ratings yet
Python
58 pages
Unit I Basics of Python
No ratings yet
Unit I Basics of Python
71 pages
Unit-Ii Python Data Structures, Packages
No ratings yet
Unit-Ii Python Data Structures, Packages
69 pages
Machine To Machine Communications
No ratings yet
Machine To Machine Communications
13 pages
Unit 1
No ratings yet
Unit 1
35 pages
Delta Queue BI Extr CVPM BA 6
No ratings yet
Delta Queue BI Extr CVPM BA 6
13 pages
Plates - API-2W Grade 50
No ratings yet
Plates - API-2W Grade 50
2 pages
Kafka As A Populist Rereading Kafka Penal Colony David Pan
No ratings yet
Kafka As A Populist Rereading Kafka Penal Colony David Pan
38 pages
OST Report Template 2020
No ratings yet
OST Report Template 2020
18 pages
A Keyless Approach To Image Encryption
No ratings yet
A Keyless Approach To Image Encryption
4 pages
What Are Analogous Structures
No ratings yet
What Are Analogous Structures
6 pages
Session I Structural Analysis PDF
100% (1)
Session I Structural Analysis PDF
46 pages
Em1110-2 - 3105
No ratings yet
Em1110-2 - 3105
454 pages
Controlling A Robotic Car Through MATLAB GUI - Electronics Project
No ratings yet
Controlling A Robotic Car Through MATLAB GUI - Electronics Project
4 pages
Level 3 NVQ in Electrotechnical Services - Electrical Installation Buildings Structuresqualification en PDF
No ratings yet
Level 3 NVQ in Electrotechnical Services - Electrical Installation Buildings Structuresqualification en PDF
206 pages
RAUDHA Complete
No ratings yet
RAUDHA Complete
298 pages
Directions
No ratings yet
Directions
3 pages
Establishing Strategic Pay Plans
No ratings yet
Establishing Strategic Pay Plans
61 pages
Answer Key Worksheet-1 (Basic Ict Skills-1) 2024
No ratings yet
Answer Key Worksheet-1 (Basic Ict Skills-1) 2024
2 pages
ICT Digital Citizenship Worksheet
No ratings yet
ICT Digital Citizenship Worksheet
3 pages
The Effectiveness of Teaching Reading Using The Needham Model in Improving The Reflective Thinking Skills of Tenth Grade Students in Jordan
No ratings yet
The Effectiveness of Teaching Reading Using The Needham Model in Improving The Reflective Thinking Skills of Tenth Grade Students in Jordan
9 pages
How To Predict Dates For Market Reversals Using Pythagores PDF
No ratings yet
How To Predict Dates For Market Reversals Using Pythagores PDF
15 pages
1 Paragraph Compare and Contrast
No ratings yet
1 Paragraph Compare and Contrast
1 page
GESS Dubai 2024 Floorplan 28.10
No ratings yet
GESS Dubai 2024 Floorplan 28.10
1 page
Decision Tree Slides
No ratings yet
Decision Tree Slides
94 pages
IHAM 2018 2019 - Programme Booklet
No ratings yet
IHAM 2018 2019 - Programme Booklet
23 pages
Contact Materials and Performance Specifications
No ratings yet
Contact Materials and Performance Specifications
1 page
CPI PDFF
No ratings yet
CPI PDFF
8 pages
David Lyon Surveillance Studies - Understanding Visibility, Mobility and The Phenetic
No ratings yet
David Lyon Surveillance Studies - Understanding Visibility, Mobility and The Phenetic
7 pages
Se ZG519 Ec-2r Second Sem 2023-2024
No ratings yet
Se ZG519 Ec-2r Second Sem 2023-2024
5 pages
Cinquain Poems
No ratings yet
Cinquain Poems
3 pages
DLS PD 24 25
No ratings yet
DLS PD 24 25
2 pages
Grading and Reporting Performance
No ratings yet
Grading and Reporting Performance
8 pages