Python for Data Science
1 Copyright © 2022 TalentLabs Limited
Agenda
• Python
– Basics of python
– Data structures in Python
– Control Structures in Python
– Loops in Python
– Functions
• Environment Setup
– Tools Installation?
– Jupyter Notebook Setup and Usage
2 Copyright © 2022 TalentLabs Limited
Agenda
• Numpy With Data Analytics
• Pandas With Python
• Introduction To Visualization With Python
– Matplotlib For Data Analytics
– Seaborn For Data Analytics
• Exploratory Data Analysis
3 Copyright © 2022 TalentLabs Limited
Environment Set-up of Python
4 Copyright © 2022 TalentLabs Limited
Installation
Install Python on your machine. Install Anaconda
• You can download from the official • https://www.anaconda.com/download
website
• https://www.python.org/downloads
5 Copyright © 2022 TalentLabs Limited
Jupyter Installation & Usage
Using Command Prompt Using Anaconda
You can open command prompt You can open Anaconda prompt
directly type following to install directly type following to start
Jupyter notebook : Jupyter notebook :
• > pip install jupyter • > jupyter notebook
6 Copyright © 2022 TalentLabs Limited
Usage Of Jupyter notebook
• Notebook has the ability to re-run individual code
snippets, and it provides you the flexibility of modifying
them before re-running.
• You can deploy a Jupyter Notebook on a remote server
and access it from your local web browser.
• You can insert notes and documentation to your code in
a Jupyter Notebook in various formats like markdown,
latex, and HTML.
• Use a Jupyter Notebook as cells.
7 Copyright © 2022 TalentLabs Limited
Introduction
• Open Source general purpose programming language
• Object Oriented
• Programming as well as scripting language
Python is a general-purpose
programming language that is often
applied in scripting roles
8 Copyright © 2022 TalentLabs Limited
Features
• Easy to learn & use
• Interpreted language
• Open Source
• Large Standard Library
• Large Community Support
• Extensible
• Cross-platform language
9 Copyright © 2022 TalentLabs Limited
Python vs other programming languages
More user-friendly
More applications
Stability
Speed
10 Copyright © 2022 TalentLabs Limited
Environment Setup
11 Copyright © 2022 TalentLabs Limited
Installation
Install Python on your machine. Install Anaconda
• You can download from the official • https://www.anaconda.com/download
website
• https://www.python.org/downloads
1
Copyright © 2022 TalentLabs Limited
2
Jupyter Installation & Usage
Using Command Prompt Using Anaconda
You can open command prompt directly You can open Anaconda prompt
type following to install Jupyter directly type following to start Jupyter
notebook : notebook :
• > pip install jupyter • > jupyter notebook
13 Copyright © 2022 TalentLabs Limited
Usage Of Jupyter notebook
• Notebook has the ability to re-run individual code
snippets, and it provides you the flexibility of modifying
them before re-running.
• You can deploy a Jupyter Notebook on a remote server
and access it from your local web browser.
• You can insert notes and documentation to your code in
a Jupyter Notebook in various formats like markdown,
latex, and HTML.
• Use a Jupyter Notebook as cells.
1
Copyright © 2022 TalentLabs Limited
4
Basics o f Python
15 Copyright © 2022 TalentLabs Limited
Variables
You can consider a variable to be a temporary
storage space where you can keep changing values.
Assigning values to a variable:
• To assign values to a variable in Python,
we will use the assignment (=) operator.
– a = 10, a = “Welcome to the class”
• No need to declare the datatype of the variables
done in other programming languages
16 Copyright © 2022 TalentLabs Limited
Keywords
• Special reserved words which convey a special
– Meaning to interpreter or compiler.
• It can’t be used as a variable.
• Few of the example keywords are as follows:
– def, – from,
– else, if, – return,
– class, – lambda,
– continue, – except,
– break, – import,
– finally, – None
17 Copyright © 2022 TalentLabs Limited
Data Types
• Variables hold values of different data types.
• With the help of type() function you can check the type of variable used.
Datatype
Immutable Mutable
Numbers Strings Tuples Lists Dictionaries Sets
18 Copyright © 2022 TalentLabs Limited
Operators & Operands
Operators are special symbols that represent
computations like addition and multiplication.
• Value of operators is applied to are called operands.
• Operator : -, +, /,*,**
• 20+32, hour-1, hour*60+minute 5**2
– Order of Precedence : PEMD
(Parenthesis, Exponentiation, Multiplication
and Operators)
19 Copyright © 2022 TalentLabs Limited
Data Structures in Python
20 Copyright © 2022 TalentLabs Limited
Lists
List is a sequence of values of any type.
• Values in lists are called as elements or items.
[10, 20, ‘Class’]
• A list within another list is nested.
• Lists are mutable.
• It has variable length.
• Lists are accessed similarly like arrays. First
element will be stored at 0th index.
• Let’s discuss function used in lists.
21 Copyright © 2022 TalentLabs Limited
Tuples
Tuples are similar like lists, having a sequence of
values of any type and enclosed within parentheses.
• Tuples are immutable.
• It has fixed length.
• tup_1 = (‘a’, 1, ‘df’, ‘b’)
• Let’s discuss about the functions of tuples.
22 Copyright © 2022 TalentLabs Limited
Lists Vs Tuples
• Items surrounded in square brackets [] • Items surrounded in round brackets ()
• Lists are mutable in nature • Tuples are immutable in nature
• If content is not fixed, and keeps on • If content is fixed, and never changes
changing then we should go for lists. then we should go for Tuples.
• List objects cannot be used as keys for • Tuple objects can be used as keys for
dictionaries because keys should be Hash dictionaries because keys should be Hash
table and immutable. table and immutable.
23 Copyright © 2022 TalentLabs Limited
Sets
Unordered collection of items is known as set.
• Items of set can not duplicate. Colors = { ‘red’, ‘blue’ , ‘green’ }
• Let’s see how to use, Union, Intersection methods in set.
• Let’s discuss about the other functions of Sets.
24 Copyright © 2022 TalentLabs Limited
Dictionary
• Unordered collection of data.
• Data in dictionary is stored as key:value pair.
• Key should not be mutable and value can be of any type.
• Dict = { “name”: “begindatum”, ‘age’:10 }
• Keys: name and age
• Values: begin at number and 10
• Let’s see how to access the keys, and values in dictionary
along with different functions used.
25 Copyright © 2022 TalentLabs Limited
Python Control Statements
26 Copyright © 2022 TalentLabs Limited
Control Statements
1 Conditional statements
The flow control
statements are divided
2 Iterative statements.
into three categories
3 Transfer statements
27 Copyright © 2022 TalentLabs Limited
Conditional Statements
There are four types of conditional statements
01 02 03 04
if statement if-else if-elif-else nested if-else
28 Copyright © 2022 TalentLabs Limited
if- Statement
Syntax Example
if condition: number = 6
statement 1 if number > 5:
statement 2 print('/talentlabs')
statement n
29 Copyright © 2022 TalentLabs Limited
if-else Statement
Syntax Example
if condition: password = input("Enter password")
statement 1 if password == "/talentlabs@10":
else: print("Correct password")
statement 2 else:
print("Incorrect Password")
30 Copyright © 2022 TalentLabs Limited
if-elif-else Statement
Syntax Example
if condition-1: def user_check(choice):
statement 1 if choice == 1:
elif condition-2: print("Admin")
statement 2 elif choice == 2:
else: print("Editor")
statement else:
print("Wrong entry")
31 Copyright © 2022 TalentLabs Limited
Nested-if-else Statement
Example
num = float(input("Enter a number: "))
if num >= 0:
if num == 0:
print("Zero")
else:
print("Positive number")
else:
print("Negative number")
32 Copyright © 2022 TalentLabs Limited
Transfer statements
1 Break Statement.
There are Three types 2 Continue Statement.
of Transfer Statements
3 Pass Statement.
33 Copyright © 2022 TalentLabs Limited
Break statements
Example
for num in range(10):
if num > 5:
print("stop processing.")
break
print(num)
34 Copyright © 2022 TalentLabs Limited
Continue statements
Example
for num in range(3, 8):
if num == 5:
continue
else:
print(num)
35 Copyright © 2022 TalentLabs Limited
Pass statements
Example
months = ['January', 'June', 'March', 'April’]
for mon in months:
pass
print(months)
36 Copyright © 2022 TalentLabs Limited
Python Loops(Iterative statements)
37 Copyright © 2022 TalentLabs Limited
Loops
For loop
for index in sequence:
statements
for i in range(1,10):
print(i)
for i in range(0,5):
print(i)
else:
print("for loop completely exhausted, since there is no break.");
Once for loop is executed, else block is also executed.
38 Copyright © 2022 TalentLabs Limited
Loops
While Loop
# Python program to illustrate while loop
count = 0
while (count < 5):
count = count + 1
print(“zepanalytics")
39 Copyright © 2022 TalentLabs Limited
Loops
Nested Loop Using For & While
for iterator_var in sequence while expression
for iterator_var in sequence: while expression:
statements(s) statement(s)
statements(s) statement(s)
40 Copyright © 2022 TalentLabs Limited
Python Functions
41 Copyright © 2022 TalentLabs Limited
Functions In Python
Function is a named-sequence of statements that
Type Conversion Function
performs some operations.
• Example: type(32) • Convert values from one type to another.
• Here, name of function is type and expression in • Int can convert floating-point values to integers
parenthesis is called argument of the function. and vice-versa
• int(3.4545)
• 3
42 Copyright © 2022 TalentLabs Limited
Lamba-Functions In Python
• Lambda Function is a small
anonymous function which
makes the developer’s life Syntax
easier
Lambda arguments : expression
• It can take any
number of arguments, The expression is executed and the result is returned:
but can only have one
expression.
43 Copyright © 2022 TalentLabs Limited
Numpy For Data Analytics
44 Copyright © 2022 TalentLabs Limited
Numpy
Can be used to store 1-d, 2-d, 3-d and so on..
45 Copyright © 2022 TalentLabs Limited
Numpy
Time (us) taken for the loop
Numpy vectorize
1 Map function
List comprehension
Plain "for-loop"
0 500 1000 1500 2000 2500
46 Copyright © 2022 TalentLabs Limited
Numpy
• It is used for computation and processing of single and multi-
dimensional array of elements.
• Smaller memory consumption and better runtime behaviour.
• To install the module: pip install numpy
– To use NumPy:
import numpy as np
values = [20.1, 20.4, 12.3, 43.5, 54.4, 23.5]
Convert = np.array(values)
print(Convert)
print(Convert+20)
Let’s see more examples on this.
47 Copyright © 2022 TalentLabs Limited
Pandas For Data Analytics
48 Copyright © 2022 TalentLabs Limited
Pandas
Pandas is fast, powerful, flexible and easy to use
open-source data analysis and manipulation tool.
To install pandas: pip install pandas
• Used to handle the data for single and multi-
dimensional data structure.
• Creates the powerful data frame.
• Data frame is two-dimensional array used to store
data whereas Series is one-dimensional array.
Let’s see more practical use of Pandas.
49 Copyright © 2022 TalentLabs Limited
Pandas
Combining data frames
• Merge: merge(left_df, right_df, on=’Customer_id’, how=’inner’)
how=“inner” how=“outer” how=“left” how=“right”
x y x y x y x y
Natural join Full outer join Left outer join Right outer join
Concat
frames = [df1, df2]
result = pd.concat(frames)
50 Copyright © 2022 TalentLabs Limited
Introduction To Visualization With Python
51 Copyright © 2022 TalentLabs Limited
Matplotlib
Bar Graph
• Helps to visualize a numeric
feature
52 Copyright © 2022 TalentLabs Limited
Matplotlib
Scatter Plot
• The scatter function plots
one dot for each observation
53 Copyright © 2022 TalentLabs Limited
Matplotlib
Histogram
• A histogram is a graph
showing frequency
distributions.
54 Copyright © 2022 TalentLabs Limited
Matplotlib
Box & Whisker Plot Box and Whisker plot
A box and whisker plot (also called a box plot) shows the
• A box and whisker plot is
five-number summary of a set of data: minimum, lower
defined as a graphical quartile, median, upper quartile, and maximum
method of displaying
variation in a set of data.
55 Copyright © 2022 TalentLabs Limited
Seaborn
Scatter Plot
• Scatterplot Can be used with
several semantic groupings
which can help to
understand well in a graph
against continuous/
categorical data.
• It can draw a two-
dimensional graph.
56 Copyright © 2022 TalentLabs Limited
Seaborn
Box Plot
• A box plot consists of 5 things.
• Minimum
• First Quartile or 25%
• Median (Second Quartile)
or 50%
• Third Quartile or 75%
• Maximum
57 Copyright © 2022 TalentLabs Limited
Seaborn
Point Plot
• Point plot used to show point
estimates and confidence
intervals using scatter plot
glyphs.
58 Copyright © 2022 TalentLabs Limited
Seaborn
Voilin Plot
• A violin plot is similar to a
boxplot.
• It shows several quantitative
data across one or more
categorical variables such
that those distributions can
be compared.
59 Copyright © 2022 TalentLabs Limited
Matplotlib For Data Analytics
• Graphical Representation of the values
• To install: pip install matplotlib
• To use in program: import matplotlib
• Let’s say if we have to populate our previous lists, we can use:
import matplotlib.pyplot as plt
plt.plot(Convert)
plt.show()
60 Copyright © 2022 TalentLabs Limited
Seaborn For Data Analytics
• Seaborn helps to visualize the statistical relationships, To
understand how variables in a dataset are related to one
another and how that relationship is dependent on other
variables, we perform statistical analysis.
• This Statistical analysis helps to visualize the trends and
identify various patterns in the dataset.
• To install: pip install matplotlib
• To use in program: import matplotlib
• Import as: import seaborn as sns
61 Copyright © 2022 TalentLabs Limited
Thank You
62 Copyright © 2022 TalentLabs Limited