SlideShare a Scribd company logo
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Data Analysis With Python
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Agenda
Python Applications
Data Life-cycle
Python For Data Analysis
What is Pandas? – Numpy, Scipy
Pandas Operations
Python for Statistics
Python for Hadoop
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Python Applications
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Python Applications
Web Scraping
Testing
Web
Development
Data Analysis
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Data Life-Cycle
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Data Life-Cycle
Data
Data
Data
Data
Data
Warehousing
Data AnalysisData AnalysisData Analysis Data Visualization
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
What is Data Analysis?
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
What is Data Analysis?
Percentage increase in unemployed
youth in Afghanistan between 2010-2011
Data of unemployed
youth across the globe
from 2010-2014
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
What is Pandas?
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Data Analysis Using Python
Pandas is a software library written for the Python programming language for data manipulation and analysis.
Numpy and Scipy
and Matplotlib
Pandas is well suited for many different kinds of data:
 Tabular data with heterogeneously-typed columns.
 Ordered and unordered time series data.
 Arbitrary matrix data with row and column labels
 Any other form of observational / statistical data sets. The data actually
need not be labeled at all to be placed into a pandas data structure
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Pandas Operations
Changing the Index Concatenation
Slicing the
DataFrame
Data conversion
Changing the
column headers
Joining and Merging
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Slicing
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Slicing
Index Int rate US GDP Thousands
2001 2 50
2002 3 55
2003 2 65
2004 2 55
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Slicing
Slicing the starting 2 rows
Slicing the last 2 rows
Index Int rate US GDP Thousands
2001 2 50
2002 3 55
2003 2 65
2004 2 55
Index Int rate US GDP Thousands
2001 2 50
2002 3 55
Index Int rate US GDP Thousands
2003 2 65
2004 2 55
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Merging
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Merging
Index HPI Int rate US GDP Thousands
2001 80 2 50
2002 85 3 55
2003 88 2 65
2004 85 2 55
Index HPI Int rate US GDP Thousands
2005 80 2 50
2006 85 3 55
2007 88 2 65
2008 85 2 55
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Merging
Index HPI Int rate US GDP Thousands
2001 80 2 50
2002 85 3 55
2003 88 2 65
2004 85 2 55
Index HPI Int rate US GDP Thousands
2005 80 2 50
2006 85 3 55
2007 88 2 65
2008 85 2 55
Merging
Index HPI Int rate US GDP
Thousands x
US GDP
Thousands y
0 80 2 50 50
1 85 3 55 55
2 88 2 65 65
3 85 2 55 55
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Joining
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Joining
Index Int rate US GDP
Thousands
2001 2 50
2002 3 55
2003 2 65
2004 2 55
Index Low tier
HPI
Unemployment
2001 50 7
2003 52 8
2004 50 9
2005 43 6
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Joining
Index Int rate US GDP
Thousands
2001 2 50
2002 3 55
2003 2 65
2004 2 55
Index Low tier
HPI
Unemployment
2001 50 7
2003 52 8
2004 50 9
2005 43 6
Index Int rate US GDP
Thousands
Low tier
HPI
Unemployment
2001 2.0 50.0 50.0 7.0
2002 3.0 55.0 NaN NaN
2003 2.0 65.0 52.0 8.0
2004 2.0 55.0 50.0 9.0
2005 NaN NaN 53.0 6.0
Joining
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Changing the Index and Column Headers
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Changing the Index and Column Headers
Index Int rate US GDP Thousands
2001 2 50
2002 3 55
2003 2 65
2004 2 55
Index US GDP Thousands
2001 50
2002 55
2003 65
2004 55
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Changing the Index and Column Headers
Index Int rate US GDP Thousands
2001 2 50
2002 3 55
2003 2 65
2004 2 55
Index US GDP Thousands
2001 50
2002 55
2003 65
2004 55
Index US GDP Thousands
2 50
3 55
2 65
2 55
Index GDP
2001 50
2002 55
2003 65
2004 55
Changing the Index
Changing the
column headers
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Concatenation
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Concatenation
Student
Name:
Age:
Sex:
Phone number:
Student Data
Concatenate
E-mail
Student
Name:
Age:
Sex:
Phone number:
E-mail:
Concatenation
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Data Munging
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Data Munging
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Use-Case
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Example: Youth Unemployment Data
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Example: Youth Unemployment Data
Problem Statement
Find the change in percentage of unemployed youth for every country from 2010-2011
There is approx. 3.1%
increase in unemployed
youth in ‘Arab World’
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Example: Youth Unemployment Data
Column 1 – Country Name
Column 2 – Country Code
Column 3 – 2010
Column 4 – 2011
Column 5 – 2012
Column 6 – 2013
Column 7 – 2014
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Python For Statistics
from statistics import mean
print(mean([1,1,1,1,3,4,4,4,5,2]))
Mean
Median
from statistics import median
print(median([1,1,1,1,3,4,4,4,5,2]))
High Median
Low Median
from statistics import mode
print(mode([1,1,1,1,3,4,4,4,5,2]))
Mode
from statistics import mode
print(mode([1,1,1,1,3,4,4,4,5,2]))
Variance
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Python For Hadoop : Pydoop
Pydoop is a Python interface to Hadoop that allows you to write MapReduce applications and interact with HDFS
in pure Python.
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Python For Hadoop : Pydoop
Python Applications What Is Data Analysis
Pandas Operations Data Analysis Use-Case
What Is Pandas
Python For Statistics And
Python For Hadoop
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING

More Related Content

PPTX
Pandas csv
PDF
Data Analysis and Visualization using Python
PDF
pandas - Python Data Analysis
PPTX
Python Seaborn Data Visualization
PDF
Introduction to Python Pandas for Data Analytics
PDF
Data visualization in Python
PDF
Python Matplotlib Tutorial | Matplotlib Tutorial | Python Tutorial | Python T...
PDF
Python Programming Tutorial | Edureka
Pandas csv
Data Analysis and Visualization using Python
pandas - Python Data Analysis
Python Seaborn Data Visualization
Introduction to Python Pandas for Data Analytics
Data visualization in Python
Python Matplotlib Tutorial | Matplotlib Tutorial | Python Tutorial | Python T...
Python Programming Tutorial | Edureka

What's hot (20)

PPTX
PPT on Data Science Using Python
PPTX
Introduction to matplotlib
PDF
Python for Data Science | Python Data Science Tutorial | Data Science Certifi...
PPT
Python Pandas
PPTX
Exploratory data analysis with Python
PPTX
Python Scipy Numpy
PPTX
Scikit Learn intro
PPT
2.3 bayesian classification
PDF
Pandas
PPTX
Python pandas Library
ODP
Data Analysis in Python
PPTX
Data Analysis with Python Pandas
PPTX
MatplotLib.pptx
PPTX
Data visualization using R
PDF
Introduction to Machine Learning with SciKit-Learn
PPTX
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...
PDF
Dimensionality Reduction
PDF
Python for Data Science
PDF
Learn Python Programming | Python Programming - Step by Step | Python for Beg...
PPT
Machine learning with Big Data power point presentation
PPT on Data Science Using Python
Introduction to matplotlib
Python for Data Science | Python Data Science Tutorial | Data Science Certifi...
Python Pandas
Exploratory data analysis with Python
Python Scipy Numpy
Scikit Learn intro
2.3 bayesian classification
Pandas
Python pandas Library
Data Analysis in Python
Data Analysis with Python Pandas
MatplotLib.pptx
Data visualization using R
Introduction to Machine Learning with SciKit-Learn
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...
Dimensionality Reduction
Python for Data Science
Learn Python Programming | Python Programming - Step by Step | Python for Beg...
Machine learning with Big Data power point presentation
Ad

Similar to Python For Data Analysis | Python Pandas Tutorial | Learn Python | Python Training | Edureka (20)

PDF
Using pandas library for data analysis in python
PDF
Python for Data Analysis Data Wrangling with Pandas NumPy and IPython Wes Mck...
PPTX
1_ Introduction Python.pptx python is a data
PDF
Wes McKinney - Python for Data Analysis-O'Reilly Media (2012).pdf
PDF
Python for Data Analysis_ Data Wrangling with Pandas, Numpy, and Ipython ( PD...
PPTX
Lecture 3 intro2data
PDF
Python Certification | Data Science with Python Certification | Python Online...
PPTX
More on Pandas.pptx
PDF
Introduction To Python
PDF
Python for Data Analysis Data Wrangling with Pandas NumPy and IPython 1st Edi...
PDF
Best data analyst course syllabus 2025.pdf
PDF
Python For Data Analysis 3rd Wes Mckinney
PDF
Download full ebook of Mastering Pandas Femi Anthony instant download pdf
PPTX
Meetup Junio Data Analysis with python 2018
PPTX
Learning Python opens up endless possibilities
PPT
PPTX
Certified Python Business Analyst
PDF
data science hot.pdf
PDF
Python for Data Analysis 3rd Edition by Wes McKinney ISBN 9781098103989 10981...
PDF
Data Wrangling and Visualization Using Python
Using pandas library for data analysis in python
Python for Data Analysis Data Wrangling with Pandas NumPy and IPython Wes Mck...
1_ Introduction Python.pptx python is a data
Wes McKinney - Python for Data Analysis-O'Reilly Media (2012).pdf
Python for Data Analysis_ Data Wrangling with Pandas, Numpy, and Ipython ( PD...
Lecture 3 intro2data
Python Certification | Data Science with Python Certification | Python Online...
More on Pandas.pptx
Introduction To Python
Python for Data Analysis Data Wrangling with Pandas NumPy and IPython 1st Edi...
Best data analyst course syllabus 2025.pdf
Python For Data Analysis 3rd Wes Mckinney
Download full ebook of Mastering Pandas Femi Anthony instant download pdf
Meetup Junio Data Analysis with python 2018
Learning Python opens up endless possibilities
Certified Python Business Analyst
data science hot.pdf
Python for Data Analysis 3rd Edition by Wes McKinney ISBN 9781098103989 10981...
Data Wrangling and Visualization Using Python
Ad

More from Edureka! (20)

PDF
What to learn during the 21 days Lockdown | Edureka
PDF
Top 10 Dying Programming Languages in 2020 | Edureka
PDF
Top 5 Trending Business Intelligence Tools | Edureka
PDF
Tableau Tutorial for Data Science | Edureka
PDF
Top 5 PMP Certifications | Edureka
PDF
Top Maven Interview Questions in 2020 | Edureka
PDF
Linux Mint Tutorial | Edureka
PDF
How to Deploy Java Web App in AWS| Edureka
PDF
Importance of Digital Marketing | Edureka
PDF
RPA in 2020 | Edureka
PDF
Email Notifications in Jenkins | Edureka
PDF
EA Algorithm in Machine Learning | Edureka
PDF
Cognitive AI Tutorial | Edureka
PDF
AWS Cloud Practitioner Tutorial | Edureka
PDF
Blue Prism Top Interview Questions | Edureka
PDF
Big Data on AWS Tutorial | Edureka
PDF
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
PDF
Kubernetes Installation on Ubuntu | Edureka
PDF
Introduction to DevOps | Edureka
PDF
ITIL® Tutorial for Beginners | ITIL® Foundation Training | Edureka
What to learn during the 21 days Lockdown | Edureka
Top 10 Dying Programming Languages in 2020 | Edureka
Top 5 Trending Business Intelligence Tools | Edureka
Tableau Tutorial for Data Science | Edureka
Top 5 PMP Certifications | Edureka
Top Maven Interview Questions in 2020 | Edureka
Linux Mint Tutorial | Edureka
How to Deploy Java Web App in AWS| Edureka
Importance of Digital Marketing | Edureka
RPA in 2020 | Edureka
Email Notifications in Jenkins | Edureka
EA Algorithm in Machine Learning | Edureka
Cognitive AI Tutorial | Edureka
AWS Cloud Practitioner Tutorial | Edureka
Blue Prism Top Interview Questions | Edureka
Big Data on AWS Tutorial | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
Kubernetes Installation on Ubuntu | Edureka
Introduction to DevOps | Edureka
ITIL® Tutorial for Beginners | ITIL® Foundation Training | Edureka

Recently uploaded (20)

PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
sap open course for s4hana steps from ECC to s4
PPTX
Big Data Technologies - Introduction.pptx
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
A comparative analysis of optical character recognition models for extracting...
PPTX
Cloud computing and distributed systems.
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPT
Teaching material agriculture food technology
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
cuic standard and advanced reporting.pdf
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
Encapsulation_ Review paper, used for researhc scholars
Programs and apps: productivity, graphics, security and other tools
sap open course for s4hana steps from ECC to s4
Big Data Technologies - Introduction.pptx
Unlocking AI with Model Context Protocol (MCP)
MIND Revenue Release Quarter 2 2025 Press Release
A comparative analysis of optical character recognition models for extracting...
Cloud computing and distributed systems.
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Teaching material agriculture food technology
Assigned Numbers - 2025 - Bluetooth® Document
20250228 LYD VKU AI Blended-Learning.pptx
Digital-Transformation-Roadmap-for-Companies.pptx
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
cuic standard and advanced reporting.pdf
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
Dropbox Q2 2025 Financial Results & Investor Presentation

Python For Data Analysis | Python Pandas Tutorial | Learn Python | Python Training | Edureka