SlideShare a Scribd company logo
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Data Analysis With Python
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Agenda
Python Applications
Data Life-cycle
Python For Data Analysis
What is Pandas? – Numpy, Scipy
Pandas Operations
Python for Statistics
Python for Hadoop
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Python Applications
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Python Applications
Web Scraping
Testing
Web
Development
Data Analysis
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Data Life-Cycle
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Data Life-Cycle
Data
Data
Data
Data
Data
Warehousing
Data AnalysisData AnalysisData Analysis Data Visualization
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
What is Data Analysis?
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
What is Data Analysis?
Percentage increase in unemployed
youth in Afghanistan between 2010-2011
Data of unemployed
youth across the globe
from 2010-2014
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
What is Pandas?
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Data Analysis Using Python
Pandas is a software library written for the Python programming language for data manipulation and analysis.
Numpy and Scipy
and Matplotlib
Pandas is well suited for many different kinds of data:
 Tabular data with heterogeneously-typed columns.
 Ordered and unordered time series data.
 Arbitrary matrix data with row and column labels
 Any other form of observational / statistical data sets. The data actually
need not be labeled at all to be placed into a pandas data structure
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Pandas Operations
Changing the Index Concatenation
Slicing the
DataFrame
Data conversion
Changing the
column headers
Joining and Merging
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Slicing
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Slicing
Index Int rate US GDP Thousands
2001 2 50
2002 3 55
2003 2 65
2004 2 55
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Slicing
Slicing the starting 2 rows
Slicing the last 2 rows
Index Int rate US GDP Thousands
2001 2 50
2002 3 55
2003 2 65
2004 2 55
Index Int rate US GDP Thousands
2001 2 50
2002 3 55
Index Int rate US GDP Thousands
2003 2 65
2004 2 55
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Merging
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Merging
Index HPI Int rate US GDP Thousands
2001 80 2 50
2002 85 3 55
2003 88 2 65
2004 85 2 55
Index HPI Int rate US GDP Thousands
2005 80 2 50
2006 85 3 55
2007 88 2 65
2008 85 2 55
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Merging
Index HPI Int rate US GDP Thousands
2001 80 2 50
2002 85 3 55
2003 88 2 65
2004 85 2 55
Index HPI Int rate US GDP Thousands
2005 80 2 50
2006 85 3 55
2007 88 2 65
2008 85 2 55
Merging
Index HPI Int rate US GDP
Thousands x
US GDP
Thousands y
0 80 2 50 50
1 85 3 55 55
2 88 2 65 65
3 85 2 55 55
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Joining
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Joining
Index Int rate US GDP
Thousands
2001 2 50
2002 3 55
2003 2 65
2004 2 55
Index Low tier
HPI
Unemployment
2001 50 7
2003 52 8
2004 50 9
2005 43 6
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Joining
Index Int rate US GDP
Thousands
2001 2 50
2002 3 55
2003 2 65
2004 2 55
Index Low tier
HPI
Unemployment
2001 50 7
2003 52 8
2004 50 9
2005 43 6
Index Int rate US GDP
Thousands
Low tier
HPI
Unemployment
2001 2.0 50.0 50.0 7.0
2002 3.0 55.0 NaN NaN
2003 2.0 65.0 52.0 8.0
2004 2.0 55.0 50.0 9.0
2005 NaN NaN 53.0 6.0
Joining
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Changing the Index and Column Headers
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Changing the Index and Column Headers
Index Int rate US GDP Thousands
2001 2 50
2002 3 55
2003 2 65
2004 2 55
Index US GDP Thousands
2001 50
2002 55
2003 65
2004 55
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Changing the Index and Column Headers
Index Int rate US GDP Thousands
2001 2 50
2002 3 55
2003 2 65
2004 2 55
Index US GDP Thousands
2001 50
2002 55
2003 65
2004 55
Index US GDP Thousands
2 50
3 55
2 65
2 55
Index GDP
2001 50
2002 55
2003 65
2004 55
Changing the Index
Changing the
column headers
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Concatenation
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Concatenation
Student
Name:
Age:
Sex:
Phone number:
Student Data
Concatenate
E-mail
Student
Name:
Age:
Sex:
Phone number:
E-mail:
Concatenation
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Data Munging
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Data Munging
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Use-Case
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Example: Youth Unemployment Data
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Example: Youth Unemployment Data
Problem Statement
Find the change in percentage of unemployed youth for every country from 2010-2011
There is approx. 3.1%
increase in unemployed
youth in ‘Arab World’
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Example: Youth Unemployment Data
Column 1 – Country Name
Column 2 – Country Code
Column 3 – 2010
Column 4 – 2011
Column 5 – 2012
Column 6 – 2013
Column 7 – 2014
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Python For Statistics
from statistics import mean
print(mean([1,1,1,1,3,4,4,4,5,2]))
Mean
Median
from statistics import median
print(median([1,1,1,1,3,4,4,4,5,2]))
High Median
Low Median
from statistics import mode
print(mode([1,1,1,1,3,4,4,4,5,2]))
Mode
from statistics import mode
print(mode([1,1,1,1,3,4,4,4,5,2]))
Variance
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Python For Hadoop : Pydoop
Pydoop is a Python interface to Hadoop that allows you to write MapReduce applications and interact with HDFS
in pure Python.
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING
Python For Hadoop : Pydoop
Python Applications What Is Data Analysis
Pandas Operations Data Analysis Use-Case
What Is Pandas
Python For Statistics And
Python For Hadoop
www.edureka.co/pythonEDUREKA PYTHON CERTIFICATION TRAINING

More Related Content

PDF
Data Analysis and Visualization using Python
PDF
pandas - Python Data Analysis
PPTX
Python Seaborn Data Visualization
PPTX
Pandas csv
PDF
Introduction to Python Pandas for Data Analytics
PDF
Data visualization in Python
PDF
Python Matplotlib Tutorial | Matplotlib Tutorial | Python Tutorial | Python T...
PDF
Python Programming Tutorial | Edureka
Data Analysis and Visualization using Python
pandas - Python Data Analysis
Python Seaborn Data Visualization
Pandas csv
Introduction to Python Pandas for Data Analytics
Data visualization in Python
Python Matplotlib Tutorial | Matplotlib Tutorial | Python Tutorial | Python T...
Python Programming Tutorial | Edureka

What's hot (20)

PPTX
PPT on Data Science Using Python
PPTX
Introduction to matplotlib
PDF
Python for Data Science | Python Data Science Tutorial | Data Science Certifi...
PPT
Python Pandas
PPTX
Exploratory data analysis with Python
PPTX
Python Scipy Numpy
PPTX
Scikit Learn intro
PPT
2.3 bayesian classification
PDF
Pandas
PPTX
Python pandas Library
ODP
Data Analysis in Python
PPTX
Data Analysis with Python Pandas
PPTX
MatplotLib.pptx
PPTX
Data visualization using R
PDF
Introduction to Machine Learning with SciKit-Learn
PPTX
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...
PDF
Dimensionality Reduction
PDF
Python for Data Science
PDF
Learn Python Programming | Python Programming - Step by Step | Python for Beg...
PPT
Machine learning with Big Data power point presentation
PPT on Data Science Using Python
Introduction to matplotlib
Python for Data Science | Python Data Science Tutorial | Data Science Certifi...
Python Pandas
Exploratory data analysis with Python
Python Scipy Numpy
Scikit Learn intro
2.3 bayesian classification
Pandas
Python pandas Library
Data Analysis in Python
Data Analysis with Python Pandas
MatplotLib.pptx
Data visualization using R
Introduction to Machine Learning with SciKit-Learn
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...
Dimensionality Reduction
Python for Data Science
Learn Python Programming | Python Programming - Step by Step | Python for Beg...
Machine learning with Big Data power point presentation
Ad

Similar to Python For Data Analysis | Python Pandas Tutorial | Learn Python | Python Training | Edureka (20)

PDF
Python Certification | Data Science with Python Certification | Python Online...
PPTX
Make Your Reports Over the Counter
PDF
Supervised vs Unsupervised vs Reinforcement Learning | Edureka
PPTX
Tackling Python: What is it and How Can it Help with Technical SEO? | TechSEO...
PDF
SearchLeeds 2018 - Stephen Kenwright - Branded3 - Customer-centric search: se...
PDF
AI in Software for Augmenting Intelligence Across the Enterprise
PPTX
Set Your Course for Change with Real-Time Analytics and Insights
PDF
Why Choose a Data Analytics Professional Certificate | IABAC
PDF
product internationalization.pdf
PDF
Statistics Using Python | Statistics Python Tutorial | Python Certification T...
PDF
Prueba de audio
PPT
RAPD Presentation
PDF
Data Science Training | Data Science Tutorial for Beginners | Data Science wi...
PDF
Data Analytics Certification Program for Beginners | IABAC
PDF
Pmp 3 chapter
PPTX
Actionable Insights with Google Analytics - Ben Rogers - Attacat Internet Mar...
PDF
Making Data Science accessible to a wider audience
PDF
LIVErtising 2015 3 Listenomics
PPTX
CMW2023 EEAT & AI Blueprint for content scaling.pptx
PPTX
Session 3.pptx trainin of trainer AASt for thos who wants to be trainers
Python Certification | Data Science with Python Certification | Python Online...
Make Your Reports Over the Counter
Supervised vs Unsupervised vs Reinforcement Learning | Edureka
Tackling Python: What is it and How Can it Help with Technical SEO? | TechSEO...
SearchLeeds 2018 - Stephen Kenwright - Branded3 - Customer-centric search: se...
AI in Software for Augmenting Intelligence Across the Enterprise
Set Your Course for Change with Real-Time Analytics and Insights
Why Choose a Data Analytics Professional Certificate | IABAC
product internationalization.pdf
Statistics Using Python | Statistics Python Tutorial | Python Certification T...
Prueba de audio
RAPD Presentation
Data Science Training | Data Science Tutorial for Beginners | Data Science wi...
Data Analytics Certification Program for Beginners | IABAC
Pmp 3 chapter
Actionable Insights with Google Analytics - Ben Rogers - Attacat Internet Mar...
Making Data Science accessible to a wider audience
LIVErtising 2015 3 Listenomics
CMW2023 EEAT & AI Blueprint for content scaling.pptx
Session 3.pptx trainin of trainer AASt for thos who wants to be trainers
Ad

More from Edureka! (20)

PDF
What to learn during the 21 days Lockdown | Edureka
PDF
Top 10 Dying Programming Languages in 2020 | Edureka
PDF
Top 5 Trending Business Intelligence Tools | Edureka
PDF
Tableau Tutorial for Data Science | Edureka
PDF
Top 5 PMP Certifications | Edureka
PDF
Top Maven Interview Questions in 2020 | Edureka
PDF
Linux Mint Tutorial | Edureka
PDF
How to Deploy Java Web App in AWS| Edureka
PDF
Importance of Digital Marketing | Edureka
PDF
RPA in 2020 | Edureka
PDF
Email Notifications in Jenkins | Edureka
PDF
EA Algorithm in Machine Learning | Edureka
PDF
Cognitive AI Tutorial | Edureka
PDF
AWS Cloud Practitioner Tutorial | Edureka
PDF
Blue Prism Top Interview Questions | Edureka
PDF
Big Data on AWS Tutorial | Edureka
PDF
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
PDF
Kubernetes Installation on Ubuntu | Edureka
PDF
Introduction to DevOps | Edureka
PDF
ITIL® Tutorial for Beginners | ITIL® Foundation Training | Edureka
What to learn during the 21 days Lockdown | Edureka
Top 10 Dying Programming Languages in 2020 | Edureka
Top 5 Trending Business Intelligence Tools | Edureka
Tableau Tutorial for Data Science | Edureka
Top 5 PMP Certifications | Edureka
Top Maven Interview Questions in 2020 | Edureka
Linux Mint Tutorial | Edureka
How to Deploy Java Web App in AWS| Edureka
Importance of Digital Marketing | Edureka
RPA in 2020 | Edureka
Email Notifications in Jenkins | Edureka
EA Algorithm in Machine Learning | Edureka
Cognitive AI Tutorial | Edureka
AWS Cloud Practitioner Tutorial | Edureka
Blue Prism Top Interview Questions | Edureka
Big Data on AWS Tutorial | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
Kubernetes Installation on Ubuntu | Edureka
Introduction to DevOps | Edureka
ITIL® Tutorial for Beginners | ITIL® Foundation Training | Edureka

Recently uploaded (20)

PPTX
Spectroscopy.pptx food analysis technology
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPT
Teaching material agriculture food technology
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Encapsulation theory and applications.pdf
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Electronic commerce courselecture one. Pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Spectroscopy.pptx food analysis technology
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Teaching material agriculture food technology
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Encapsulation_ Review paper, used for researhc scholars
Understanding_Digital_Forensics_Presentation.pptx
Network Security Unit 5.pdf for BCA BBA.
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
MIND Revenue Release Quarter 2 2025 Press Release
Diabetes mellitus diagnosis method based random forest with bat algorithm
Encapsulation theory and applications.pdf
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Electronic commerce courselecture one. Pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
20250228 LYD VKU AI Blended-Learning.pptx
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
NewMind AI Weekly Chronicles - August'25 Week I
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows

Python For Data Analysis | Python Pandas Tutorial | Learn Python | Python Training | Edureka