SlideShare a Scribd company logo
2
Most read
10
Most read
16
Most read
Statistics in Data Science with Python
Mahe Karim
Front End Developer
ID - 162-15-7770
Area of Interest:
 Full Stack Developer
 Data Analyst
 Animation
 Why Not Jump Into Passive Income ? ;)
Who I Am ?
Statistics in Data Science with Python
Implement of our course
Step 1
Step 2
Step 3
•Statistics
•Data Science
•Python
Basic RoadTo Data Science
Statistics
Machine
Learning
Deep
Learning
Programming
Language
( Python / R )
Data Science
Smartest way to be a
Data Scientist / Analyst • Core Statistics
• Statistical Machine
Learning
• Probabilistic
Modeling
Step 1
Statistics
• Database
• Data Mining
• Data Design
Step 2
Computing
• Deep Learning
• NLP
• DataAnalysis
Step 3
ML
3 steps to learning the statistics and
probability required for data science:
• Descriptive statistics, distributions,
hypothesis testing, and regression.
Core Statistics
Concepts
• Conditional probability, priors,
posteriors, and maximum likelihood.
BayesianThinking
• Learn basic machine concepts and
how statistics fits in.
Intro to Statistical
Machine Learning
Verified course include STATISTICS
Most ImportantTopics In Statistics
• Part 1 - Simple Linear Regression
Part 2 - Multivariate Linear Regression
Part 3 - Logistic Regression
Part 4 - Multivariate Logistic Regression
Part 5 - Neural Networks
Part 6 - SupportVector Machines
Part 7 - K-Means Clustering & PCA
Part 8 - Anomaly Detection & Recommendation
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
path = os.getcwd() + 'dataex1data1.txt'
data = pd.read_csv(path, header=None, names=['Population', 'Profit'])
data.head()
Data Set:
data.plot(kind='scatter', x='Population', y='Profit',
figsize=(12,8))
Implementing Simple Linear Regression
def computeCost(X, y, theta):
inner = np.power(((X * theta.T) - y), 2)
return np.sum(inner) / (2 * len(X)
# append a ones column to the front of the data set
data.insert(0, 'Ones', 1)
# set X (training data) and y (target variable)
cols = data.shape[1]
X = data.iloc[:,0:cols-1]
y = data.iloc[:,cols-1:cols]
# convert from data frames to numpy matrices
X = np.matrix(X.values)
y = np.matrix(y.values)
theta = np.matrix(np.array([0,0]))
x = np.linspace(data.Population.min(), data.Population.max(), 100)
f = g[0, 0] + (g[0, 1] * x)
fig, ax = plt.subplots(figsize=(12,8))
ax.plot(x, f, 'r', label='Prediction')
ax.scatter(data.Population, data.Profit, label='Traning Data')
ax.legend(loc=2)
ax.set_xlabel('Population')
ax.set_ylabel('Profit')
ax.set_title('Predicted Profit vs. Population Size')
Prediction ;) :D :p <3
Resources:
 https://elitedatascience.com/learn-statistics-for-data-science
 https://github.com/datasciencemasters/go
 An Introduction to Statistical Learning with Applications in R Gareth
James, DanielaWitten,Trevor Hastie and RobertTibshirani
 http://www.johnwittenauer.net/machine-learning-exercises-in-
python-part-1/
 Think Stats
Ad

Recommended

PPTX
Functions using stack and heap
baabtra.com - No. 1 supplier of quality freshers
 
PDF
The matplotlib Library
Haim Michael
 
PDF
[計一] Basic r programming final0918
Chia-Yi Yen
 
PPTX
Python Pyplot Class XII
ajay_opjs
 
PPTX
2017 arab wic marwa ayad machine learning
marwa Ayad Mohamed
 
PDF
Hyperparameter Optimization with Hyperband Algorithm
Deep Learning Italia
 
PPTX
Ml study notes id3
Feri Handoyo
 
PDF
Introduction to Machine Learning with SciKit-Learn
Benjamin Bengfort
 
PPTX
Jay Yagnik at AI Frontiers : A History Lesson on AI
AI Frontiers
 
PPT
Matlab Nn Intro
Imthias Ahamed
 
PDF
Machine Learning Basics for Web Application Developers
Etsuji Nakai
 
PPTX
Linear regression on 1 terabytes of data? Some crazy observations and actions
Hesen Peng
 
PPTX
Data Structure Algorithm
nibiganesh
 
PDF
1 seaborn introduction
YuleiLi3
 
PPTX
Essential NumPy
zekeLabs Technologies
 
PDF
20181204i mlse discussions
Hiroshi Maruyama
 
PPTX
hash
tim4911
 
PDF
Visual diagnostics for more effective machine learning
Benjamin Bengfort
 
PPT
R-programming-training-in-mumbai
Unmesh Baile
 
PDF
論文紹介:Graph Pattern Entity Ranking Model for Knowledge Graph Completion
Naomi Shiraishi
 
PDF
Intermediate python ch1_slides
Atul Kumar
 
PDF
Dynamics in graph analysis (PyData Carolinas 2016)
Benjamin Bengfort
 
PPTX
Heap tree
JananiJ19
 
PPT
gSpan algorithm
Sadik Mussah
 
PDF
MOA for the IoT at ACML 2016
Albert Bifet
 
PDF
Artificial intelligence and data stream mining
Albert Bifet
 
PPTX
Training at AI Frontiers 2018 - LaiOffer Data Session: How Spark Speedup AI
AI Frontiers
 
PDF
Data Science With Python
Mosky Liu
 
PDF
Python Advanced Predictive Analytics Kumar Ashish
dakorarampse
 
PDF
Tech Tutorus - Data Science Using Python Course Curriculam.pdf
Tech Tutorus
 

More Related Content

What's hot (19)

PPTX
Jay Yagnik at AI Frontiers : A History Lesson on AI
AI Frontiers
 
PPT
Matlab Nn Intro
Imthias Ahamed
 
PDF
Machine Learning Basics for Web Application Developers
Etsuji Nakai
 
PPTX
Linear regression on 1 terabytes of data? Some crazy observations and actions
Hesen Peng
 
PPTX
Data Structure Algorithm
nibiganesh
 
PDF
1 seaborn introduction
YuleiLi3
 
PPTX
Essential NumPy
zekeLabs Technologies
 
PDF
20181204i mlse discussions
Hiroshi Maruyama
 
PPTX
hash
tim4911
 
PDF
Visual diagnostics for more effective machine learning
Benjamin Bengfort
 
PPT
R-programming-training-in-mumbai
Unmesh Baile
 
PDF
論文紹介:Graph Pattern Entity Ranking Model for Knowledge Graph Completion
Naomi Shiraishi
 
PDF
Intermediate python ch1_slides
Atul Kumar
 
PDF
Dynamics in graph analysis (PyData Carolinas 2016)
Benjamin Bengfort
 
PPTX
Heap tree
JananiJ19
 
PPT
gSpan algorithm
Sadik Mussah
 
PDF
MOA for the IoT at ACML 2016
Albert Bifet
 
PDF
Artificial intelligence and data stream mining
Albert Bifet
 
PPTX
Training at AI Frontiers 2018 - LaiOffer Data Session: How Spark Speedup AI
AI Frontiers
 
Jay Yagnik at AI Frontiers : A History Lesson on AI
AI Frontiers
 
Matlab Nn Intro
Imthias Ahamed
 
Machine Learning Basics for Web Application Developers
Etsuji Nakai
 
Linear regression on 1 terabytes of data? Some crazy observations and actions
Hesen Peng
 
Data Structure Algorithm
nibiganesh
 
1 seaborn introduction
YuleiLi3
 
Essential NumPy
zekeLabs Technologies
 
20181204i mlse discussions
Hiroshi Maruyama
 
hash
tim4911
 
Visual diagnostics for more effective machine learning
Benjamin Bengfort
 
R-programming-training-in-mumbai
Unmesh Baile
 
論文紹介:Graph Pattern Entity Ranking Model for Knowledge Graph Completion
Naomi Shiraishi
 
Intermediate python ch1_slides
Atul Kumar
 
Dynamics in graph analysis (PyData Carolinas 2016)
Benjamin Bengfort
 
Heap tree
JananiJ19
 
gSpan algorithm
Sadik Mussah
 
MOA for the IoT at ACML 2016
Albert Bifet
 
Artificial intelligence and data stream mining
Albert Bifet
 
Training at AI Frontiers 2018 - LaiOffer Data Session: How Spark Speedup AI
AI Frontiers
 

Similar to Statistics in Data Science with Python (20)

PDF
Data Science With Python
Mosky Liu
 
PDF
Python Advanced Predictive Analytics Kumar Ashish
dakorarampse
 
PDF
Tech Tutorus - Data Science Using Python Course Curriculam.pdf
Tech Tutorus
 
PPTX
Data scientist roadmap
Sonu Kumar
 
PDF
Learn Python teaching deck, learn how to code
synix4
 
PDF
Machine Learning part 3 - Introduction to data science
Frank Kienle
 
PPTX
Introduction to data analyticals123232.pptx
MalluKomar
 
DOCX
Self Study Business Approach to DS_01022022.docx
Shanmugasundaram M
 
PDF
Data Science and Machine Learning Using Python and Scikit-learn
Asim Jalis
 
PDF
Machine Learning Guide maXbox Starter62
Max Kleiner
 
PPTX
Introduction to Fundamentals of Data Science
KakaraSrikanth1
 
PPTX
AlgorithmsModelsNov13.pptx
PerumalPitchandi
 
PDF
Machine_Learning_Trushita
Trushita Redij
 
PDF
1225 lunchlearn shekhar_using his mac
Rising Media, Inc.
 
PPTX
Intro to Machine Learning for non-Data Scientists
Parinaz Ameri
 
PDF
maxbox_starter138_top7_statistical_methods.pdf
MaxKleiner3
 
PDF
Fundamentals Of Machine Learning For Predictive Data Analytics Algorithms Wor...
allerparede
 
PPTX
Data Science.pptx
TrainerAnalogicx
 
PPTX
classXII_DS_Teacher_Presentationgfgggggggggggggggpptx
AkKumar43
 
PPTX
Informs presentation new ppt
Salford Systems
 
Data Science With Python
Mosky Liu
 
Python Advanced Predictive Analytics Kumar Ashish
dakorarampse
 
Tech Tutorus - Data Science Using Python Course Curriculam.pdf
Tech Tutorus
 
Data scientist roadmap
Sonu Kumar
 
Learn Python teaching deck, learn how to code
synix4
 
Machine Learning part 3 - Introduction to data science
Frank Kienle
 
Introduction to data analyticals123232.pptx
MalluKomar
 
Self Study Business Approach to DS_01022022.docx
Shanmugasundaram M
 
Data Science and Machine Learning Using Python and Scikit-learn
Asim Jalis
 
Machine Learning Guide maXbox Starter62
Max Kleiner
 
Introduction to Fundamentals of Data Science
KakaraSrikanth1
 
AlgorithmsModelsNov13.pptx
PerumalPitchandi
 
Machine_Learning_Trushita
Trushita Redij
 
1225 lunchlearn shekhar_using his mac
Rising Media, Inc.
 
Intro to Machine Learning for non-Data Scientists
Parinaz Ameri
 
maxbox_starter138_top7_statistical_methods.pdf
MaxKleiner3
 
Fundamentals Of Machine Learning For Predictive Data Analytics Algorithms Wor...
allerparede
 
Data Science.pptx
TrainerAnalogicx
 
classXII_DS_Teacher_Presentationgfgggggggggggggggpptx
AkKumar43
 
Informs presentation new ppt
Salford Systems
 
Ad

Recently uploaded (20)

PDF
ECONOMICS, DISASTER MANAGEMENT, ROAD SAFETY - STUDY MATERIAL [10TH]
SHERAZ AHMAD LONE
 
PPTX
List View Components in Odoo 18 - Odoo Slides
Celine George
 
PDF
Gladiolous Cultivation practices by AKL.pdf
kushallamichhame
 
PDF
University of Ghana Cracks Down on Misconduct: Over 100 Students Sanctioned
Kweku Zurek
 
PPTX
Birnagar High School Platinum Jubilee Quiz.pptx
Sourav Kr Podder
 
PPTX
Great Governors' Send-Off Quiz 2025 Prelims IIT KGP
IIT Kharagpur Quiz Club
 
PPTX
How payment terms are configured in Odoo 18
Celine George
 
PDF
English 3 Quarter 1_LEwithLAS_Week 1.pdf
DeAsisAlyanajaneH
 
PPTX
Q1_TLE 8_Week 1- Day 1 tools and equipment
clairenotado3
 
PDF
Public Health For The 21st Century 1st Edition Judy Orme Jane Powell
trjnesjnqg7801
 
PDF
LDMMIA Yoga S10 Free Workshop Grad Level
LDM & Mia eStudios
 
PPTX
GREAT QUIZ EXCHANGE 2025 - GENERAL QUIZ.pptx
Ronisha Das
 
PDF
THE PSYCHOANALYTIC OF THE BLACK CAT BY EDGAR ALLAN POE (1).pdf
nabilahk908
 
PPTX
How to use _name_search() method in Odoo 18
Celine George
 
PPTX
Pests of Maize: An comprehensive overview.pptx
Arshad Shaikh
 
PPTX
Wage and Salary Computation.ppt.......,x
JosalitoPalacio
 
PPTX
Values Education 10 Quarter 1 Module .pptx
JBPafin
 
PDF
HistoPathology Ppt. Arshita Gupta for Diploma
arshitagupta674
 
PPTX
IIT KGP Quiz Week 2024 Sports Quiz (Prelims + Finals)
IIT Kharagpur Quiz Club
 
PPTX
OBSESSIVE COMPULSIVE DISORDER.pptx IN 5TH SEMESTER B.SC NURSING, 2ND YEAR GNM...
parmarjuli1412
 
ECONOMICS, DISASTER MANAGEMENT, ROAD SAFETY - STUDY MATERIAL [10TH]
SHERAZ AHMAD LONE
 
List View Components in Odoo 18 - Odoo Slides
Celine George
 
Gladiolous Cultivation practices by AKL.pdf
kushallamichhame
 
University of Ghana Cracks Down on Misconduct: Over 100 Students Sanctioned
Kweku Zurek
 
Birnagar High School Platinum Jubilee Quiz.pptx
Sourav Kr Podder
 
Great Governors' Send-Off Quiz 2025 Prelims IIT KGP
IIT Kharagpur Quiz Club
 
How payment terms are configured in Odoo 18
Celine George
 
English 3 Quarter 1_LEwithLAS_Week 1.pdf
DeAsisAlyanajaneH
 
Q1_TLE 8_Week 1- Day 1 tools and equipment
clairenotado3
 
Public Health For The 21st Century 1st Edition Judy Orme Jane Powell
trjnesjnqg7801
 
LDMMIA Yoga S10 Free Workshop Grad Level
LDM & Mia eStudios
 
GREAT QUIZ EXCHANGE 2025 - GENERAL QUIZ.pptx
Ronisha Das
 
THE PSYCHOANALYTIC OF THE BLACK CAT BY EDGAR ALLAN POE (1).pdf
nabilahk908
 
How to use _name_search() method in Odoo 18
Celine George
 
Pests of Maize: An comprehensive overview.pptx
Arshad Shaikh
 
Wage and Salary Computation.ppt.......,x
JosalitoPalacio
 
Values Education 10 Quarter 1 Module .pptx
JBPafin
 
HistoPathology Ppt. Arshita Gupta for Diploma
arshitagupta674
 
IIT KGP Quiz Week 2024 Sports Quiz (Prelims + Finals)
IIT Kharagpur Quiz Club
 
OBSESSIVE COMPULSIVE DISORDER.pptx IN 5TH SEMESTER B.SC NURSING, 2ND YEAR GNM...
parmarjuli1412
 
Ad

Statistics in Data Science with Python

  • 1. Statistics in Data Science with Python
  • 2. Mahe Karim Front End Developer ID - 162-15-7770 Area of Interest:  Full Stack Developer  Data Analyst  Animation  Why Not Jump Into Passive Income ? ;) Who I Am ?
  • 4. Implement of our course Step 1 Step 2 Step 3 •Statistics •Data Science •Python
  • 5. Basic RoadTo Data Science Statistics Machine Learning Deep Learning Programming Language ( Python / R ) Data Science
  • 6. Smartest way to be a Data Scientist / Analyst • Core Statistics • Statistical Machine Learning • Probabilistic Modeling Step 1 Statistics • Database • Data Mining • Data Design Step 2 Computing • Deep Learning • NLP • DataAnalysis Step 3 ML
  • 7. 3 steps to learning the statistics and probability required for data science: • Descriptive statistics, distributions, hypothesis testing, and regression. Core Statistics Concepts • Conditional probability, priors, posteriors, and maximum likelihood. BayesianThinking • Learn basic machine concepts and how statistics fits in. Intro to Statistical Machine Learning
  • 9. Most ImportantTopics In Statistics • Part 1 - Simple Linear Regression Part 2 - Multivariate Linear Regression Part 3 - Logistic Regression Part 4 - Multivariate Logistic Regression Part 5 - Neural Networks Part 6 - SupportVector Machines Part 7 - K-Means Clustering & PCA Part 8 - Anomaly Detection & Recommendation
  • 10. import os import numpy as np import pandas as pd import matplotlib.pyplot as plt %matplotlib inline path = os.getcwd() + 'dataex1data1.txt' data = pd.read_csv(path, header=None, names=['Population', 'Profit']) data.head()
  • 13. Implementing Simple Linear Regression def computeCost(X, y, theta): inner = np.power(((X * theta.T) - y), 2) return np.sum(inner) / (2 * len(X) # append a ones column to the front of the data set data.insert(0, 'Ones', 1) # set X (training data) and y (target variable) cols = data.shape[1] X = data.iloc[:,0:cols-1] y = data.iloc[:,cols-1:cols] # convert from data frames to numpy matrices X = np.matrix(X.values) y = np.matrix(y.values) theta = np.matrix(np.array([0,0]))
  • 14. x = np.linspace(data.Population.min(), data.Population.max(), 100) f = g[0, 0] + (g[0, 1] * x) fig, ax = plt.subplots(figsize=(12,8)) ax.plot(x, f, 'r', label='Prediction') ax.scatter(data.Population, data.Profit, label='Traning Data') ax.legend(loc=2) ax.set_xlabel('Population') ax.set_ylabel('Profit') ax.set_title('Predicted Profit vs. Population Size')
  • 16. Resources:  https://elitedatascience.com/learn-statistics-for-data-science  https://github.com/datasciencemasters/go  An Introduction to Statistical Learning with Applications in R Gareth James, DanielaWitten,Trevor Hastie and RobertTibshirani  http://www.johnwittenauer.net/machine-learning-exercises-in- python-part-1/  Think Stats