SlideShare a Scribd company logo
Naive Bayes
Techno Main Salt Lake
By
Name : Mahim Majee
Stream : CSE
Semester : 7th
Section : A
Roll No. : 13000119055
Registration No. : 016704 of 2019-20
1
Introduction
 Classification technique based on Bayes’ Theorem
 With “naive” assumption of independence among
predictors.
 Easy to build
 Particularly useful for very large data sets
 Known to outperform even highly sophisticated
classification methods a. e.g. Earlier method for
spam detection
Introduction - Bayes theorem
● P(c|x) - the posterior probability of class (c, target) given predictor (x,
attributes).
● P(c) - the prior probability of class.
● P(x|c) - is the likelihood which is the probability of predictor given class.
● P(x) - is the prior probability of predictor.
How Naive Bayes algorithm works
 We are training data set of weather and corresponding target variable
‘Play’ (suggesting possibilities of playing).
 Now, we need to classify whether players will play or not based on
weather condition.
 Convert the data set into a frequency table
 Create Likelihood table by finding the probabilities like Overcast probability =
0.29 and probability of playing is 0.64.
 Now, use Naive Bayesian equation to calculate the posterior probability for
each class. The class with the highest posterior probability is the outcome of
prediction.
Problem: Players will play if weather is sunny. Is this statement is correct?
P(Yes | Sunny) = P( Sunny | Yes) * P(Yes) / P (Sunny) Here we have P (Sunny |Yes)
= 3/9 = 0.33, P(Sunny) = 5/14 = 0.36, P( Yes)= 9/14 = 0.64 Now, P (Yes | Sunny) =
0.33 * 0.64 / 0.36 = 0.60, (high probability)
Problem: Players will play if weather is sunny. Is this statement is correct?
P(No | Sunny) = P( Sunny | No) * P(No) / P (Sunny) Here we have P (Sunny |No) =
2/5 = 0.4, P(Sunny) = 5/14 = 0.36, P( No)= 5/14 = 0.36 Now, P (No | Sunny) = 0.4
* 0.36 / 0.36 = 0.40, (low probability)
Zero Frequency Problem
 What if any of the count is 0?
 Add 1 to all counts
 It is a form of Laplace smoothing
Using Python
#Import Library of Gaussian Naive Bayes model
from sklearn.naive_bayes import GaussianNB
import numpy as np
#assigning predictor and target variables
X= np.array([[-3,7],[1,5], [1,2], [-2,0], [2,3], [-4,0], [-1,1], [1,1], [-2,2],
[2,7], [-4,1], [-2,7]])
y = np.array([3, 3, 3, 3, 4, 3, 3, 4, 3, 4, 4, 4])
#Create a Gaussian Classifier
model = GaussianNB()
# Train the model using the training sets
model.fit(X, y)
#Predict Output
predicted= model.predict([[1,2],[3,4]])
print(predicted)
Tips to improve the Naive Bayes Model
 If continuous features do not have normal distribution, we
should use transformation or different methods to convert
 If test data set has zero frequency issue, apply smoothing
techniques “Laplace smoothing”
 Remove correlated features, as the highly correlated features
are voted twice in the model and it can lead to over inflating
importance.
 Naive Bayes classifier has limited options for parameter tuning
 Can’t be ensembled - because there is no variance to reduce
Variants
 Gaussian: It is used in classification and it assumes that
features follow a normal distribution.
 Multinomial: It is used for discrete counts. Implements the
naive Bayes algorithm for multinomially distributed data. It is
one of the two classic naive Bayes variants used in text
classification
 Bernoulli: The binomial model is useful if your feature vectors
are binary (i.e. zeros and ones). One application would be text
classification with ‘bag of words’ model where the 1s & 0s are
“word occurs in the document” and “word does not occur in the
document” respectively.
Applications
 Real time Prediction: Naive Bayes is an eager learning classifier and it
is sure fast. Thus, it could be used for making predictions in real time.
 Multi class Prediction: Well known for multi class prediction feature.
 Text classification/ Spam Filtering/ Sentiment Analysis: Mostly used in
text classification. Have higher success rate as compared to other
algorithms. Widely used in Spam filtering (identify spam e-mail) and
Sentiment Analysis.
 Recommendation System: Naive Bayes Classifier and Collaborative
Filtering together builds a Recommendation System that uses machine
learning and data mining techniques to filter unseen information and
predict whether a user would like a given resource or not
Conclusions
 The naive Bayes model is tremendously appealing because of its
simplicity, elegance, and robustness.
 It is one of the oldest formal classification algorithms, and yet
even in its simplest form it is often surprisingly effective.
 It is widely used in areas such as text classification and spam
filtering. A large number of modifications have been
introduced, by the statistical, data mining, machine learning,
and pattern recognition communities, in an attempt to make it
more flexible.
 But someone has to recognize that such modifications are
necessarily complications, which detract from its basic
simplicity.
References
 1.9. Naive Bayes — scikit-learn 1.1.1 documentation
 https://en.wikipedia.org/wiki/Bayes%27_theorem
 Naive Bayes classifier – Wikipedia
 http://en.wikipedia.org/wiki/Naive_Bayes_classifier
 http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo20/www/mlbo
ok/ch6.pdf
 Data Mining: Concepts and Techniques, 3rd Edition, Han & Kamber &
Pei ISBN: 9780123 814791
Naive.pdf

More Related Content

PPTX
Naïve Bayes Classifier Algorithm.pptx
PDF
Naive Bayes
PPTX
Naive_hehe.pptx
PDF
Machine learning naive bayes and svm.pdf
PPT
UNIT2_NaiveBayes algorithms used in machine learning
PPTX
Navies bayes
PPTX
Bayer's Theorem Naive Bayer's classifier
PPTX
Data Analytics with Data Science Algorithm
Naïve Bayes Classifier Algorithm.pptx
Naive Bayes
Naive_hehe.pptx
Machine learning naive bayes and svm.pdf
UNIT2_NaiveBayes algorithms used in machine learning
Navies bayes
Bayer's Theorem Naive Bayer's classifier
Data Analytics with Data Science Algorithm

Similar to Naive.pdf (20)

PPTX
Naïve Bayes Classification (Data Mining)
PPTX
Introduction to Naive Bayes Algorithm ppt
PPTX
Naive Bayes Presentation
PPT
9-Decision Tree Induction-23-01-2025.ppt
PPTX
Naïve Bayes Classifier Algorithm.pptx
PPTX
1.1 Probability Theory and Naiv Bayse.pptx
PDF
NAIVE BAYES ALGORITHM
PDF
Naïve Bayes Machine Learning Classification with R Programming: A case study ...
PPTX
Supervised models
PPTX
"Naive Bayes Classifier" @ Papers We Love Bucharest
PPT
NaiveBayes classifier for data classification
PPT
NaiveBayes this is more functioonal and extraction of same version
PPT
NaiveBayesfcctcvtyvyuyuvuygygygiughuobiubivvyjnh
PPT
NaiveBayes classifier in artificial inteeligence.ppt
PPTX
naive bayes classification for machine learning..pptx
PPTX
CS3501.pptx
PDF
lecture 5 about lecture 5 about lecture lecture
PDF
Unit3_Classification_BAYES_Machine_Learning.pdf
PPT
NaiveBayes.ppt
PPT
NaiveBayes.ppt
Naïve Bayes Classification (Data Mining)
Introduction to Naive Bayes Algorithm ppt
Naive Bayes Presentation
9-Decision Tree Induction-23-01-2025.ppt
Naïve Bayes Classifier Algorithm.pptx
1.1 Probability Theory and Naiv Bayse.pptx
NAIVE BAYES ALGORITHM
Naïve Bayes Machine Learning Classification with R Programming: A case study ...
Supervised models
"Naive Bayes Classifier" @ Papers We Love Bucharest
NaiveBayes classifier for data classification
NaiveBayes this is more functioonal and extraction of same version
NaiveBayesfcctcvtyvyuyuvuygygygiughuobiubivvyjnh
NaiveBayes classifier in artificial inteeligence.ppt
naive bayes classification for machine learning..pptx
CS3501.pptx
lecture 5 about lecture 5 about lecture lecture
Unit3_Classification_BAYES_Machine_Learning.pdf
NaiveBayes.ppt
NaiveBayes.ppt
Ad

Recently uploaded (20)

PPTX
UNIT III MENTAL HEALTH NURSING ASSESSMENT
PPTX
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
PDF
What if we spent less time fighting change, and more time building what’s rig...
PPTX
Lesson notes of climatology university.
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
LNK 2025 (2).pdf MWEHEHEHEHEHEHEHEHEHEHE
PDF
Practical Manual AGRO-233 Principles and Practices of Natural Farming
PDF
Yogi Goddess Pres Conference Studio Updates
PDF
Paper A Mock Exam 9_ Attempt review.pdf.
PPTX
UV-Visible spectroscopy..pptx UV-Visible Spectroscopy – Electronic Transition...
PPTX
History, Philosophy and sociology of education (1).pptx
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PPTX
Cell Structure & Organelles in detailed.
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
Weekly quiz Compilation Jan -July 25.pdf
UNIT III MENTAL HEALTH NURSING ASSESSMENT
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
What if we spent less time fighting change, and more time building what’s rig...
Lesson notes of climatology university.
STATICS OF THE RIGID BODIES Hibbelers.pdf
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
2.FourierTransform-ShortQuestionswithAnswers.pdf
LNK 2025 (2).pdf MWEHEHEHEHEHEHEHEHEHEHE
Practical Manual AGRO-233 Principles and Practices of Natural Farming
Yogi Goddess Pres Conference Studio Updates
Paper A Mock Exam 9_ Attempt review.pdf.
UV-Visible spectroscopy..pptx UV-Visible Spectroscopy – Electronic Transition...
History, Philosophy and sociology of education (1).pptx
Final Presentation General Medicine 03-08-2024.pptx
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
Chinmaya Tiranga quiz Grand Finale.pdf
Cell Structure & Organelles in detailed.
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
Weekly quiz Compilation Jan -July 25.pdf
Ad

Naive.pdf

  • 1. Naive Bayes Techno Main Salt Lake By Name : Mahim Majee Stream : CSE Semester : 7th Section : A Roll No. : 13000119055 Registration No. : 016704 of 2019-20 1
  • 2. Introduction  Classification technique based on Bayes’ Theorem  With “naive” assumption of independence among predictors.  Easy to build  Particularly useful for very large data sets  Known to outperform even highly sophisticated classification methods a. e.g. Earlier method for spam detection
  • 3. Introduction - Bayes theorem ● P(c|x) - the posterior probability of class (c, target) given predictor (x, attributes). ● P(c) - the prior probability of class. ● P(x|c) - is the likelihood which is the probability of predictor given class. ● P(x) - is the prior probability of predictor.
  • 4. How Naive Bayes algorithm works  We are training data set of weather and corresponding target variable ‘Play’ (suggesting possibilities of playing).  Now, we need to classify whether players will play or not based on weather condition.
  • 5.  Convert the data set into a frequency table  Create Likelihood table by finding the probabilities like Overcast probability = 0.29 and probability of playing is 0.64.  Now, use Naive Bayesian equation to calculate the posterior probability for each class. The class with the highest posterior probability is the outcome of prediction.
  • 6. Problem: Players will play if weather is sunny. Is this statement is correct? P(Yes | Sunny) = P( Sunny | Yes) * P(Yes) / P (Sunny) Here we have P (Sunny |Yes) = 3/9 = 0.33, P(Sunny) = 5/14 = 0.36, P( Yes)= 9/14 = 0.64 Now, P (Yes | Sunny) = 0.33 * 0.64 / 0.36 = 0.60, (high probability)
  • 7. Problem: Players will play if weather is sunny. Is this statement is correct? P(No | Sunny) = P( Sunny | No) * P(No) / P (Sunny) Here we have P (Sunny |No) = 2/5 = 0.4, P(Sunny) = 5/14 = 0.36, P( No)= 5/14 = 0.36 Now, P (No | Sunny) = 0.4 * 0.36 / 0.36 = 0.40, (low probability)
  • 8. Zero Frequency Problem  What if any of the count is 0?  Add 1 to all counts  It is a form of Laplace smoothing
  • 9. Using Python #Import Library of Gaussian Naive Bayes model from sklearn.naive_bayes import GaussianNB import numpy as np #assigning predictor and target variables X= np.array([[-3,7],[1,5], [1,2], [-2,0], [2,3], [-4,0], [-1,1], [1,1], [-2,2], [2,7], [-4,1], [-2,7]]) y = np.array([3, 3, 3, 3, 4, 3, 3, 4, 3, 4, 4, 4]) #Create a Gaussian Classifier model = GaussianNB() # Train the model using the training sets model.fit(X, y) #Predict Output predicted= model.predict([[1,2],[3,4]]) print(predicted)
  • 10. Tips to improve the Naive Bayes Model  If continuous features do not have normal distribution, we should use transformation or different methods to convert  If test data set has zero frequency issue, apply smoothing techniques “Laplace smoothing”  Remove correlated features, as the highly correlated features are voted twice in the model and it can lead to over inflating importance.  Naive Bayes classifier has limited options for parameter tuning  Can’t be ensembled - because there is no variance to reduce
  • 11. Variants  Gaussian: It is used in classification and it assumes that features follow a normal distribution.  Multinomial: It is used for discrete counts. Implements the naive Bayes algorithm for multinomially distributed data. It is one of the two classic naive Bayes variants used in text classification  Bernoulli: The binomial model is useful if your feature vectors are binary (i.e. zeros and ones). One application would be text classification with ‘bag of words’ model where the 1s & 0s are “word occurs in the document” and “word does not occur in the document” respectively.
  • 12. Applications  Real time Prediction: Naive Bayes is an eager learning classifier and it is sure fast. Thus, it could be used for making predictions in real time.  Multi class Prediction: Well known for multi class prediction feature.  Text classification/ Spam Filtering/ Sentiment Analysis: Mostly used in text classification. Have higher success rate as compared to other algorithms. Widely used in Spam filtering (identify spam e-mail) and Sentiment Analysis.  Recommendation System: Naive Bayes Classifier and Collaborative Filtering together builds a Recommendation System that uses machine learning and data mining techniques to filter unseen information and predict whether a user would like a given resource or not
  • 13. Conclusions  The naive Bayes model is tremendously appealing because of its simplicity, elegance, and robustness.  It is one of the oldest formal classification algorithms, and yet even in its simplest form it is often surprisingly effective.  It is widely used in areas such as text classification and spam filtering. A large number of modifications have been introduced, by the statistical, data mining, machine learning, and pattern recognition communities, in an attempt to make it more flexible.  But someone has to recognize that such modifications are necessarily complications, which detract from its basic simplicity.
  • 14. References  1.9. Naive Bayes — scikit-learn 1.1.1 documentation  https://en.wikipedia.org/wiki/Bayes%27_theorem  Naive Bayes classifier – Wikipedia  http://en.wikipedia.org/wiki/Naive_Bayes_classifier  http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo20/www/mlbo ok/ch6.pdf  Data Mining: Concepts and Techniques, 3rd Edition, Han & Kamber & Pei ISBN: 9780123 814791