SlideShare a Scribd company logo
Bayesian Learning
Understanding randomness, uncertainty, and learning from data
By
Sharmila Chidaravalli
Assistant Professor
Department of ISE
Global Academy of Technology
Introduction to Probability-based Learning
Definition:
Probability-based learning is a practical learning method that combines prior knowledge
(or prior probabilities) with observed data.
Concept:
It uses probability theory to model randomness, uncertainty, and noise for predicting future
events.
Application:
Useful for handling large datasets and for making inferences using Bayes' Rule.
Probabilistic Models:
•Involve randomness and provide solutions in the form of probability distributions.
•Can handle uncertain and noisy data effectively.
Deterministic Models:
•Do not involve randomness.
•Produce the same output every time for the same initial conditions.
•Result in a single, fixed outcome.
Introduction to Probability-based Learning
Bayesian Learning
Difference from General Probabilistic Learning:
•Bayesian learning uses subjective probabilities—based on an individual’s belief
or interpretation.
•These probabilities can change over time with new information.
Key Algorithms:
•Naïve Bayes Learning
•Bayesian Belief Network (BBN)
These use prior probabilities and apply Bayes’ Rule to draw conclusions and make
predictions.
Fundamentals of Bayes Theorem
Bayes Theorem
• Goal: To determine the most probable hypothesis, given the data D plus any initial knowledge
about the prior probabilities of the various hypotheses in H.
•P(h∣D): Posterior probability of hypothesis h given data D
→ Updated belief after seeing the data
•P(D∣h): Likelihood of data D given hypothesis h
→ How well h explains the data
•P(h): Prior probability of h
→ Initial belief before seeing the data
•P(D):Evidence or marginal likelihood
→ Total probability of data under all hypotheses
Fundamentals of Bayes Theorem
Bayes Theorem
• Prior probability of h, P(h): it reflects any background knowledge we have about the chance that h is a
correct hypothesis (before having observed the data).
• Prior probability of D, P(D): it reflects the probability that training data D will be observed given no
knowledge about which hypothesis h holds.
• Conditional Probability of observation D, P(D|h): it denotes the probability of observing data D given
some world in which hypothesis h holds.
• Posterior probability of h, P(h|D): it represents the probability that h holds given the observed training
data D. It reflects our confidence that h holds after we have seen the training data D and it is the
quantity that Machine Learning researchers are interested in.
Bayes Theorem allows us to compute P(h|D):
P(h|D)=[P(D|h)P(h)] /P(D)
Bayes Theorem - Example
Fundamentals of Bayes Theorem
Step 1: Calculated Probabilities
P(A) = 4/7
P(B) = 3/7
P(B | A) = 2/4 = 1/2
P(A | B) = 2/3
Is Bayes Theorem Correct???
Bayes Theorem - Example
Fundamentals of Bayes Theorem
Step 2: Bayes’ Theorem Verification
Bayes’ Theorem: P(A | B) = [P(B | A) * P(A)] / P(B)
Substituting values:
P(A | B) = [1/2 * 4/7] / 3/7 P(B | A) = [2/3 * 3/7] / 4/7
P(A | B) = (2/7) / (3/7) = 2/3 P(B | A) = (2/7) / (4/7) = 2/4
✅ Result matches actual value → Bayes’ Theorem holds true
Consider a boy who has a volleyball tournament on the next day, but today he feels sick. It is unusual that there
is only a 40% chance he would fall sick since he is a healthy boy. Now, Find the probability of the boy
participating in the tournament. The boy is very much interested in volley ball, so there is a 90% probability
that he would participate in tournaments and 20% that he will fall sick given that he participates in the
tournament.
Consider a boy who has a volleyball tournament on the next day, but today he feels sick. It is unusual that there
is only a 40% chance he would fall sick since he is a healthy boy. Now, Find the probability of the boy
participating in the tournament. The boy is very much interested in volley ball, so there is a 90% probability
that he would participate in tournaments and 20% that he will fall sick given that he participates in the
tournament.
The probability of the boy participating in the tournament given that he is sick is:
P (Boy participating in the tournament | He is sick)
= P (Boy participating in the tournament) × P (He is sick | Boy participating in the tournament)/P (He is Sick)
Solution:
P (Boy participating in the tournament) = 90%
P (He is sick | Boy participating in the tournament) = 20%
P (He is Sick) = 40%
P (Boy participating in the tournament | He is sick) = (0.9 × 0.2)/0.4 = 0.45
Hence, 45% is the probability that the boy will participate in the tournament given that he is sick.
Assume the following probabilities, the probability of a person having Malaria to be 0.02%, the probability of
the test to be positive on detecting Malaria, given that the person has Malaria is 98% and similarly the
probability of the test to be negative on detecting Malaria, given that the person doesn’t have malaria to be
95%. Find the probability of a person having Malaria; given that, the test result is positive.
Classification Using Bayes Model
Maximum A Posteriori (MAP) Hypothesis, hMAP
The learner considers some set of candidate hypotheses H and it is
interested in finding the most probable hypothesis h ∈ H given the observed data D
Any such maximally probable hypothesis is called a maximum a posteriori (MAP) hypothesis
hMAP
We can determine the MAP hypotheses by using Bayes theorem to calculate the posterior
probability of each candidate hypothesis.
Maximum Likelihood (ML) Hypothesis, hML
If we assume that every hypothesis in H is equally probable
i.e. P(hi) = P(hj) for all hi and hj in H
We can only consider P(D|h) to find the most probable hypothesis.
P(D|h) is often called the likelihood of the data D given h
Any hypothesis that maximizes P(D|h) is called a maximum likelihood
(ML) hypothesis, hML.
NAÏVE BAYES ALGORITHM
It is a supervised binary class or multi class classification algorithm that works on the principle of Bayes
theorem.
There is a family of Naïve Bayes classifiers based on a common principle.
These algorithms classify for datasets whose features are independent and each feature is assumed to be given
equal weightage.
It particularly works for a large dataset and is very fast. It is one of the most effective and simple classification
algorithms. This algorithm considers all features to be independent of each other even though they are
individually dependent on the classified object.
Each of the features contributes a probability value independently during classification and hence this algorithm
is called as Naïve algorithm.
Some important applications of these algorithms are text classification, recommendation system and face
recognition.
NAÏVE BAYES ALGORITHM
NAÏVE BAYES ALGORITHM
Naive Bayes Assumption
NAÏVE BAYES ALGORITHM
Problem 1:
Consider new instance = (Sunny, Cool, High, Strong) Yes/No????
Consider new instance = (Overcast, Hot, High, Strong) Yes/No????
Bayesian Learning - Naive Bayes Algorithm
Bayesian Learning - Naive Bayes Algorithm
Bayesian Learning - Naive Bayes Algorithm
Bayesian Learning - Naive Bayes Algorithm
Bayesian Learning - Naive Bayes Algorithm
Problem 2 : Assess students Performance using Naïve Bayes Algorithm with the set as given.
Predict whether a student gets a job offer or not in his final year of the course CGPA = >=9,
Interactiveness = Yes, Practical Knowledge = Average ,Communication Skills = Average.
Problem 3
New Instance to be classified = (Red, SUV, Domestic) ---Yes/No?????
Problem 4
New Instance to be classified = (Single Parent, Young, Low) ---Yes/No?????
Bayes Optimal Classifier
• Normally we consider:
What is the most probable hypothesis given the training data?
• We can also consider:
what is the most probable classification of the new instance given the training
data?
Consider
• Three possible hypotheses:
P(h1|D) = .4, P(h2|D) = .3, P(h3|D) = .3
• Given new instance x,
h1(x) = +, h2(x) = −, h3(x) = −
• What’s most probable classification of x?
Bayes Optimal Classifier
Bayes Optimal Classifier
Bayes Optimal Classifier
Given the hypothesis space with 4 hypothesis h1, h2, h3 and h4. Determine if the patient is diagnosed as COVID
positive or COVID negative using Bayes Optimal classifier.
Naïve Bayes classifiers work on the principle of Bayes’ Theorem, assuming conditional independence
between features.
For continuous attributes, we cannot use frequency counts directly as with categorical attributes.
Instead, we assume that the values follow a Gaussian (Normal) distribution.
Naïve Bayes Algorithm for Continuous Attributes (Gaussian Naïve Bayes)
Based on numerical data find the gender for a person with following data
Height = 6ft, Weight = 130lbs and foot-size = 8 inch ,find Gender =?
Use Naïve Bayes Algorithm
Gender Height Weight FootSize
Male 6.00 180 12
Male 5.92 190 11
Male 5.58 170 12
Male 5.92 165 10
Female 5.00 100 6
Female 5.50 150 8
Female 5.42 130 7
Female 5.75 150 9
Gender Height Weight FootSize
Male 6.00 180 12
Male 5.92 190 11
Male 5.58 170 12
Male 5.92 165 10
Female 5.00 100 6
Female 5.50 150 8
Female 5.42 130 7
Female 5.75 150 9
Bayesian Learning - Naive Bayes Algorithm
Bayesian Learning - Naive Bayes Algorithm
Bayesian Learning - Naive Bayes Algorithm
Bayesian Learning - Naive Bayes Algorithm
Bayesian Learning - Naive Bayes Algorithm
Analyze the student performance using Navie Bayes algorithm for continuous attribute. Predict whether
student with test instance CGPA=8.5, Interactiveness=Yes) will get job offer or not in the final year.

More Related Content

More from Sharmila Chidaravalli (10)

PDF
Big Data Intoduction & Hadoop ArchitectureModule1.pdf
Sharmila Chidaravalli
 
PPTX
Dms introduction Sharmila Chidaravalli
Sharmila Chidaravalli
 
PDF
Assembly code
Sharmila Chidaravalli
 
PDF
Direct Memory Access & Interrrupts
Sharmila Chidaravalli
 
PPT
8255 Introduction
Sharmila Chidaravalli
 
PPTX
System Modeling & Simulation Introduction
Sharmila Chidaravalli
 
PDF
Travelling Salesperson Problem-Branch & Bound
Sharmila Chidaravalli
 
PDF
Bellman ford algorithm -Shortest Path
Sharmila Chidaravalli
 
Big Data Intoduction & Hadoop ArchitectureModule1.pdf
Sharmila Chidaravalli
 
Dms introduction Sharmila Chidaravalli
Sharmila Chidaravalli
 
Assembly code
Sharmila Chidaravalli
 
Direct Memory Access & Interrrupts
Sharmila Chidaravalli
 
8255 Introduction
Sharmila Chidaravalli
 
System Modeling & Simulation Introduction
Sharmila Chidaravalli
 
Travelling Salesperson Problem-Branch & Bound
Sharmila Chidaravalli
 
Bellman ford algorithm -Shortest Path
Sharmila Chidaravalli
 

Recently uploaded (20)

PDF
How to Buy Verified CashApp Accounts IN 2025
Buy Verified CashApp Accounts
 
PPTX
Computer network Computer network Computer network Computer network
Shrikant317689
 
PDF
Plant Control_EST_85520-01_en_AllChanges_20220127.pdf
DarshanaChathuranga4
 
PDF
Rapid Prototyping for XR: Lecture 5 - Cross Platform Development
Mark Billinghurst
 
PDF
輪読会資料_Miipher and Miipher2 .
NABLAS株式会社
 
PPTX
How to Un-Obsolete Your Legacy Keypad Design
Epec Engineered Technologies
 
PPTX
MATERIAL SCIENCE LECTURE NOTES FOR DIPLOMA STUDENTS
SAMEER VISHWAKARMA
 
PPT
SF 9_Unit 1.ppt software engineering ppt
AmarrKannthh
 
PDF
Rapid Prototyping for XR: Lecture 6 - AI for Prototyping and Research Directi...
Mark Billinghurst
 
PPTX
WHO And BIS std- for water quality .pptx
dhanashree78
 
PDF
Designing for Tomorrow – Architecture’s Role in the Sustainability Movement
BIM Services
 
PDF
Decision support system in machine learning models for a face recognition-bas...
TELKOMNIKA JOURNAL
 
PPTX
Bitumen Emulsion by Dr Sangita Ex CRRI Delhi
grilcodes
 
PDF
تقرير عن التحليل الديناميكي لتدفق الهواء حول جناح.pdf
محمد قصص فتوتة
 
PDF
June 2025 Top 10 Sites -Electrical and Electronics Engineering: An Internatio...
elelijjournal653
 
PPSX
OOPS Concepts in Python and Exception Handling
Dr. A. B. Shinde
 
PPTX
Mobile database systems 20254545645.pptx
herosh1968
 
PPTX
FSE_LLM4SE1_A Tool for In-depth Analysis of Code Execution Reasoning of Large...
cl144
 
PDF
Python Mini Project: Command-Line Quiz Game for School/College Students
MPREETHI7
 
PDF
NFPA 10 - Estandar para extintores de incendios portatiles (ed.22 ENG).pdf
Oscar Orozco
 
How to Buy Verified CashApp Accounts IN 2025
Buy Verified CashApp Accounts
 
Computer network Computer network Computer network Computer network
Shrikant317689
 
Plant Control_EST_85520-01_en_AllChanges_20220127.pdf
DarshanaChathuranga4
 
Rapid Prototyping for XR: Lecture 5 - Cross Platform Development
Mark Billinghurst
 
輪読会資料_Miipher and Miipher2 .
NABLAS株式会社
 
How to Un-Obsolete Your Legacy Keypad Design
Epec Engineered Technologies
 
MATERIAL SCIENCE LECTURE NOTES FOR DIPLOMA STUDENTS
SAMEER VISHWAKARMA
 
SF 9_Unit 1.ppt software engineering ppt
AmarrKannthh
 
Rapid Prototyping for XR: Lecture 6 - AI for Prototyping and Research Directi...
Mark Billinghurst
 
WHO And BIS std- for water quality .pptx
dhanashree78
 
Designing for Tomorrow – Architecture’s Role in the Sustainability Movement
BIM Services
 
Decision support system in machine learning models for a face recognition-bas...
TELKOMNIKA JOURNAL
 
Bitumen Emulsion by Dr Sangita Ex CRRI Delhi
grilcodes
 
تقرير عن التحليل الديناميكي لتدفق الهواء حول جناح.pdf
محمد قصص فتوتة
 
June 2025 Top 10 Sites -Electrical and Electronics Engineering: An Internatio...
elelijjournal653
 
OOPS Concepts in Python and Exception Handling
Dr. A. B. Shinde
 
Mobile database systems 20254545645.pptx
herosh1968
 
FSE_LLM4SE1_A Tool for In-depth Analysis of Code Execution Reasoning of Large...
cl144
 
Python Mini Project: Command-Line Quiz Game for School/College Students
MPREETHI7
 
NFPA 10 - Estandar para extintores de incendios portatiles (ed.22 ENG).pdf
Oscar Orozco
 
Ad

Bayesian Learning - Naive Bayes Algorithm

  • 1. Bayesian Learning Understanding randomness, uncertainty, and learning from data By Sharmila Chidaravalli Assistant Professor Department of ISE Global Academy of Technology
  • 2. Introduction to Probability-based Learning Definition: Probability-based learning is a practical learning method that combines prior knowledge (or prior probabilities) with observed data. Concept: It uses probability theory to model randomness, uncertainty, and noise for predicting future events. Application: Useful for handling large datasets and for making inferences using Bayes' Rule. Probabilistic Models: •Involve randomness and provide solutions in the form of probability distributions. •Can handle uncertain and noisy data effectively. Deterministic Models: •Do not involve randomness. •Produce the same output every time for the same initial conditions. •Result in a single, fixed outcome.
  • 3. Introduction to Probability-based Learning Bayesian Learning Difference from General Probabilistic Learning: •Bayesian learning uses subjective probabilities—based on an individual’s belief or interpretation. •These probabilities can change over time with new information. Key Algorithms: •Naïve Bayes Learning •Bayesian Belief Network (BBN) These use prior probabilities and apply Bayes’ Rule to draw conclusions and make predictions.
  • 4. Fundamentals of Bayes Theorem Bayes Theorem • Goal: To determine the most probable hypothesis, given the data D plus any initial knowledge about the prior probabilities of the various hypotheses in H. •P(h∣D): Posterior probability of hypothesis h given data D → Updated belief after seeing the data •P(D∣h): Likelihood of data D given hypothesis h → How well h explains the data •P(h): Prior probability of h → Initial belief before seeing the data •P(D):Evidence or marginal likelihood → Total probability of data under all hypotheses
  • 5. Fundamentals of Bayes Theorem Bayes Theorem • Prior probability of h, P(h): it reflects any background knowledge we have about the chance that h is a correct hypothesis (before having observed the data). • Prior probability of D, P(D): it reflects the probability that training data D will be observed given no knowledge about which hypothesis h holds. • Conditional Probability of observation D, P(D|h): it denotes the probability of observing data D given some world in which hypothesis h holds. • Posterior probability of h, P(h|D): it represents the probability that h holds given the observed training data D. It reflects our confidence that h holds after we have seen the training data D and it is the quantity that Machine Learning researchers are interested in. Bayes Theorem allows us to compute P(h|D): P(h|D)=[P(D|h)P(h)] /P(D)
  • 6. Bayes Theorem - Example Fundamentals of Bayes Theorem Step 1: Calculated Probabilities P(A) = 4/7 P(B) = 3/7 P(B | A) = 2/4 = 1/2 P(A | B) = 2/3 Is Bayes Theorem Correct???
  • 7. Bayes Theorem - Example Fundamentals of Bayes Theorem Step 2: Bayes’ Theorem Verification Bayes’ Theorem: P(A | B) = [P(B | A) * P(A)] / P(B) Substituting values: P(A | B) = [1/2 * 4/7] / 3/7 P(B | A) = [2/3 * 3/7] / 4/7 P(A | B) = (2/7) / (3/7) = 2/3 P(B | A) = (2/7) / (4/7) = 2/4 ✅ Result matches actual value → Bayes’ Theorem holds true
  • 8. Consider a boy who has a volleyball tournament on the next day, but today he feels sick. It is unusual that there is only a 40% chance he would fall sick since he is a healthy boy. Now, Find the probability of the boy participating in the tournament. The boy is very much interested in volley ball, so there is a 90% probability that he would participate in tournaments and 20% that he will fall sick given that he participates in the tournament.
  • 9. Consider a boy who has a volleyball tournament on the next day, but today he feels sick. It is unusual that there is only a 40% chance he would fall sick since he is a healthy boy. Now, Find the probability of the boy participating in the tournament. The boy is very much interested in volley ball, so there is a 90% probability that he would participate in tournaments and 20% that he will fall sick given that he participates in the tournament. The probability of the boy participating in the tournament given that he is sick is: P (Boy participating in the tournament | He is sick) = P (Boy participating in the tournament) × P (He is sick | Boy participating in the tournament)/P (He is Sick) Solution: P (Boy participating in the tournament) = 90% P (He is sick | Boy participating in the tournament) = 20% P (He is Sick) = 40% P (Boy participating in the tournament | He is sick) = (0.9 × 0.2)/0.4 = 0.45 Hence, 45% is the probability that the boy will participate in the tournament given that he is sick.
  • 10. Assume the following probabilities, the probability of a person having Malaria to be 0.02%, the probability of the test to be positive on detecting Malaria, given that the person has Malaria is 98% and similarly the probability of the test to be negative on detecting Malaria, given that the person doesn’t have malaria to be 95%. Find the probability of a person having Malaria; given that, the test result is positive.
  • 11. Classification Using Bayes Model Maximum A Posteriori (MAP) Hypothesis, hMAP The learner considers some set of candidate hypotheses H and it is interested in finding the most probable hypothesis h ∈ H given the observed data D Any such maximally probable hypothesis is called a maximum a posteriori (MAP) hypothesis hMAP We can determine the MAP hypotheses by using Bayes theorem to calculate the posterior probability of each candidate hypothesis.
  • 12. Maximum Likelihood (ML) Hypothesis, hML If we assume that every hypothesis in H is equally probable i.e. P(hi) = P(hj) for all hi and hj in H We can only consider P(D|h) to find the most probable hypothesis. P(D|h) is often called the likelihood of the data D given h Any hypothesis that maximizes P(D|h) is called a maximum likelihood (ML) hypothesis, hML.
  • 13. NAÏVE BAYES ALGORITHM It is a supervised binary class or multi class classification algorithm that works on the principle of Bayes theorem. There is a family of Naïve Bayes classifiers based on a common principle. These algorithms classify for datasets whose features are independent and each feature is assumed to be given equal weightage. It particularly works for a large dataset and is very fast. It is one of the most effective and simple classification algorithms. This algorithm considers all features to be independent of each other even though they are individually dependent on the classified object. Each of the features contributes a probability value independently during classification and hence this algorithm is called as Naïve algorithm. Some important applications of these algorithms are text classification, recommendation system and face recognition.
  • 15. NAÏVE BAYES ALGORITHM Naive Bayes Assumption
  • 17. Problem 1: Consider new instance = (Sunny, Cool, High, Strong) Yes/No???? Consider new instance = (Overcast, Hot, High, Strong) Yes/No????
  • 23. Problem 2 : Assess students Performance using Naïve Bayes Algorithm with the set as given. Predict whether a student gets a job offer or not in his final year of the course CGPA = >=9, Interactiveness = Yes, Practical Knowledge = Average ,Communication Skills = Average.
  • 24. Problem 3 New Instance to be classified = (Red, SUV, Domestic) ---Yes/No?????
  • 25. Problem 4 New Instance to be classified = (Single Parent, Young, Low) ---Yes/No?????
  • 26. Bayes Optimal Classifier • Normally we consider: What is the most probable hypothesis given the training data? • We can also consider: what is the most probable classification of the new instance given the training data?
  • 27. Consider • Three possible hypotheses: P(h1|D) = .4, P(h2|D) = .3, P(h3|D) = .3 • Given new instance x, h1(x) = +, h2(x) = −, h3(x) = − • What’s most probable classification of x? Bayes Optimal Classifier
  • 30. Given the hypothesis space with 4 hypothesis h1, h2, h3 and h4. Determine if the patient is diagnosed as COVID positive or COVID negative using Bayes Optimal classifier.
  • 31. Naïve Bayes classifiers work on the principle of Bayes’ Theorem, assuming conditional independence between features. For continuous attributes, we cannot use frequency counts directly as with categorical attributes. Instead, we assume that the values follow a Gaussian (Normal) distribution. Naïve Bayes Algorithm for Continuous Attributes (Gaussian Naïve Bayes)
  • 32. Based on numerical data find the gender for a person with following data Height = 6ft, Weight = 130lbs and foot-size = 8 inch ,find Gender =? Use Naïve Bayes Algorithm Gender Height Weight FootSize Male 6.00 180 12 Male 5.92 190 11 Male 5.58 170 12 Male 5.92 165 10 Female 5.00 100 6 Female 5.50 150 8 Female 5.42 130 7 Female 5.75 150 9
  • 33. Gender Height Weight FootSize Male 6.00 180 12 Male 5.92 190 11 Male 5.58 170 12 Male 5.92 165 10 Female 5.00 100 6 Female 5.50 150 8 Female 5.42 130 7 Female 5.75 150 9
  • 39. Analyze the student performance using Navie Bayes algorithm for continuous attribute. Predict whether student with test instance CGPA=8.5, Interactiveness=Yes) will get job offer or not in the final year.