SlideShare a Scribd company logo
Instance Based Learning
Instance Based Learning in machine learning
Instance Based Learning in machine learning
Instance Based Learning in machine learning
Instance Based Learning in machine learning
Unsupervised Learning:
Customer Segmentation: The unsupervised learning puts the customers
into different buying groups, hence the companies can know the different
customer segments and advertise to the group to make them better
targets.
Market Basket Analysis: This also extends to suggestions. It facilitates the
exploration of the relations between the products that are usually bought
together. Think of a store putting peanut butter and jelly closer to each
other because of this assumption.
Model-Based Learning:
 Model-based learning involves creating a mathematical model that can
predict outcomes based on input data.
 The model is trained on a large dataset and then used to make
predictions on new data.
 The model can be thought of as a set of rules that the machine uses to
make predictions.
 The model is typically created using statistical algorithms such as linear
regression, logistic regression, decision trees, and neural networks.
 Parameterized : if it learns using predefined mapped function
Instance-based learning:
 Sometimes called memory-based learning is a family of learning
algorithms that, instead of performing explicit generalization, compares
new problem instances with instances seen in training, which have been
stored in memory.
 Instead of summarizing the training data into a model, uses the training
instances themselves to make predictions.
 Lazy Learning: Unlike eager learning algorithms (which generalize the
training data into a model), instance-based learning algorithms delay
processing until a prediction is needed.
 Some of the instance-based learning algorithms are :
K Nearest Neighbor (KNN)
Self-Organizing Map (SOM)
Learning Vector Quantization (LVQ)
Locally Weighted Learning (LWL)
Case-Based Reasoning
KNN Algorithm:
 K-nearest neighbours (KNN) algorithm is a type of supervised ML
algorithm which can be used for both classification as well as
regression problems.
It is mainly used for classification problems in industry.
 Lazy learning algorithm − KNN is a lazy learning algorithm because it
does not have a specialized training phase and uses all the data for
training while classification.
 Non-parametric learning algorithm − KNN is also a non-parametric
learning algorithm because it doesn’t assume anything about the
underlying data.
 Makes predictions based on the similarity (typically distance)
between the new data point(new instance ) and the stored instances.
Instance Based Learning in machine learning
Instance Based Learning in machine learning
Classification Using Knn
NAME AGE GENDER CLASS OF SPORTS
Ajay 32 0 Football
Mark 40 0 Neither
Sara 16 1 Cricket
Zaira 34 1 Cricket
Sachin 55 0 Neither
Rahul 40 0 Cricket
Pooja 20 1 Neither
Smith 15 0 Cricket
Laxmi 55 1 Football
Michael 15 0 Football
Let’s find in which class
of people Angelina will
lie whose k factor is 3
and age is 5.
So we have to find out
the distance using
d=√((x2-x1)²+(y2-y1)²)
to find the distance
between any two
points.
distance between Ajay and Angelina using formula
d=√((age2-age1)²+(gender2-gender1)²)
d=√((5-32)²+(1-0)²)
d=√729+1
d=27.02
Similarly, we find out all distance one by one.
Distance between Angelina and
Distance
Ajay 27.02
Mark 35.01
Sara 11.00 Cricket
Zaira 29.00
Sachin 50.01
Rahul 35.01
Pooja 15.00
Smith 10.05 Cricket
Laxmi 50.00
Michael 10.05 Football
Angelina-Cricket
Instance Based Learning in machine learning
Instance Based Learning in machine learning
Regression Using Knn
Instance Based Learning in machine learning
BRIGHTNESS SATURATION CLASS
40 20 Red
50 50 Blue
60 90 Blue
10 25 Red
70 70 Blue
60 10 Red
25 80 Blue
BRIGHTNESS SATURATION CLASS K=5
20 35 ?
BRIGHTNESS SATURATION CLASS DISTANCE
40 20 Red 25
50 50 Blue 33.54
60 90 Blue 68.01
10 25 Red 10
70 70 Blue 61.03
60 10 Red 47.17
25 80 Blue 45
BRIGHTNESS SATURATION CLASS DISTANCE
10 25 Red 10
40 20 Red 25
50 50 Blue 33.54
25 80 Blue 45
60 10 Red 47.17
70 70 Blue 61.03
60 90 Blue 68.01
BRIGHTNESS SATURATION CLASS
40 20 Red
50 50 Blue
60 90 Blue
10 25 Red
70 70 Blue
60 10 Red
25 80 Blue
20 35 Red
Instance Based Learning in machine learning
How it Works:
Training Phase:
In k-NN, there is no explicit training phase. The algorithm simply stores
the training data.
.
Prediction Phase:
When a new instance is introduced for prediction, the algorithm follows these steps:
 Compute Distances: Calculate the distance between the new instance and all the instances in the
training set. Common distance metrics include Euclidean distance for continuous variables, Manhattan distance,
or Hamming distance for categorical variables.
 Identify Neighbors: Select the 'k' instances from the training set that are closest to the new instance (the
'k' nearest neighbors).
 Aggregate the Output:
For classification: Perform a majority vote among the 'k' nearest neighbors. The class that appears most
frequently among the neighbours is assigned to the new instance.
For regression: Calculate the average of the values of the 'k' nearest neighbors and assign this average to
the new instance
Instance Based Learning in machine learning
Instance Based Learning in machine learning
Instance Based Learning in machine learning
Instance Based Learning in machine learning
Instance Based Learning in machine learning
Instance Based Learning in machine learning
Step 1: Dataset and New Point
Dataset:
x y
1 2
2 3
3 5
4 4
5 7
New Point:
𝑥new=3.5
distance between new instance and data samples in data set:
D1= sqrt((3.5-1) **2 )=2.5
D2=1.5
D3=0.5
D4=0.5
D5=1.5
Select 3 nearest neighbours
(x3,y3)=(3,5)
(x4,y4)=(4,4)
((x2,y2)=(2,3
Compute Weights
Weights are the inverse of the distances. To avoid division by zero, we add a
small value (0.000010 to the distances.
W3=1/(0.5+0.00001)=1.99996
W4=1/(0.5+0.00001)=1.99996
W2=1/(1.5+0.00001)=0.66666
Compute Weighted Average
Compute the weighted sum of the target values and the sum of weights:
Weighted sum of Y=(5 *1.99996) + (4 * 1.99996) + (3 * 0.66666)=19.99962
Sum of weights=!.99996 +1.99996 + 0.66666=4.66658
Weighted average=19.99962/4.66658=4.2857 (3.5, 4.2857)
• Once we add distance weighting, there is really no harm in allowing all
training examples to have an influence on the classification of the x,,
because very distant examples will have very little effect on f(x,).
• Global method(Shepard's method) /otherwise local method.
• Considering all examples will make our our classifier to run more slowly.
CASE-BASED REASONING:
• k-NEAREST NEIGHBOR algorithm is lazy and classify new query instances by
analysing similar instances while ignoring instances that are very different from the
query.
• Represent instances as real-valued points in an n-dimensional Euclidean space.
• Case-based reasoning (CBR) is a learning paradigm based on the first two of these
principles
• In CBR, instances are typically represented using more rich symbolic descriptions,
and the methods used to retrieve similar instances are correspondingly more
elaborate
• CBR has been applied to problems such as conceptual design of mechanical devices
based on a stored library of previous designs.
• Reasoning about new legal cases based on previous rulings
• Solving planning and scheduling problems by reusing and combining
portions of previous solutions to similar problems.
• The CADET system :CADET is a Case-based Design Tool. CADET is a system
that aids conceptual design of electro-mechanical devices and is based on
the paradigm of Case-based Reasoning.
• A library containing approximately 75 previous designs and design
fragments to suggest conceptual designs to meet the specifications of new
design problems. Each instance stored in memory (e.g., a water pipe) is
represented by describing both its structure and its qualitative function.
Instance Based Learning in machine learning
• Given this functional specification for the new design problem, CADET
searches its library for stored cases whose functional descriptions match
the design problem.
• If an exact match is found, indicating that some stored case implements
exactly the desired function, then this case can be returned as a suggested
solution to the design problem.
• If no exact match occurs, CADET may find cases that match various
subgraphs of the desired functional specification.
• T-junction function matches a subgraph of the water faucet function
• graph.
• By retrieving multiple cases that match different subgraphs, the entire
design can sometimes be pieced together.
• It may also require backtracking on earlier choices of design subgoals
and, therefore, rejecting cases that were previously retrieved.

More Related Content

PPTX
Python - Numpy/Pandas/Matplot Machine Learning Libraries
PPTX
Monitors & workstation,Donald ch-2
PDF
Decision tree
PPT
Classification using back propagation algorithm
PPT
Support Vector Machines
PDF
Linear regression
PDF
K - Nearest neighbor ( KNN )
PPTX
An overview of gradient descent optimization algorithms
Python - Numpy/Pandas/Matplot Machine Learning Libraries
Monitors & workstation,Donald ch-2
Decision tree
Classification using back propagation algorithm
Support Vector Machines
Linear regression
K - Nearest neighbor ( KNN )
An overview of gradient descent optimization algorithms

What's hot (20)

PPTX
Decision trees for machine learning
PPTX
PPT
K means Clustering Algorithm
PDF
Bayes Belief Networks
PPT
Three dimensional concepts - Computer Graphics
PPTX
Support vector machines (svm)
PDF
Performance Metrics for Machine Learning Algorithms
PPTX
Symbol table design (Compiler Construction)
PPT
Polygon clipping
PDF
Artificial Neural Network
PPT
K mean-clustering algorithm
PPTX
Machine Learning-Linear regression
PPTX
Introduction to numpy
PDF
Feature Extraction
PPTX
Naïve Bayes Classifier Algorithm.pptx
PPTX
Loss Function.pptx
PPTX
Regular expressions in Python
PPTX
Decision Tree Learning
PPT
2.3 bayesian classification
Decision trees for machine learning
K means Clustering Algorithm
Bayes Belief Networks
Three dimensional concepts - Computer Graphics
Support vector machines (svm)
Performance Metrics for Machine Learning Algorithms
Symbol table design (Compiler Construction)
Polygon clipping
Artificial Neural Network
K mean-clustering algorithm
Machine Learning-Linear regression
Introduction to numpy
Feature Extraction
Naïve Bayes Classifier Algorithm.pptx
Loss Function.pptx
Regular expressions in Python
Decision Tree Learning
2.3 bayesian classification
Ad

Similar to Instance Based Learning in machine learning (20)

PPTX
Ensemble_instance_unsupersied_learning 01_02_2024.pptx
DOCX
COMPUTER VISION UNIT 4 BSC CS WITH AI MADRAS UNIVERSITY
PPTX
ML basic & clustering
PPTX
07 learning
PDF
Dr. Shivu__Machine Learning-Module 3.pdf
PPT
Supervised and unsupervised learning
PPTX
Lec13 Clustering.pptx
PPTX
Introduction to data visualization tools like Tableau and Power BI and Excel
PPTX
Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ...
PDF
Cluster Analysis for Dummies
PDF
Mat189: Cluster Analysis with NBA Sports Data
PDF
Module - 5 Machine Learning-22ISE62.pdf
DOCX
Neural nw k means
PPTX
K-Nearest Neighbor(KNN)
PPTX
MachineLearningGlobalAcademyofTechnologySlides
PPTX
CLUSTER ANALYSIS ALGORITHMS.pptx
PDF
Chapter#04[Part#01]K-Means Clusterig.pdf
PDF
Principal component analysis and lda
PDF
Yulia Honcharenko "Application of metric learning for logo recognition"
PDF
PPT s10-machine vision-s2
Ensemble_instance_unsupersied_learning 01_02_2024.pptx
COMPUTER VISION UNIT 4 BSC CS WITH AI MADRAS UNIVERSITY
ML basic & clustering
07 learning
Dr. Shivu__Machine Learning-Module 3.pdf
Supervised and unsupervised learning
Lec13 Clustering.pptx
Introduction to data visualization tools like Tableau and Power BI and Excel
Machine Learning Tutorial Part - 2 | Machine Learning Tutorial For Beginners ...
Cluster Analysis for Dummies
Mat189: Cluster Analysis with NBA Sports Data
Module - 5 Machine Learning-22ISE62.pdf
Neural nw k means
K-Nearest Neighbor(KNN)
MachineLearningGlobalAcademyofTechnologySlides
CLUSTER ANALYSIS ALGORITHMS.pptx
Chapter#04[Part#01]K-Means Clusterig.pdf
Principal component analysis and lda
Yulia Honcharenko "Application of metric learning for logo recognition"
PPT s10-machine vision-s2
Ad

Recently uploaded (20)

PDF
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
PDF
737-MAX_SRG.pdf student reference guides
PDF
BIO-INSPIRED HORMONAL MODULATION AND ADAPTIVE ORCHESTRATION IN S-AI-GPT
PPTX
additive manufacturing of ss316l using mig welding
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PDF
III.4.1.2_The_Space_Environment.p pdffdf
PPTX
Geodesy 1.pptx...............................................
PPTX
UNIT 4 Total Quality Management .pptx
PPTX
bas. eng. economics group 4 presentation 1.pptx
PPTX
Internet of Things (IOT) - A guide to understanding
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PPTX
Fundamentals of safety and accident prevention -final (1).pptx
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PDF
composite construction of structures.pdf
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
737-MAX_SRG.pdf student reference guides
BIO-INSPIRED HORMONAL MODULATION AND ADAPTIVE ORCHESTRATION IN S-AI-GPT
additive manufacturing of ss316l using mig welding
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
UNIT-1 - COAL BASED THERMAL POWER PLANTS
Foundation to blockchain - A guide to Blockchain Tech
III.4.1.2_The_Space_Environment.p pdffdf
Geodesy 1.pptx...............................................
UNIT 4 Total Quality Management .pptx
bas. eng. economics group 4 presentation 1.pptx
Internet of Things (IOT) - A guide to understanding
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
Model Code of Practice - Construction Work - 21102022 .pdf
Fundamentals of safety and accident prevention -final (1).pptx
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
Embodied AI: Ushering in the Next Era of Intelligent Systems
composite construction of structures.pdf

Instance Based Learning in machine learning

  • 6. Unsupervised Learning: Customer Segmentation: The unsupervised learning puts the customers into different buying groups, hence the companies can know the different customer segments and advertise to the group to make them better targets. Market Basket Analysis: This also extends to suggestions. It facilitates the exploration of the relations between the products that are usually bought together. Think of a store putting peanut butter and jelly closer to each other because of this assumption.
  • 7. Model-Based Learning:  Model-based learning involves creating a mathematical model that can predict outcomes based on input data.  The model is trained on a large dataset and then used to make predictions on new data.  The model can be thought of as a set of rules that the machine uses to make predictions.  The model is typically created using statistical algorithms such as linear regression, logistic regression, decision trees, and neural networks.  Parameterized : if it learns using predefined mapped function
  • 8. Instance-based learning:  Sometimes called memory-based learning is a family of learning algorithms that, instead of performing explicit generalization, compares new problem instances with instances seen in training, which have been stored in memory.  Instead of summarizing the training data into a model, uses the training instances themselves to make predictions.
  • 9.  Lazy Learning: Unlike eager learning algorithms (which generalize the training data into a model), instance-based learning algorithms delay processing until a prediction is needed.  Some of the instance-based learning algorithms are : K Nearest Neighbor (KNN) Self-Organizing Map (SOM) Learning Vector Quantization (LVQ) Locally Weighted Learning (LWL) Case-Based Reasoning
  • 10. KNN Algorithm:  K-nearest neighbours (KNN) algorithm is a type of supervised ML algorithm which can be used for both classification as well as regression problems. It is mainly used for classification problems in industry.  Lazy learning algorithm − KNN is a lazy learning algorithm because it does not have a specialized training phase and uses all the data for training while classification.  Non-parametric learning algorithm − KNN is also a non-parametric learning algorithm because it doesn’t assume anything about the underlying data.
  • 11.  Makes predictions based on the similarity (typically distance) between the new data point(new instance ) and the stored instances.
  • 15. NAME AGE GENDER CLASS OF SPORTS Ajay 32 0 Football Mark 40 0 Neither Sara 16 1 Cricket Zaira 34 1 Cricket Sachin 55 0 Neither Rahul 40 0 Cricket Pooja 20 1 Neither Smith 15 0 Cricket Laxmi 55 1 Football Michael 15 0 Football Let’s find in which class of people Angelina will lie whose k factor is 3 and age is 5. So we have to find out the distance using d=√((x2-x1)²+(y2-y1)²) to find the distance between any two points.
  • 16. distance between Ajay and Angelina using formula d=√((age2-age1)²+(gender2-gender1)²) d=√((5-32)²+(1-0)²) d=√729+1 d=27.02
  • 17. Similarly, we find out all distance one by one. Distance between Angelina and Distance Ajay 27.02 Mark 35.01 Sara 11.00 Cricket Zaira 29.00 Sachin 50.01 Rahul 35.01 Pooja 15.00 Smith 10.05 Cricket Laxmi 50.00 Michael 10.05 Football Angelina-Cricket
  • 22. BRIGHTNESS SATURATION CLASS 40 20 Red 50 50 Blue 60 90 Blue 10 25 Red 70 70 Blue 60 10 Red 25 80 Blue BRIGHTNESS SATURATION CLASS K=5 20 35 ?
  • 23. BRIGHTNESS SATURATION CLASS DISTANCE 40 20 Red 25 50 50 Blue 33.54 60 90 Blue 68.01 10 25 Red 10 70 70 Blue 61.03 60 10 Red 47.17 25 80 Blue 45
  • 24. BRIGHTNESS SATURATION CLASS DISTANCE 10 25 Red 10 40 20 Red 25 50 50 Blue 33.54 25 80 Blue 45 60 10 Red 47.17 70 70 Blue 61.03 60 90 Blue 68.01
  • 25. BRIGHTNESS SATURATION CLASS 40 20 Red 50 50 Blue 60 90 Blue 10 25 Red 70 70 Blue 60 10 Red 25 80 Blue 20 35 Red
  • 27. How it Works: Training Phase: In k-NN, there is no explicit training phase. The algorithm simply stores the training data. .
  • 28. Prediction Phase: When a new instance is introduced for prediction, the algorithm follows these steps:  Compute Distances: Calculate the distance between the new instance and all the instances in the training set. Common distance metrics include Euclidean distance for continuous variables, Manhattan distance, or Hamming distance for categorical variables.  Identify Neighbors: Select the 'k' instances from the training set that are closest to the new instance (the 'k' nearest neighbors).  Aggregate the Output: For classification: Perform a majority vote among the 'k' nearest neighbors. The class that appears most frequently among the neighbours is assigned to the new instance. For regression: Calculate the average of the values of the 'k' nearest neighbors and assign this average to the new instance
  • 35. Step 1: Dataset and New Point Dataset: x y 1 2 2 3 3 5 4 4 5 7 New Point: 𝑥new=3.5
  • 36. distance between new instance and data samples in data set: D1= sqrt((3.5-1) **2 )=2.5 D2=1.5 D3=0.5 D4=0.5 D5=1.5 Select 3 nearest neighbours (x3,y3)=(3,5) (x4,y4)=(4,4) ((x2,y2)=(2,3
  • 37. Compute Weights Weights are the inverse of the distances. To avoid division by zero, we add a small value (0.000010 to the distances. W3=1/(0.5+0.00001)=1.99996 W4=1/(0.5+0.00001)=1.99996 W2=1/(1.5+0.00001)=0.66666 Compute Weighted Average Compute the weighted sum of the target values and the sum of weights: Weighted sum of Y=(5 *1.99996) + (4 * 1.99996) + (3 * 0.66666)=19.99962 Sum of weights=!.99996 +1.99996 + 0.66666=4.66658 Weighted average=19.99962/4.66658=4.2857 (3.5, 4.2857)
  • 38. • Once we add distance weighting, there is really no harm in allowing all training examples to have an influence on the classification of the x,, because very distant examples will have very little effect on f(x,). • Global method(Shepard's method) /otherwise local method. • Considering all examples will make our our classifier to run more slowly.
  • 39. CASE-BASED REASONING: • k-NEAREST NEIGHBOR algorithm is lazy and classify new query instances by analysing similar instances while ignoring instances that are very different from the query. • Represent instances as real-valued points in an n-dimensional Euclidean space. • Case-based reasoning (CBR) is a learning paradigm based on the first two of these principles • In CBR, instances are typically represented using more rich symbolic descriptions, and the methods used to retrieve similar instances are correspondingly more elaborate • CBR has been applied to problems such as conceptual design of mechanical devices based on a stored library of previous designs. • Reasoning about new legal cases based on previous rulings
  • 40. • Solving planning and scheduling problems by reusing and combining portions of previous solutions to similar problems. • The CADET system :CADET is a Case-based Design Tool. CADET is a system that aids conceptual design of electro-mechanical devices and is based on the paradigm of Case-based Reasoning. • A library containing approximately 75 previous designs and design fragments to suggest conceptual designs to meet the specifications of new design problems. Each instance stored in memory (e.g., a water pipe) is represented by describing both its structure and its qualitative function.
  • 42. • Given this functional specification for the new design problem, CADET searches its library for stored cases whose functional descriptions match the design problem. • If an exact match is found, indicating that some stored case implements exactly the desired function, then this case can be returned as a suggested solution to the design problem. • If no exact match occurs, CADET may find cases that match various subgraphs of the desired functional specification. • T-junction function matches a subgraph of the water faucet function • graph.
  • 43. • By retrieving multiple cases that match different subgraphs, the entire design can sometimes be pieced together. • It may also require backtracking on earlier choices of design subgoals and, therefore, rejecting cases that were previously retrieved.