SlideShare a Scribd company logo
Venkat Java Projects
Mobile:+91 9966499110
Visit:www.venkatjavaprojects.com Email:venkatjavaprojects@gmail.com
Prediction of Quality for Different Type of Wine based on Different
Feature Sets Using Supervised Machine Learning Techniques
In this paper author is predicting quality of wine using supervise machine
learning algorithmssuch asSVM,Random Forest,NaïveBayes etc. All algorithms
prediction accuracy can be better by adding features selection algorithms such
as Genetic Algorithm (GA) or Simulated Annealing (SA). Feature selection
algorithms can be applied to dataset to remove non relevant attributes or
missing values and take only thoseattributes from dataset which are important
to makepredictions.Using featureselection algorithmswecan decreasedataset
size by removing non relevant data and make prediction accuracy better and
faster.
Genetic algorithm works in similar way as its work on chromosomes by taking
relevant genes to formnew production and remove unhealthy or non-relevant
genes. GA algorithm continuously iterate over dataset to look for non-relevant
attributes by doing mutation, reproduction and fitness, only those attributes
which has high fitness or related to more dataset values can be used for
mutation and reproduction and unfitted values will be removed out.
Simulated annealing (SA) is a global search/selection method that makes small
random changes (i.e. perturbations) to an initial (dataset values) candidate
solution. If the performancevalue for the perturbed (new Data) value is better
than the previous solution, the new solution (data/attribute) is accepted. If not,
an acceptance probability is determined based on the difference between the
two performance values and the current iteration of the search. From this, a
sub-optimal solution can be accepted on the off-change that it may eventually
produce a better solution or best attributes in subsequent iterations.
SVMAlgorithm: Machinelearning involvespredicting and classifyingdata and to
do so we employ various machinelearning algorithms according to the dataset.
SVM or Support Vector Machine is a linear model for classification and
regression problems. Itcan solve linear and non-linear problems and work well
for many practical problems. The idea of SVM is simple: The algorithm creates a
line or a hyperplanewhich separates the data into classes. In machine learning,
the radial basis function kernel, or RBF kernel, is a popular kernel function used
in various kernelized learning algorithms. In particular, it is commonly used in
support vector machine classification. As a simple example, for a classification
Venkat Java Projects
Mobile:+91 9966499110
Visit:www.venkatjavaprojects.com Email:venkatjavaprojects@gmail.com
taskwith onlytwo features (likethe image above),youcan think of a hyperplane
as a line that linearly separates and classifies a set of data.
Intuitively, the further from the hyperplane our data points lie, the more
confident we are that they have been correctly classified. We therefore want
our data points to be as far away from the hyperplane as possible, while still
being on the correct side of it.
So when new testing data is added, whatever sideof the hyperplaneit lands will
decide the class that we assign to it.
How do we find the right hyperplane?
Or, in other words, how do we best segregate the two classes within the data?
The distancebetween the hyperplaneand the nearestdata point fromeither set
is known as the margin. The goal is to choose a hyperplane with the greatest
possible margin between the hyperplane and any point within the training set,
giving a greater chance of new data being classified correctly.
Random ForestAlgorithm: it’s an ensemble algorithm which means internally it
will use multiple classifier algorithms to build accurate classifier model.
Internally this algorithm will use decision tree algorithm to generate it train
model for classification.
Naive Bayes: Naive Bayes which is one of the most commonly used algorithms
for classifying problems is simple probabilistic classifier and is based on Bayes
Theorem. It determines the probability of each features occurring in each class
and returns the outcome with the highest probability.
Dataset Information
We downloaded wine dataset from UCI machine learning website and dataset
saved inside dataset folder. All machine learning algorithms will take dataset and
train a model by splitting dataset into train and test part. Train part will be used
to train model and test part will be applied on train model to predict test part class
value.
Screen shots
To run this project double click on ‘run.bat’ file to get below screen
Venkat Java Projects
Mobile:+91 9966499110
Visit:www.venkatjavaprojects.com Email:venkatjavaprojects@gmail.com
In above screen click on ‘upload White/Red Wine Dataset’ button to upload red
or white wine dataset.
In above screen I am uploading redwine dataset and after upload will get below
screen
Venkat Java Projects
Mobile:+91 9966499110
Visit:www.venkatjavaprojects.com Email:venkatjavaprojects@gmail.com
Now click on ‘Run SVM with GA’ button to run SVM algorithm with genetic
feature selection algorithm. After clicking on this button 5 empty windows will
open you just closed all 5 windows and keep the old one running
In above screen we got 60% accuracy for SVM with Ga. Now run SVM with SA
(Simulated Annealing) Algorithm
Venkat Java Projects
Mobile:+91 9966499110
Visit:www.venkatjavaprojects.com Email:venkatjavaprojects@gmail.com
In abovescreenfor SVM with SA we got50 % accuracy. Nowrun RandomForest
with GA
With random forest ga we got 30% accuracy. Now run random forest with SA
Venkat Java Projects
Mobile:+91 9966499110
Visit:www.venkatjavaprojects.com Email:venkatjavaprojects@gmail.com
In above screenrandom forestSA also got same accuracy and now click on Naïve
Bayes with GA
In above screennaïve bayes with GA got 40% accuracyand nowrun Naïve Bayes
with SA
Venkat Java Projects
Mobile:+91 9966499110
Visit:www.venkatjavaprojects.com Email:venkatjavaprojects@gmail.com
Naïve Bayes SA got 40 % accuracy. Now click on ‘Accuracy Graph’ button to
get accuracy graph for all algorithms
In above graph x-axis represents algorithm name and y-axis represents accuracy
of those algorithms and from above graph we can conclude SVM with GA got
better accuracy compare to all other algorithms
Venkat Java Projects
Mobile:+91 9966499110
Visit:www.venkatjavaprojects.com Email:venkatjavaprojects@gmail.com

More Related Content

PDF
Supervised and Unsupervised Machine Learning
PPTX
Supervised learning
PDF
Machine learning Algorithms
PDF
Unsupervised Machine Learning Ml And How It Works
DOCX
AI Builder - Binary Classification
PPTX
Machine Learning for Product Managers
PDF
Using machine learning in anti money laundering part 2
PDF
Using Machine Learning in Anti Money Laundering - Part 1
Supervised and Unsupervised Machine Learning
Supervised learning
Machine learning Algorithms
Unsupervised Machine Learning Ml And How It Works
AI Builder - Binary Classification
Machine Learning for Product Managers
Using machine learning in anti money laundering part 2
Using Machine Learning in Anti Money Laundering - Part 1

What's hot (16)

PDF
AWS Machine Learning Workshp
PPTX
Supervised Machine Learning
DOCX
A decision tree based recommendation system for tourists
PDF
Course Project for Coursera Practical Machine Learning
PDF
Barga Data Science lecture 9
PPTX
H2O World - Top 10 Data Science Pitfalls - Mark Landry
DOCX
Network intrusion detection using supervised machine learning technique with ...
PPTX
Build Deep Learning model to identify santader bank's dissatisfied customers
PDF
Module 7: Unsupervised Learning
PPTX
LinkedIn talk at Netflix ML Platform meetup Sep 2019
PDF
Loan Default Prediction with Machine Learning
PPTX
Machine learning(UNIT 4)
PDF
Barga Data Science lecture 5
PDF
Barga Data Science lecture 7
PDF
Module 2: Machine Learning Deep Dive
PPTX
Predictive Modeling Workshop
AWS Machine Learning Workshp
Supervised Machine Learning
A decision tree based recommendation system for tourists
Course Project for Coursera Practical Machine Learning
Barga Data Science lecture 9
H2O World - Top 10 Data Science Pitfalls - Mark Landry
Network intrusion detection using supervised machine learning technique with ...
Build Deep Learning model to identify santader bank's dissatisfied customers
Module 7: Unsupervised Learning
LinkedIn talk at Netflix ML Platform meetup Sep 2019
Loan Default Prediction with Machine Learning
Machine learning(UNIT 4)
Barga Data Science lecture 5
Barga Data Science lecture 7
Module 2: Machine Learning Deep Dive
Predictive Modeling Workshop
Ad

Similar to Prediction of quality for different type of winebased on different feature sets using supervised machine learning techniques (20)

PPTX
Pseudo-Genetic Machine Learning Algorithm
PDF
Comparison of Top Data Mining(Final)
PPTX
CS-422 THESIS (1).pptx
PDF
The Rise of the Machines - A Primer to Machine Learning and Predictive Analyt...
PDF
Applying data mining for wine industry
PPTX
20141030 Feature-Value DSAA 2014 V6_FINAL
PDF
Introduction to Machine Learning using R - Dublin R User Group - Oct 2013
DOC
Tutorial - Support vector machines
PDF
Proposing an Appropriate Pattern for Car Detection by Using Intelligent Algor...
PPTX
Machine learning interviews day2
DOCX
Feature extraction for classifying students based on theirac ademic performance
DOC
Tutorial - Support vector machines
PPTX
vodQA Pune (2019) - Testing AI,ML applications
DOCX
Software defect estimation using machine learning algorithms
DOCX
Software defect estimation using machine learning algorithms
DOCX
introduction to machine learning unit iv
PDF
An empirical assessment of different kernel functions on the performance of s...
PDF
Performance Evaluation of Different Data Mining Classification Algorithm and ...
PPTX
Test AI/ML Applications
PPTX
classification algorithms in machine learning.pptx
Pseudo-Genetic Machine Learning Algorithm
Comparison of Top Data Mining(Final)
CS-422 THESIS (1).pptx
The Rise of the Machines - A Primer to Machine Learning and Predictive Analyt...
Applying data mining for wine industry
20141030 Feature-Value DSAA 2014 V6_FINAL
Introduction to Machine Learning using R - Dublin R User Group - Oct 2013
Tutorial - Support vector machines
Proposing an Appropriate Pattern for Car Detection by Using Intelligent Algor...
Machine learning interviews day2
Feature extraction for classifying students based on theirac ademic performance
Tutorial - Support vector machines
vodQA Pune (2019) - Testing AI,ML applications
Software defect estimation using machine learning algorithms
Software defect estimation using machine learning algorithms
introduction to machine learning unit iv
An empirical assessment of different kernel functions on the performance of s...
Performance Evaluation of Different Data Mining Classification Algorithm and ...
Test AI/ML Applications
classification algorithms in machine learning.pptx
Ad

More from Venkat Projects (20)

DOCX
1.AUTOMATIC DETECTION OF DIABETIC RETINOPATHY USING CNN.docx
DOCX
12.BLOCKCHAIN BASED MILK DELIVERY PLATFORM FOR STALLHOLDER DAIRY FARMERS IN K...
DOCX
10.ATTENDANCE CAPTURE SYSTEM USING FACE RECOGNITION.docx
DOCX
9.IMPLEMENTATION OF BLOCKCHAIN IN FINANCIAL SECTOR TO IMPROVE SCALABILITY.docx
DOCX
8.Geo Tracking Of Waste And Triggering Alerts And Mapping Areas With High Was...
DOCX
Image Forgery Detection Based on Fusion of Lightweight Deep Learning Models.docx
DOCX
6.A FOREST FIRE IDENTIFICATION METHOD FOR UNMANNED AERIAL VEHICLE MONITORING ...
DOCX
WATERMARKING IMAGES
DOCX
4.LOCAL DYNAMIC NEIGHBORHOOD BASED OUTLIER DETECTION APPROACH AND ITS FRAMEWO...
DOCX
Application and evaluation of a K-Medoidsbased shape clustering method for an...
DOCX
OPTIMISED STACKED ENSEMBLE TECHNIQUES IN THE PREDICTION OF CERVICAL CANCER US...
DOCX
1.AUTOMATIC DETECTION OF DIABETIC RETINOPATHY USING CNN.docx
DOCX
2022 PYTHON MAJOR PROJECTS LIST.docx
DOCX
2022 PYTHON PROJECTS LIST.docx
DOCX
2021 PYTHON PROJECTS LIST.docx
DOCX
2021 python projects list
DOCX
10.sentiment analysis of customer product reviews using machine learni
DOCX
9.data analysis for understanding the impact of covid–19 vaccinations on the ...
DOCX
6.iris recognition using machine learning technique
DOCX
5.local community detection algorithm based on minimal cluster
1.AUTOMATIC DETECTION OF DIABETIC RETINOPATHY USING CNN.docx
12.BLOCKCHAIN BASED MILK DELIVERY PLATFORM FOR STALLHOLDER DAIRY FARMERS IN K...
10.ATTENDANCE CAPTURE SYSTEM USING FACE RECOGNITION.docx
9.IMPLEMENTATION OF BLOCKCHAIN IN FINANCIAL SECTOR TO IMPROVE SCALABILITY.docx
8.Geo Tracking Of Waste And Triggering Alerts And Mapping Areas With High Was...
Image Forgery Detection Based on Fusion of Lightweight Deep Learning Models.docx
6.A FOREST FIRE IDENTIFICATION METHOD FOR UNMANNED AERIAL VEHICLE MONITORING ...
WATERMARKING IMAGES
4.LOCAL DYNAMIC NEIGHBORHOOD BASED OUTLIER DETECTION APPROACH AND ITS FRAMEWO...
Application and evaluation of a K-Medoidsbased shape clustering method for an...
OPTIMISED STACKED ENSEMBLE TECHNIQUES IN THE PREDICTION OF CERVICAL CANCER US...
1.AUTOMATIC DETECTION OF DIABETIC RETINOPATHY USING CNN.docx
2022 PYTHON MAJOR PROJECTS LIST.docx
2022 PYTHON PROJECTS LIST.docx
2021 PYTHON PROJECTS LIST.docx
2021 python projects list
10.sentiment analysis of customer product reviews using machine learni
9.data analysis for understanding the impact of covid–19 vaccinations on the ...
6.iris recognition using machine learning technique
5.local community detection algorithm based on minimal cluster

Recently uploaded (20)

PPTX
Pharma ospi slides which help in ospi learning
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PDF
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
PPTX
Onica Farming 24rsclub profitable farm business
PPTX
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
PPTX
Open Quiz Monsoon Mind Game Prelims.pptx
PPTX
How to Manage Starshipit in Odoo 18 - Odoo Slides
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
Pre independence Education in Inndia.pdf
PPTX
Cell Structure & Organelles in detailed.
DOCX
UPPER GASTRO INTESTINAL DISORDER.docx
PDF
English Language Teaching from Post-.pdf
PPTX
Introduction and Scope of Bichemistry.pptx
PDF
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
PDF
O7-L3 Supply Chain Operations - ICLT Program
PDF
Module 3: Health Systems Tutorial Slides S2 2025
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
Mga Unang Hakbang Tungo Sa Tao by Joe Vibar Nero.pdf
PDF
PSYCHOLOGY IN EDUCATION.pdf ( nice pdf ...)
Pharma ospi slides which help in ospi learning
Abdominal Access Techniques with Prof. Dr. R K Mishra
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
Onica Farming 24rsclub profitable farm business
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
Open Quiz Monsoon Mind Game Prelims.pptx
How to Manage Starshipit in Odoo 18 - Odoo Slides
O5-L3 Freight Transport Ops (International) V1.pdf
Pre independence Education in Inndia.pdf
Cell Structure & Organelles in detailed.
UPPER GASTRO INTESTINAL DISORDER.docx
English Language Teaching from Post-.pdf
Introduction and Scope of Bichemistry.pptx
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
O7-L3 Supply Chain Operations - ICLT Program
Module 3: Health Systems Tutorial Slides S2 2025
Pharmacology of Heart Failure /Pharmacotherapy of CHF
Mga Unang Hakbang Tungo Sa Tao by Joe Vibar Nero.pdf
PSYCHOLOGY IN EDUCATION.pdf ( nice pdf ...)

Prediction of quality for different type of winebased on different feature sets using supervised machine learning techniques

  • 1. Venkat Java Projects Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com Email:[email protected] Prediction of Quality for Different Type of Wine based on Different Feature Sets Using Supervised Machine Learning Techniques In this paper author is predicting quality of wine using supervise machine learning algorithmssuch asSVM,Random Forest,NaïveBayes etc. All algorithms prediction accuracy can be better by adding features selection algorithms such as Genetic Algorithm (GA) or Simulated Annealing (SA). Feature selection algorithms can be applied to dataset to remove non relevant attributes or missing values and take only thoseattributes from dataset which are important to makepredictions.Using featureselection algorithmswecan decreasedataset size by removing non relevant data and make prediction accuracy better and faster. Genetic algorithm works in similar way as its work on chromosomes by taking relevant genes to formnew production and remove unhealthy or non-relevant genes. GA algorithm continuously iterate over dataset to look for non-relevant attributes by doing mutation, reproduction and fitness, only those attributes which has high fitness or related to more dataset values can be used for mutation and reproduction and unfitted values will be removed out. Simulated annealing (SA) is a global search/selection method that makes small random changes (i.e. perturbations) to an initial (dataset values) candidate solution. If the performancevalue for the perturbed (new Data) value is better than the previous solution, the new solution (data/attribute) is accepted. If not, an acceptance probability is determined based on the difference between the two performance values and the current iteration of the search. From this, a sub-optimal solution can be accepted on the off-change that it may eventually produce a better solution or best attributes in subsequent iterations. SVMAlgorithm: Machinelearning involvespredicting and classifyingdata and to do so we employ various machinelearning algorithms according to the dataset. SVM or Support Vector Machine is a linear model for classification and regression problems. Itcan solve linear and non-linear problems and work well for many practical problems. The idea of SVM is simple: The algorithm creates a line or a hyperplanewhich separates the data into classes. In machine learning, the radial basis function kernel, or RBF kernel, is a popular kernel function used in various kernelized learning algorithms. In particular, it is commonly used in support vector machine classification. As a simple example, for a classification
  • 2. Venkat Java Projects Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com Email:[email protected] taskwith onlytwo features (likethe image above),youcan think of a hyperplane as a line that linearly separates and classifies a set of data. Intuitively, the further from the hyperplane our data points lie, the more confident we are that they have been correctly classified. We therefore want our data points to be as far away from the hyperplane as possible, while still being on the correct side of it. So when new testing data is added, whatever sideof the hyperplaneit lands will decide the class that we assign to it. How do we find the right hyperplane? Or, in other words, how do we best segregate the two classes within the data? The distancebetween the hyperplaneand the nearestdata point fromeither set is known as the margin. The goal is to choose a hyperplane with the greatest possible margin between the hyperplane and any point within the training set, giving a greater chance of new data being classified correctly. Random ForestAlgorithm: it’s an ensemble algorithm which means internally it will use multiple classifier algorithms to build accurate classifier model. Internally this algorithm will use decision tree algorithm to generate it train model for classification. Naive Bayes: Naive Bayes which is one of the most commonly used algorithms for classifying problems is simple probabilistic classifier and is based on Bayes Theorem. It determines the probability of each features occurring in each class and returns the outcome with the highest probability. Dataset Information We downloaded wine dataset from UCI machine learning website and dataset saved inside dataset folder. All machine learning algorithms will take dataset and train a model by splitting dataset into train and test part. Train part will be used to train model and test part will be applied on train model to predict test part class value. Screen shots To run this project double click on ‘run.bat’ file to get below screen
  • 3. Venkat Java Projects Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com Email:[email protected] In above screen click on ‘upload White/Red Wine Dataset’ button to upload red or white wine dataset. In above screen I am uploading redwine dataset and after upload will get below screen
  • 4. Venkat Java Projects Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com Email:[email protected] Now click on ‘Run SVM with GA’ button to run SVM algorithm with genetic feature selection algorithm. After clicking on this button 5 empty windows will open you just closed all 5 windows and keep the old one running In above screen we got 60% accuracy for SVM with Ga. Now run SVM with SA (Simulated Annealing) Algorithm
  • 5. Venkat Java Projects Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com Email:[email protected] In abovescreenfor SVM with SA we got50 % accuracy. Nowrun RandomForest with GA With random forest ga we got 30% accuracy. Now run random forest with SA
  • 6. Venkat Java Projects Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com Email:[email protected] In above screenrandom forestSA also got same accuracy and now click on Naïve Bayes with GA In above screennaïve bayes with GA got 40% accuracyand nowrun Naïve Bayes with SA
  • 7. Venkat Java Projects Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com Email:[email protected] Naïve Bayes SA got 40 % accuracy. Now click on ‘Accuracy Graph’ button to get accuracy graph for all algorithms In above graph x-axis represents algorithm name and y-axis represents accuracy of those algorithms and from above graph we can conclude SVM with GA got better accuracy compare to all other algorithms
  • 8. Venkat Java Projects Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com Email:[email protected]