SlideShare a Scribd company logo
Venkat Java Projects
Mobile:+91 9966499110
Visit:www.venkatjavaprojects.com Email:venkatjavaprojects@gmail.com
Performance Analysis of Machine Learning Algorithms on Self Localization Systems
In thispaperauthoris usingSVM(SupportVectorMachine), DecisionTree Classifier,K-Neighbors
Classifier,naïve bayes,RandomForestClassifier,BaggingClassifier,AdaBoostClassifierandMLP
Classifier
All the algorithmsgenerate model fromtraindatasetandnew datawill be appliedontrainmodel to
predictitclass.Random Forest algorithmisgivingbetterpredictionaccuracycompare to all other
algorithm.
Support vector machine:
Machine learninginvolvespredictingandclassifyingdataandto do sowe employvariousmachine
learningalgorithmsaccordingtothe dataset.SVMor SupportVectorMachine is a linearmodel for
classificationandregressionproblems.Itcansolve linearandnon-linearproblemsandworkwell for
manypractical problems.The ideaof SVMissimple:The algorithmcreatesaline or a hyperplane
whichseparatesthe dataintoclasses.Inmachine learning,the radial basisfunctionkernel,orRBF
kernel,isapopularkernel functionusedinvariouskernelizedlearningalgorithms.Inparticular,itis
commonlyusedinsupportvectormachine classification.Asasimple example,foraclassification
task withonlytwofeatures(like the image above),youcanthinkof a hyper plane asa line that
linearlyseparatesandclassifiesasetof data.
Intuitively,the furtherfromthe hyper plane ourdatapointslie,the more confidentwe are thatthey
have beencorrectlyclassified.We therefore wantourdatapointstobe as far awayfrom the hyper
plane as possible,while still beingonthe correctside of it.
So whennewtestingdataisadded, whateversideof the hyperplane itlandswill decide the class
that we assignto it.
How dowe findthe righthyper plane?
Or, inotherwords,howdo we bestsegregate the twoclasseswithinthe data?
The distance betweenthe hyper plane andthe nearestdatapointfromeithersetisknownasthe
margin.The goal isto choose a hyper plane withthe greatestpossible marginbetweenthe hyper
plane andany pointwithinthe trainingset,givingagreaterchance of new data beingclassified
correctly.
Both algorithmsgeneratemodel fromtraindatasetandnew datawill be appliedontrainmodel to
predictitclass.SVMalgorithmisgivingbetterpredictionaccuracycompare to ANN algorithm.
Naïve Bayes Classifier Algorithm
It wouldbe difficultand practicallyimpossible toclassifyawebpage,a document,anemail orany
otherlengthytextnotesmanually.This iswhere Naïve BayesClassifiermachine learningalgorithm
comesto the rescue.A classifierisafunctionthatallocatesa population’s elementvalue fromone of
the available categories.Forinstance,SpamFilteringisapopularapplicationof Naïve Bayes
algorithm.Spamfilterhere,isaclassifierthatassignsalabel “Spam”or “Not Spam” to all the emails.
Venkat Java Projects
Mobile:+91 9966499110
Visit:www.venkatjavaprojects.com Email:venkatjavaprojects@gmail.com
Naïve BayesClassifierisamongst the mostpopularlearningmethodgroupedbysimilaritiesthat
workson the popularBayesTheorem of Probability- tobuildmachine learningmodelsparticularly
for disease predictionanddocumentclassification.Itisa simple classification of wordsbased on
BayesProbabilityTheoremforsubjective analysisof content.
Decision tree:
A decisiontree isagraphical representationthatmakesuse of branchingmethodologytoexemplify
all possible outcomesof adecision, basedoncertainconditions.Inadecisiontree,the internalnode
representsateston the attribute,eachbranchof the tree represents the outcome of the testand
the leaf node representsaparticularclasslabel i.e.the decisionmade aftercomputingall of the
attributes.
The classificationrulesare representedthroughthe pathfromrootto the leaf node.
Types ofDecisionTrees
ClassificationTrees- These are consideredasthe defaultkindof decisiontreesusedtoseparate a
datasetintodifferentclasses,based onthe responsevariable.Theseare generallyusedwhenthe
response variable iscategoricalinnature.
RegressionTrees-Whenthe responseortargetvariable iscontinuousornumerical,regressiontrees
are used.These are generally usedinpredictive type of problemswhencomparedtoclassification.
Decisiontreescanalsobe classifiedintotwotypes,basedonthe type of targetvariable- Continuous
Variable DecisionTreesandBinary Variable DecisionTrees.Itisthe targetvariable thathelpsdecide
whatkindof decisiontree wouldbe requiredforaparticularproblem.
Random forest:
RandomForestisthe go to machine learningalgorithmthatusesabaggingapproachto create a
bunchof decisiontreeswithrandom subsetof the data.A model istrainedseveral timesonrandom
sample of the datasetto achieve goodpredictionperformancefromthe randomforestalgorithm.In
thisensemble learningmethod,the outputof all the decisiontreesinthe randomforest,is
combinedtomake the final prediction.The final prediction of the randomforestalgorithmisderived
by pollingthe resultsof eachdecisiontree orjustby goingwitha predictionthatappearsthe most
timesinthe decisiontrees.
For instance,inthe above example- if 5 friendsdecide thatyouwill like restaurantRbutonly2
friendsdecide thatyouwill notlike the restaurantthenthe final predictionisthat,youwill like
restaurantR as majorityalwayswins.
K – nearest neighbor:
K-nearest neighbor’salgorithm(k-NN) isa nonparametricmethodused forclassification
and regression Inbothcases,the inputconsistsof the kclosesttrainingexamplesinthe feature
space.The outputdependsonwhether k-NN isusedforclassificationorregression:
 In k-NN classification,the outputisa class membership.Anobjectisclassifiedbya
pluralityvote of itsneighbors,withthe objectbeingassignedtothe classmost common
amongits k nearestneighbors(kisapositive integer,typicallysmall).If k= 1, thenthe
objectissimplyassignedtothe classof that single nearestneighbor.
 In k-NN regression,the outputisthe propertyvalue forthe object.Thisvalue isthe
average of the valuesof k nearestneighbors.
Venkat Java Projects
Mobile:+91 9966499110
Visit:www.venkatjavaprojects.com Email:venkatjavaprojects@gmail.com
K-NN isa type of instant-basedlearning,orlazylearning, where the functionisonly
approximatedlocallyandall computationisdeferreduntil classification.
Both forclassificationandregression,auseful technique canbe toassignweightstothe
contributionsof the neighbors,sothatthe nearerneighborscontribute more tothe average
than the more distantones.Forexample,acommonweightingscheme consistsingiving
each neighboraweightof 1/d,where d isthe distance tothe neighbor
The neighborsare takenfroma setof objectsforwhichthe class (for k-NN classification)or
the objectpropertyvalue (for k-NN regression) isknown.Thiscanbe thoughtof as the
trainingsetforthe algorithm,thoughnoexplicittrainingstepisrequired.
A peculiarityof the k-NN algorithmisthatitis sensitive tothe local structure of the data.
Bagging classifier:
A Baggingclassifierisanensemble meta-estimatorthatfitsbase classifierseachonrandomsubsets
of the original datasetandthenaggregate theirindividual predictions(eitherbyvotingorby
averaging) toforma final prediction.Suchameta-estimatorcantypicallybe usedasa wayto reduce
the variance of a black-box estimator(e.g.,adecisiontree),byintroducingrandomizationintoits
constructionprocedure andthenmakinganensemble outof it.
Each base classifieristrainedinparallelwithatrainingsetwhichisgeneratedbyrandomlydrawing,
withreplacement,N examples (ordata) fromthe original trainingdataset –whereN is the size of the
original training set.Trainingsetfor each of the base classifiersisindependentof eachother.Many
of the original datamaybe repeatedinthe resultingtrainingsetwhile othersmaybe leftout.
Baggingreducesoverfitting(variance) byaveragingorvoting,however,thisleadstoanincrease in
bias,whichis compensatedbythe reductioninvariance though.
AdaBoost:
Adaptiveboosting isamachine learningmeatalgorithm formulated.Itcanbe usedin conjunction
withmanyothertypesof learningalgorithmstoimprove performance.The outputof the other
learningalgorithms('weaklearners') iscombinedintoaweightedsumthatrepresentsthe final
outputof the boostedclassifier.AdaBoostisadaptive inthe sense thatsubsequentweaklearners
are tweakedinfavorof those instancesmisclassifiedbypreviousclassifiers.AdaBoostissensitiveto
noisydata and outliers. Insome problemsitcanbe lesssusceptible tothe overfittingproblemthan
otherlearningalgorithms.The individuallearnerscanbe weak,butas longas the performance of
each one isslightlybetterthanrandomguessing,the final model canbe proventoconverge toa
stronglearner.
Everylearningalgorithmtendsto suitsome problemtypesbetterthanothers,andtypicallyhas
manydifferentparametersandconfigurationstoadjustbefore itachievesoptimalperformance ona
dataset,AdaBoost isoftenreferredtoasthe bestout-of-the-box classifier.[2]
Whenusedwith
decisiontree learning,informationgatheredateachstage of the AdaBoostalgorithmaboutthe
relative 'hardness'of eachtrainingsample isfedintothe tree growingalgorithmsuchthatlatertrees
tendto focuson harder-to-classifyexamples.
Venkat Java Projects
Mobile:+91 9966499110
Visit:www.venkatjavaprojects.com Email:venkatjavaprojects@gmail.com
Multilayer perceptron (MLP):
A multilayerperceptron(MLP) isa classof feedforward artificialneural network (ANN).The term
MLP isusedambiguously,sometimeslooselytoreferto any feed forwardANN,sometimesstrictlyto
referto networkscomposedof multiple layersof perceptrons (withthresholdactivation);
see § Terminology.Multilayerperceptronsare sometimescolloquiallyreferredtoas"vanilla"neural
networks,especiallywhentheyhave asingle hiddenlayer.
An MLP consistsof at leastthree layersof nodes:aninputlayer,a hiddenlayerandanoutputlayer.
Exceptfor the inputnodes,eachnode isa neuronthatusesa nonlinear activationfunction.MLP
utilizesasupervised learningtechnique called backpropagation fortraining. Itsmultiple layersand
non-linearactivationdistinguishMLPfroma linear perceptron.Itcandistinguishdatathatis
not linearlyseparable.
To implementabove all algorithmswe have usedpython technologyand‘student data’ dataset.This
dataset available inside dataset folder which contains test dataset with dataset information file.
PythonPackagesandLibrariesused:Numpy,pandas, tkinter,
PyVISA 1.10.1 1.10.1
PyVISA-py 0.3.1 0.3.1
cycler 0.10.0 0.10.0
imutils 0.5.3 0.5.3
joblib 0.14.1 0.14.1
kiwisolver 1.1.0 1.1.0
matplotlib 3.1.2 3.1.2
nltk 3.4.5 3.4.5
numpy 1.18.1 1.18.1
opencv-python 4.1.2.30 4.1.2.30
pandas 0.25.3 0.25.3
pip 19.0.3 20.0.1
pylab 0.0.2 0.0.2
pyparsing 2.4.6 2.4.6
python-dateutil 2.8.1 2.8.1
pytz 2019.3 2019.3
pyusb 1.0.2 1.0.2
scikit-learn 0.22.1 0.22.1
scipy 1.4.1 1.4.1
seaborn 0.9.0 0.9.0
setuptools 40.8.0 45.1.0
six 1.14.0 1.14.0
sklearn 0.0 0.0
style 1.1.6 1.1.6
Venkat Java Projects
Mobile:+91 9966499110
Visit:www.venkatjavaprojects.com Email:venkatjavaprojects@gmail.com
styled 0.2.0.post1 0.2.0.post1
classificationreport,confusionmatrix,accuracyscore,train_test_split,K-Fold,cross_val_score,Grid
SearchCV, DecisionTree Classifier,K-NeighborsClassifier,SVC,naive_bayes,Random Forest
Classifier,BaggingClassifier,AdaBoostClassifier,MLP Classifier.
Screenshots
Whenwe run the code itdisplaysbelow window
Nowclickon ‘uploaddataset’touploadthe data
Venkat Java Projects
Mobile:+91 9966499110
Visit:www.venkatjavaprojects.com Email:venkatjavaprojects@gmail.com
Nowclickon ‘readdata’ itreads the data
Nowclickon ‘Train_Test_split’tosplitthe dataintotrainingandtesting
Venkat Java Projects
Mobile:+91 9966499110
Visit:www.venkatjavaprojects.com Email:venkatjavaprojects@gmail.com
Nowclickon ‘All classifiers’toclassifythe models
KNN Predicted Values on Test Data is 98%
CART Predicted Values on Test Data is 97.31%
SVM Predicted Values on Test Data is 98.18%
RF Predicted Values on Test Data is 98.62%
Bagging Predicted Values on Test Data is 97.41%
Ada Predicted Values on Test Data is 87.43%
MLP Predicted Values on Test Data is 98.00%
Nowclickon ‘Model comparison’the comparisonbetweenthe models
Venkat Java Projects
Mobile:+91 9966499110
Visit:www.venkatjavaprojects.com Email:venkatjavaprojects@gmail.com

More Related Content

DOCX
Performance analysis of machine learning algorithms on self localization system1
PPTX
Tree pruning
PPT
Classification and prediction
PDF
05 Classification And Prediction
PPTX
04 Classification in Data Mining
PPT
Data Mining
PDF
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
Performance analysis of machine learning algorithms on self localization system1
Tree pruning
Classification and prediction
05 Classification And Prediction
04 Classification in Data Mining
Data Mining
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES

What's hot (19)

PPT
Data.Mining.C.6(II).classification and prediction
PDF
25 Machine Learning Unsupervised Learaning K-means K-centers
PPTX
Terminology Machine Learning
PPT
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
PPT
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
PPT
Chapter 09 class advanced
PPTX
Gradient Boosted trees
PPTX
Classification Continued
PPTX
Mis End Term Exam Theory Concepts
PPTX
Machine Learning
PDF
PPTX
Cluster Analysis
PPT
15857 cse422 unsupervised-learning
PDF
Unsupervised Learning Techniques to Diversifying and Pruning Random Forest
PPT
2.2 decision tree
PPTX
Classification techniques in data mining
PPTX
Data Mining: Mining stream time series and sequence data
PDF
Decision tree lecture 3
DOC
DATA MINING.doc
Data.Mining.C.6(II).classification and prediction
25 Machine Learning Unsupervised Learaning K-means K-centers
Terminology Machine Learning
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Data Mining:Concepts and Techniques, Chapter 8. Classification: Basic Concepts
Chapter 09 class advanced
Gradient Boosted trees
Classification Continued
Mis End Term Exam Theory Concepts
Machine Learning
Cluster Analysis
15857 cse422 unsupervised-learning
Unsupervised Learning Techniques to Diversifying and Pruning Random Forest
2.2 decision tree
Classification techniques in data mining
Data Mining: Mining stream time series and sequence data
Decision tree lecture 3
DATA MINING.doc
Ad

Similar to Performance analysis of machine learning algorithms on self localization system1 (20)

PPTX
PythonML.pptx
PDF
Python Code for Classification Supervised Machine Learning.pdf
DOCX
introduction to machine learning unit iv
PPTX
Data mining: Classification and prediction
PPTX
Data mining: Classification and Prediction
PPTX
RapidMiner: Data Mining And Rapid Miner
PPTX
RapidMiner: Data Mining And Rapid Miner
PPTX
AI Algorithms
PDF
Different Types of Data Science Models You Should Know.pdf
PPT
Machine-Learning-Algorithms- A Overview.ppt
PPT
Machine-Learning-Algorithms- A Overview.ppt
PDF
A Decision Tree Based Classifier for Classification & Prediction of Diseases
PPTX
Introduction to data visualization tools like Tableau and Power BI and Excel
DOCX
Nlp text classification
PPTX
Linear Algebra.pptx Presentation for GenAI
PPTX
Scikit - Algorithms98888888888888888888888.pptx
PDF
Андрей Гулин "Знакомство с MatrixNet"
PPT
Supervised and unsupervised learning
PPTX
dataminingclassificationprediction123 .pptx
PPTX
Big Data Analytics.pptx
PythonML.pptx
Python Code for Classification Supervised Machine Learning.pdf
introduction to machine learning unit iv
Data mining: Classification and prediction
Data mining: Classification and Prediction
RapidMiner: Data Mining And Rapid Miner
RapidMiner: Data Mining And Rapid Miner
AI Algorithms
Different Types of Data Science Models You Should Know.pdf
Machine-Learning-Algorithms- A Overview.ppt
Machine-Learning-Algorithms- A Overview.ppt
A Decision Tree Based Classifier for Classification & Prediction of Diseases
Introduction to data visualization tools like Tableau and Power BI and Excel
Nlp text classification
Linear Algebra.pptx Presentation for GenAI
Scikit - Algorithms98888888888888888888888.pptx
Андрей Гулин "Знакомство с MatrixNet"
Supervised and unsupervised learning
dataminingclassificationprediction123 .pptx
Big Data Analytics.pptx
Ad

More from Venkat Projects (20)

DOCX
1.AUTOMATIC DETECTION OF DIABETIC RETINOPATHY USING CNN.docx
DOCX
12.BLOCKCHAIN BASED MILK DELIVERY PLATFORM FOR STALLHOLDER DAIRY FARMERS IN K...
DOCX
10.ATTENDANCE CAPTURE SYSTEM USING FACE RECOGNITION.docx
DOCX
9.IMPLEMENTATION OF BLOCKCHAIN IN FINANCIAL SECTOR TO IMPROVE SCALABILITY.docx
DOCX
8.Geo Tracking Of Waste And Triggering Alerts And Mapping Areas With High Was...
DOCX
Image Forgery Detection Based on Fusion of Lightweight Deep Learning Models.docx
DOCX
6.A FOREST FIRE IDENTIFICATION METHOD FOR UNMANNED AERIAL VEHICLE MONITORING ...
DOCX
WATERMARKING IMAGES
DOCX
4.LOCAL DYNAMIC NEIGHBORHOOD BASED OUTLIER DETECTION APPROACH AND ITS FRAMEWO...
DOCX
Application and evaluation of a K-Medoidsbased shape clustering method for an...
DOCX
OPTIMISED STACKED ENSEMBLE TECHNIQUES IN THE PREDICTION OF CERVICAL CANCER US...
DOCX
1.AUTOMATIC DETECTION OF DIABETIC RETINOPATHY USING CNN.docx
DOCX
2022 PYTHON MAJOR PROJECTS LIST.docx
DOCX
2022 PYTHON PROJECTS LIST.docx
DOCX
2021 PYTHON PROJECTS LIST.docx
DOCX
2021 python projects list
DOCX
10.sentiment analysis of customer product reviews using machine learni
DOCX
9.data analysis for understanding the impact of covid–19 vaccinations on the ...
DOCX
6.iris recognition using machine learning technique
DOCX
5.local community detection algorithm based on minimal cluster
1.AUTOMATIC DETECTION OF DIABETIC RETINOPATHY USING CNN.docx
12.BLOCKCHAIN BASED MILK DELIVERY PLATFORM FOR STALLHOLDER DAIRY FARMERS IN K...
10.ATTENDANCE CAPTURE SYSTEM USING FACE RECOGNITION.docx
9.IMPLEMENTATION OF BLOCKCHAIN IN FINANCIAL SECTOR TO IMPROVE SCALABILITY.docx
8.Geo Tracking Of Waste And Triggering Alerts And Mapping Areas With High Was...
Image Forgery Detection Based on Fusion of Lightweight Deep Learning Models.docx
6.A FOREST FIRE IDENTIFICATION METHOD FOR UNMANNED AERIAL VEHICLE MONITORING ...
WATERMARKING IMAGES
4.LOCAL DYNAMIC NEIGHBORHOOD BASED OUTLIER DETECTION APPROACH AND ITS FRAMEWO...
Application and evaluation of a K-Medoidsbased shape clustering method for an...
OPTIMISED STACKED ENSEMBLE TECHNIQUES IN THE PREDICTION OF CERVICAL CANCER US...
1.AUTOMATIC DETECTION OF DIABETIC RETINOPATHY USING CNN.docx
2022 PYTHON MAJOR PROJECTS LIST.docx
2022 PYTHON PROJECTS LIST.docx
2021 PYTHON PROJECTS LIST.docx
2021 python projects list
10.sentiment analysis of customer product reviews using machine learni
9.data analysis for understanding the impact of covid–19 vaccinations on the ...
6.iris recognition using machine learning technique
5.local community detection algorithm based on minimal cluster

Recently uploaded (20)

PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
O7-L3 Supply Chain Operations - ICLT Program
PPTX
Onica Farming 24rsclub profitable farm business
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PDF
PSYCHOLOGY IN EDUCATION.pdf ( nice pdf ...)
DOCX
UPPER GASTRO INTESTINAL DISORDER.docx
PDF
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
PPTX
Cell Structure & Organelles in detailed.
PPTX
NOI Hackathon - Summer Edition - GreenThumber.pptx
PPTX
Nursing Management of Patients with Disorders of Ear, Nose, and Throat (ENT) ...
PPTX
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PPTX
Pharma ospi slides which help in ospi learning
PPTX
Open Quiz Monsoon Mind Game Final Set.pptx
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
Open Quiz Monsoon Mind Game Prelims.pptx
PDF
Electrolyte Disturbances and Fluid Management A clinical and physiological ap...
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
O7-L3 Supply Chain Operations - ICLT Program
Onica Farming 24rsclub profitable farm business
STATICS OF THE RIGID BODIES Hibbelers.pdf
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PSYCHOLOGY IN EDUCATION.pdf ( nice pdf ...)
UPPER GASTRO INTESTINAL DISORDER.docx
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
Cell Structure & Organelles in detailed.
NOI Hackathon - Summer Edition - GreenThumber.pptx
Nursing Management of Patients with Disorders of Ear, Nose, and Throat (ENT) ...
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
Pharma ospi slides which help in ospi learning
Open Quiz Monsoon Mind Game Final Set.pptx
Anesthesia in Laparoscopic Surgery in India
Open Quiz Monsoon Mind Game Prelims.pptx
Electrolyte Disturbances and Fluid Management A clinical and physiological ap...
102 student loan defaulters named and shamed – Is someone you know on the list?

Performance analysis of machine learning algorithms on self localization system1

  • 1. Venkat Java Projects Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com Email:[email protected] Performance Analysis of Machine Learning Algorithms on Self Localization Systems In thispaperauthoris usingSVM(SupportVectorMachine), DecisionTree Classifier,K-Neighbors Classifier,naïve bayes,RandomForestClassifier,BaggingClassifier,AdaBoostClassifierandMLP Classifier All the algorithmsgenerate model fromtraindatasetandnew datawill be appliedontrainmodel to predictitclass.Random Forest algorithmisgivingbetterpredictionaccuracycompare to all other algorithm. Support vector machine: Machine learninginvolvespredictingandclassifyingdataandto do sowe employvariousmachine learningalgorithmsaccordingtothe dataset.SVMor SupportVectorMachine is a linearmodel for classificationandregressionproblems.Itcansolve linearandnon-linearproblemsandworkwell for manypractical problems.The ideaof SVMissimple:The algorithmcreatesaline or a hyperplane whichseparatesthe dataintoclasses.Inmachine learning,the radial basisfunctionkernel,orRBF kernel,isapopularkernel functionusedinvariouskernelizedlearningalgorithms.Inparticular,itis commonlyusedinsupportvectormachine classification.Asasimple example,foraclassification task withonlytwofeatures(like the image above),youcanthinkof a hyper plane asa line that linearlyseparatesandclassifiesasetof data. Intuitively,the furtherfromthe hyper plane ourdatapointslie,the more confidentwe are thatthey have beencorrectlyclassified.We therefore wantourdatapointstobe as far awayfrom the hyper plane as possible,while still beingonthe correctside of it. So whennewtestingdataisadded, whateversideof the hyperplane itlandswill decide the class that we assignto it. How dowe findthe righthyper plane? Or, inotherwords,howdo we bestsegregate the twoclasseswithinthe data? The distance betweenthe hyper plane andthe nearestdatapointfromeithersetisknownasthe margin.The goal isto choose a hyper plane withthe greatestpossible marginbetweenthe hyper plane andany pointwithinthe trainingset,givingagreaterchance of new data beingclassified correctly. Both algorithmsgeneratemodel fromtraindatasetandnew datawill be appliedontrainmodel to predictitclass.SVMalgorithmisgivingbetterpredictionaccuracycompare to ANN algorithm. Naïve Bayes Classifier Algorithm It wouldbe difficultand practicallyimpossible toclassifyawebpage,a document,anemail orany otherlengthytextnotesmanually.This iswhere Naïve BayesClassifiermachine learningalgorithm comesto the rescue.A classifierisafunctionthatallocatesa population’s elementvalue fromone of the available categories.Forinstance,SpamFilteringisapopularapplicationof Naïve Bayes algorithm.Spamfilterhere,isaclassifierthatassignsalabel “Spam”or “Not Spam” to all the emails.
  • 2. Venkat Java Projects Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com Email:[email protected] Naïve BayesClassifierisamongst the mostpopularlearningmethodgroupedbysimilaritiesthat workson the popularBayesTheorem of Probability- tobuildmachine learningmodelsparticularly for disease predictionanddocumentclassification.Itisa simple classification of wordsbased on BayesProbabilityTheoremforsubjective analysisof content. Decision tree: A decisiontree isagraphical representationthatmakesuse of branchingmethodologytoexemplify all possible outcomesof adecision, basedoncertainconditions.Inadecisiontree,the internalnode representsateston the attribute,eachbranchof the tree represents the outcome of the testand the leaf node representsaparticularclasslabel i.e.the decisionmade aftercomputingall of the attributes. The classificationrulesare representedthroughthe pathfromrootto the leaf node. Types ofDecisionTrees ClassificationTrees- These are consideredasthe defaultkindof decisiontreesusedtoseparate a datasetintodifferentclasses,based onthe responsevariable.Theseare generallyusedwhenthe response variable iscategoricalinnature. RegressionTrees-Whenthe responseortargetvariable iscontinuousornumerical,regressiontrees are used.These are generally usedinpredictive type of problemswhencomparedtoclassification. Decisiontreescanalsobe classifiedintotwotypes,basedonthe type of targetvariable- Continuous Variable DecisionTreesandBinary Variable DecisionTrees.Itisthe targetvariable thathelpsdecide whatkindof decisiontree wouldbe requiredforaparticularproblem. Random forest: RandomForestisthe go to machine learningalgorithmthatusesabaggingapproachto create a bunchof decisiontreeswithrandom subsetof the data.A model istrainedseveral timesonrandom sample of the datasetto achieve goodpredictionperformancefromthe randomforestalgorithm.In thisensemble learningmethod,the outputof all the decisiontreesinthe randomforest,is combinedtomake the final prediction.The final prediction of the randomforestalgorithmisderived by pollingthe resultsof eachdecisiontree orjustby goingwitha predictionthatappearsthe most timesinthe decisiontrees. For instance,inthe above example- if 5 friendsdecide thatyouwill like restaurantRbutonly2 friendsdecide thatyouwill notlike the restaurantthenthe final predictionisthat,youwill like restaurantR as majorityalwayswins. K – nearest neighbor: K-nearest neighbor’salgorithm(k-NN) isa nonparametricmethodused forclassification and regression Inbothcases,the inputconsistsof the kclosesttrainingexamplesinthe feature space.The outputdependsonwhether k-NN isusedforclassificationorregression:  In k-NN classification,the outputisa class membership.Anobjectisclassifiedbya pluralityvote of itsneighbors,withthe objectbeingassignedtothe classmost common amongits k nearestneighbors(kisapositive integer,typicallysmall).If k= 1, thenthe objectissimplyassignedtothe classof that single nearestneighbor.  In k-NN regression,the outputisthe propertyvalue forthe object.Thisvalue isthe average of the valuesof k nearestneighbors.
  • 3. Venkat Java Projects Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com Email:[email protected] K-NN isa type of instant-basedlearning,orlazylearning, where the functionisonly approximatedlocallyandall computationisdeferreduntil classification. Both forclassificationandregression,auseful technique canbe toassignweightstothe contributionsof the neighbors,sothatthe nearerneighborscontribute more tothe average than the more distantones.Forexample,acommonweightingscheme consistsingiving each neighboraweightof 1/d,where d isthe distance tothe neighbor The neighborsare takenfroma setof objectsforwhichthe class (for k-NN classification)or the objectpropertyvalue (for k-NN regression) isknown.Thiscanbe thoughtof as the trainingsetforthe algorithm,thoughnoexplicittrainingstepisrequired. A peculiarityof the k-NN algorithmisthatitis sensitive tothe local structure of the data. Bagging classifier: A Baggingclassifierisanensemble meta-estimatorthatfitsbase classifierseachonrandomsubsets of the original datasetandthenaggregate theirindividual predictions(eitherbyvotingorby averaging) toforma final prediction.Suchameta-estimatorcantypicallybe usedasa wayto reduce the variance of a black-box estimator(e.g.,adecisiontree),byintroducingrandomizationintoits constructionprocedure andthenmakinganensemble outof it. Each base classifieristrainedinparallelwithatrainingsetwhichisgeneratedbyrandomlydrawing, withreplacement,N examples (ordata) fromthe original trainingdataset –whereN is the size of the original training set.Trainingsetfor each of the base classifiersisindependentof eachother.Many of the original datamaybe repeatedinthe resultingtrainingsetwhile othersmaybe leftout. Baggingreducesoverfitting(variance) byaveragingorvoting,however,thisleadstoanincrease in bias,whichis compensatedbythe reductioninvariance though. AdaBoost: Adaptiveboosting isamachine learningmeatalgorithm formulated.Itcanbe usedin conjunction withmanyothertypesof learningalgorithmstoimprove performance.The outputof the other learningalgorithms('weaklearners') iscombinedintoaweightedsumthatrepresentsthe final outputof the boostedclassifier.AdaBoostisadaptive inthe sense thatsubsequentweaklearners are tweakedinfavorof those instancesmisclassifiedbypreviousclassifiers.AdaBoostissensitiveto noisydata and outliers. Insome problemsitcanbe lesssusceptible tothe overfittingproblemthan otherlearningalgorithms.The individuallearnerscanbe weak,butas longas the performance of each one isslightlybetterthanrandomguessing,the final model canbe proventoconverge toa stronglearner. Everylearningalgorithmtendsto suitsome problemtypesbetterthanothers,andtypicallyhas manydifferentparametersandconfigurationstoadjustbefore itachievesoptimalperformance ona dataset,AdaBoost isoftenreferredtoasthe bestout-of-the-box classifier.[2] Whenusedwith decisiontree learning,informationgatheredateachstage of the AdaBoostalgorithmaboutthe relative 'hardness'of eachtrainingsample isfedintothe tree growingalgorithmsuchthatlatertrees tendto focuson harder-to-classifyexamples.
  • 4. Venkat Java Projects Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com Email:[email protected] Multilayer perceptron (MLP): A multilayerperceptron(MLP) isa classof feedforward artificialneural network (ANN).The term MLP isusedambiguously,sometimeslooselytoreferto any feed forwardANN,sometimesstrictlyto referto networkscomposedof multiple layersof perceptrons (withthresholdactivation); see § Terminology.Multilayerperceptronsare sometimescolloquiallyreferredtoas"vanilla"neural networks,especiallywhentheyhave asingle hiddenlayer. An MLP consistsof at leastthree layersof nodes:aninputlayer,a hiddenlayerandanoutputlayer. Exceptfor the inputnodes,eachnode isa neuronthatusesa nonlinear activationfunction.MLP utilizesasupervised learningtechnique called backpropagation fortraining. Itsmultiple layersand non-linearactivationdistinguishMLPfroma linear perceptron.Itcandistinguishdatathatis not linearlyseparable. To implementabove all algorithmswe have usedpython technologyand‘student data’ dataset.This dataset available inside dataset folder which contains test dataset with dataset information file. PythonPackagesandLibrariesused:Numpy,pandas, tkinter, PyVISA 1.10.1 1.10.1 PyVISA-py 0.3.1 0.3.1 cycler 0.10.0 0.10.0 imutils 0.5.3 0.5.3 joblib 0.14.1 0.14.1 kiwisolver 1.1.0 1.1.0 matplotlib 3.1.2 3.1.2 nltk 3.4.5 3.4.5 numpy 1.18.1 1.18.1 opencv-python 4.1.2.30 4.1.2.30 pandas 0.25.3 0.25.3 pip 19.0.3 20.0.1 pylab 0.0.2 0.0.2 pyparsing 2.4.6 2.4.6 python-dateutil 2.8.1 2.8.1 pytz 2019.3 2019.3 pyusb 1.0.2 1.0.2 scikit-learn 0.22.1 0.22.1 scipy 1.4.1 1.4.1 seaborn 0.9.0 0.9.0 setuptools 40.8.0 45.1.0 six 1.14.0 1.14.0 sklearn 0.0 0.0 style 1.1.6 1.1.6
  • 5. Venkat Java Projects Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com Email:[email protected] styled 0.2.0.post1 0.2.0.post1 classificationreport,confusionmatrix,accuracyscore,train_test_split,K-Fold,cross_val_score,Grid SearchCV, DecisionTree Classifier,K-NeighborsClassifier,SVC,naive_bayes,Random Forest Classifier,BaggingClassifier,AdaBoostClassifier,MLP Classifier. Screenshots Whenwe run the code itdisplaysbelow window Nowclickon ‘uploaddataset’touploadthe data
  • 6. Venkat Java Projects Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com Email:[email protected] Nowclickon ‘readdata’ itreads the data Nowclickon ‘Train_Test_split’tosplitthe dataintotrainingandtesting
  • 7. Venkat Java Projects Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com Email:[email protected] Nowclickon ‘All classifiers’toclassifythe models KNN Predicted Values on Test Data is 98% CART Predicted Values on Test Data is 97.31% SVM Predicted Values on Test Data is 98.18% RF Predicted Values on Test Data is 98.62% Bagging Predicted Values on Test Data is 97.41% Ada Predicted Values on Test Data is 87.43% MLP Predicted Values on Test Data is 98.00% Nowclickon ‘Model comparison’the comparisonbetweenthe models
  • 8. Venkat Java Projects Mobile:+91 9966499110 Visit:www.venkatjavaprojects.com Email:[email protected]