Classification: MNIST, training a Binary classifier, performance measure, multiclass classification, error analysis, multi label classification, multi output classification.
Classification: MNIST, training a Binary classifier, performance measure, multiclass classification, error
analysis, multi label classification, multi output classification.
The document discusses classifying handwritten digits from the MNIST dataset using various machine learning classifiers and evaluation metrics. It begins with binary classification of the digit 5 using SGDClassifier, evaluating accuracy which is misleading due to class imbalance. The document then introduces confusion matrices and precision/recall metrics to better evaluate performance. It demonstrates how precision and recall can be traded off by varying the decision threshold, and introduces ROC curves to visualize this tradeoff. Finally, it compares SGDClassifier and RandomForestClassifier on this binary classification task.
The document discusses machine learning classification using the MNIST dataset of handwritten digits. It begins by defining classification and providing examples. It then describes the MNIST dataset and how it is fetched in scikit-learn. The document outlines the steps of classification which include dividing the data into training and test sets, training a classifier on the training set, testing it on the test set, and evaluating performance. It specifically trains a stochastic gradient descent (SGD) classifier on the MNIST data. The performance is evaluated using cross validation accuracy, confusion matrix, and metrics like precision and recall.
The document discusses the MNIST dataset and classification techniques. It contains:
- An overview of the MNIST dataset containing 70,000 handwritten digit images
- Preparation of the data for classification, including splitting into training and test sets
- Binary classification to detect the digit 5 using a linear SGD classifier
- Evaluation metrics like accuracy, precision, recall, F1 score, and confusion matrices
- Tuning the classifier by adjusting decision thresholds to balance precision and recall
The document provides an overview of various machine learning classification algorithms including decision trees, lazy learners like K-nearest neighbors, decision lists, naive Bayes, artificial neural networks, and support vector machines. It also discusses evaluating and combining classifiers, as well as preprocessing techniques like feature selection and dimensionality reduction.
This document provides an overview of various machine learning classification techniques including decision trees, k-nearest neighbors, decision lists, naive Bayes, artificial neural networks, and support vector machines. For each technique, it discusses the basic approach, how models are trained and tested, and potential issues that may arise such as overfitting, parameter selection, and handling different data types.
This document discusses various classification algorithms including logistic regression, Naive Bayes, support vector machines, k-nearest neighbors, decision trees, and random forests. It provides examples of using logistic regression and support vector machines for classification tasks. For logistic regression, it demonstrates building a model to classify handwritten digits from the MNIST dataset. For support vector machines, it uses a banknote authentication dataset to classify currency notes as authentic or fraudulent. The document discusses evaluating model performance using metrics like confusion matrix, accuracy, precision, recall, and F1 score.
introducatio to ml introducatio to ml introducatio to mlDecentMusicians
This document provides an introduction to a machine learning course, detailing its objectives, learning outcomes, and evaluation criteria. It covers various machine learning algorithms, including supervised and unsupervised approaches, along with performance evaluation metrics and methodologies. The syllabus includes modules on topics such as regression, classification, clustering, dimensionality reduction, and ensemble models.
An Introduction to Supervised Machine Learning and Pattern Classification: Th...Sebastian Raschka
The document provides an introduction to supervised machine learning and pattern classification. It begins with an overview of the speaker's background and research interests. Key concepts covered include definitions of machine learning, examples of machine learning applications, and the differences between supervised, unsupervised, and reinforcement learning. The rest of the document outlines the typical workflow for a supervised learning problem, including data collection and preprocessing, model training and evaluation, and model selection. Common classification algorithms like decision trees, naive Bayes, and support vector machines are briefly explained. The presentation concludes with discussions around choosing the right algorithm and avoiding overfitting.
This document provides an overview of machine learning concepts covered in an Introduction to Machine Learning course. It discusses topics like binary and multiclass classification, evaluation metrics like precision and recall, imbalanced datasets, and algorithms like k-nearest neighbors, decision trees, support vector machines, and data projection techniques. Examples and illustrations are provided to explain key concepts in classification and how different algorithms work.
i i believe is is enviromntbelieve is is enviromnt7.ppthirahelen
This document outlines key topics in machine learning, particularly focusing on classifiers, training and testing methodologies, and different clustering strategies. It presents detailed concepts such as bias-variance trade-off, various classifier types including SVMs and k-nearest neighbors, and the training process involving labeled data. It also discusses optimal strategies for classifier performance and generalization in the context of supervised and unsupervised learning.
- Quiz 1 will be on Wednesday covering material from lecture in multiple choice and short answer format, focusing on topics not covered by projects. Students are advised to review slides and textbook.
- A preview was given of Project 3 on machine learning, computer vision, and clustering strategies like k-means, agglomerative clustering, mean-shift clustering and spectral clustering.
- An overview of machine learning concepts was provided including the framework of applying a prediction function to an image's features to get an output label, the process of training and testing models, and common classifiers and their properties.
- Quiz 1 will be on Wednesday covering material from lecture in multiple choice and short answer format, focusing on topics not covered by projects. Students are advised to review slides and textbook.
- A preview was given of Project 3 and upcoming lecture topics on machine learning, computer vision, and clustering strategies.
- The document contained slides on machine learning frameworks, prediction functions, classifiers like nearest neighbors and SVMs, generalization, and bias-variance tradeoff. References for further machine learning reading were also provided.
- Quiz 1 will be on Wednesday covering material from lecture with an emphasis on topics not covered in projects. It will contain around 20 multiple choice or short answer questions to be completed in class.
- The document previews a machine learning lecture covering topics like clustering strategies, classifiers, generalization, bias-variance tradeoff, and support vector machines. It provides slides and summaries of key concepts.
- Summarizing techniques for reducing error in machine learning models like choosing simpler classifiers, adding regularization, and obtaining more training data.
- Quiz 1 will be on Wednesday covering material from lecture with an emphasis on topics not covered in projects. It will contain around 20 multiple choice or short answer questions to be completed in class.
- The document previews a machine learning lecture covering topics like clustering strategies, classifiers, generalization, bias-variance tradeoff, and support vector machines. It provides slides and summaries of key concepts.
- Summarizing techniques for reducing error in machine learning models like choosing simpler classifiers, collecting more training data, and regularizing parameters.
This document discusses machine learning techniques including linear support vector machines (SVMs), data splitting, model fitting and prediction, and histograms. It summarizes an SVM tutorial for predicting samples and evaluating models using classification reports and confusion matrices. It also covers kernel density estimation, PCA, and comparing different classifiers.
The document provides an overview of machine learning, defining it as the ability for computers to learn from data without explicit programming. It discusses various types of machine learning, including supervised, unsupervised, and reinforcement learning, along with examples and the importance of decision trees in classification tasks. The document also outlines how to prepare datasets, types of algorithms, and details the decision tree mechanism, including concepts of entropy and information gain to optimize classification results.
Machine Learning Deep Learning Machine learningssuserd89c50
The document provides an overview of various clustering strategies, machine learning frameworks, and classifier types, emphasizing the importance of model generalization and the impact of training size on prediction accuracy. It discusses different classification methods such as nearest neighbor, linear classifiers, and support vector machines, while also addressing the bias-variance tradeoff and the need for appropriate feature representation. Additionally, it touches on the process of selecting kernels for image classification with SVMs and highlights the pros and cons of different classifiers.
Computational Biology, Part 4 Protein Coding Regionsbutest
The document discusses different machine learning approaches for supervised classification and sequence analysis. It describes several classification algorithms like k-nearest neighbors, decision trees, linear discriminants, and support vector machines. It also discusses evaluating classifiers using cross-validation and confusion matrices. For sequence analysis, it covers using position-specific scoring matrices, hidden Markov models, cobbling, and family pairwise search to identify new members of protein families. It compares the performance of these different machine learning methods on sequence analysis tasks.
This document provides an introduction to machine learning concepts including supervised learning, unsupervised learning, reinforcement learning, classification, regression, clustering, naive Bayes classifier, k-nearest neighbors algorithm, decision trees, and support vector machines. It defines each concept and technique, provides examples to illustrate how they work, and discusses their advantages and disadvantages. The key machine learning algorithms covered are naive Bayes, k-NN, decision trees, and support vector machines.
This document contains legal notices and disclaimers for an Intel presentation. It states that the presentation is for informational purposes only and that Intel makes no warranties. It also notes that performance depends on system configuration and that sample source code is released under an Intel license agreement. Finally, it provides basic copyright information.
Unit 4 Classification of data and more info on itrandomguy1722
The document discusses classification in machine learning, distinguishing between supervised learning (where the model is trained on labeled data) and unsupervised learning (where no class labels are known). It describes various aspects of classification, including model construction, usage, and evaluation, focusing on algorithms like decision trees and metrics such as entropy, information gain, and the Gini index. The document emphasizes the importance of data preparation and the characteristics required for effective classification models.
This document outlines an introduction to machine learning, covering various topics such as data reduction, regression, clustering, and classification with practical Python scripts. It provides examples of algorithms, discusses the importance of PCA for dimensionality reduction, and details a machine learning process from data collection to classification. Additionally, it highlights the use of libraries like scikit-learn and TensorFlow for implementing machine learning tasks.
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESVikash Kumar
This document is a progress report on image classification methods applied to retinal image datasets with a focus on glaucoma detection, utilizing algorithms like K-Nearest Neighbors, Random Forest, Adaptive Boosting, and Support Vector Machine. The study achieved an accuracy of 82% using Random Forest and suggests future research directions, including the use of convolutional neural networks and direct image dataset input for improved classification accuracy. It serves as a theoretical guide for researchers in selecting appropriate classifiers for similar tasks in image processing.
This document provides an overview of machine learning techniques for classification with imbalanced data. It discusses challenges with imbalanced datasets like most classifiers being biased towards the majority class. It then summarizes techniques for dealing with imbalanced data, including random over/under sampling, SMOTE, cost-sensitive classification, and collecting more data. [/SUMMARY]
Decision Tree Learning: Decision tree representation, Appropriate problems for decision tree learning,
Basic decision tree learning algorithm –ID3.
More Related Content
Similar to Classification: MNIST, training a Binary classifier, performance measure, multiclass classification, error analysis, multi label classification, multi output classification. (20)
An Introduction to Supervised Machine Learning and Pattern Classification: Th...Sebastian Raschka
The document provides an introduction to supervised machine learning and pattern classification. It begins with an overview of the speaker's background and research interests. Key concepts covered include definitions of machine learning, examples of machine learning applications, and the differences between supervised, unsupervised, and reinforcement learning. The rest of the document outlines the typical workflow for a supervised learning problem, including data collection and preprocessing, model training and evaluation, and model selection. Common classification algorithms like decision trees, naive Bayes, and support vector machines are briefly explained. The presentation concludes with discussions around choosing the right algorithm and avoiding overfitting.
This document provides an overview of machine learning concepts covered in an Introduction to Machine Learning course. It discusses topics like binary and multiclass classification, evaluation metrics like precision and recall, imbalanced datasets, and algorithms like k-nearest neighbors, decision trees, support vector machines, and data projection techniques. Examples and illustrations are provided to explain key concepts in classification and how different algorithms work.
i i believe is is enviromntbelieve is is enviromnt7.ppthirahelen
This document outlines key topics in machine learning, particularly focusing on classifiers, training and testing methodologies, and different clustering strategies. It presents detailed concepts such as bias-variance trade-off, various classifier types including SVMs and k-nearest neighbors, and the training process involving labeled data. It also discusses optimal strategies for classifier performance and generalization in the context of supervised and unsupervised learning.
- Quiz 1 will be on Wednesday covering material from lecture in multiple choice and short answer format, focusing on topics not covered by projects. Students are advised to review slides and textbook.
- A preview was given of Project 3 on machine learning, computer vision, and clustering strategies like k-means, agglomerative clustering, mean-shift clustering and spectral clustering.
- An overview of machine learning concepts was provided including the framework of applying a prediction function to an image's features to get an output label, the process of training and testing models, and common classifiers and their properties.
- Quiz 1 will be on Wednesday covering material from lecture in multiple choice and short answer format, focusing on topics not covered by projects. Students are advised to review slides and textbook.
- A preview was given of Project 3 and upcoming lecture topics on machine learning, computer vision, and clustering strategies.
- The document contained slides on machine learning frameworks, prediction functions, classifiers like nearest neighbors and SVMs, generalization, and bias-variance tradeoff. References for further machine learning reading were also provided.
- Quiz 1 will be on Wednesday covering material from lecture with an emphasis on topics not covered in projects. It will contain around 20 multiple choice or short answer questions to be completed in class.
- The document previews a machine learning lecture covering topics like clustering strategies, classifiers, generalization, bias-variance tradeoff, and support vector machines. It provides slides and summaries of key concepts.
- Summarizing techniques for reducing error in machine learning models like choosing simpler classifiers, adding regularization, and obtaining more training data.
- Quiz 1 will be on Wednesday covering material from lecture with an emphasis on topics not covered in projects. It will contain around 20 multiple choice or short answer questions to be completed in class.
- The document previews a machine learning lecture covering topics like clustering strategies, classifiers, generalization, bias-variance tradeoff, and support vector machines. It provides slides and summaries of key concepts.
- Summarizing techniques for reducing error in machine learning models like choosing simpler classifiers, collecting more training data, and regularizing parameters.
This document discusses machine learning techniques including linear support vector machines (SVMs), data splitting, model fitting and prediction, and histograms. It summarizes an SVM tutorial for predicting samples and evaluating models using classification reports and confusion matrices. It also covers kernel density estimation, PCA, and comparing different classifiers.
The document provides an overview of machine learning, defining it as the ability for computers to learn from data without explicit programming. It discusses various types of machine learning, including supervised, unsupervised, and reinforcement learning, along with examples and the importance of decision trees in classification tasks. The document also outlines how to prepare datasets, types of algorithms, and details the decision tree mechanism, including concepts of entropy and information gain to optimize classification results.
Machine Learning Deep Learning Machine learningssuserd89c50
The document provides an overview of various clustering strategies, machine learning frameworks, and classifier types, emphasizing the importance of model generalization and the impact of training size on prediction accuracy. It discusses different classification methods such as nearest neighbor, linear classifiers, and support vector machines, while also addressing the bias-variance tradeoff and the need for appropriate feature representation. Additionally, it touches on the process of selecting kernels for image classification with SVMs and highlights the pros and cons of different classifiers.
Computational Biology, Part 4 Protein Coding Regionsbutest
The document discusses different machine learning approaches for supervised classification and sequence analysis. It describes several classification algorithms like k-nearest neighbors, decision trees, linear discriminants, and support vector machines. It also discusses evaluating classifiers using cross-validation and confusion matrices. For sequence analysis, it covers using position-specific scoring matrices, hidden Markov models, cobbling, and family pairwise search to identify new members of protein families. It compares the performance of these different machine learning methods on sequence analysis tasks.
This document provides an introduction to machine learning concepts including supervised learning, unsupervised learning, reinforcement learning, classification, regression, clustering, naive Bayes classifier, k-nearest neighbors algorithm, decision trees, and support vector machines. It defines each concept and technique, provides examples to illustrate how they work, and discusses their advantages and disadvantages. The key machine learning algorithms covered are naive Bayes, k-NN, decision trees, and support vector machines.
This document contains legal notices and disclaimers for an Intel presentation. It states that the presentation is for informational purposes only and that Intel makes no warranties. It also notes that performance depends on system configuration and that sample source code is released under an Intel license agreement. Finally, it provides basic copyright information.
Unit 4 Classification of data and more info on itrandomguy1722
The document discusses classification in machine learning, distinguishing between supervised learning (where the model is trained on labeled data) and unsupervised learning (where no class labels are known). It describes various aspects of classification, including model construction, usage, and evaluation, focusing on algorithms like decision trees and metrics such as entropy, information gain, and the Gini index. The document emphasizes the importance of data preparation and the characteristics required for effective classification models.
This document outlines an introduction to machine learning, covering various topics such as data reduction, regression, clustering, and classification with practical Python scripts. It provides examples of algorithms, discusses the importance of PCA for dimensionality reduction, and details a machine learning process from data collection to classification. Additionally, it highlights the use of libraries like scikit-learn and TensorFlow for implementing machine learning tasks.
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESVikash Kumar
This document is a progress report on image classification methods applied to retinal image datasets with a focus on glaucoma detection, utilizing algorithms like K-Nearest Neighbors, Random Forest, Adaptive Boosting, and Support Vector Machine. The study achieved an accuracy of 82% using Random Forest and suggests future research directions, including the use of convolutional neural networks and direct image dataset input for improved classification accuracy. It serves as a theoretical guide for researchers in selecting appropriate classifiers for similar tasks in image processing.
This document provides an overview of machine learning techniques for classification with imbalanced data. It discusses challenges with imbalanced datasets like most classifiers being biased towards the majority class. It then summarizes techniques for dealing with imbalanced data, including random over/under sampling, SMOTE, cost-sensitive classification, and collecting more data. [/SUMMARY]
Introduction: Machine learning Landscape: what is ML?, Why, Types of ML, main challenges of ML.
Concept Learning: Concept learning task, Concept learning as search, Find-S algorithm, Version space,
Candidate Elimination algorithm.
Software Engineering is the course with code 21CS47 taught over 3 hours per week for a total of 40 contact hours. It has both CIE and SEE components worth 50 marks each. The course aims to teach students about software engineering principles, processes, requirements engineering, system models, agile development, project management, and risks in software development. Key topics covered include the software development lifecycle, software quality metrics, software processes and process models, testing strategies, and project scheduling.
This document provides an overview of regular expressions and examples of using them in Python. Some key points covered include:
- Regular expressions allow defining search patterns to find or extract text patterns from strings. The 're' library must be imported.
- Common patterns include ^ and $ to match start/end, . for any character, */+ for repetition, [] for character sets.
- Functions like search() and findall() are used to find matches. Groups () allow extracting portions of matches.
- Examples demonstrate searching/extracting email addresses, numbers, dates from text using regular expressions.
- Special characters like .,+,* have meaning in regex and must be escaped \. Special characters
This document discusses tuples in Python. Some key points:
1. Tuples are immutable sequences that are similar to lists but defined using parentheses. They can contain heterogeneous elements and support indexing, slicing, and repetition operations.
2. Tuples can be created using parentheses or the tuple() function. They cannot be modified once created.
3. Common tuple operations include accessing elements, mathematical operations, sorting, and using tuples as dictionary keys due to their immutability.
4. The differences between tuples and lists are that tuples are immutable while lists are mutable, tuples use parentheses and lists use brackets, and tuples can be used as dictionary keys while lists cannot.
5. Tuples can be
This document discusses Python dictionaries. It defines dictionaries as mappings between keys and values, where keys can be any immutable type like strings or numbers. It provides examples of creating empty dictionaries, adding items, accessing values by key, and built-in functions like len(), get(), pop(), keys(), update(), and del. It also discusses using dictionaries to count letter frequencies in strings and to parse text files. Advanced topics covered include translating strings using maketrans() and translate(), ignoring punctuation, and converting dictionaries to lists.
This document discusses lists in Python. It defines lists as mutable sequences that can contain elements of different types. Lists can be nested within other lists. Common list operations include accessing elements by index, slicing lists, modifying lists by assigning to indices, and using list methods like append(), pop(), sort(), and len(). The document provides examples of creating, accessing, modifying, and traversing lists in Python code.
This document discusses files in Python. It begins by defining what a file is and explaining that files enable persistent storage on disk. It then covers opening, reading from, and writing to files in Python. The main types of files are text and binary, and common file operations are open, close, read, and write. It provides examples of opening files in different modes, reading files line by line or in full, and writing strings or lists of strings to files. It also discusses searching files and handling errors when opening files. In the end, it presents some exercises involving copying files, counting words in a file, and converting decimal to binary.
The document provides an overview of the Python programming language, outlining that it is an interpreted, high-level and general-purpose language used across many domains with a large standard library and is open source; it also discusses Python's features such as being object-oriented, portable across platforms, powerful through libraries like NumPy and SciPy, and how it is used widely in industries like Google, YouTube, and more. The course covers Python application programming with details on credits, exam structure, what Python is, how it runs, popular IDEs, and its uses in different engineering branches.
Slot and filler structures represent knowledge through attributes (slots) and their associated values (fillers). Weak slot and filler structures provide little domain knowledge. Frames are a type of weak structure where a frame contains slots describing an entity. Semantic networks also represent knowledge with nodes and labeled links, allowing inheritance of properties through generalization hierarchies. Both frames and semantic networks enable quick retrieval of attribute values and easy description of object relations, but semantic networks additionally allow representation of non-binary predicates and partitioned reasoning about quantified statements.
- Weak slot and filler structures for knowledge representation lack rules, while strong structures like Conceptual Dependency (CD) and scripts overcome this.
- CD represents knowledge as a graphical presentation of high-level events using symbols like actions, objects, modifiers. It facilitates inference and is language independent.
- Scripts represent commonly occurring experiences through structured sequences of roles, props, scenes, and results to predict related events. Both CD and scripts decompose knowledge into primitives for fewer inference rules.
The document discusses state space search problems and techniques for solving them. It defines state space search as a process of considering successive configurations or states of a problem instance to find a goal state. Various search techniques like breadth-first search, depth-first search, and heuristic search are described. It also discusses problem characteristics that help determine the most appropriate search method, such as whether a problem can be decomposed or solution steps ignored/undone. Examples of search problems like the 8-puzzle, chess, and water jug problems are provided to illustrate state space formulation and solutions.
The document discusses human intelligence and artificial intelligence (AI). It defines human intelligence as comprising abilities such as learning, understanding language, perceiving, reasoning, and feeling. AI is defined as the science and engineering of making machines intelligent, especially computer programs. It involves developing systems that exhibit traits associated with human intelligence such as reasoning, learning, interacting with the environment, and problem solving. The document outlines the history of AI and discusses approaches to developing systems that think like humans or rationally. It also covers applications of AI such as natural language processing, expert systems, robotics, and more.
Rapid Prototyping for XR: Lecture 3 - Video and Paper PrototypingMark Billinghurst
This is lecture 3 in the course on Rapid Prototyping for XR, taught by Mark Billinghurst on June 10th 2025. This lecture is about Video and Paper prototyping.
Complete University of Calculus :: 2nd editionShabista Imam
Master the language of change with the Complete Guidance Book of Calculus—your comprehensive resource for understanding the core concepts and applications of differential and integral calculus. Designed for high school, college, and self-study learners, this book takes a clear, intuitive approach to a subject often considered challenging.
This covers traditional machine learning algorithms for classification. It includes Support vector machines, decision trees, Naive Bayes classifier , neural networks, etc.
It also discusses about model evaluation and selection. It discusses ID3 and C4.5 algorithms. It also describes k-nearest neighbor classifer.
Call For Papers - 17th International Conference on Wireless & Mobile Networks...hosseinihamid192023
17th International Conference on Wireless & Mobile Networks (WiMoNe 2025) will provide
an excellent international forum for sharing knowledge and results in theory, methodology and
applications of Wireless & Mobile computing Environment. Current information age is witnessing
a dramatic use of digital and electronic devices in the workplace and beyond. Wireless, Mobile
Networks & its applications had received a significant and sustained research interest in terms of
designing and deploying large scale and high performance computational applications in real life.
The aim of the conference is to provide a platform to the researchers and practitioners from both
academia as well as industry to meet and share cutting-edge development in the field.
Rapid Prototyping for XR: Lecture 6 - AI for Prototyping and Research Directi...Mark Billinghurst
This is lecture 6 in the course on Rapid Prototyping for XR, taught on June 13th, 2025 by Mark Billinghurst. This lecture was about using AI for Prototyping and Research Directions.
Complete guidance book of Asp.Net Web APIShabista Imam
Unlock the full potential of modern web development with the Complete Guidance Book of ASP.NET Web API—your all-in-one resource for mastering RESTful services using Microsoft’s powerful ASP.NET Core framework. This book takes you on a step-by-step journey from beginner to expert, covering everything from routing and controllers to security, performance optimization, and real-world architecture.
Structured Programming with C++ :: Kjell BackmanShabista Imam
Step into the world of high-performance programming with the Complete Guidance Book of C++ Programming—a definitive resource for mastering one of the most powerful and versatile languages in computer science.
Whether you're a beginner looking to learn the fundamentals or an intermediate developer aiming to sharpen your skills, this book walks you through C++ from the ground up. You'll start with basics like variables, control structures, and functions, then progress to object-oriented programming (OOP), memory management, file handling, templates, and the Standard Template Library (STL).
Introduction to Natural Language Processing - Stages in NLP Pipeline, Challen...resming1
Lecture delivered in 2021. This gives an introduction to Natural Language Processing. It describes the use cases of NLP in daily life. It discusses the stages in NLP Pipeline. It highlights the challenges involved covering the different levels of ambiguity that could arise. It also gives a brief note on the present scenario with the latest language models, tools and frameworks/libraries for NLP.
Citizen Observatories (COs) are initiatives that empower citizens to engage in data collection, analysis and interpretation in order to address various issues affecting their communities and contribute to policy-making and community development.
Thematic co-exploration is a co-production process where citizens actively participate alongside scientists and other actors in the exploration of specific themes.
Making them a reality involves addressing the following challenges:
Data quality and reliability
Engagement and retention of participants
Integration with policy and decision-making
For any number of circumstances, obsolescence risk is ever present in the electronics industry. This is especially true for human-to-machine interface hardware, such as keypads, touchscreens, front panels, bezels, etc. This industry is known for its high mix and low-volume builds, critical design requirements, and high costs to requalify hardware. Because of these reasons, many programs will face end-of-life challenges both at the component level as well as at the supplier level.
Redesigns and qualifications can take months or even years, so proactively managing this risk is the best way to deter this. If an LED is obsolete or a switch vendor has gone out of business, there are options to proceed.
In this webinar, we cover options to redesign and reverse engineer legacy keypad and touchscreen designs.
For more information on our HMI solutions, visit https://www.epectec.com/user-interfaces.
Classification: MNIST, training a Binary classifier, performance measure, multiclass classification, error analysis, multi label classification, multi output classification.
1. Introduction to Classification with
MNIST
Understanding the Basics of Image Classification in
Machine Learning
By.
Dr. Ravi Kumar B N
Assistant Professor, Dept. of ISE
Module2 PartI
2. Definition of Classification
Classification is a supervised machine learning task where the goal is to assign input
data to predefined categories or classes. A model learns from labeled training data and
then predicts the class labels for new, unseen data.
Types of Classification
1. Binary Classification
•Involves only two classes (e.g., Yes/No, Spam/Not Spam, Fraud/Not Fraud).
•Example: Email spam detection (spam or not spam).
2. Multi-Class Classification
•Involves more than two classes, where each instance belongs to exactly one class.
•Example: Handwritten digit recognition (0-9).
3. Multi-Label Classification
•Each instance can belong to multiple classes simultaneously.
•Example: A movie can be categorized as both "Action" and "Sci-Fi".
4. Multi-Output Classification (also called multi-target classification)
• model predicts multiple labels (outputs) for each input sample. Unlike multi-label classification
(where each instance belongs to multiple categories), here, each output can be treated as a
separate classification problem.
• image and predicts both the digit (0-9) and its color (red, blue, etc.).
3. Classification Algorithms
1.Logistic Regression – A linear model that estimates probabilities using the sigmoid function.
Example: Predicting if a customer will buy a product (Yes/No).
2.K-Nearest Neighbors (KNN) – Classifies a data point based on the majority class of its k-nearest
neighbors.
Example: Handwriting recognition.
3.Support Vector Machine (SVM) – Finds the optimal hyperplane that maximizes the margin between
classes.
Example: Face detection in images.
4.Decision Tree – A tree-like model that splits data based on feature conditions.
Example: Loan approval prediction.
5.Random Forest – An ensemble of decision trees that improves accuracy and reduces overfitting.
Example: Medical diagnosis classification.
6.Naïve Bayes – A probabilistic model based on Bayes' theorem with the assumption of feature
independence.
Example: Spam email filtering.
7.Artificial Neural Networks (ANNs) – Deep learning models inspired by the human brain for complex
patterns.
Example: Image classification (e.g., cats vs. dogs).
4. MNIST Dataset Overview
The MNIST (Modified National Institute of Standards and Technology)
dataset contains handwritten digits is a famous benchmark in machine
learning and computer vision. It consists of:
•70,000 grayscale images (28x28 pixels each).
•10 classes (digits 0-9).
•60,000 training images and 10,000 test images.
•Handwritten digit recognition is a common task using MNIST.
This set is often called the “hello world” of Machine Learning whenever people come up
with a new classification algorithm they are curious to see how it will perform on MNIST
6. What is Scikit-Learn?
"Scientific Toolkit for Machine Learning"
Scikit-Learn is a popular Python machine learning library built on NumPy, SciPy,
and Matplotlib. It provides simple and efficient tools for data mining, analysis, and
machine learning tasks.
Scikit-Learn follows a simple "fit → predict → evaluate" workflow.
7. The following code fetches the MNIST dataset:
The fetch_openml function in Scikit-Learn allows you to download datasets directly
from OpenML, an online repository of machine learning datasets. It is useful for
quickly accessing datasets for experiments and model training.
8. There are 70,000 images, and each image has 784 features. This is because each image is 28
× 28 pixels, and each feature simply represents one pixel’s intensity, from 0 (white) to 255
(black). Let’s take a peek at one digit from the dataset. Instance’s feature vector, reshape it
to a 28 × 28 array, and display it using Matplotlib’s imshow() function:
9. import matplotlib as mpl
import matplotlib.pyplot as plt
X_array = X.to_numpy() # Convert DataFrame to NumPy array
some_digit = X_array[0].reshape(28, 28)
plt.imshow(some_digit, cmap="binary")
plt.axis("off")
plt.show()
10. The MNIST dataset is actually already split into a training set (the first 60,000
images) and a test set (the last 10,000 images):
X_train, X_test, y_train, y_test = X[:60000], X[60000:], y[:60000], y[60000:]
The training set is already shuffled which guarantees that all cross-validation
folds will be similar (you don’t want one fold to be missing some digits). some
learning algorithms are sensitive to the order of the training instances, and they
perform poorly if they get many similar instances in a row.
X[:60000] → X_train
Takes the first 60,000 images for training.
X[60000:] → X_test
Takes the remaining 10,000 images for testing.
y[:60000] → y_train
Takes the first 60,000 labels for training.
Takes the last 10,000 labels for testing.
y[60000:] → y_test
11. Training a Binary Classifier
“5-detector” will be an example of a binary classifier, capable of distinguishing
between just two classes, 5 and not-5. Let’s create the target vectors for this
classification task:
y_train_5 = (y_train == 5) # True for all 5s, False for all other digits
y_test_5 = (y_test == 5)
pick a classifier and train it. Using Stochastic Gradient Descent (SGD) classifier,
using Scikit-Learn’s SGDClassifier class.
-This classifier has the advantage of being capable of handling very large datasets
efficiently. This is in part because SGD deals with training instances independently, one
at a time
13. Performance Measures
There are many performance measures available
• Measuring Accuracy Using Cross-Validation
• Confusion Matrix
• Precision and Recall
• Precision/Recall Trade-off
• The ROC Curve
14. Measuring Accuracy Using Cross-Validation-
Implementing Cross-Validation
to evaluate the accuracy of a machine learning model by testing it on different
subsets of the data. Instead of using just one train-test split, the data is divided
into multiple parts (folds), and the model is trained and tested multiple times.
How Does It Work? (K-Fold Cross-Validation)
1.Split the data into K equal parts (folds).
2.Train the model on K-1 folds and test on the remaining fold.
3.Repeat the process K times, each time using a different fold as the test set.
4.Calculate the average accuracy across all K iterations.
15. from sklearn.model_selection import StratifiedKFold
from sklearn.base import clone
skfolds = StratifiedKFold(n_splits=3)
for train_index, test_index in skfolds.split(X_train, y_train_5):
clone_clf = clone(sgd_clf)
X_train_folds = X_train[train_index]
y_train_folds = y_train_5[train_index]
X_test_fold = X_train[test_index]
y_test_fold = y_train_5[test_index]
clone_clf.fit(X_train_folds, y_train_folds)
y_pred = clone_clf.predict(X_test_fold) # Predict labels for the validation fold
n_correct = sum(y_pred == y_test_fold) # Count correct predictions
print(n_correct / len(y_pred)) # prints 0.9502, 0.96565, and 0.96495
The StratifiedKFold class performs stratified sampling to produce folds that contain a
representative ratio of each class. At each iteration the code creates a clone of the classifier,
trains that clone on the training folds, and makes predictions on the test fold. Then it counts
the number of correct predictions and outputs the ratio of correct predictions.
16. use the cross_val_score() function to evaluate our
SGDClassifier model, using K-fold cross-validation with
three folds.
>>> from sklearn.model_selection import cross_val_score
>>> cross_val_score(sgd_clf, X_train, y_train_5, cv=3, scoring="accuracy")
array([0.96355, 0.93795, 0.95615])
cv=3-> 3-fold cross-validation, where:
1.The training dataset (X_train, y_train_5) is split into 3 equal parts (folds).
2.The model is trained on 2 folds and tested on the remaining 1 fold.
3.This process is repeated 3 times, each time using a different fold for testing.
4.The function returns 3 accuracy scores (one for each test fold).
17. from sklearn.base import BaseEstimator
class Never5Classifier(BaseEstimator):
def fit(self, X, y=None):
return self
def predict(self, X):
return np.zeros((len(X), 1), dtype=bool)
Can you guess this model’s accuracy?
>>> never_5_clf = Never5Classifier()
>>> cross_val_score(never_5_clf, X_train, y_train_5, cv=3, scoring="accuracy")
array([0.91125, 0.90855, 0.90915])
it has over 90% accuracy! This is simply because only about 10% of the images are 5s, so
if you always guess that an image is not a 5, you will be right about 90% of the time.
This demonstrates why accuracy is generally not the preferred performance measure for
classifiers, especially when you are dealing with skewed datasets (i.e., when some classes
are much more frequent than others).
This is a dummy classifier that never predicts the digit 5
18. Confusion Matrix
• The general idea is to count the number of times instances of class A are
classified as class B.
• For example, to know the number of times the classifier confused images of
5s with 3s.
• To compute the confusion matrix, you first need to have a set of predictions
so that they can be compared to the actual targets.
• We can use the cross_val_predict() function:
Just like the cross_val_score() function, cross_val_predict() performs K-fold cross-
validation, but instead of returning the evaluation scores, it returns the predictions
made on each test fold.
from sklearn.model_selection import cross_val_predict
y_train_pred = cross_val_predict(sgd_clf, X_train, y_train_5, cv=3)
array([1, 0, 0, ..., 1, 0, 0])
19. To get the confusion matrix using the confusion_matrix() func tion. Just pass it the
target classes (y_train_5) and the predicted classes (y_train_pred):
>>> from sklearn.metrics import confusion_matrix
>>> confusion_matrix(y_train_5, y_train_pred)
array([[53057, 1522],
[ 1325, 4096]])
Each row in a confusion matrix represents an actual class, while each column repre
sents a predicted class.
• The first row of this matrix considers non-5 images (the negative class): 53,057 of
them were correctly classified as non-5s (they are called true negatives), while the
remaining 1,522 were wrongly classified as 5s (false positives).
• The second row considers the images of 5s (the positive class): 1,325 were
wrongly classified as non-5s (false negatives), while the remaining 4,096 were
correctly classi fied as 5s (true positives).
20. A perfect classifier would have only true positives and true negatives, so its
confusion matrix would have nonzero values only on its main diagonal (top left to
bottom right):
>>> y_train_perfect_predictions = y_train_5 # pretend we reached perfection
>>> confusion_matrix(y_train_5, y_train_perfect_predictions)
array([[54579, 0],
[ 0, 5421]])
21. Recall- also called sensitivity or the true positive rate (TPR): this is the ratio of positive
instances that are correctly detected by the classifier
precision - The accuracy of the positive predictions
23. Precision and Recall
Precision tells us how many of the predicted positives were actually correct, while recall
tells us how many of the actual positives the model was able to find.
•Precision: Out of everything the model said was positive, how much was right?
•Recall: Out of all the actual positives, how many did the model find?
For example, if a spam filter marks 100 emails as spam and 90 are actually spam, precision
is high. But if there were 200 spam emails in total and the filter only caught 90, recall is
low.
24. >>> from sklearn.metrics import precision_score, recall_score
>>> precision_score(y_train_5, y_train_pred) # == 4096 / (4096 + 1522)
0.7290850836596654
>>> recall_score(y_train_5, y_train_pred) # == 4096 / (4096 + 1325)
0.7555801512636044
it claims an image represents a 5, it is correct only 72.9% of the time. More over,
it only detects 75.6% of the 5s.
🔹 y_train_5 → True labels (ground truth).
🔹 y_train_pred → Predicted labels by the model
25. F1 score -combine precision and recall into a single metric called the F1 score.
• The F1 score is the harmonic mean of precision and recall
• Whereas the regular mean treats all values equally, the harmonic mean gives
much more weight to low values
in some contexts you mostly care about precision, and in other con texts you really care
about recall. For example, if you trained a classifier to detect videos that are safe for
kids, you would probably prefer a classifier that rejects many good videos (low recall) but
keeps only safe ones (high precision), rather than a classifier that has a much higher
recall but lets a few really bad videos show up in your product (in such cases, you may
even want to add a human pipeline to check the classifier’s video selection). On the
The harmonic mean is a type
useful when dealing with rate
26. Precision/Recall Trade-of
determines the minimum probability required for a positive prediction
There is often a tradeoff between precision and recall—improving one can lower the other.
🔹 If you increase precision, you become more selective, reducing false positives but
potentially missing some actual positives (lower recall).
🔹 If you increase recall, you try to catch as many positives as possible, but this may result in
more false positives (lower precision).
Example: Spam Email Detection
•If the spam filter is very strict (high precision), it only marks emails as spam when it’s very
sure, but it might miss some actual spam emails (low recall).
•If the spam filter flags too many emails as spam (high recall), it catches all spam emails, but
it might also wrongly mark some important emails as spam (low precision).
Adjusting the Tradeoff
•This balance is controlled using a decision threshold (e.g., probability score from a classifier).
•Lowering the threshold → More positives detected (higher recall, lower precision).
•Raising the threshold → Fewer false positives (higher precision, lower recall).
27. some real-world examples to help understand precision and recall:
1. Medical Diagnosis (Cancer Detection)
•Precision: Out of all patients diagnosed with cancer, how many actually have cancer?
•Recall: Out of all patients who truly have cancer, how many did the test detect?
• A high-precision test
. Fraud Detection (Credit Card Transactions)
•Precision: Out of all transactions flagged as fraud, how many were actually fraudulent?
•Recall: Out of all actual fraud cases, how many did the system detect?
• High precision avoids wrongly blocking legitimate transactions.
• High recall ensures most fraudulent transactions are caught, even if some false alarms
occur.
2. Autonomous Vehicles (Pedestrian Detection)
•Precision: Out of all detected pedestrians, how many were actually pedestrians?
•Recall: Out of all pedestrians on the road, how many did the system detect?
• High precision avoids unnecessary braking for non-pedestrians.
• High recall ensures all pedestrians are detected to prevent accidents.
3
28. . Job Resume Filtering (HR Systems)
•Precision: Out of all resumes selected for an interview, how many were actually qualified?
•Recall: Out of all qualified candidates, how many were selected for an interview?
• High precision avoids interviewing unqualified applicants.
• High recall ensures all strong candidates are considered.
4. Fake News Detection
•Precision: Out of all news articles flagged as fake, how many were actually fake?
•Recall: Out of all fake news articles available, how many were caught?
• High precision avoids wrongly marking real news as fake.
• High recall ensures most fake news is detected, even if some mistakes happen.
5. Voice Assistant (Wake Word Detection - e.g., "Hey Siri")
•Precision: Out of all times the assistant woke up, how many were actually triggered by "Hey Siri"?
•Recall: Out of all times the user said "Hey Siri", how many did the assistant recognize?
• High precision avoids waking up due to background noise.
• High recall ensures the assistant responds every time it's needed.
30. Decision Score
A decision score represents the confidence level or raw output of a machine learning
model before applying a classification threshold. It helps determine whether a given
instance belongs to a particular class.
decision_function() method, which returns a score for each instance, and then use any
threshold you want to make predictions based on those scores
34. The ROC Curve
• The receiver operating characteristic (ROC) curve is another
common tool used with binary classifiers.
• It is very similar to the precision/recall curve, but instead of plot
ting precision versus recall, the ROC curve plots the true positive
rate (TPR) (another name for recall) against the false positive rate
(FPR).
• The FPR is the ratio of negative instances that are incorrectly
classified as positive.
38. Multiclass Classification
Binary classifiers distinguish between two classes (Algorithms-Logistic Regression or
Support Vector Machine classifiers)
Multiclass classifiers (also called multinomial classifiers)
can distinguish between more than two classes.
algorithms- SGD classifiers, Random Forest classifiers,
and naive Bayes classifiers
39. There are various strategies that you can use to perform multiclass classification
with multiple binary classifiers.
1) one-versus-the-rest (OvR) strategy (also called one-versus-all).
• One way to create a system that can classify the digit images into 10 classes (from
0 to 9) is to train 10 binary classifiers, one for each digit (a 0-detector, a 1-detector,
a 2 detector, and so on).
• Then when we want to classify an image, you get the decision score from each
classifier for that image and you select the class whose classifier out puts the
highest score.
Example (MNIST = 10 classes):
•You train 10 classifiers:
•0 vs not-0
•1 vs not-1
...
•9 vs not-9
40. 2) one-versus-one (OvO) strategy
• train a binary classifier for every pair of digits: one to distinguish 0s and 1s,
another to distinguish 0s and 2s, another for 1s and 2s, and so on. This is called the
one-versus-one (OvO) strategy. If there are N classes, you need to train N × (N –
1) / 2 classifiers.
• For the MNIST problem, this means training 45 binary classifiers! When you want
to classify an image, you have to run the image through all 45 classifiers and see
which class wins the most duels.
45 classifiers
️
🛠️How it works:
1.Train classifiers like:
•0 vs 1, 0 vs 2, ..., 8 vs 9
41. Support Vector Machine classifiers scale poorly with the size of the training set.
For these algorithms OvO is preferred
>>> from sklearn.svm import SVC
>>> svm_clf = SVC()
>>> svm_clf.fit(X_train, y_train)
>>> svm_clf.predict([some_digit])
array([5], dtype=uint8)
The decision_function() method in scikit-learn classifiers gives you the
confidence scores (also called decision values) instead of just the predicted
class.
42. If you call the decision_function() method, you will see that it
returns 10 scores per instance (instead of just 1). That’s one score
per class:
>>> some_digit_scores = svm_clf.decision_function([some_digit])
>>> some_digit_scores array([[ 2.92492871, 7.02307409, 3.93648529,
0.90117363, 5.96945908, 9.5 , 1.90718593, 8.02755089, -0.13202708,
4.94216947]])
The highest score is indeed the one corresponding to class 5:
>>> np.argmax(some_digit_scores)
5
>>> svm_clf.classes_ # unique class labels seen during training.
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=uint8) # unsigned 8-bit integers
>>> svm_clf.classes_[5]
5
43. If you want to force Scikit-Learn to use one-versus-one or one-versus-the-
rest, you can use the OneVsOneClassifier or OneVsRestClassifier classes.
this code creates a multiclass classifier using the OvR strat egy, based on an
SVC
>>> from sklearn.multiclass import OneVsRestClassifier
>>> ovr_clf = OneVsRestClassifier(SVC())
>>> ovr_clf.fit(X_train, y_train)
>>> ovr_clf.predict([some_digit])
array([5], dtype=uint8)
>>> len(ovr_clf.estimators_) #trained 10 binary classifiers
10
44. Error Analysis
Once we have trained a promising model, the next step is to analyze what kinds of
mistakes it makes. This can help you uncover hidden patterns and guide your next
improvements.
1. A confusion matrix shows how often predictions matched actual labels. It’s your
first tool for spotting error patterns.
we need to make predictions using the cross_val_predict() function, then call the
confusion_matrix() function
47. most images are on the main diago nal, which means that they were classified
correctly. The 5s look slightly darker than the other digits, which could mean that
there are fewer images of 5s in the dataset or that the classifier does not perform as
well on 5s as on other digits. In fact, you can verify that both are the case
48. Looking at this plot, it seems that your efforts should be spent on reducing the false 8s.
For example, you could try to gather more training data for digits that look like 8s (but are
not) so that the classifier can learn to distinguish them from real 8s.
Or you could engineer new features that would help the classifier—for example, writing
an algorithm to count the number of closed loops (e.g., 8 has two, 6 has one, 5 has none).
Or you could preprocess the images (e.g., using Scikit-Image, Pillow, or OpenCV) to make
some patterns, such as closed loops, stand out more.
50. Multilabel Classification
In some cases you may want your classifier to output multiple classes for each
instance.
Consider a face recognition classifier: what should it do if it recognizes several
people in the same picture? It should attach one tag per person it recognizes. Say
the classifier has been trained to recognize three faces, Alice, Bob, and Charlie.
Then when the classifier is shown a picture of Alice and Charlie, it should output
[1, 0, 1] (meaning “Alice yes, Bob no, Charlie yes”). Such a classification system
that outputs multiple binary tags is called a multilabel classification system.
52. Multioutput Classification
It is simply a generaliza tion of multilabel classification where each label can be
multiclass (i.e., it can have more than two possible values).
Key Difference:
•Multilabel classification: Each instance can have multiple binary labels (e.g., [1, 0, 1]).
•Multioutput classification: Each output label can have more than two classes (i.e.,
multiclass per output).
53. build a system that removes noise from images
It will take as input a noisy digit image, and it will (hopefully) output a clean digit image,
repre sented as an array of pixel intensities, just like the MNIST images. Notice that the
classifier’s output is multilabel (one label per pixel) and each label can have multiple
values (pixel intensity ranges from 0 to 255). It is thus an example of a multioutput
classification system.