SlideShare a Scribd company logo
阮山松 – NGUYEN SON TUNG - F112169103
Chapter 5 Introduction to
Machine Learning with Scikit-
learn
• Enabling computers to learn from data without explicit programming.
• Key idea: Machine learning algorithms adapt their behavior based on
data, improving over time with experience.
• Quote: “A program learns from experience E for tasks T with
performance measure P, if its performance at tasks T, as measured by
P, improves with experience E.” — Tom Mitchell.
• Examples: Email spam detection, chess-playing algorithms.
What is Machine Learning?
• Objective: Machine learning focuses on discovering patterns or
predicting outcomes based on data.
• Supervised Learning: Uses labeled data where the outcome is known.
oExample: Spam detection with labeled emails.
• Unsupervised Learning: Identifies patterns in unlabeled data.
oExample: Market basket analysis (finding items that often appear
together).
Learning from Data: Two Approaches
• A machine learning approach that uses labeled datasets to train
models, allowing them to predict outcomes based on labeled
examples.
• Types:
oClassification: Predicts discrete labels (e.g., spam or not spam).
oRegression: Predicts continuous values (e.g., house prices).
• Key Feature: Requires labeled data from domain experts or logs.
Supervised Learning: Overview
• Assigning each data point to a predefined category based on training
data.
• Process: Model is trained with labeled data to classify new data into
categories.
• Examples:
oFraud detection: Classify transactions as fraudulent or legitimate.
oSentiment analysis: Classify text as positive, negative, or neutral.
Classification in Supervised Learning
• Captures relationships between dependent and independent
variables to predict continuous outcomes.
• Predict a continuous output variable based on input features.
• Example: Predicting stock prices based on historical data.
• Key Differentiator: Output is a non-stop variety as opposed to discrete
classes.
Regression in Supervised Learning
• A method of exploring data without labeled outcomes to find hidden
structures.
• Discover underlying patterns or groupings within the data.
• No labeled data: Learns from the structure and similarities in the
dataset itself.
Unsupervised Learning: Overview
Applications of Unsupervised Learning
• Clustering: Grouping similar data points together to identify natural
clusters within the data.
oExample: Customer segmentation in marketing.
• Association: Finds items that frequently occur together.
oExample: Market basket analysis in retail (e.g., milk and bread
bought together).
Structure of a Machine Learning System
• Offline Process: Training stage where the model learns from historical
data.
oObjective: Learn patterns and relationships.
• Online Process: Prediction stage where the model makes predictions
on new, unseen data.
oExample: A spam classifier trained on old emails predicts if new
emails are spam or not.
• Six Key Stages:
1. Problem Understanding: Define and understand the problem.
2. Data Collection: Gather relevant data.
3. Data Annotation and Preparation: Clean, label, and prepare data.
4. Data Wrangling: Transform data into the required format.
5. Model Development, Training, and Evaluation: Train and assess
the model.
6. Model Deployment and Maintenance: Integrate into production
and monitor.
The Machine Learning Process
The Machine Learning Process
• Clarify goals and set the scope for the machine learning project.
• Actions:
oDefine the problem with stakeholders.
oDetermine measurable outcomes and success criteria.
oExamples: Identify spam emails, classify product reviews.
Step 1: Problem Understanding
Step 2: Data Collection
• Gather quality data relevant to the problem.
• Data sources: Transaction logs, user behavior logs, public datasets,
etc.
• Key Consideration: Quality and relevance of data directly affect model
performance.
Step 3: Data Annotation and Data Preparation
• Data Annotation: Label data for supervised learning.
oExample: Tagging images with objects for object detection.
• Data Preparation: Cleaning, reformatting, and normalizing raw data
for model compatibility.
• Objective: Ensure data quality and consistency.
Step 4: Data Wrangling
• Transform data into a numeric format suitable for model training.
• Process: Convert data into feature vectors (numeric arrays) using
libraries like NumPy.
• Ensures compatibility with algorithms and standardizes input format.
Step 5: Model Development, Training, and
Evaluation
• Model Development: Selecting and configuring algorithms (e.g.,
Scikit-learn’s SVM, decision trees).
• Training: Model learns from a large portion of the data.
• Evaluation: Test the model on unseen data to assess performance.
• Tune parameters based on results to improve accuracy.
Step 5: Model Development, Training, and
Evaluation
Step 6: Model Deployment
• Integration: Incorporate the model into production systems.
• Inference: Model processes new data and generates predictions.
• Data Collection: Gather new data from real-world usage.
• Model Improvement: Use collected data to refine the model for
future iterations.
Scikit-Learn: The Python Library for Machine
Learning
• Scikit-learn is a highly popular library for machine learning that
provides tools for supervised and unsupervised learning.
• It is built upon the SciPy stack, which involves NumPy, SciPy,
Matplotlib, Pandas, etc
Installing Scikit-Learn
• Installation:
oCreate a new cell in Jupyter Notebook:
oRun command to install
import sklearn
Understanding the API
• Features: Numerical variables representing data points.
• Estimators: Learn patterns from data (e.g., classification, regression).
• Predictors: Make predictions on new data.
• Transformers: Preprocess and transform data (e.g., scaling, feature
extraction).
• Chaining Estimators: Combine multiple estimators for complex tasks.
• Pipeline Objects: Simplify the process of chaining multiple estimators
into a single one.
• pipe = Pipeline([('scaler', StandardScaler()), ('svc', SVC())])
Your First Scikit-learn Experiment
• Use a simple dataset called iris dataset
• Example 1: See the components of iris object as a dictionary.
• Result:
Your First Scikit-learn Experiment
• Example 2: See Data, target, feature names and target names
• Result:
Your First Scikit-learn Experiment
• Example 3: Support Vector Machines (SVM) Classifier
• Result:
Chapter five gives a top level view of system gaining knowledge of (ML) basics
thru Scikit-learn, that specialize in sensible and reachable applications. Key
insights consist of know-how supervised vs. unsupervised gaining knowledge
of and the significance of records first-rate in correct predictions. Real-
international examples, which include unsolicited mail detection, spotlight
ML`s sensible value. The six-degree ML process—hassle know-how, records
collection, annotation, wrangling, version development, and deployment—
emphasizes making plans and iteration. Scikit-learn`s simplicity, validated thru
physical activities just like the Iris dataset, reinforces principles like
hyperparameter tuning and pipeline use. The bankruptcy additionally stresses
ML's iterative nature and moral concerns for accountable version
development.
Personal Reflections on this chapter

More Related Content

PDF
Hands-on - Machine Learning using scikitLearn
PPTX
A Beginner's Guide to Machine Learning with Scikit-Learn
PDF
Apprentissage statistique et analyse prédictive en Python avec scikit-learn p...
PDF
Pycon 2012 Scikit-Learn
PDF
Introduction to Machine Learning in Python using Scikit-Learn
PPTX
introductiontomachinelearning.pptx
PDF
Introduction to Machine Learning with SciKit-Learn
PDF
Introduction to Machine Learning with Python ( PDFDrive.com ).pdf
Hands-on - Machine Learning using scikitLearn
A Beginner's Guide to Machine Learning with Scikit-Learn
Apprentissage statistique et analyse prédictive en Python avec scikit-learn p...
Pycon 2012 Scikit-Learn
Introduction to Machine Learning in Python using Scikit-Learn
introductiontomachinelearning.pptx
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with Python ( PDFDrive.com ).pdf

Similar to Chapter 5 Introduction to Machine Learning with Scikit-learn.pptx (20)

PPTX
Lecture-6-7.pptx
PPTX
Machine learning and types
PPTX
Introduction to machine learning
PPTX
Unit - 1 - Introduction of the machine learning
PDF
Machine Learning for Everyone
PPTX
Week_1 Machine Learning introduction.pptx
PDF
Machine learning
PPTX
Lecture 1.pptxgggggggggggggggggggggggggggggggggggggggggggg
PPTX
AI_06_Machine Learning.pptx
PDF
Introduction To Machine Learning With Python A Guide For Data Scientists 1st ...
PPTX
ECT463 Machine Learning Module 1 KTU 2019 Scheme.pptx
PPTX
Machine Learning Introduction
PDF
Week 1.pdf
PPTX
Machine learning
PPTX
Xhrysalis'25 Session 6 Main.pptxjwjijsnajwjjwjwjw
PPTX
Chapter 05 Machine Learning.pptx
PPT
Machine Learning ICS 273A
PPTX
Machine learning ppt.
PPTX
Machine_Learning.pptx
PDF
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
Lecture-6-7.pptx
Machine learning and types
Introduction to machine learning
Unit - 1 - Introduction of the machine learning
Machine Learning for Everyone
Week_1 Machine Learning introduction.pptx
Machine learning
Lecture 1.pptxgggggggggggggggggggggggggggggggggggggggggggg
AI_06_Machine Learning.pptx
Introduction To Machine Learning With Python A Guide For Data Scientists 1st ...
ECT463 Machine Learning Module 1 KTU 2019 Scheme.pptx
Machine Learning Introduction
Week 1.pdf
Machine learning
Xhrysalis'25 Session 6 Main.pptxjwjijsnajwjjwjwjw
Chapter 05 Machine Learning.pptx
Machine Learning ICS 273A
Machine learning ppt.
Machine_Learning.pptx
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
Ad

Recently uploaded (20)

PDF
Designing Intelligence for the Shop Floor.pdf
PDF
medical staffing services at VALiNTRY
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PDF
Nekopoi APK 2025 free lastest update
PDF
Download FL Studio Crack Latest version 2025 ?
PDF
Complete Guide to Website Development in Malaysia for SMEs
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PDF
Salesforce Agentforce AI Implementation.pdf
PDF
Cost to Outsource Software Development in 2025
PPTX
AMADEUS TRAVEL AGENT SOFTWARE | AMADEUS TICKETING SYSTEM
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PDF
Tally Prime Crack Download New Version 5.1 [2025] (License Key Free
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PPTX
Patient Appointment Booking in Odoo with online payment
PPTX
Why Generative AI is the Future of Content, Code & Creativity?
PPTX
Monitoring Stack: Grafana, Loki & Promtail
PPTX
Reimagine Home Health with the Power of Agentic AI​
Designing Intelligence for the Shop Floor.pdf
medical staffing services at VALiNTRY
How to Choose the Right IT Partner for Your Business in Malaysia
Internet Downloader Manager (IDM) Crack 6.42 Build 41
Nekopoi APK 2025 free lastest update
Download FL Studio Crack Latest version 2025 ?
Complete Guide to Website Development in Malaysia for SMEs
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
Salesforce Agentforce AI Implementation.pdf
Cost to Outsource Software Development in 2025
AMADEUS TRAVEL AGENT SOFTWARE | AMADEUS TICKETING SYSTEM
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
Wondershare Filmora 15 Crack With Activation Key [2025
Tally Prime Crack Download New Version 5.1 [2025] (License Key Free
Design an Analysis of Algorithms II-SECS-1021-03
Patient Appointment Booking in Odoo with online payment
Why Generative AI is the Future of Content, Code & Creativity?
Monitoring Stack: Grafana, Loki & Promtail
Reimagine Home Health with the Power of Agentic AI​
Ad

Chapter 5 Introduction to Machine Learning with Scikit-learn.pptx

  • 1. 阮山松 – NGUYEN SON TUNG - F112169103 Chapter 5 Introduction to Machine Learning with Scikit- learn
  • 2. • Enabling computers to learn from data without explicit programming. • Key idea: Machine learning algorithms adapt their behavior based on data, improving over time with experience. • Quote: “A program learns from experience E for tasks T with performance measure P, if its performance at tasks T, as measured by P, improves with experience E.” — Tom Mitchell. • Examples: Email spam detection, chess-playing algorithms. What is Machine Learning?
  • 3. • Objective: Machine learning focuses on discovering patterns or predicting outcomes based on data. • Supervised Learning: Uses labeled data where the outcome is known. oExample: Spam detection with labeled emails. • Unsupervised Learning: Identifies patterns in unlabeled data. oExample: Market basket analysis (finding items that often appear together). Learning from Data: Two Approaches
  • 4. • A machine learning approach that uses labeled datasets to train models, allowing them to predict outcomes based on labeled examples. • Types: oClassification: Predicts discrete labels (e.g., spam or not spam). oRegression: Predicts continuous values (e.g., house prices). • Key Feature: Requires labeled data from domain experts or logs. Supervised Learning: Overview
  • 5. • Assigning each data point to a predefined category based on training data. • Process: Model is trained with labeled data to classify new data into categories. • Examples: oFraud detection: Classify transactions as fraudulent or legitimate. oSentiment analysis: Classify text as positive, negative, or neutral. Classification in Supervised Learning
  • 6. • Captures relationships between dependent and independent variables to predict continuous outcomes. • Predict a continuous output variable based on input features. • Example: Predicting stock prices based on historical data. • Key Differentiator: Output is a non-stop variety as opposed to discrete classes. Regression in Supervised Learning
  • 7. • A method of exploring data without labeled outcomes to find hidden structures. • Discover underlying patterns or groupings within the data. • No labeled data: Learns from the structure and similarities in the dataset itself. Unsupervised Learning: Overview
  • 8. Applications of Unsupervised Learning • Clustering: Grouping similar data points together to identify natural clusters within the data. oExample: Customer segmentation in marketing. • Association: Finds items that frequently occur together. oExample: Market basket analysis in retail (e.g., milk and bread bought together).
  • 9. Structure of a Machine Learning System • Offline Process: Training stage where the model learns from historical data. oObjective: Learn patterns and relationships. • Online Process: Prediction stage where the model makes predictions on new, unseen data. oExample: A spam classifier trained on old emails predicts if new emails are spam or not.
  • 10. • Six Key Stages: 1. Problem Understanding: Define and understand the problem. 2. Data Collection: Gather relevant data. 3. Data Annotation and Preparation: Clean, label, and prepare data. 4. Data Wrangling: Transform data into the required format. 5. Model Development, Training, and Evaluation: Train and assess the model. 6. Model Deployment and Maintenance: Integrate into production and monitor. The Machine Learning Process
  • 12. • Clarify goals and set the scope for the machine learning project. • Actions: oDefine the problem with stakeholders. oDetermine measurable outcomes and success criteria. oExamples: Identify spam emails, classify product reviews. Step 1: Problem Understanding
  • 13. Step 2: Data Collection • Gather quality data relevant to the problem. • Data sources: Transaction logs, user behavior logs, public datasets, etc. • Key Consideration: Quality and relevance of data directly affect model performance.
  • 14. Step 3: Data Annotation and Data Preparation • Data Annotation: Label data for supervised learning. oExample: Tagging images with objects for object detection. • Data Preparation: Cleaning, reformatting, and normalizing raw data for model compatibility. • Objective: Ensure data quality and consistency.
  • 15. Step 4: Data Wrangling • Transform data into a numeric format suitable for model training. • Process: Convert data into feature vectors (numeric arrays) using libraries like NumPy. • Ensures compatibility with algorithms and standardizes input format.
  • 16. Step 5: Model Development, Training, and Evaluation • Model Development: Selecting and configuring algorithms (e.g., Scikit-learn’s SVM, decision trees). • Training: Model learns from a large portion of the data. • Evaluation: Test the model on unseen data to assess performance. • Tune parameters based on results to improve accuracy.
  • 17. Step 5: Model Development, Training, and Evaluation
  • 18. Step 6: Model Deployment • Integration: Incorporate the model into production systems. • Inference: Model processes new data and generates predictions. • Data Collection: Gather new data from real-world usage. • Model Improvement: Use collected data to refine the model for future iterations.
  • 19. Scikit-Learn: The Python Library for Machine Learning • Scikit-learn is a highly popular library for machine learning that provides tools for supervised and unsupervised learning. • It is built upon the SciPy stack, which involves NumPy, SciPy, Matplotlib, Pandas, etc
  • 20. Installing Scikit-Learn • Installation: oCreate a new cell in Jupyter Notebook: oRun command to install import sklearn
  • 21. Understanding the API • Features: Numerical variables representing data points. • Estimators: Learn patterns from data (e.g., classification, regression). • Predictors: Make predictions on new data. • Transformers: Preprocess and transform data (e.g., scaling, feature extraction). • Chaining Estimators: Combine multiple estimators for complex tasks. • Pipeline Objects: Simplify the process of chaining multiple estimators into a single one. • pipe = Pipeline([('scaler', StandardScaler()), ('svc', SVC())])
  • 22. Your First Scikit-learn Experiment • Use a simple dataset called iris dataset • Example 1: See the components of iris object as a dictionary. • Result:
  • 23. Your First Scikit-learn Experiment • Example 2: See Data, target, feature names and target names • Result:
  • 24. Your First Scikit-learn Experiment • Example 3: Support Vector Machines (SVM) Classifier • Result:
  • 25. Chapter five gives a top level view of system gaining knowledge of (ML) basics thru Scikit-learn, that specialize in sensible and reachable applications. Key insights consist of know-how supervised vs. unsupervised gaining knowledge of and the significance of records first-rate in correct predictions. Real- international examples, which include unsolicited mail detection, spotlight ML`s sensible value. The six-degree ML process—hassle know-how, records collection, annotation, wrangling, version development, and deployment— emphasizes making plans and iteration. Scikit-learn`s simplicity, validated thru physical activities just like the Iris dataset, reinforces principles like hyperparameter tuning and pipeline use. The bankruptcy additionally stresses ML's iterative nature and moral concerns for accountable version development. Personal Reflections on this chapter