Machine_Learning.pptx

ML in a
Nutshell
• Tens of thousands of machine learning algorithms
• Hundreds new every year
• Every machine learning algorithm has three
components:
• Representation
• Evaluation
• Optimization

Representation
(Algorithms)
• Decision trees
• Sets of rules / Logic programs
• Instances
• Graphical models (Bayes/Markov
nets)
• Neural networks
• Support vector machines
• Model ensembles
• Etc.

Evaluation(Model fit stats
calculation)
• Accuracy
• Precision and
recall
• Squared error
• Likelihood
• Posterior
probability
• Cost / Utility
• Margin
• Entropy
• K-L divergence
• Etc.

Optimization(Methods for Parameters
Estimation)
• Combinatorial
optimization
• E.g.: Greedy search
• Convex optimization
• E.g.: Gradient descent
• Constrained
optimization
• E.g.: Linear programming

ML in
Practice
• Understanding domain, prior knowledge, and goals
• Data integration, selection, cleaning, pre-
processing, etc.
• Learning models
• Interpreting results
• Consolidating and deploying discovered knowledge
• Loop

Machine
Learning
• How did our brain process the images?
• How did the grouping happen?
• Human brain processed the given images - learning
• After learning the brain simply looked at the new image and
compared with the groups classified the image to the closest
group - Classification
• If a machine has to perform the same operation we use
Machine Learning
• We write programs for learning and then classification, this is
nothing but machine learning

Machine
Learning
Learnin
g
algorith
m
TRAININ
G
DATA
Answer
Trained
machine
Query

The machine learning
framework
• Apply a prediction function to a feature representation of the image
to get the desired output:
Slide credit: L.

The Machine Learning
framework
y =
f(x)
output prediction function
Image feature
• Training: given a training set of labeled examples {(x1,y1), …, (xN,yN)}, estimate the
prediction function f by minimizing the prediction error on the training set
• Testing: apply f to a never before seen test example x and output the predicted value
y = f(x)
Methods includes Parametric Methods and Non-parametric Methods

Classification: predicting a
category
Some techniques:
- Naïve Bayes
- Decision Tree
- Logistic Regression
- SGD
- Support Vector Machines
- Neural Network
- Ensembles

• Decision Trees – rpart, party, tree, chaid
• Random Forest – randomForest, party
• Neural Networks – nnet, neuralnet,
RSNNS
• SVM and Kernel estimation – e1071,
kernalab
•Performance Evaluation -
ROCR Etc.
Classification
in R

Highest market share
Data handling and
analytics capabilities
Easy to use/ learn
Why R/Python

Table of Contents
1) Business Problem
2) Base knowledge
3) Difference between Linear Regression & Logistic Regression
4) Logistic Regression
5) Logistic Model Development
6) Model Validation
7) Hands on – R/Python
8) Quizes
9) Case Solution

Business Problem
We have data for 900 banking customers, whether an applicant was rated as “Good
Credit/Non-Default” or “Bad Credit/Default” based on some given attributes.
Analytical Problem: We need to analyze and do exploratory data analysis and build
model
Here is the Definitions of the columns of
the data:
⮚ Age-Age of Applicant
⮚ Education- Range 1-5
⮚ Employment – Range 0-33
⮚ Address – Range 0-34
⮚ Income – Monthly Salary(‘Thousands)
⮚ Debtinc – Debt-to-Income Ratio
⮚ Credit debt – Credit card debt(credit loans)
⮚ Other debts – Any other loan liabilities
⮚ Default – 1(Loan default), 0 (Non-Default)
1. Understanding business problem
2. What is Default/Non-Default
3. Need of predictive model
4. Other problems that can be answered by similar scenario
5. What is Target/Dependent Variable
6. What are independent variables
7. Exploratory Analysis
8. Understanding Logistic Regression

Models can be developed in domains of risk, marketing or collections.
• Risk: Models to assess the credit worthiness of an application
✓ Credit card & Loans: Bad vs. Good
✓ Fraud Detection: Fraud Vs. Not Fraud
• Collections: Models to identify cardholders that are likely to default and thus need collection effort (Payment
Projection Models)
✓ Credit card & Loans: Default vs. Non Default
• Insurance: Models to identify claims that are Fraud or Not Fraud
• Marketing & Sales: Models to identify to responders to promotional campaigns
✓ Marketing: Response Vs. Non Response
✓ Sales: Buying Vs. Not Buying
• Operations: Models to identify to employees who attrite
✓ Operations: Attrition Vs. Retention
• Website: Models to identify to weather visitor will click or not
✓ Website: Click Vs. Not Click
• Gaming: Models to identify to who will win
✓ Gaming: Win Vs. Loss
• Health Care: Models to identify to cure or not cure
✓ Health Care: Cure Vs. Not Cure
Logistic Regression Modeling Applications

Meaning of Modeling
By “Modeling” we mean developing set of equations or mathematical formulation by which we can :
• Predict certain events
• Identify the drivers of certain events based on some explanatory variables For example, we can build
models to predict drivers of sales, risk of a borrower.
Historical
Data
Statistical
Analyse
s
Predict
Future Events
Bad
25
Your
Company

Recall: Modeling Process
3) Model Development
• Multivariate Analysis
• Variable Creation
• Variable Transformation and Binning
• Multicollinearity Check
• Variable Selection
• Technique Selection
Building Blocks of
the Modeling
Process
2) Data Analysis
• Univariate Analysis
• Extreme value treatment
• Missing Value Treatment
• Graphical Representation
1) Data Preparation
• Data Extraction
• Data Collation
• Data Sampling and holdouts
4) Model Validation
• Performance Analysis and charting
• Multiple model comparison
• Out of Sample Validation

What ?
Regression is formulating a functional relationship between a set of independent or Explanatory
variables (Xs) with a Dependent or Response variable (Y).
Y=f(X1, X2, X3,…,Xn)
Why ?
▪Knowledge of Y is crucial for decision making.
▪ Will he/she buy or not?
▪ Shall I offer him/her the loan or not?
▪X is available at the time of decision making and is related to Y, thus making it possible to have
a prediction of Y.
Recall: What is Regression?

ÐLogistic Regression
Continuous Binary (1/0)
ÐOrdinary Least
Squares Regression
Y
Sales Volume, Claim Amount, Number of
Equipments, % Increase in Growth, etc.
Buy/No Buy, Delinquent/Non Delinquent,
Growth/No Growth, etc.
Recall: Types of Regression

Quiz 1
Is Logistic regression mainly used for Regression?
True False
A B
1. A
2. B

Quiz 2
Which of the following methods do we use to best fit the data in Logistic Regression?
Least Square Error Maximum Likelihood
A B
Jaccard distance Both A and B
C D
1. A
2. B
3. C
4. D

Steps of Modelling
Outlier
treatment
•Box Plot
•Percentile(99)
Missing
value
treatment
Normalizatio
n of data
Handling
multi-
collinearity
•Performed VIF
Dummy
variable
creation
•Handle categorical variables
Logisitc
Model
• Variable Selection
• Optimising Model

Missing Value Analysis
Console Output

Outlier Detection: identify abnormal
patterns
Example: identify engine
anomalies Features:
- Heat generated
- Vibration of engine

Outlier Detection Target Function: outlier
factor
Outlier factor (0…1)
Some techniques:
- Statistical
techniques
- Local outlier factor
- One-class SVM
ID Total$ Age City OF
101 $200 25 SF 0.1
102 $350 35 LA 0.05
103 $25 15 LA 0.2
… … … … 0.1
0.9
0.2
0.15
0.1

Quiz 3
Which of the following measures of central tendency will always change if a single
value in the data changes?
Mean Median
A B
Mode All
C D
1. A
2. B
3. C
4. D

Quiz 4
A correlation between age and health of a person found to be -1.09. On basis of this
you would tell the doctors that?
The age is good predictor of health The age is poor predictor of health
A B
None of these
C
1. A
2. B
3. C

Logistic Procedure-Variable Significance

Logistic Procedure-Concordance
Association of Predicted Probabilities and Observed Responses
Percent Concordant 79.0
Percent Discordant 19.1
Percent Tied 1.9
Pairs 3627468
✓ Concordance is used to assess how well scorecards are separating the good and bad accounts in the development sample.
✓ The higher is the concordance, the larger is the separation of scores between good and bad accounts.
✓ The concordance ratio is a non-negative number, which theoretically may lie between 0 and 1.
Concordance Determination:
Among all pairs formed from 0 & 1 observations from the dependent variable, the % of pairs where the probability
assigned to an observation with value 1 for the dependent variable is greater than that assigned to an observation with
value 0.
Percentage of concordant pairs should be at least greater than 60.

Logistic Procedure-Confusion Matrix/ROC Curve

Solution
1. AUC of the model being 87% means that model have been able to distinguish between
defaulters and non-defaulters with 37% better than baseline/random model
1. Initial event rate i.e. default rate being 26% which is baseline result while Precision of
model being 55%, model have been able to provide 2x better lift.

Quiz 5
For the below confusion matrix, what is the recall?
1. 0.7
2. 0.8
3. 0.9
4. 0.95

Machine_Learning.pptx

More Related Content

Similar to Machine_Learning.pptx (20)

Recently uploaded (20)

Machine_Learning.pptx