Linear regression and logistic regression are two machine learning algorithms that can be implemented in Python. Linear regression is used for predictive analysis to find relationships between variables, while logistic regression is used for classification with binary dependent variables. Support vector machines (SVMs) are another algorithm that finds the optimal hyperplane to separate data points and maximize the margin between the classes. Key terms discussed include cost functions, gradient descent, confusion matrices, and ROC curves. Code examples are provided to demonstrate implementing linear regression, logistic regression, and SVM in Python using scikit-learn.
This document summarizes a student project on building machine learning models to recognize handwritten digits. The project involved collecting handwritten digit data, preprocessing the data, implementing logistic regression with gradient descent, evaluating performance on test data, and achieving over 90% accuracy on the test set. The models have applications in areas like banking, postal services, and document digitization.
This document provides an introduction and overview of machine learning algorithms. It begins by discussing the importance and growth of machine learning. It then describes the three main types of machine learning algorithms: supervised learning, unsupervised learning, and reinforcement learning. Next, it lists and briefly defines ten commonly used machine learning algorithms including linear regression, logistic regression, decision trees, SVM, Naive Bayes, and KNN. For each algorithm, it provides a simplified example to illustrate how it works along with sample Python and R code.
This document provides an overview of linear regression and logistic regression concepts. It begins with an introduction to linear regression, discussing finding the best fit line to training data. It then covers the loss function and gradient descent optimization algorithm used to minimize loss and fit the model parameters. Next, it discusses logistic regression for classification problems, covering the sigmoid function for hypothesis representation and interpreting probabilities. It concludes by discussing feature scaling techniques like normalization and standardization to prepare data for modeling.
This document provides an introduction to machine learning concepts including supervised and unsupervised learning. It discusses linear regression with one variable and multiple variables. For linear regression with one variable, it describes the hypothesis function, cost function, gradient descent algorithm, and makes predictions using a housing dataset. For multiple variables, it introduces feature normalization and applies the concepts to predict housing prices based on size, bedrooms and price in a real estate dataset. The document provides code examples to implement the algorithms.
This document discusses supervised learning. Supervised learning uses labeled training data to train models to predict outputs for new data. Examples given include weather prediction apps, spam filters, and Netflix recommendations. Supervised learning algorithms are selected based on whether the target variable is categorical or continuous. Classification algorithms are used when the target is categorical while regression is used for continuous targets. Common regression algorithms discussed include linear regression, logistic regression, ridge regression, lasso regression, and elastic net. Metrics for evaluating supervised learning models include accuracy, R-squared, adjusted R-squared, mean squared error, and coefficients/p-values. The document also covers challenges like overfitting and regularization techniques to address it.
Logistic regression is a supervised statistical technique used primarily for binary classification problems, estimating the probability of a dependent variable and utilizing logit and sigmoid functions to derive relationships between variables. It distinguishes classes by finding decision boundaries and is often preferred for its simplicity and interpretability, although it is limited in handling categorical features and may suffer from overfitting. The cost function in logistic regression is crucial for model performance and optimization during predictions.
This document provides an overview of logistic regression, detailing its purpose in predicting categorical outcomes based on independent variables. It outlines key concepts such as the logit link function, odds ratios, cost functions, and model evaluation metrics including confusion matrix, accuracy, precision, and ROC curves. The content is part of an educational initiative by Delta Analytics to empower communities with data analysis skills.
Logistic regression is a machine learning classification algorithm used to predict the probability of a categorical dependent variable given one or more independent variables. It uses a logit link function to transform the probability values into odds ratios between 0 and infinity. The model is trained by minimizing a cost function called logistic loss using gradient descent optimization. Model performance is evaluated using metrics like accuracy, precision, recall, and the confusion matrix, and can be optimized by adjusting the probability threshold for classifications.
This document summarizes Andrew Ng's lecture notes on supervised learning and linear regression. It begins with examples of supervised learning problems like predicting housing prices from living area size. It introduces key concepts like training examples, features, hypotheses, and cost functions. It then describes using linear regression to predict prices from area and bedrooms. Gradient descent and stochastic gradient descent are introduced as algorithms to minimize the cost function. Finally, it discusses an alternative approach using the normal equations to explicitly minimize the cost function without iteration.
This document discusses machine learning algorithms for classification problems, specifically logistic regression. It explains that logistic regression predicts the probability of a binary outcome using a sigmoid function. Unlike linear regression, which is used for continuous outputs, logistic regression is used for classification problems where the output is discrete/categorical. It describes how logistic regression learns model parameters through gradient descent optimization of a likelihood function to minimize error. Regularization techniques can also be used to address overfitting issues that may arise from limited training data.
The document provides an overview of linear regression using Python, emphasizing the calculation of the best-fit line and its application in predicting outcomes. It discusses methods for minimizing errors, metrics for evaluating model performance such as mean absolute error and R-squared, and introduces concepts like polynomial regression and the potential pitfalls of overfitting and underfitting. Additionally, it briefly covers the bias-variance trade-off and explains how to adjust the R-squared value to assess model accuracy considering the number of predictors.
MACHINE LEARNING Unit -2 Algorithm.pptxARVIND SARDAR
The document discusses machine learning algorithms, focusing on regression and classification techniques. Regression analysis is a supervised learning method used for predicting continuous values, while classification algorithms categorize new observations based on training data, with logistic regression being a key example for binary classification tasks. Additionally, it elaborates on Bayes' theorem, which aids in calculating conditional probabilities and is fundamental to Naive Bayes classification in machine learning applications.
Difference between logistic regression shallow neural network and deep neura...Chode Amarnath
Logistic regression and shallow neural networks are both supervised learning algorithms used for classification problems. Logistic regression uses a logistic function to output a probability between 0 and 1, while shallow neural networks perform the same logistic calculation multiple times on inputs. Deep neural networks further extend this idea by adding more hidden layers of computation between the input and output layers. Both logistic regression and neural networks are trained using gradient descent to minimize a cost function by updating the model's weights and biases.
The document discusses logistic regression, a classification algorithm used to predict binary outcomes based on independent variables. It explains the mathematics behind logistic functions and the process of estimating coefficients using maximum-likelihood estimation. Additionally, it illustrates the application of logistic regression in predicting probabilities, such as determining a customer's likelihood to buy a magazine based on various factors.
This document provides an overview of supervised learning and linear regression. It introduces supervised learning problems using an example of predicting house prices based on living area. Linear regression is discussed as an initial approach to model this relationship. The cost function is defined as the mean squared error between predictions and targets. Gradient descent and stochastic gradient descent are presented as algorithms to minimize this cost function and learn the parameters of the linear regression model.
The document discusses logistic regression and its application in classification problems, distinguishing between binary and multi-class classifications. It covers the phases of classification (training and testing), algorithm selection criteria, data preprocessing, and model evaluation metrics. The discussion also highlights the use of R for data science, emphasizing the importance of handling categorical variables, managing multicollinearity, and the significance of reproducibility and model generalization.
Machine Learning deep learning artificialAlaaShorbaji1
The document discusses the concept of machine learning, which allows computers to learn from data without explicit programming, contrasting it with traditional programming. It covers key terminologies, types of learning (supervised, unsupervised, and semi-supervised), and specific algorithms used in supervised learning like linear regression and decision trees. The document also highlights the importance of regression analysis and provides examples of its application in predicting outcomes.
The document describes a process for analyzing prime numbers using Python's machine learning libraries, starting with generating all prime numbers under 10,000 and storing them in a CSV file. It explains how to load the data into a dataframe, visualize prime distribution, and implement a logistic regression classifier to predict prime numbers, with results indicating a mixed performance. The document further discusses challenges in predicting primes due to their unpredictable nature and provides insights into confusion matrices and feature extraction methods.
The document discusses linear regression for predicting salary based on years of experience. It introduces gradient descent for linear regression, which iteratively updates the slope and intercept parameters (θ1 and θ2) to minimize cost and improve predictions. Gradient descent takes steps in the direction of the steepest descent down the cost function landscape. The learning rate determines step sizes and must be optimized for accurate predictions within a reasonable time.
The document discusses key concepts in neural networks including units, layers, batch normalization, cost/loss functions, regularization techniques, activation functions, backpropagation, learning rates, and optimization methods. It provides definitions and explanations of these concepts at a high level. For example, it defines units as the activation function that transforms inputs via a nonlinear function, and hidden layers as layers other than the input and output layers that receive weighted input and pass transformed values to the next layer. It also summarizes common cost functions, regularization approaches like dropout, and optimization methods like gradient descent and stochastic gradient descent.
WEKA:Credibility Evaluating Whats Been Learnedweka Content
- Training and test sets are used to measure classification success rates, with the test set being independent of the training set. The error rate on the training set is optimistic. Cross validation techniques like 10-fold stratified cross validation are used when data is limited.
- True success rates are predicted using properties of statistics and normal distributions. Confidence levels determine the range within which the true rate is expected to lie.
- Techniques like paired t-tests are used to statistically compare the performance of different algorithms or data mining methods. They determine if performance differences are statistically significant.
This document discusses various techniques for evaluating machine learning models and comparing their performance, including:
- Measuring error rates on separate test and training sets to avoid overfitting
- Using techniques like cross-validation, bootstrapping, and holdout validation when data is limited
- Comparing algorithms using statistical tests like paired t-tests
- Accounting for costs of different prediction outcomes in evaluation and model training
- Visualizing performance using lift charts and ROC curves to compare models
- The Minimum Description Length principle for selecting the model that best compresses the data
This document provides an overview of the basics of MATLAB. It covers the following topics in 3 sentences or less each:
- The course outline includes basic setup, variables and arrays, for loops, if statements, user input, saving and loading variables, and plotting.
- The basic setup section explains the layout of MATLAB including tabs for home, plots, and apps as well as the command window and workspace.
- Variables and arrays are then introduced, explaining how variables can hold single numbers or arrays of numbers, and how the colon can be used to generate vectors and address values in arrays.
Feature-selection-techniques to be used in machine learning algorithmsssuser363702
The document discusses methods for feature selection using the chi-squared test for categorical data and Pearson's correlation coefficient and PCA for numeric data, utilizing the bank marketing UCI dataset. It explains the process of isolating categorical features, selecting the best features, and how to handle both categorical and numeric data in model training. The analysis reveals the most important features impacting the model, ultimately reducing the number of features used.
Logistic regression is a machine learning classification algorithm used to predict the probability of a categorical dependent variable given one or more independent variables. It uses a logit link function to transform the probability values into odds ratios between 0 and infinity. The model is trained by minimizing a cost function called logistic loss using gradient descent optimization. Model performance is evaluated using metrics like accuracy, precision, recall, and the confusion matrix, and can be optimized by adjusting the probability threshold for classifications.
This document summarizes Andrew Ng's lecture notes on supervised learning and linear regression. It begins with examples of supervised learning problems like predicting housing prices from living area size. It introduces key concepts like training examples, features, hypotheses, and cost functions. It then describes using linear regression to predict prices from area and bedrooms. Gradient descent and stochastic gradient descent are introduced as algorithms to minimize the cost function. Finally, it discusses an alternative approach using the normal equations to explicitly minimize the cost function without iteration.
This document discusses machine learning algorithms for classification problems, specifically logistic regression. It explains that logistic regression predicts the probability of a binary outcome using a sigmoid function. Unlike linear regression, which is used for continuous outputs, logistic regression is used for classification problems where the output is discrete/categorical. It describes how logistic regression learns model parameters through gradient descent optimization of a likelihood function to minimize error. Regularization techniques can also be used to address overfitting issues that may arise from limited training data.
The document provides an overview of linear regression using Python, emphasizing the calculation of the best-fit line and its application in predicting outcomes. It discusses methods for minimizing errors, metrics for evaluating model performance such as mean absolute error and R-squared, and introduces concepts like polynomial regression and the potential pitfalls of overfitting and underfitting. Additionally, it briefly covers the bias-variance trade-off and explains how to adjust the R-squared value to assess model accuracy considering the number of predictors.
MACHINE LEARNING Unit -2 Algorithm.pptxARVIND SARDAR
The document discusses machine learning algorithms, focusing on regression and classification techniques. Regression analysis is a supervised learning method used for predicting continuous values, while classification algorithms categorize new observations based on training data, with logistic regression being a key example for binary classification tasks. Additionally, it elaborates on Bayes' theorem, which aids in calculating conditional probabilities and is fundamental to Naive Bayes classification in machine learning applications.
Difference between logistic regression shallow neural network and deep neura...Chode Amarnath
Logistic regression and shallow neural networks are both supervised learning algorithms used for classification problems. Logistic regression uses a logistic function to output a probability between 0 and 1, while shallow neural networks perform the same logistic calculation multiple times on inputs. Deep neural networks further extend this idea by adding more hidden layers of computation between the input and output layers. Both logistic regression and neural networks are trained using gradient descent to minimize a cost function by updating the model's weights and biases.
The document discusses logistic regression, a classification algorithm used to predict binary outcomes based on independent variables. It explains the mathematics behind logistic functions and the process of estimating coefficients using maximum-likelihood estimation. Additionally, it illustrates the application of logistic regression in predicting probabilities, such as determining a customer's likelihood to buy a magazine based on various factors.
This document provides an overview of supervised learning and linear regression. It introduces supervised learning problems using an example of predicting house prices based on living area. Linear regression is discussed as an initial approach to model this relationship. The cost function is defined as the mean squared error between predictions and targets. Gradient descent and stochastic gradient descent are presented as algorithms to minimize this cost function and learn the parameters of the linear regression model.
The document discusses logistic regression and its application in classification problems, distinguishing between binary and multi-class classifications. It covers the phases of classification (training and testing), algorithm selection criteria, data preprocessing, and model evaluation metrics. The discussion also highlights the use of R for data science, emphasizing the importance of handling categorical variables, managing multicollinearity, and the significance of reproducibility and model generalization.
Machine Learning deep learning artificialAlaaShorbaji1
The document discusses the concept of machine learning, which allows computers to learn from data without explicit programming, contrasting it with traditional programming. It covers key terminologies, types of learning (supervised, unsupervised, and semi-supervised), and specific algorithms used in supervised learning like linear regression and decision trees. The document also highlights the importance of regression analysis and provides examples of its application in predicting outcomes.
The document describes a process for analyzing prime numbers using Python's machine learning libraries, starting with generating all prime numbers under 10,000 and storing them in a CSV file. It explains how to load the data into a dataframe, visualize prime distribution, and implement a logistic regression classifier to predict prime numbers, with results indicating a mixed performance. The document further discusses challenges in predicting primes due to their unpredictable nature and provides insights into confusion matrices and feature extraction methods.
The document discusses linear regression for predicting salary based on years of experience. It introduces gradient descent for linear regression, which iteratively updates the slope and intercept parameters (θ1 and θ2) to minimize cost and improve predictions. Gradient descent takes steps in the direction of the steepest descent down the cost function landscape. The learning rate determines step sizes and must be optimized for accurate predictions within a reasonable time.
The document discusses key concepts in neural networks including units, layers, batch normalization, cost/loss functions, regularization techniques, activation functions, backpropagation, learning rates, and optimization methods. It provides definitions and explanations of these concepts at a high level. For example, it defines units as the activation function that transforms inputs via a nonlinear function, and hidden layers as layers other than the input and output layers that receive weighted input and pass transformed values to the next layer. It also summarizes common cost functions, regularization approaches like dropout, and optimization methods like gradient descent and stochastic gradient descent.
WEKA:Credibility Evaluating Whats Been Learnedweka Content
- Training and test sets are used to measure classification success rates, with the test set being independent of the training set. The error rate on the training set is optimistic. Cross validation techniques like 10-fold stratified cross validation are used when data is limited.
- True success rates are predicted using properties of statistics and normal distributions. Confidence levels determine the range within which the true rate is expected to lie.
- Techniques like paired t-tests are used to statistically compare the performance of different algorithms or data mining methods. They determine if performance differences are statistically significant.
This document discusses various techniques for evaluating machine learning models and comparing their performance, including:
- Measuring error rates on separate test and training sets to avoid overfitting
- Using techniques like cross-validation, bootstrapping, and holdout validation when data is limited
- Comparing algorithms using statistical tests like paired t-tests
- Accounting for costs of different prediction outcomes in evaluation and model training
- Visualizing performance using lift charts and ROC curves to compare models
- The Minimum Description Length principle for selecting the model that best compresses the data
This document provides an overview of the basics of MATLAB. It covers the following topics in 3 sentences or less each:
- The course outline includes basic setup, variables and arrays, for loops, if statements, user input, saving and loading variables, and plotting.
- The basic setup section explains the layout of MATLAB including tabs for home, plots, and apps as well as the command window and workspace.
- Variables and arrays are then introduced, explaining how variables can hold single numbers or arrays of numbers, and how the colon can be used to generate vectors and address values in arrays.
Feature-selection-techniques to be used in machine learning algorithmsssuser363702
The document discusses methods for feature selection using the chi-squared test for categorical data and Pearson's correlation coefficient and PCA for numeric data, utilizing the bank marketing UCI dataset. It explains the process of isolating categorical features, selecting the best features, and how to handle both categorical and numeric data in model training. The analysis reveals the most important features impacting the model, ultimately reducing the number of features used.
Rapid Prototyping for XR: Lecture 3 - Video and Paper PrototypingMark Billinghurst
This is lecture 3 in the course on Rapid Prototyping for XR, taught by Mark Billinghurst on June 10th 2025. This lecture is about Video and Paper prototyping.
Deep Learning for Image Processing on 16 June 2025 MITS.pptxresming1
This covers how image processing or the field of computer vision has advanced with the advent of neural network architectures ranging from LeNet to Vision transformers. It covers how deep neural network architectures have developed step-by-step from the popular CNNs to ViTs. CNNs and its variants along with their features are described. Vision transformers are introduced and compared with CNNs. It also shows how an image is processed to be given as input to the vision transformer. It give the applications of computer vision.
International Journal of Advanced Information Technology (IJAIT)ijait
International journal of advanced Information technology (IJAIT) is a bi monthly open access peer-
reviewed journal, will act as a major forum for the presentation of innovative ideas, approaches,
developments, and research projects in the area advanced information technology applications and
services. It will also serve to facilitate the exchange of information between researchers and industry
professionals to discuss the latest issues and advancement in the area of advanced IT. Core areas of
advanced IT and multi-disciplinary and its applications will be covered during the conferences.
Rapid Prototyping for XR: Lecture 4 - High Level Prototyping.Mark Billinghurst
This is lecture 4 in the course on Rapid Prototyping for XR, taught by Mark Billinghurst on June 11th, 2025. This lecture is about High Level Prototyping.
Structured Programming with C++ :: Kjell BackmanShabista Imam
Step into the world of high-performance programming with the Complete Guidance Book of C++ Programming—a definitive resource for mastering one of the most powerful and versatile languages in computer science.
Whether you're a beginner looking to learn the fundamentals or an intermediate developer aiming to sharpen your skills, this book walks you through C++ from the ground up. You'll start with basics like variables, control structures, and functions, then progress to object-oriented programming (OOP), memory management, file handling, templates, and the Standard Template Library (STL).
Generative AI & Scientific Research : Catalyst for Innovation, Ethics & ImpactAlqualsaDIResearchGr
Invited keynote at the Artificial Intelligence Symposium on AI-powered Research Innovation, taking place at ENSEM (L'École Nationale Supérieure d'Électricité et de Mécanique), Casablanca on June 21, 2025. I’ll be giving a keynote titled: "Generative AI & Scientific Research: Catalyst for Innovation, Ethics & Impact". Looking forward to engaging with researchers and doctoral students on how Generative AI is reshaping the future of science, from discovery to governance — with both opportunities and responsibilities in focus.
#AI hashtag#GenerativeAI #ScientificResearch #Innovation #Ethics #Keynote #AIinScience #GAI #ResearchInnovation #Casablanca
1. Thinking, Creative Thinking, Innovation
2. Societies Evolution from 1.0 to 5.0
3. AI - 3P Approach, Use Cases & Innovation
4. GAI & Creativity
5. TrustWorthy AI
6. Guidelines on The Responsible use of GAI In Research
本資料では、Google DeepMindの音声復元モデル「Miipher / Miipher-2」を紹介しています。Miipher-2はUSM + WaveFit構成により、テキスト不要&高速処理を実現する他、100TPUで100万時間を3日で処理するスケーラビリティも大きな特徴です。
It introduces Miipher / Miipher-2, Google DeepMind's speech enhancement and restoration models.
Miipher-2 uses a USM + WaveFit setup for text-free and efficient processing, and it scales to clean 1M hours of audio in 3 days on 100 TPUs.
NEW Strengthened Senior High School Gen Math.pptxDaryllWhere
Ad
Machine Learning with Python- Machine Learning Algorithms- Logistic Regression.pdf
1. Machine Learning with Python
Machine Learning Algorithms - Logistic Regression
Prof.ShibdasDutta,
Associate Professor,
DCGDATACORESYSTEMSINDIAPVTLTD
Kolkata
Company Confidential: Data-Core Systems, Inc. | datacoresystems.com
2. Machine Learning Algorithms – Classification Algo- Logistic Regression
Logistic Regression - Introduction
Logistic regression is a supervised learning classification algorithm used to predict the
probability of a target variable. The nature of target or dependent variable is
dichotomous, which means there would be only two possible classes.
In simple words, the dependent variable is binary in nature having data coded as either
1 (stands for success/yes) or 0 (stands for failure/no).
Mathematically, a logistic regression model predicts P(Y=1) as a function of X.
It is one of the simplest ML algorithms that can be used for various classification
problems such as spam detection, Diabetes prediction, cancer detection etc.
Company Confidential: Data-Core Systems, Inc. | datacoresystems.com
3. Types of Logistic Regression
Generally, logistic regression means binary logistic regression having binary target
variables, but there can be two more categories of target variables that can be predicted
by it. Based on those number of categories, Logistic regression can be divided into
following types:
Binary or Binomial
In such a kind of classification, a dependent variable will have only two possible types
either 1 and 0. For example, these variables may represent success or failure, yes or no,
win or loss etc.
Multinomial
In such a kind of classification, dependent variable can have 3 or more possible
unordered types or the types having no quantitative significance. For example, these
variables may represent “Type A” or “Type B” or “Type C”.
Ordinal
In such a kind of classification, dependent variable can have 3 or more possible ordered
types or the types having a quantitative significance. For example, these variables may
represent “poor” or “good”, “very good”, “Excellent” and each category can have the
scores like 0,1,2,3.
4. Logistic Regression Assumptions
Before diving into the implementation of logistic regression, we must be aware of the following assumptions about the
same:
• In case of binary logistic regression, the target variables must be binary always and the desired outcome is
represented by the factor level 1.
• There should not be any multi-collinearity in the model, which means the independent variables must be independent
of each other.
• We must include meaningful variables in our model.
• We should choose a large sample size for logistic regression.
Company Confidential: Data-Core Systems, Inc. | datacoresystems.com
5. Binary Logistic Regression model
The simplest form of logistic regression is binary or binomial logistic regression in which the target or dependent variable
can have only 2 possible types either 1 or 0. It allows us to model a relationship between multiple predictor variables and a
binary/binomial target variable. In case of logistic regression, the linear function is basically used as an input to another
function such as g in the following relation:
Here, g is the logistic or sigmoid function.
To sigmoid curve can be represented with the help of following graph.
We can see the values of y-axis lie between 0 and 1 and crosses the axis at 0.5.
Hypothesis , e is the natural log 2.718
The classes can be divided into positive or negative.
The output comes under the probability of positive
class if it lies between 0 and 1.
For our implementation, we are interpreting the
output of hypothesis function as positive if it is ≥
0.5, otherwise negative.
6. We also need to define a loss function to measure how well the algorithm performs using the weights on functions, represented
by theta as follows:
Loss function
Functions have parameters/weights (represented by theta in our notation) and we want to find the best values for them. To start
we pick random values and we need a way to measure how well the algorithm performs using those random weights. That
measure is computed using the loss function, defined as:
def loss(h, y):
return (-y * np.log(h) - (1 - y) * np.log(1 - h)).mean()
Company Confidential: Data-Core Systems, Inc. | datacoresystems.com
7. Now, after defining the loss function our prime goal is to minimize the loss function. It can be done
with the help of fitting the weights which means by increasing or decreasing the weights. With the
help of derivatives of the loss function w.r.t each weight, we would be able to know what
parameters should have high weight and what should have smaller weight.
Gradient descent
The following gradient descent equation tells us how loss would change if we modified
the parameters:
Partial derivative
gradient = np.dot(X.T, (h - y)) / y.shape[0]
Then we update the weights by substracting to them the derivative times the learning rate.
lr = 0.01
theta -= lr * gradient
We should repeat this steps several times until we reach the optimal solution.
Company Confidential: Data-Core Systems, Inc. | datacoresystems.com
8. Predictions
By calling the sigmoid function we get the probability that some input x belongs to class 1.
Let’s take all probabilities ≥ 0.5 = class 1 and all probabilities < 0 = class 0.
This threshold should be defined depending on the business problem we were working.
def predict_probs(X, theta):
return sigmoid(np.dot(X, theta))def predict(X, theta, threshold=0.5):
return predict_probs(X, theta) >= threshold
Company Confidential: Data-Core Systems, Inc. | datacoresystems.com
9. Implementation in Python
Now we will implement the above concept of binomial logistic regression in Python. For this
purpose, we are using a multivariate flower dataset named ‘iris’ which have 3 classes of 50
instances each, but we will be using the first two feature columns. Every class represents a type of
iris flower.
First, we need to import the necessary libraries as follows:
import numpy as np
import matplotlib.pyplot as plt import seaborn as sns
from sklearn import datasets
Next, load the iris dataset as follows:
iris = datasets.load_iris()
X = iris.data[:, :2]
y = (iris.target != 0) * 1
We can plot our training data s follows:
Weather it can be separated with decision boundary or not?
plt.figure(figsize=(10, 6))
plt.scatter(X[y == 0][:, 0], X[y == 0][:, 1], color='g', label='0')
plt.scatter(X[y == 1][:, 0], X[y == 1][:, 1], color='y', label='1')
plt.legend();
10. It seems that it can be differentiated using a Decision Boundary, now lets define our class.
Next, we will define sigmoid function, loss function and gradient descend as follows:
Company Confidential: Data-Core Systems, Inc. | datacoresystems.com
11. class LogisticRegression:
#defining parameters such as learning rate, number ot iterations, whether to include intercept,
# and verbose which says whether to print anything or not like, loss etc.
def init (self, lr=0.01, num_iter=100000, fit_intercept=True, verbose=False):
self.lr = lr
self.num_iter = num_iter
self.fit_intercept = fit_intercept
self.verbose = verbose
def add_intercept(self, X): # function to define the Incercept value.
intercept = np.ones((X.shape[0], 1)) # initially we set it as all 1's
# then we concatinate them to the value of X, we don't add we just append them at the end.
return np.concatenate((intercept, X), axis=1)
def sigmoid(self, z): # this is our actual sigmoid function which predicts our yp
return 1 / (1 + np.exp(-z))
def loss(self, h, y): # this is the loss function which we use to minimize the error of our model
return (-y * np.log(h) - (1 - y) * np.log(1 - h)).mean()
def fit(self, X, y): # this is the function which trains our model.
if self.fit_intercept:
X = self. add_intercept(X) # as said if we want our intercept term to be added we use fit_intercept=True
12. Now, initialize the weights as follows:
self.theta = np.zeros(X.shape[1]) # weights initialization of our Normal Vector, initially we set it to 0,
then we learn it eventually
for i in range(self.num_iter): # this for loop runs for the number of iterations provided
z = np.dot(X, self.theta) # this is our theta * Xi
h = self. sigmoid(z) # this is where we predict the values of Y based on theta and Xi
gradient = np.dot(X.T, (h - y)) / y.size # this is where the gradient is calculated form the error
generated by our model
self.theta -= self.lr * gradient # this is where we update our values of theta, so that we can use the
new values for the next iteration
z = np.dot(X, self.theta) # this is our new theta * Xi
h = self. sigmoid(z)
loss = self. loss(h, y) # this is where the loss is calculated
if(self.verbose ==True and i % 10000 == 0): # as mentioned above if we want to print somehting we use
verbose, so if verbose=True then our loss get printed
print(f'loss: {loss} t')
Company Confidential: Data-Core Systems, Inc. | datacoresystems.com
13. With the help of the following script, we can predict the output probabilities:
# this is where we predict the probability values based on out generated W values out of all those
iterations.
def predict_prob(self, X):
# as said if we want our intercept term to be added we use fit_intercept=True
if self.fit_intercept:
X = self. add_intercept(X)
# this is the final prediction that is generated based on the values learned.
return self. sigmoid(np.dot(X, self.theta))
# this is where we predict the actual values 0 or 1 using round. anything less than 0.5 = 0 or more than
0.5 is 1
def predict(self, X):
return self.predict_prob(X).round()
Company Confidential: Data-Core Systems, Inc. | datacoresystems.com
14. Next, we can evaluate the model and plot it as follows:
model = LogisticRegression(lr=0.1, num_iter=300000)
preds = model.predict(X) # how well our predictions work
(preds == y).mean()
plt.figure(figsize=(10, 6))
plt.scatter(X[y == 0][:, 0], X[y == 0][:, 1], color='g', label='0')
plt.scatter(X[y == 1][:, 0], X[y == 1][:, 1], color='y', label='1') plt.legend()
x1_min, x1_max = X[:,0].min(), X[:,0].max(),
x2_min, x2_max = X[:,1].min(), X[:,1].max(),
xx1, xx2 = np.meshgrid(np.linspace(x1_min, x1_max), np.linspace(x2_min, x2_max))
grid = np.c_[xx1.ravel(), xx2.ravel()]
probs = model.predict_prob(grid).reshape(xx1.shape) plt.contour(xx1, xx2, probs, [0.5],
linewidths=1, colors='red');
Company Confidential: Data-Core Systems, Inc. | datacoresystems.com
17. cm = confusion_matrix(y, model.predict(X))
fig, ax = plt.subplots(figsize=(8, 8))
ax.imshow(cm)
ax.grid(False)
ax.xaxis.set(ticks=(0, 1), ticklabels=('Predicted 0s', 'Predicted 1s'))
ax.yaxis.set(ticks=(0, 1), ticklabels=('Actual 0s', 'Actual 1s'))
ax.set_ylim(1.5, -0.5)
for i in range(2):
for j in range(2):
ax.text(j, i, cm[i, j], ha='center', va='center', color='white')
plt.show()
Company Confidential: Data-Core Systems, Inc. | datacoresystems.com
18. Multinomial Logistic Regression Model
Another useful form of logistic regression is multinomial logistic regression in which the target or dependent variable can
have 3 or more possible unordered types i.e. the types having no quantitative significance.
Implementation in Python
Now we will implement the above concept of multinomial logistic regression in Python. For this
purpose, we are using a dataset from sklearn named digit.
First, we need to import the necessary libraries as follows:
Import sklearn
from sklearn import datasets
from sklearn import linear_model
from sklearn import metrics
from sklearn.model_selection import train_test_split
Next, we need to load digit dataset:
digits = datasets.load_digits()
Company Confidential: Data-Core Systems, Inc. | datacoresystems.com
19. Now, define the feature matrix(X) and response vector(y)as follows:
X = digits.data
y = digits.target
With the help of next line of code, we can split X and y into training and testing sets:
X_train, X_test, y_train, y_test = train_test_split(X, y,test_size=0.4, random_state= 1)
Now create an object of logistic regression as follows:
digreg = linear_model.LogisticRegression()
Now, we need to train the model by using the training sets as follows:
digreg.fit(X_train, y_train)
Next, make the predictions on testing set as follows:
y_pred = digreg.predict(X_test)
Company Confidential: Data-Core Systems, Inc. | datacoresystems.com
20. Next print the accuracy of the model as follows:
print("Accuracy of Logistic Regression model is:",
metrics.accuracy_score(y_test, y_pred)*100)
Output
Accuracy of Logistic Regression model is: 95.6884561891516
From the above output we can see the accuracy of our model is around 96 percent.
Company Confidential: Data-Core Systems, Inc. | datacoresystems.com