SlideShare a Scribd company logo
Machine Learning
REGRESSION
MR. U. A. NULI
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
TEXTILE AND ENGINEERING INSTITUTE, ICHALKARANJI
What is Regression?
Regression is a technique used to model and analyse the relationships between
variables and how they contribute and are related to producing a particular outcome
together.
Regression analysis is a form of predictive modelling technique which investigates the
relationship between a dependent (target) and independent variable(s) (predictor).
Regression analysis is a conceptually simple method for investigating functional
relationships among variables
Regression predict a real and continuous value y for a given set of input X (X = x1,x2,…)
Regression is a Supervised Learning Technique
2
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
Regression Fundamentals
The simplest case to examine is one in which a variable Y, referred to as the dependent
or target variable, may be related to one variable X, called an independent or
explanatory variable, predictor variables or simply a regressor.
In simplest terms, the purpose of regression is to try to find the best fit line or equation
that expresses the relationship between Y and X.
A simplest way to express the linear relation between Y and X is to use a line equation.
Y = W0 + W1*X
The relationship is expressed in the form of an equation or a model connecting the
response or dependent variable and one or more explanatory or predictor variables.
3
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
Regression Typical Examples
This technique is used for forecasting, time series modelling and finding
the relationship between the variables.
For example:
A relationship between rash driving and number of road accidents by a driver is best
studied through regression.
A real estate appraiser may wish to relate the sale price of a home from selected
physical characteristics of the building and taxes (local, school, county) paid on the
building.
4
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
Regression Applications
Predicting Stock prices
Forecast Sales of a month
Predict airfare
….
5
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
Regression Fundamentals
Regression model establish relation between Response/Dependent variable y with the
Independent /predictor variable x.
We can write the relationship using a hypothesis function h as
y = h(x)
Where h is called hypothesis function
Hypothesis function describes the relationship between x and y variables.
If the relationship is linear then the regression is called as Linear Regression
If the relationship is non-linear then the regression is called as Non-Linear Regression
Some time h(x) is also written as f(x)
h
x y
6
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
Regression Fundamentals
h(x) can be expresses in different way as:
h(x) = w0 + w1x -------------- 1
h(x) = w0 + w1x1 + w2x2 + w3x3 + …. -------2
h(x) = w0 + w1x2 ----------------- 3
h(x) = w0 + w1x1 + w2x2
2 ----------------- 4
Here w1,w2 are called as coefficients of regression or model parameters
x,x1,x2 are independent/predictor variables
7
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
8
Based on hypothesis function used regression can be categorized as:
Linear Regression:
Relation between independent and dependent variables is linear and usually expressed
By straight line equation
Example – equation 1 and 2
Simple Linear Regression:
There exists only one dependent variable and related to only one independent variable
For Example
y = h(x)
= w0 + w1x
Types of Regression:
https://in.mathworks.com/help/matlab/data_analysis/linear-regression.html
9
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
10
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
Multiple Linear Regression:
Most of time output Y can not be predicted by single independent variable but needs
multiple Independent variables.
The Regression that has one output variable and more than one input/independent
variables with Linear relationship between input and output is called as multiple linear
regression.
Example:
y = h(x)
= w0 + w1x1 + w2x2 + w3x3 + ….
Prediction of house price based on size of house, age of house, distance from the center
of city, etc.
11
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
Graph of more than two independent
Variable is difficult to plot
12
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
Non-linear Regression:
13
Non linear regression has a non linear relationship between independent variables and
dependent Variable.
Number of independent variables can be one, or more than one possible.
A straight line can not fit data properly hence a linear equation is not suitable, instead
Non linear regression is expressed by a polynomial, hence also called as polynomial
Regression.
Examples:
Y = h(x)
h(x) = w0 + w1x2 h(x) = w0 + w1ln(x) h(x) = w0 + w1ex
h(x) = w0 + w1x1 + w2x2
2 h(x) = w0 + w1sin(x)
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
14
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
15
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
Which Regression to Select?
16
Depends on number of independent variables and their relationship with dependent
variable in The data.
Single input variable and single output variable with linear relationship
– Simple Linear Regression
Multiple input variable and one out variable with linear relationship
– Multiple Linear Regression
Single/multiple input variable and one output variable with nonlinear relationship
- Nonlinear or Polynomial Regression
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
Assumption for Linear Regression:
17
Before analysing data using linear regression, it is necessary to make sure that the data
you want to analyse can actually be analysed using linear regression.
This can be ensured by following assumptions:
1. Variables used, should be measured at the continuous level(variables need to be continuous
variables).
2. There needs to be a linear relationship between the independent and dependent variables.
(check whether there exists a linear relationship using suitable statistical test (correlation
coefficient))
3. Little or no multi-collinearity.
multi-collinearity – one independent variable is co-related with other independent variable.
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
18
4. There should be no significant outliers.
An outlier is an observed data point that has a dependent variable value that is very different to
the value predicted by the regression equation.
As such, an outlier will be a point on a scatterplot that is (vertically) far away from the regression
line indicating that it has a large residual, as highlighted below:
https://statistics.laerd.com/spss-
tutorials/linear-regression-using-
spss-statistics.php
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
Correlation coefficient:
19
For a data set comprising n points of two variables x and y, the following
equation depicts the computation of covariance:
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
20
However, covariance can be a very large number. It is best to express it as a normalized
number between -1 and 1 to understand the relation between the quantities. This is
achieved by normalizing covariance with standard deviations of both the variables (sx
and sy).
This is called correlation coeffiient between x and y.
This is also called as Pearson correlation
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
21
Correlation measures the strength of linear dependence between X and Y and lies
between -1 and 1. The following graph gives you a visual understanding of how the
correlation impacts the linear dependence:
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
Simple Linear Regression:
22
Simple Linear Regression has only one independent variable and one dependent variable
House Size(feet2) -x House Price - y
1 200 250000
2 300 350000
3 400 450000
4 500 550000
5 600 650000
Training Dataset
Terminology:
n = Total number of training examples
ex: 5
x: input/independent/predictor variable
y: actual output variable
(x,y) : one training example
( x(i), y(i) ) Ith training example
Ex: x(1)= 200 , y(1)= 250000
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
Simple linear regression
23
Response or Target variable is defined as
= h(x) = w0+ w1x
Since there is possibility of difference between actual output value and
Predicted value, we can write actual output as
y = +e = w0+ w1x + e
e = y - w0+ w1x if e is negative, e = - y
= y -
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
24
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
Cost Function:
25
Objective: The error e ≈ 0 or difference between
predicted output value and actual output value
Should be nearly zero
Measure of how best the line fits to data, or how
best the hypothesis function predicts the
Output is specified by cost function.
Different values of the weights (w0, w1) gives us
different lines and our task is to find weights for
which we get best fit.
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
26
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
Cost Function
27
For linear regression, the most commonly used cost function is the Mean
Squared Error cost function.
It is the average over the various data points (xi, yi) of the squared error
between the predicted value (xi) and yi
, =
1
2
ℎ − h(xi) = w0+w1xi
This is also called as Residual sum of Square or RSS
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
Cost Function
28
Goal: To find values of w0 and w1 that will minimize J(w0 ,w1 )
Different values w0 ,w1 gives different lines fitting the data. These different lines will have different
Cost.
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
Cost Vs w0, w1
29
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
30
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
31
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
32
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
How to estimate parameters w0, w1
33
The Least Squares Approach
Using Normal Equations
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
34
The predicted output in simple linear regression is
= h(x) = w0+ w1x
The observed or actual output is y
Error e = -y
The Least square starts with a sum of error square as:
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
, =
1
2
ℎ −
35
1. Take partial derivatives of J with respect to w0 and w1 and equate it to
zero.
= = 0
= = 0
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
, =
1
2
ℎ −
1
2
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
36
From equation 1
= = 0
= ∑ ( ℎ − )
( )
= 0 ℎ( ) = +
= ∑ ( ℎ − ) = 0
= ∑ ( ℎ − ) = 0
= ∑ ( + − ) = 0 3
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
37
From equation 2
= = 0
= ∑ ( ℎ − )
( )
= 0 ℎ( ) = +
= ∑ ( ℎ − ) = 0
= ∑ ( ℎ − ) = 0
= ∑ ( + − ) = 0 4
38
These equations are called as normal Equations
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
∑ ( + − ) = 0
∑ ( + − ) = 0 5
6
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
39
∑ ( + − ) = 0
From First Normal Equation (equation no 5)
+ − = 0
= −
=
1
−
= −
7
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
40
From Second Normal Equation (equation no 6)
∑ ( + − ) = 0
∑ ( − ̅ + − ) = 0 Using eq. 7
∑ ( − ̅ + ( − )) = 0
∑ ( − ̅ ) + ∑ ( − ) = 0
∑ ( − ̅ ) = − ∑ ( − )
∑ − ̅ = ∑ ( − )
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
41
∑ − ̅ = ∑ ( − )
=
∑ ( )
∑ ̅
=
∑ ( )( ̅)
∑ ̅
Final equation is obtained from
PROBABILITY AND STATISTICS FOR COMPUTER SCIENTISTS SECOND EDITION by Michael Baron
CRC Press Page no. 366 equation 11.4 and 11.5
42
W0 =
∑ ∑
= − w1 ̅
W1 =
∑ ̅
∑ ( ̅)
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
How to estimate parameters w0, w1
43
Gradient Descent Algorithm
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
Optimization:
44
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
Optimization refers to the task of minimizing/maximizing an objective
function f(x) parameterized by x.
In machine/deep learning terminology, it’s the task of minimizing the
cost/loss function J(w) parameterized by the model’s parameters w ∈ Rd.
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
45
Optimization algorithms (in case of minimization) have one of the
following goals:
•Find the global minimum of the objective function. This is feasible if the
objective function is convex, i.e. any local minimum is a global
minimum.
•Find the lowest possible value of the objective function within its
neighbourhood. That’s usually the case if the objective function is not
convex as the case in most deep learning problems.
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
46
There are three kinds of optimization algorithms:
•Optimization algorithm that is not iterative and simply solves for one point.
•Optimization algorithm that is iterative in nature and converges to
acceptable solution regardless of the parameters initialization such as
gradient descent applied to regression.
•Optimization algorithm that is iterative in nature and applied to a set of
problems that have non-convex cost functions such as neural networks.
Therefore, parameters’ initialization plays a critical role in speeding up
convergence and achieving lower error rates.
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
47
Gradient Descent is the most common optimization algorithm in machine
learning and deep learning. It is used to find the values of function
parameters(coefficients)that minimizes cost function as much as possible.
It is a first-order optimization algorithm. This means it only takes into account
the first derivative when performing the updates on the parameters.
On each iteration, we update the parameters in the opposite direction of
the gradient of the objective function J(w) w.r.t the parameters where the
gradient gives the direction of the steepest ascent.
The size of the step we take on each iteration to reach the local minimum is
determined by the learning rate α. Therefore, we follow the direction of the
slope downhill until we reach a local minimum
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
48
Simplified example of Gradient Descent
Suppose you are at the top of a mountain, and you have to reach a lake which is at
the lowest point of the mountain (a.k.a valley). A twist is that you are blindfolded and
you have zero visibility to see where you are headed. So, what approach will you take
to reach the lake?
The best way is to check the ground
near you and observe where the
land tends to descend.
This will give an idea in what
direction you should take your first
step.
If you follow the descending path, it
is very likely you would reach the
lake
https://www.analyticsvidhya.com/blog/2017/03/introduction-to-gradient-descent-algorithm-along-its-variants/
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
49
Gradient Descent for Simple linear Regression
The predicted output in simple linear regression is
= h(x) = w0+ w1x
The observed or actual output is y
Error e = -y
The Cost function used is:
, =
1
2
ℎ −
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
50
The Graph of cost J verses W is as shown below:
w
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
51
General equation for Gradient Descent :
= − ∇
Here:
is cost function J(W0,W1)
∇ ca b written as
is called as learning rate.
Hence above equation can be written as:
= −
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
52
, =
1
2
ℎ −
For Following Cost function
, =
1
2
+ −
Gradient Descent equation can be written as
= −
,
Where k = 0,1
We can write the equation for w0 and w1 as
= −
,
= −
,
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
53
,
=
= ∑ ( ℎ − )
( )
ℎ( ) = +
= ∑ ( ℎ − )
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
54
,
=
= ∑ ( ℎ − )
( )
ℎ( ) = +
= ∑ ( ℎ − )
= ∑ ( + − )
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
55
,
=
= ∑ ( ℎ − )
( )
ℎ( ) = +
= ∑ ( ℎ − )
= ∑ ( + − )
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
56
Basic Gradient Descent Algorithm:
Repeat Until Converge
{
}
= −
This can be written as:
Repeat Until Converge
{
}
= −
,
= −
,
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
57
Repeat Until Converge
{
}
= −
1
( + − )
= −
1
( + − )
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
58
Learning Rate ( )
W
J(W)
W
J(W)
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
59
Steps in Gradient Descent Algorithm:
1. Initialize W0 and W1 with random initial values.
2. Initialize learning rate
3. Set Epochs
4. Calculate predicted output or h(x) for all the samples in training dataset
5. Calculate cost J
6. Estimate new parameters W0new and W1new .
7. Set new values for parameters W0 and W1 From W0new and W1new .
8. Repeat steps from Step No 4 Until the number of epochs are not over
Complete Dataset
1
2
3
4
5
6
7
Training Dataset
2
4
5
7
Testing Dataset
1
3
6
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
60
Multiple Linear Regression
Multiple regression is used to predict value of one output variable based on two
or more input variables.
Multiple linear regression (MLR), also known simply as multiple regression, is a
statistical technique that uses several explanatory variables to predict the
outcome of a response variable.
The goal of multiple linear regression (MLR) is to model the linear relationship
between the explanatory (independent) variables and response (dependent)
variable.
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
61
Examples:
1. House price prediction based on size of the house, number of rooms in the
house, Number of floors, Age of the building, open space around the
building.
Here:
y = House Price
X1 = Size of the House
X2 = Number of Rooms
X3 = Number of floors
X4 = Age of the building
X5 = open space
1. Prediction of person’s income based on education, work-class, country,
experience, etc.
X1 X2 X3 X4 X5 y
X1
0 X2
0 X3
0 X4
0 X5
0 y0
X1
1 X2
1 X3
1 X4
1 X5
1 y1
X1
2 X2
2 X3
2 X4
2 X5
2 y2
X1
3 X2
3 X3
3 X4
3 X5
3 y3
X1
4 X2
4 X3
4 X4
4 X5
4 y4
Sr. No.
0
1
2
3
4
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
62
Response or Target variable is defined as
= h(x) = w0+ w1x1 + w2x2 + w3x3 + … + wnxn
Where
X1,X2,X3, ….. , Xn are input/independent/predictor variables
is the output variable.
W0,W1,W2, ….. , Wn are parameters or coefficients of regression.
Since there is possibility of difference between actual output value and
Predicted value, we can write actual output as
y = +e = w0+ w1x1 + w2x2 + w3x3 + … + wnxn + e
e = y - w0+ w1x1 + w2x2 + w3x3 + … + wnxn
= y -
if e is negative, e = - y
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
63
Parameter Estimation in Multiple Linear Regression:
Gradient Descent Algorithm is used to estimate parameters in
Multiple Linear Regression
Basic Gradient Descent Algorithm:
Repeat Until Converge
{
}
= −
The Cost function is:
=
1
2
ℎ −
Xi ith input in the dataset
yi ith output in the dataset
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
64
Basic Gradient Descent Algorithm:
Repeat Until Converge
{
}
= −
, , . .
= −
, , . .
Where K = 1,2, …. n
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
65
, ,..
=
= ∑ ( ℎ − )
( …. )
ℎ( ) = + 1 + 2 + … . +
= ∑ ( ℎ − )
= ∑ ( + − )
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
66
, ,..
=
= ∑ ( ℎ − )
( …. )
= ∑ ( ℎ − ) 1
= ∑ ( + − ) 1
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
67
, ,..
=
= ∑ ( ℎ − )
( …. )
= ∑ ( ℎ − )
= ∑ ( + − )
Where K = 1,2,3, …. n
Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
68
Repeat Until Converge
{
.
.
.
}
= −
1
( + 1 + 2 + … . + − )
= −
1
( + 1 + 2 + … . + − ) 1
= −
1
( + 1 + 2 + … . + − )
Where K = 1,2,3, …. n = kth input variable and ith sample

More Related Content

PPTX
Machine Learning Overview.pptx
PPT
Machine Learning
PDF
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Tutori...
PPTX
Machine Learning Contents.pptx
PDF
Machine Learning: Applications, Process and Techniques
PPTX
Machine Learning Basics
PPTX
Introduction to-machine-learning
PPTX
Supervised Machine Learning Techniques
Machine Learning Overview.pptx
Machine Learning
Machine Learning Algorithms | Machine Learning Tutorial | Data Science Tutori...
Machine Learning Contents.pptx
Machine Learning: Applications, Process and Techniques
Machine Learning Basics
Introduction to-machine-learning
Supervised Machine Learning Techniques

What's hot (20)

PPTX
Evolutionary computing - soft computing
PDF
Generative adversarial networks
PPTX
Frames
PPTX
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
PPT
Intelligent systems
PPTX
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
PPT
Machine Learning
PDF
Deep Learning: Application & Opportunity
PPTX
introduction to machin learning
PDF
Generative Adversarial Networks and Their Applications
PPTX
Reinforcement Learning, Application and Q-Learning
PPTX
Presentation on supervised learning
PDF
Introduction to Machine learning with Python
PDF
Deep learning and Healthcare
PDF
Supervised and Unsupervised Machine Learning
PPTX
Deep Learning Applications | Deep Learning Applications In Real Life | Deep l...
PPTX
Introduction to Machine Learning
PDF
An introduction to Machine Learning
PPTX
Multilayer perceptron
PPTX
Machine Learning in Healthcare Diagnostics
Evolutionary computing - soft computing
Generative adversarial networks
Frames
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Intelligent systems
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Machine Learning
Deep Learning: Application & Opportunity
introduction to machin learning
Generative Adversarial Networks and Their Applications
Reinforcement Learning, Application and Q-Learning
Presentation on supervised learning
Introduction to Machine learning with Python
Deep learning and Healthcare
Supervised and Unsupervised Machine Learning
Deep Learning Applications | Deep Learning Applications In Real Life | Deep l...
Introduction to Machine Learning
An introduction to Machine Learning
Multilayer perceptron
Machine Learning in Healthcare Diagnostics
Ad

Similar to Machine-Learning-Unit2Machine Learning-Machine Learning--Regression.pdf (20)

PPTX
Detail Study of the concept of Regression model.pptx
PPTX
Regression of research methodlogyyy.pptx
PPTX
Introduction to Regression . pptx
PPTX
Regression Analysis
PPTX
REGRESSION METasdfghjklmjhgftrHODS1.pptx
PPTX
Linear regression aims to find the "best-fit" linear line
PPTX
REGRESSION ANALYSIS THEORY EXPLAINED HERE
PPTX
Artifical Intelligence And Machine Learning Algorithum.pptx
PPTX
Regression analysis by akanksha Bali
PPTX
Regression analysis
PPTX
Regression analysis
PDF
Regression
PPTX
Linear regression
PPTX
Regression
PPTX
Linear regression.pptx
PPTX
Regression
PPTX
Regression (Linear Regression and Logistic Regression) by Akanksha Bali
PPTX
Unit4- Lecture1.pptx simple linear regression
PPTX
ML4 Regression.pptx
PPTX
regression analysis presentation slides.
Detail Study of the concept of Regression model.pptx
Regression of research methodlogyyy.pptx
Introduction to Regression . pptx
Regression Analysis
REGRESSION METasdfghjklmjhgftrHODS1.pptx
Linear regression aims to find the "best-fit" linear line
REGRESSION ANALYSIS THEORY EXPLAINED HERE
Artifical Intelligence And Machine Learning Algorithum.pptx
Regression analysis by akanksha Bali
Regression analysis
Regression analysis
Regression
Linear regression
Regression
Linear regression.pptx
Regression
Regression (Linear Regression and Logistic Regression) by Akanksha Bali
Unit4- Lecture1.pptx simple linear regression
ML4 Regression.pptx
regression analysis presentation slides.
Ad

More from SsdSsd5 (11)

PDF
Machine Learning-Unit1Machine Learning-Machine Learning-.pdf
PDF
Decision treeDecision treeDecision treeDecision tree
PDF
ClusteringClusteringClusteringClustering.pdf
PPTX
S.E Unit 6colorcolorcolorcolorcolorcolor.pptx
PPTX
Unit 4colorcolorcolorcolorcolorcolorcolor.pptx
PPTX
presentationDFDdfd fddhdtdtddtdtytydtdtdtdtdttdd6.pptx
PPT
software requirement engineeringg.ppt
PDF
Chap_10_Object_Recognition.pdf
PDF
CS6640_F2014_Fourier_I.pdf
PPT
DIP_Chapter01.ppt
PPTX
1 [Autosaved].pptx
Machine Learning-Unit1Machine Learning-Machine Learning-.pdf
Decision treeDecision treeDecision treeDecision tree
ClusteringClusteringClusteringClustering.pdf
S.E Unit 6colorcolorcolorcolorcolorcolor.pptx
Unit 4colorcolorcolorcolorcolorcolorcolor.pptx
presentationDFDdfd fddhdtdtddtdtytydtdtdtdtdttdd6.pptx
software requirement engineeringg.ppt
Chap_10_Object_Recognition.pdf
CS6640_F2014_Fourier_I.pdf
DIP_Chapter01.ppt
1 [Autosaved].pptx

Recently uploaded (20)

DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PPT
Project quality management in manufacturing
PPTX
Current and future trends in Computer Vision.pptx
PPTX
Fundamentals of safety and accident prevention -final (1).pptx
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PPTX
UNIT 4 Total Quality Management .pptx
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PDF
737-MAX_SRG.pdf student reference guides
PDF
PREDICTION OF DIABETES FROM ELECTRONIC HEALTH RECORDS
PPTX
bas. eng. economics group 4 presentation 1.pptx
PPTX
Geodesy 1.pptx...............................................
PDF
composite construction of structures.pdf
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PDF
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
PDF
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
DOCX
573137875-Attendance-Management-System-original
PDF
Human-AI Collaboration: Balancing Agentic AI and Autonomy in Hybrid Systems
PPTX
Internet of Things (IOT) - A guide to understanding
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
Project quality management in manufacturing
Current and future trends in Computer Vision.pptx
Fundamentals of safety and accident prevention -final (1).pptx
Foundation to blockchain - A guide to Blockchain Tech
UNIT-1 - COAL BASED THERMAL POWER PLANTS
UNIT 4 Total Quality Management .pptx
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
737-MAX_SRG.pdf student reference guides
PREDICTION OF DIABETES FROM ELECTRONIC HEALTH RECORDS
bas. eng. economics group 4 presentation 1.pptx
Geodesy 1.pptx...............................................
composite construction of structures.pdf
Embodied AI: Ushering in the Next Era of Intelligent Systems
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
573137875-Attendance-Management-System-original
Human-AI Collaboration: Balancing Agentic AI and Autonomy in Hybrid Systems
Internet of Things (IOT) - A guide to understanding
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx

Machine-Learning-Unit2Machine Learning-Machine Learning--Regression.pdf

  • 1. Machine Learning REGRESSION MR. U. A. NULI COMPUTER SCIENCE AND ENGINEERING DEPARTMENT TEXTILE AND ENGINEERING INSTITUTE, ICHALKARANJI
  • 2. What is Regression? Regression is a technique used to model and analyse the relationships between variables and how they contribute and are related to producing a particular outcome together. Regression analysis is a form of predictive modelling technique which investigates the relationship between a dependent (target) and independent variable(s) (predictor). Regression analysis is a conceptually simple method for investigating functional relationships among variables Regression predict a real and continuous value y for a given set of input X (X = x1,x2,…) Regression is a Supervised Learning Technique 2 Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
  • 3. Regression Fundamentals The simplest case to examine is one in which a variable Y, referred to as the dependent or target variable, may be related to one variable X, called an independent or explanatory variable, predictor variables or simply a regressor. In simplest terms, the purpose of regression is to try to find the best fit line or equation that expresses the relationship between Y and X. A simplest way to express the linear relation between Y and X is to use a line equation. Y = W0 + W1*X The relationship is expressed in the form of an equation or a model connecting the response or dependent variable and one or more explanatory or predictor variables. 3 Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
  • 4. Regression Typical Examples This technique is used for forecasting, time series modelling and finding the relationship between the variables. For example: A relationship between rash driving and number of road accidents by a driver is best studied through regression. A real estate appraiser may wish to relate the sale price of a home from selected physical characteristics of the building and taxes (local, school, county) paid on the building. 4 Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
  • 5. Regression Applications Predicting Stock prices Forecast Sales of a month Predict airfare …. 5 Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
  • 6. Regression Fundamentals Regression model establish relation between Response/Dependent variable y with the Independent /predictor variable x. We can write the relationship using a hypothesis function h as y = h(x) Where h is called hypothesis function Hypothesis function describes the relationship between x and y variables. If the relationship is linear then the regression is called as Linear Regression If the relationship is non-linear then the regression is called as Non-Linear Regression Some time h(x) is also written as f(x) h x y 6 Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
  • 7. Regression Fundamentals h(x) can be expresses in different way as: h(x) = w0 + w1x -------------- 1 h(x) = w0 + w1x1 + w2x2 + w3x3 + …. -------2 h(x) = w0 + w1x2 ----------------- 3 h(x) = w0 + w1x1 + w2x2 2 ----------------- 4 Here w1,w2 are called as coefficients of regression or model parameters x,x1,x2 are independent/predictor variables 7 Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
  • 8. Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji 8 Based on hypothesis function used regression can be categorized as: Linear Regression: Relation between independent and dependent variables is linear and usually expressed By straight line equation Example – equation 1 and 2 Simple Linear Regression: There exists only one dependent variable and related to only one independent variable For Example y = h(x) = w0 + w1x Types of Regression:
  • 9. https://in.mathworks.com/help/matlab/data_analysis/linear-regression.html 9 Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
  • 10. 10 Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
  • 11. Multiple Linear Regression: Most of time output Y can not be predicted by single independent variable but needs multiple Independent variables. The Regression that has one output variable and more than one input/independent variables with Linear relationship between input and output is called as multiple linear regression. Example: y = h(x) = w0 + w1x1 + w2x2 + w3x3 + …. Prediction of house price based on size of house, age of house, distance from the center of city, etc. 11 Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
  • 12. Graph of more than two independent Variable is difficult to plot 12 Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
  • 13. Non-linear Regression: 13 Non linear regression has a non linear relationship between independent variables and dependent Variable. Number of independent variables can be one, or more than one possible. A straight line can not fit data properly hence a linear equation is not suitable, instead Non linear regression is expressed by a polynomial, hence also called as polynomial Regression. Examples: Y = h(x) h(x) = w0 + w1x2 h(x) = w0 + w1ln(x) h(x) = w0 + w1ex h(x) = w0 + w1x1 + w2x2 2 h(x) = w0 + w1sin(x) Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
  • 14. 14 Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
  • 15. 15 Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
  • 16. Which Regression to Select? 16 Depends on number of independent variables and their relationship with dependent variable in The data. Single input variable and single output variable with linear relationship – Simple Linear Regression Multiple input variable and one out variable with linear relationship – Multiple Linear Regression Single/multiple input variable and one output variable with nonlinear relationship - Nonlinear or Polynomial Regression Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
  • 17. Assumption for Linear Regression: 17 Before analysing data using linear regression, it is necessary to make sure that the data you want to analyse can actually be analysed using linear regression. This can be ensured by following assumptions: 1. Variables used, should be measured at the continuous level(variables need to be continuous variables). 2. There needs to be a linear relationship between the independent and dependent variables. (check whether there exists a linear relationship using suitable statistical test (correlation coefficient)) 3. Little or no multi-collinearity. multi-collinearity – one independent variable is co-related with other independent variable. Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
  • 18. 18 4. There should be no significant outliers. An outlier is an observed data point that has a dependent variable value that is very different to the value predicted by the regression equation. As such, an outlier will be a point on a scatterplot that is (vertically) far away from the regression line indicating that it has a large residual, as highlighted below: https://statistics.laerd.com/spss- tutorials/linear-regression-using- spss-statistics.php Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
  • 19. Correlation coefficient: 19 For a data set comprising n points of two variables x and y, the following equation depicts the computation of covariance: Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
  • 20. 20 However, covariance can be a very large number. It is best to express it as a normalized number between -1 and 1 to understand the relation between the quantities. This is achieved by normalizing covariance with standard deviations of both the variables (sx and sy). This is called correlation coeffiient between x and y. This is also called as Pearson correlation Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
  • 21. 21 Correlation measures the strength of linear dependence between X and Y and lies between -1 and 1. The following graph gives you a visual understanding of how the correlation impacts the linear dependence: Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
  • 22. Simple Linear Regression: 22 Simple Linear Regression has only one independent variable and one dependent variable House Size(feet2) -x House Price - y 1 200 250000 2 300 350000 3 400 450000 4 500 550000 5 600 650000 Training Dataset Terminology: n = Total number of training examples ex: 5 x: input/independent/predictor variable y: actual output variable (x,y) : one training example ( x(i), y(i) ) Ith training example Ex: x(1)= 200 , y(1)= 250000 Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
  • 23. Simple linear regression 23 Response or Target variable is defined as = h(x) = w0+ w1x Since there is possibility of difference between actual output value and Predicted value, we can write actual output as y = +e = w0+ w1x + e e = y - w0+ w1x if e is negative, e = - y = y - Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
  • 24. 24 Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
  • 25. Cost Function: 25 Objective: The error e ≈ 0 or difference between predicted output value and actual output value Should be nearly zero Measure of how best the line fits to data, or how best the hypothesis function predicts the Output is specified by cost function. Different values of the weights (w0, w1) gives us different lines and our task is to find weights for which we get best fit. Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
  • 26. 26 Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
  • 27. Cost Function 27 For linear regression, the most commonly used cost function is the Mean Squared Error cost function. It is the average over the various data points (xi, yi) of the squared error between the predicted value (xi) and yi , = 1 2 ℎ − h(xi) = w0+w1xi This is also called as Residual sum of Square or RSS Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
  • 28. Cost Function 28 Goal: To find values of w0 and w1 that will minimize J(w0 ,w1 ) Different values w0 ,w1 gives different lines fitting the data. These different lines will have different Cost. Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
  • 29. Cost Vs w0, w1 29 Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
  • 30. 30 Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
  • 31. 31 Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
  • 32. 32 Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
  • 33. How to estimate parameters w0, w1 33 The Least Squares Approach Using Normal Equations Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
  • 34. 34 The predicted output in simple linear regression is = h(x) = w0+ w1x The observed or actual output is y Error e = -y The Least square starts with a sum of error square as: Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji , = 1 2 ℎ −
  • 35. 35 1. Take partial derivatives of J with respect to w0 and w1 and equate it to zero. = = 0 = = 0 Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji , = 1 2 ℎ − 1 2
  • 36. Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji 36 From equation 1 = = 0 = ∑ ( ℎ − ) ( ) = 0 ℎ( ) = + = ∑ ( ℎ − ) = 0 = ∑ ( ℎ − ) = 0 = ∑ ( + − ) = 0 3
  • 37. Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji 37 From equation 2 = = 0 = ∑ ( ℎ − ) ( ) = 0 ℎ( ) = + = ∑ ( ℎ − ) = 0 = ∑ ( ℎ − ) = 0 = ∑ ( + − ) = 0 4
  • 38. 38 These equations are called as normal Equations Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji ∑ ( + − ) = 0 ∑ ( + − ) = 0 5 6
  • 39. Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji 39 ∑ ( + − ) = 0 From First Normal Equation (equation no 5) + − = 0 = − = 1 − = − 7
  • 40. Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji 40 From Second Normal Equation (equation no 6) ∑ ( + − ) = 0 ∑ ( − ̅ + − ) = 0 Using eq. 7 ∑ ( − ̅ + ( − )) = 0 ∑ ( − ̅ ) + ∑ ( − ) = 0 ∑ ( − ̅ ) = − ∑ ( − ) ∑ − ̅ = ∑ ( − )
  • 41. Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji 41 ∑ − ̅ = ∑ ( − ) = ∑ ( ) ∑ ̅ = ∑ ( )( ̅) ∑ ̅ Final equation is obtained from PROBABILITY AND STATISTICS FOR COMPUTER SCIENTISTS SECOND EDITION by Michael Baron CRC Press Page no. 366 equation 11.4 and 11.5
  • 42. 42 W0 = ∑ ∑ = − w1 ̅ W1 = ∑ ̅ ∑ ( ̅) Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
  • 43. How to estimate parameters w0, w1 43 Gradient Descent Algorithm Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji
  • 44. Optimization: 44 Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji Optimization refers to the task of minimizing/maximizing an objective function f(x) parameterized by x. In machine/deep learning terminology, it’s the task of minimizing the cost/loss function J(w) parameterized by the model’s parameters w ∈ Rd.
  • 45. Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji 45 Optimization algorithms (in case of minimization) have one of the following goals: •Find the global minimum of the objective function. This is feasible if the objective function is convex, i.e. any local minimum is a global minimum. •Find the lowest possible value of the objective function within its neighbourhood. That’s usually the case if the objective function is not convex as the case in most deep learning problems.
  • 46. Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji 46 There are three kinds of optimization algorithms: •Optimization algorithm that is not iterative and simply solves for one point. •Optimization algorithm that is iterative in nature and converges to acceptable solution regardless of the parameters initialization such as gradient descent applied to regression. •Optimization algorithm that is iterative in nature and applied to a set of problems that have non-convex cost functions such as neural networks. Therefore, parameters’ initialization plays a critical role in speeding up convergence and achieving lower error rates.
  • 47. Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji 47 Gradient Descent is the most common optimization algorithm in machine learning and deep learning. It is used to find the values of function parameters(coefficients)that minimizes cost function as much as possible. It is a first-order optimization algorithm. This means it only takes into account the first derivative when performing the updates on the parameters. On each iteration, we update the parameters in the opposite direction of the gradient of the objective function J(w) w.r.t the parameters where the gradient gives the direction of the steepest ascent. The size of the step we take on each iteration to reach the local minimum is determined by the learning rate α. Therefore, we follow the direction of the slope downhill until we reach a local minimum
  • 48. Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji 48 Simplified example of Gradient Descent Suppose you are at the top of a mountain, and you have to reach a lake which is at the lowest point of the mountain (a.k.a valley). A twist is that you are blindfolded and you have zero visibility to see where you are headed. So, what approach will you take to reach the lake? The best way is to check the ground near you and observe where the land tends to descend. This will give an idea in what direction you should take your first step. If you follow the descending path, it is very likely you would reach the lake https://www.analyticsvidhya.com/blog/2017/03/introduction-to-gradient-descent-algorithm-along-its-variants/
  • 49. Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji 49 Gradient Descent for Simple linear Regression The predicted output in simple linear regression is = h(x) = w0+ w1x The observed or actual output is y Error e = -y The Cost function used is: , = 1 2 ℎ −
  • 50. Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji 50 The Graph of cost J verses W is as shown below: w
  • 51. Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji 51 General equation for Gradient Descent : = − ∇ Here: is cost function J(W0,W1) ∇ ca b written as is called as learning rate. Hence above equation can be written as: = −
  • 52. Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji 52 , = 1 2 ℎ − For Following Cost function , = 1 2 + − Gradient Descent equation can be written as = − , Where k = 0,1 We can write the equation for w0 and w1 as = − , = − ,
  • 53. Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji 53 , = = ∑ ( ℎ − ) ( ) ℎ( ) = + = ∑ ( ℎ − )
  • 54. Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji 54 , = = ∑ ( ℎ − ) ( ) ℎ( ) = + = ∑ ( ℎ − ) = ∑ ( + − )
  • 55. Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji 55 , = = ∑ ( ℎ − ) ( ) ℎ( ) = + = ∑ ( ℎ − ) = ∑ ( + − )
  • 56. Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji 56 Basic Gradient Descent Algorithm: Repeat Until Converge { } = − This can be written as: Repeat Until Converge { } = − , = − ,
  • 57. Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji 57 Repeat Until Converge { } = − 1 ( + − ) = − 1 ( + − )
  • 58. Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji 58 Learning Rate ( ) W J(W) W J(W)
  • 59. Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji 59 Steps in Gradient Descent Algorithm: 1. Initialize W0 and W1 with random initial values. 2. Initialize learning rate 3. Set Epochs 4. Calculate predicted output or h(x) for all the samples in training dataset 5. Calculate cost J 6. Estimate new parameters W0new and W1new . 7. Set new values for parameters W0 and W1 From W0new and W1new . 8. Repeat steps from Step No 4 Until the number of epochs are not over Complete Dataset 1 2 3 4 5 6 7 Training Dataset 2 4 5 7 Testing Dataset 1 3 6
  • 60. Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji 60 Multiple Linear Regression Multiple regression is used to predict value of one output variable based on two or more input variables. Multiple linear regression (MLR), also known simply as multiple regression, is a statistical technique that uses several explanatory variables to predict the outcome of a response variable. The goal of multiple linear regression (MLR) is to model the linear relationship between the explanatory (independent) variables and response (dependent) variable.
  • 61. Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji 61 Examples: 1. House price prediction based on size of the house, number of rooms in the house, Number of floors, Age of the building, open space around the building. Here: y = House Price X1 = Size of the House X2 = Number of Rooms X3 = Number of floors X4 = Age of the building X5 = open space 1. Prediction of person’s income based on education, work-class, country, experience, etc. X1 X2 X3 X4 X5 y X1 0 X2 0 X3 0 X4 0 X5 0 y0 X1 1 X2 1 X3 1 X4 1 X5 1 y1 X1 2 X2 2 X3 2 X4 2 X5 2 y2 X1 3 X2 3 X3 3 X4 3 X5 3 y3 X1 4 X2 4 X3 4 X4 4 X5 4 y4 Sr. No. 0 1 2 3 4
  • 62. Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji 62 Response or Target variable is defined as = h(x) = w0+ w1x1 + w2x2 + w3x3 + … + wnxn Where X1,X2,X3, ….. , Xn are input/independent/predictor variables is the output variable. W0,W1,W2, ….. , Wn are parameters or coefficients of regression. Since there is possibility of difference between actual output value and Predicted value, we can write actual output as y = +e = w0+ w1x1 + w2x2 + w3x3 + … + wnxn + e e = y - w0+ w1x1 + w2x2 + w3x3 + … + wnxn = y - if e is negative, e = - y
  • 63. Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji 63 Parameter Estimation in Multiple Linear Regression: Gradient Descent Algorithm is used to estimate parameters in Multiple Linear Regression Basic Gradient Descent Algorithm: Repeat Until Converge { } = − The Cost function is: = 1 2 ℎ − Xi ith input in the dataset yi ith output in the dataset
  • 64. Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji 64 Basic Gradient Descent Algorithm: Repeat Until Converge { } = − , , . . = − , , . . Where K = 1,2, …. n
  • 65. Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji 65 , ,.. = = ∑ ( ℎ − ) ( …. ) ℎ( ) = + 1 + 2 + … . + = ∑ ( ℎ − ) = ∑ ( + − )
  • 66. Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji 66 , ,.. = = ∑ ( ℎ − ) ( …. ) = ∑ ( ℎ − ) 1 = ∑ ( + − ) 1
  • 67. Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji 67 , ,.. = = ∑ ( ℎ − ) ( …. ) = ∑ ( ℎ − ) = ∑ ( + − ) Where K = 1,2,3, …. n
  • 68. Mr. U.A. Nuli, Asst.Professor, CSE dept, Textile and Engineering Institute, Ichalkaranji 68 Repeat Until Converge { . . . } = − 1 ( + 1 + 2 + … . + − ) = − 1 ( + 1 + 2 + … . + − ) 1 = − 1 ( + 1 + 2 + … . + − ) Where K = 1,2,3, …. n = kth input variable and ith sample