SlideShare a Scribd company logo
Correlation and regression
Correlation
• We are going to examine the linear correlation between two
variables, that is we are looking at how strong a linear relationship
there is between them.
• If there is suitably strong enough relationship between the two
variables ( and there is cause and effect) we can calculate a
“regression line” which is given by:
E[Y/X] =y=α+βx
E(S2)
P(X=x)=p(X=Head)=1/2
• The sample correlation coefficient r is given by:
• r= Sxy/(SxxSyy)1/2
• r is such that -1≤r ≤1
• r is a measure of linear association and does not itself indicate “cause
and effect”
• ρ=corr(X,Y)=cov(X,Y)/(var(X)var(Y))1/2
• The sample correlation coefficient, r, is an estimator of the population
correlation coefficient, ρ.
• r= Sxy/(SxxSyy)1/2
• Sxx=∑(xi- x̄)2= ∑xi
2-n x̄2
• Syy= ∑(Yi- ȳ)2= ∑Yi
2-n ȳ 2
• Sxy= ∑ (xi- x̄) (Yi- ȳ)= ∑xiyi-n x̄ ȳ
• A new computerized ultrasound scanning technique has enabled
doctors to monitor the weight of unborn babies. The table below
shows the estimated weights for one particular foetus at fortnightly
intervals during the pregnancy.
Gestation period
(weeks) x
30 32 34 36 38 40
Estimated foetal
weight (kg) y
1.6 1.7 2.5 2.8 3.2 3.5
• Calculate Sxx,Syy,Sxy for the foetal weights example
• r= Sxy/(SxxSyy)1/2
Coefficient of Determination
• The proportion of the total variability of the responses “explained” by
a model is called the coefficient of determination, denoted, R2.
• The proportion is :
R2 =SSREG/SSTOT= Sxy
2/SxxSyy
R2 can take values between 0% and 100% inclusive.
Goodness of fit
• Partitioning the variability of the responses
• To help understand the “goodness of fit” of the model to the data,
the total variation in the responses, Syy= ∑(Yi- ȳ)2 should be studied.
Some of the variation in the responses can be attributed to the
relationships with x (eg y may be high when x is high, low when x is
low) and some is random variation (unmodellable).
Just how much is attributable – or explained by the model – is a
measure of the goodness of fit of the model.
• We start from an identity involving yi (the observed y value), ȳ (the
overall average of y values) and ŷi(the predicted value of y)
• Yi- ȳ=(Yi-ŷi)+(ŷi- ȳ)
• Squaring and summing both sides of :
• ∑(Yi- ȳ)2= ∑(Yi-ŷi)2+ ∑(ŷi- ȳ)2
• The cross-product term vanishing
• The sum of the left is the “total sum of squares” of the responses, denoted
here SSTOT.
• The second sum of the right is the sum of the squares of the deviations of
the fitted responses from the overall mean. It summarises the variability
accounted for, or “explained by the model”. It is called the regression sum
of squares, denoted here by SSREG
• The first sum on the right is the sum of the squares of the estimated errors
(response-fitted response, generally referred to as residual from the fit). It
summarises the remaining variability, that between the responses and
their fitted values and so unexplained by the model. It is called the residual
sum of squares, denoted by SSRES.
• So:
• SSTOT=SSRES+SSREG
• In this case, (simple linear regression model), note that the value of
the coefficient of determination is the square of the correlation
coefficient for the data.
• r= Sxy/(SxxSyy)1/2
• R2 =SSREG/SSTOT= Sxy
2/SxxSyy=r2
• A sample of ten claims and corresponding payments on settlement
for household policies is taken from the business of an insurance
company.
The amounts, in units of $100, are as follows:
Claims
x
2.10 2.40 2.50 3.20 3.6 3.8 4.1 4.2 4.5 5
Payme
nt y
2.18 2.06 2.54 2.61 3.67 3.25 4.02 3.71 4.38 4.45

More Related Content

Similar to Correlation and regression.pptx (20)

PPT
Simple Linear Regression.pptSimple Linear Regression.ppt
NersyPrincessBongoya
 
PPTX
Measure of Association
Kalahandi University
 
PPTX
Regression-SIMPLE LINEAR (1).psssssssssptx
pokah34509
 
DOCX
Statistics
KafiPati
 
PDF
MSC III_Research Methodology and Statistics_Inferrential ststistics.pdf
Suchita Rawat
 
PPT
Scatterplots - LSRLs - RESIDs
Cumberland County Schools
 
PPT
Correlation and Regression analysis .ppt
jayeshraj0000
 
PPT
Exploring bivariate data
Ulster BOCES
 
PPT
Ttestrrrrrrrrrrrrrr2dfsssssssssssss008.ppt
EndrisHEbrahim
 
PPTX
REGRESSION ANALYSIS THEORY EXPLAINED HERE
ShriramKargaonkar
 
PDF
Correlation and Regression
Dr. Tushar J Bhatt
 
PPTX
Regression analysis refers to assessing the relationship between the outcome ...
sureshm491823
 
PPT
Research Methodology-Chapter 14
Javed Iqbal Kamyana
 
PPT
Biostatistics lecture notes 7.ppt
letayh2016
 
PPTX
Simple Regression.pptx
Victoria Bozhenko
 
PDF
Regression Analysis-Machine Learning -Different Types
Sharmila Chidaravalli
 
PPT
Simple linear regressionn and Correlation
Southern Range, Berhampur, Odisha
 
PPTX
regression.pptx
Rashi Agarwal
 
PPT
Linear regression.ppt
habtamu biazin
 
Simple Linear Regression.pptSimple Linear Regression.ppt
NersyPrincessBongoya
 
Measure of Association
Kalahandi University
 
Regression-SIMPLE LINEAR (1).psssssssssptx
pokah34509
 
Statistics
KafiPati
 
MSC III_Research Methodology and Statistics_Inferrential ststistics.pdf
Suchita Rawat
 
Scatterplots - LSRLs - RESIDs
Cumberland County Schools
 
Correlation and Regression analysis .ppt
jayeshraj0000
 
Exploring bivariate data
Ulster BOCES
 
Ttestrrrrrrrrrrrrrr2dfsssssssssssss008.ppt
EndrisHEbrahim
 
REGRESSION ANALYSIS THEORY EXPLAINED HERE
ShriramKargaonkar
 
Correlation and Regression
Dr. Tushar J Bhatt
 
Regression analysis refers to assessing the relationship between the outcome ...
sureshm491823
 
Research Methodology-Chapter 14
Javed Iqbal Kamyana
 
Biostatistics lecture notes 7.ppt
letayh2016
 
Simple Regression.pptx
Victoria Bozhenko
 
Regression Analysis-Machine Learning -Different Types
Sharmila Chidaravalli
 
Simple linear regressionn and Correlation
Southern Range, Berhampur, Odisha
 
regression.pptx
Rashi Agarwal
 
Linear regression.ppt
habtamu biazin
 

More from Object Automation (20)

PDF
Data Science and Practical Application Course
Object Automation
 
PDF
RTL DESIGN IN ML WORLD_OBJECT AUTOMATION Inc
Object Automation
 
PDF
CHIPS Alliance_Object Automation Inc_workshop
Object Automation
 
PDF
RTL Design Methodologies_Object Automation Inc
Object Automation
 
PDF
High-Level Synthesis for the Design of AI Chips
Object Automation
 
PDF
AI-Inspired IOT Chiplets and 3D Heterogeneous Integration
Object Automation
 
PDF
GenAI and AI GCC State of AI_Object Automation Inc
Object Automation
 
PDF
CDAC presentation as part of Global AI Festival and Future
Object Automation
 
PDF
Global AI Festivla and Future one day event
Object Automation
 
PDF
Generative AI In Logistics_Object Automation
Object Automation
 
PDF
Gen AI_Object Automation_TechnologyWorkshop
Object Automation
 
PDF
Deploying Pretrained Model In Edge IoT Devices.pdf
Object Automation
 
PDF
AI-INSPIRED IOT CHIPLETS AND 3D HETEROGENEOUS INTEGRATION.pdf
Object Automation
 
PDF
5G Edge Computing_Object Automation workshop
Object Automation
 
PDF
COE AI Lab Universities
Object Automation
 
PDF
Bootcamp_AIApps.pdf
Object Automation
 
PDF
Bootcamp_AIApps.pdf
Object Automation
 
PPTX
Bootcamp_AIAppsUCSD.pptx
Object Automation
 
PDF
Course_Object Automation.pdf
Object Automation
 
PDF
Enterprise AI_New.pdf
Object Automation
 
Data Science and Practical Application Course
Object Automation
 
RTL DESIGN IN ML WORLD_OBJECT AUTOMATION Inc
Object Automation
 
CHIPS Alliance_Object Automation Inc_workshop
Object Automation
 
RTL Design Methodologies_Object Automation Inc
Object Automation
 
High-Level Synthesis for the Design of AI Chips
Object Automation
 
AI-Inspired IOT Chiplets and 3D Heterogeneous Integration
Object Automation
 
GenAI and AI GCC State of AI_Object Automation Inc
Object Automation
 
CDAC presentation as part of Global AI Festival and Future
Object Automation
 
Global AI Festivla and Future one day event
Object Automation
 
Generative AI In Logistics_Object Automation
Object Automation
 
Gen AI_Object Automation_TechnologyWorkshop
Object Automation
 
Deploying Pretrained Model In Edge IoT Devices.pdf
Object Automation
 
AI-INSPIRED IOT CHIPLETS AND 3D HETEROGENEOUS INTEGRATION.pdf
Object Automation
 
5G Edge Computing_Object Automation workshop
Object Automation
 
COE AI Lab Universities
Object Automation
 
Bootcamp_AIApps.pdf
Object Automation
 
Bootcamp_AIApps.pdf
Object Automation
 
Bootcamp_AIAppsUCSD.pptx
Object Automation
 
Course_Object Automation.pdf
Object Automation
 
Enterprise AI_New.pdf
Object Automation
 
Ad

Recently uploaded (20)

PDF
Alur Perkembangan Software dan Jaringan Komputer
ssuser754303
 
PPTX
Avast Premium Security crack 25.5.6162 + License Key 2025
HyperPc soft
 
PPTX
IObit Uninstaller Pro 14.3.1.8 Crack Free Download 2025
sdfger qwerty
 
PPTX
Android Notifications-A Guide to User-Facing Alerts in Android .pptx
Nabin Dhakal
 
PPTX
IDM Crack with Internet Download Manager 6.42 Build 41 [Latest 2025]
pcprocore
 
PPTX
arctitecture application system design os dsa
za241967
 
PDF
Best Software Development at Best Prices
softechies7
 
PDF
AI Software Development Process, Strategies and Challenges
Net-Craft.com
 
PDF
Building scalbale cloud native apps with .NET 8
GillesMathieu10
 
PDF
Designing Accessible Content Blocks (1).pdf
jaclynmennie1
 
PDF
Code Once; Run Everywhere - A Beginner’s Journey with React Native
Hasitha Walpola
 
PDF
Mastering VPC Architecture Build for Scale from Day 1.pdf
Devseccops.ai
 
DOCX
Zoho Creator Solution for EI by Elsner Technologies.docx
Elsner Technologies Pvt. Ltd.
 
PDF
Why Edge Computing Matters in Mobile Application Tech.pdf
IMG Global Infotech
 
PDF
Best Practice for LLM Serving in the Cloud
Alluxio, Inc.
 
PPTX
IObit Driver Booster Pro 12.4-12.5 license keys 2025-2026
chaudhryakashoo065
 
DOCX
Best AI-Powered Wearable Tech for Remote Health Monitoring in 2025
SEOLIFT - SEO Company London
 
PDF
Writing Maintainable Playwright Tests with Ease
Shubham Joshi
 
PDF
TEASMA: A Practical Methodology for Test Adequacy Assessment of Deep Neural N...
Lionel Briand
 
PDF
Automated Test Case Repair Using Language Models
Lionel Briand
 
Alur Perkembangan Software dan Jaringan Komputer
ssuser754303
 
Avast Premium Security crack 25.5.6162 + License Key 2025
HyperPc soft
 
IObit Uninstaller Pro 14.3.1.8 Crack Free Download 2025
sdfger qwerty
 
Android Notifications-A Guide to User-Facing Alerts in Android .pptx
Nabin Dhakal
 
IDM Crack with Internet Download Manager 6.42 Build 41 [Latest 2025]
pcprocore
 
arctitecture application system design os dsa
za241967
 
Best Software Development at Best Prices
softechies7
 
AI Software Development Process, Strategies and Challenges
Net-Craft.com
 
Building scalbale cloud native apps with .NET 8
GillesMathieu10
 
Designing Accessible Content Blocks (1).pdf
jaclynmennie1
 
Code Once; Run Everywhere - A Beginner’s Journey with React Native
Hasitha Walpola
 
Mastering VPC Architecture Build for Scale from Day 1.pdf
Devseccops.ai
 
Zoho Creator Solution for EI by Elsner Technologies.docx
Elsner Technologies Pvt. Ltd.
 
Why Edge Computing Matters in Mobile Application Tech.pdf
IMG Global Infotech
 
Best Practice for LLM Serving in the Cloud
Alluxio, Inc.
 
IObit Driver Booster Pro 12.4-12.5 license keys 2025-2026
chaudhryakashoo065
 
Best AI-Powered Wearable Tech for Remote Health Monitoring in 2025
SEOLIFT - SEO Company London
 
Writing Maintainable Playwright Tests with Ease
Shubham Joshi
 
TEASMA: A Practical Methodology for Test Adequacy Assessment of Deep Neural N...
Lionel Briand
 
Automated Test Case Repair Using Language Models
Lionel Briand
 
Ad

Correlation and regression.pptx

  • 3. • We are going to examine the linear correlation between two variables, that is we are looking at how strong a linear relationship there is between them. • If there is suitably strong enough relationship between the two variables ( and there is cause and effect) we can calculate a “regression line” which is given by: E[Y/X] =y=α+βx E(S2) P(X=x)=p(X=Head)=1/2
  • 4. • The sample correlation coefficient r is given by: • r= Sxy/(SxxSyy)1/2 • r is such that -1≤r ≤1 • r is a measure of linear association and does not itself indicate “cause and effect”
  • 5. • ρ=corr(X,Y)=cov(X,Y)/(var(X)var(Y))1/2 • The sample correlation coefficient, r, is an estimator of the population correlation coefficient, ρ. • r= Sxy/(SxxSyy)1/2 • Sxx=∑(xi- x̄)2= ∑xi 2-n x̄2 • Syy= ∑(Yi- ȳ)2= ∑Yi 2-n ȳ 2 • Sxy= ∑ (xi- x̄) (Yi- ȳ)= ∑xiyi-n x̄ ȳ
  • 6. • A new computerized ultrasound scanning technique has enabled doctors to monitor the weight of unborn babies. The table below shows the estimated weights for one particular foetus at fortnightly intervals during the pregnancy. Gestation period (weeks) x 30 32 34 36 38 40 Estimated foetal weight (kg) y 1.6 1.7 2.5 2.8 3.2 3.5
  • 7. • Calculate Sxx,Syy,Sxy for the foetal weights example • r= Sxy/(SxxSyy)1/2
  • 8. Coefficient of Determination • The proportion of the total variability of the responses “explained” by a model is called the coefficient of determination, denoted, R2. • The proportion is : R2 =SSREG/SSTOT= Sxy 2/SxxSyy R2 can take values between 0% and 100% inclusive.
  • 9. Goodness of fit • Partitioning the variability of the responses • To help understand the “goodness of fit” of the model to the data, the total variation in the responses, Syy= ∑(Yi- ȳ)2 should be studied. Some of the variation in the responses can be attributed to the relationships with x (eg y may be high when x is high, low when x is low) and some is random variation (unmodellable). Just how much is attributable – or explained by the model – is a measure of the goodness of fit of the model.
  • 10. • We start from an identity involving yi (the observed y value), ȳ (the overall average of y values) and ŷi(the predicted value of y) • Yi- ȳ=(Yi-ŷi)+(ŷi- ȳ) • Squaring and summing both sides of : • ∑(Yi- ȳ)2= ∑(Yi-ŷi)2+ ∑(ŷi- ȳ)2 • The cross-product term vanishing
  • 11. • The sum of the left is the “total sum of squares” of the responses, denoted here SSTOT. • The second sum of the right is the sum of the squares of the deviations of the fitted responses from the overall mean. It summarises the variability accounted for, or “explained by the model”. It is called the regression sum of squares, denoted here by SSREG • The first sum on the right is the sum of the squares of the estimated errors (response-fitted response, generally referred to as residual from the fit). It summarises the remaining variability, that between the responses and their fitted values and so unexplained by the model. It is called the residual sum of squares, denoted by SSRES.
  • 12. • So: • SSTOT=SSRES+SSREG • In this case, (simple linear regression model), note that the value of the coefficient of determination is the square of the correlation coefficient for the data. • r= Sxy/(SxxSyy)1/2 • R2 =SSREG/SSTOT= Sxy 2/SxxSyy=r2
  • 13. • A sample of ten claims and corresponding payments on settlement for household policies is taken from the business of an insurance company. The amounts, in units of $100, are as follows: Claims x 2.10 2.40 2.50 3.20 3.6 3.8 4.1 4.2 4.5 5 Payme nt y 2.18 2.06 2.54 2.61 3.67 3.25 4.02 3.71 4.38 4.45