Open In App

Covariance and Correlation in R Programming

Last Updated : 12 Jul, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Covariance and Correlation are terms used in statistics to measure relationships between two random variables. Both of these terms measure linear dependency between a pair of random variables or bivariate data. They both capture a different component of the relationship, despite the fact that they both provide information about the link between variables.

Difference between Covariance and Correlation

We can discuss some of the main difference between them as below:

Feature

Covariance

Correlation

Definition

Shows how two variables change together.

Shows the strength and direction of a linear relationship.

Scale

Not standardized and depends on the units of variables.

Standardized and always lies between -1 and 1.

Comparability

Difficult to compare across different datasets.

Easy to compare across different datasets.

Value Range

Can take any value from negative to positive infinity.

Values always range between -1 and 1.

Use Cases

Common in risk analysis and portfolio management.

Used in data analysis, regression and prediction models.

Covariance in R Programming Language

In R programming, covariance can be measured using the cov() function. Covariance is a statistical term used to measure the direction of the linear relationship between the data vectors.

Mathematically,

\operatorname{Cov}(x, y)=\frac{\Sigma\left(x_{i}-\bar{x}\right)\left(y_{i}-\bar{y}\right)}{N} 

Where:

  • x represents the x data vector 
  • y represents the y data vector 
  • \bar{x} represents mean of x data vector 
  • \bar{y} represents mean of y data vector 
  • N represents total observations

Syntax:

cov(x, y, method)

Where:

  • x and y represents the data vectors
  • method defines the type of method to be used to compute covariance. Default is "pearson".

Example: 

R
x <- c(1, 3, 5, 10)
y <- c(2, 4, 6, 20)

print(cov(x, y))
print(cov(x, y, method = "pearson"))
print(cov(x, y, method = "kendall"))
print(cov(x, y, method = "spearman"))

Output: 

[1] 30.66667
[1] 30.66667
[1] 12
[1] 1.666667

Correlation in R Programming Language

cor() function in R programming measures the correlation coefficient value. Correlation is a relationship term in statistics that uses the covariance method to measure how strongly the vectors are related.

Mathematically,

\operatorname{Corr}(x, y)=\frac{\sum\left(x_{i}-\bar{x}\right)\left(y_{i}-\bar{y}\right)}{\sqrt{\sum\left(x_{i}-\bar{x}\right)^{2} \sum\left(y_{i}-\bar{y}\right)^{2}}} 

Where:

  • x represents the x data vector 
  • y represents the y data vector 
  • \bar{x} represents mean of x data vector 
  • \bar{y} represents mean of y data vector 
  • N represents total observations

Syntax:

cor(x, y, method)

Where:

  • x and y represents the data vectors
  • method defines the type of method to be used to compute covariance. Default is "pearson".

Example: 

R
x <- c(1, 3, 5, 10)

y <- c(2, 4, 6, 20)

print(cor(x, y))
print(cor(x, y, method = "pearson"))
print(cor(x, y, method = "kendall"))
print(cor(x, y, method = "spearman"))

Output: 

[1] 0.9724702
[1] 0.9724702
[1] 1
[1] 1

Covariance and Correlation For data frame

We can calculate the covariance and correlation for all columns in data frame.

R
data(iris)
library(dplyr)

data=select(iris,-Species)

cor(data)
cov(data)

Output:

matrix
Output

Conversion of Covariance to Correlation in R

cov2cor() function in R programming converts a covariance matrix into a corresponding correlation matrix.

Syntax:

cov2cor(X)

Where:

  • X represents the covariance square matrix

Example: 

R
x <- rnorm(2)
y <- rnorm(2)

mat <- cbind(x, y)
X <- cov(mat)

print(X)
print(cor(mat))
print(cov2cor(X))

Output: 

matrix
Output

Correlation describes the intensity and direction of the linear link between two variables, whereas covariance shows how much two variables vary together.


Similar Reads