In this article, we will learn how to calculate the covariance matrix using Python. But before that, let’s try to understand what a covariance matrix is.
What is covariance?
Let’s say there are two random variables, X and Y. And we want to know the correlation between the two random variables. In other words, we want to know if we change the value of X, then how that will affect the value of Y. To know that, we use covariance.
If the covariance of two random variables, X and Y, is positive, that will mean X and Y are positively related. In other words, if we increase the value of X, the value of Y will also increase.
On the other hand, if the covariance of two random variables, X and Y, is negative, that will mean X and Y are negatively related. In other words, if we increase the value of X, the value of Y will decrease.
And a high value of covariance between X and Y indicates there is a strong relationship between X and Y. And a low value of covariance between X and Y indicates there is a weak relationship between X and Y.
How do we calculate the covariance between two random variables, X and Y?
Now, let’s say the random variable X can take any values [x1, x2, x3, … xn] and the random variable Y can take any values [y1, y2, y3, … yn] Let’s also say x̄ is the mean value of X and ȳ is the mean value of Y. So,
The covariance of X and Y, in that case, will be
Please note that in inferential statistics, we often calculate the sample covariance of a sample and use the result to infer the covariance about the population. In that case, when we calculate the sample covariance, we use (n – 1) as a denominator instead of n. So, for sample covariance, …






0 Comments