Aim
To implement covariance between two variables X and Y using np.cov() with ddof=1 (sample covariance), display the full 2 × 2 covariance matrix, and extract the covariance value from its off-diagonal element.
CO Mapping: CO1, CO2, CO5
Theory
Covariance measures how two variables move together. For samples it is defined as
cov(X, Y) = Σ (xᵢ − x̄)(yᵢ − ȳ) / (n − 1)
Each term pairs X's deviation from its mean with Y's deviation from its mean. When both deviations share a sign (both above or both below their means) the product is positive; opposite signs give negative products. So the sign of covariance is fully interpretable: positive → the variables rise together, negative → one rises as the other falls, near zero → no linear co-movement.
The magnitude, however, is nearly uninterpretable, and that is the key limitation this practical exposes. Covariance carries the product of the units of X and Y (cm × kg, marks²…), so its size changes if you merely rescale a variable — measure X in metres instead of centimetres and the covariance shrinks 100-fold while the relationship itself is unchanged. There is no fixed "large" or "small". This is exactly the defect the Pearson correlation of Practical 12 repairs: dividing by σ_X σ_Y cancels the units and pins the result into [−1, 1].
The divisor n − 1 (ddof=1) is Bessel's correction: sample deviations are measured from the sample mean, which is itself fitted to the data, consuming one degree of freedom; dividing by n would systematically underestimate the population covariance. Finally, np.cov(x, y) returns a 2 × 2 matrix, not a single number: the diagonal holds Var(X) and Var(Y) (a variable's covariance with itself is its variance), and the two off-diagonal cells both hold cov(X, Y) — the matrix is symmetric.
Dataset
| Index | X | Y |
|---|---|---|
| 0 | 12 | 22 |
| 1 | 15 | 25 |
| 2 | 18 | 28 |
| 3 | 21 | 31 |
| 4 | 24 | 35 |
| 5 | 27 | 36 |
Means: x̄ = 117 / 6 = 19.5, ȳ = 177 / 6 = 29.5.
Procedure
- Define
xandyas float NumPy arrays of 6 values each and wrap them in the DataFramedffor a clean printout. - Print the data table.
- Compute
cov_matrix = np.cov(df["X"], df["Y"], ddof=1)— a 2 × 2 sample covariance matrix. - Print the matrix and identify its parts:
[0, 0]is Var(X),[1, 1]is Var(Y), and[0, 1]=[1, 0]is cov(X, Y). - Extract
cov_xy = cov_matrix[0, 1]and print it rounded to 4 decimals.
Interpretation of Results
Working the formula by hand: the X deviations from 19.5 are (−7.5, −4.5, −1.5, 1.5, 4.5, 7.5) and the Y deviations from 29.5 are (−7.5, −4.5, −1.5, 1.5, 5.5, 6.5). Every pair shares its sign, so all six products are positive: 56.25 + 20.25 + 2.25 + 2.25 + 24.75 + 48.75 = 154.5, and cov(X, Y) = 154.5 / 5 = 30.9 — matching the program's output. The full printed matrix is [[31.5, 30.9], [30.9, 30.7]]: Var(X) = 31.5 and Var(Y) = 30.7 on the diagonal. The positive 30.9 confirms X and Y climb together — visible in the raw data, where Y increases in near-lockstep as X steps up by 3. But is 30.9 "strong"? Covariance alone cannot say; normalising gives r = 30.9 / √(31.5 × 30.7) ≈ 0.9937, revealing an almost perfectly linear relationship. The pair of numbers — cov = 30.9, r = 0.99 — is the whole lesson: covariance detects the direction, correlation quantifies the strength.
Common Mistakes
- Reporting the whole matrix (or the diagonal variance 31.5) as "the covariance" — cov(X, Y) is specifically the off-diagonal element
cov_matrix[0, 1]. - Judging relationship strength from covariance magnitude — 30.9 is unit-dependent and would change under any rescaling; use correlation for strength.
- Using
ddof=0(population divisor n) for sample data — it biases the estimate low; sample statistics need Bessel's n − 1.
🎯 Viva Questions
- What does the sign of covariance tell you? Positive: variables move together; negative: they move oppositely; near zero: no linear co-movement.
- Why is covariance magnitude hard to interpret? It carries the product of the two variables' units and changes under rescaling, so there is no universal "large" value.
- What lies on the diagonal of
np.cov(x, y)? The variances — Var(X) = 31.5 and Var(Y) = 30.7 here; a variable's covariance with itself is its variance. - What is Bessel's correction and why divide by n − 1? Deviations are taken from the sample mean, which uses up one degree of freedom; dividing by n − 1 removes the resulting downward bias.
- How do you get correlation from this matrix? r = cov(X, Y) / √(Var(X) · Var(Y)) = 30.9 / √(31.5 × 30.7) ≈ 0.9937.
- Why is the covariance matrix symmetric? Because cov(X, Y) = cov(Y, X) — the deviation products are the same regardless of order.