Matrix Operations
Matrix operations are the computational backbone of Machine Learning. Understanding how matrices are added, multiplied, and decomposed is essential for grasping how algorithms work internally.
---
1. Matrix Addition & Subtraction
Rule: Add (or subtract) corresponding elements. Both matrices must have the same dimensions.
A + B = [aᵢⱼ + bᵢⱼ]
Example: [[1,2],[3,4]] + [[5,6],[7,8]] = [[6,8],[10,12]]
---
2. Scalar Multiplication
Rule: Multiply every element of the matrix by the scalar.
c × A = [c × aᵢⱼ]
Example: 3 × [[1,2],[3,4]] = [[3,6],[9,12]]
---
3. Matrix Multiplication
Rule: The number of columns in the first matrix must equal the number of rows in the second matrix. If A is m×n and B is n×p, the result C is m×p.
Cᵢⱼ = Σₖ Aᵢₖ × Bₖⱼ (dot product of row i of A and column j of B)
Key Properties:
- Not Commutative:
A × B ≠B × A(in general). - Associative:
(A × B) × C = A × (B × C). - Distributive:
A × (B + C) = A×B + A×C.
Matrix Multiplication Dimension Guide:
| Matrix A | Matrix B | Result | Valid? |
|---|---|---|---|
| 2×3 | 3×4 | 2×4 | ✅ Yes |
| 3×2 | 3×4 | — | ⌠No (2 ≠3) |
| 4×4 | 4×1 | 4×1 | ✅ Yes |
| 1×5 | 5×1 | 1×1 (scalar) | ✅ Yes |
Use in Data Science:
- Neural Networks are essentially chains of matrix multiplications:
Output = Activation(W × X + b) - In Linear Regression:
ŷ = X × β(prediction = data matrix × coefficient vector)
---
4. Determinant of a Matrix
Definition: The determinant is a scalar value that can be computed from a square matrix. It provides important information about the matrix.
For a 2×2 matrix: det([[a,b],[c,d]]) = ad - bc
Key Facts:
- If
det(A) = 0, the matrix is singular (non-invertible) — it "collapses" space. - If
det(A) ≠0, the matrix is invertible. - The determinant tells you the "scaling factor" of the linear transformation.
---
5. Inverse of a Matrix
Definition: The inverse of a matrix A (written Aâ»Â¹) is the matrix such that A × Aâ»Â¹ = I (Identity Matrix).
Conditions:
- Only square matrices can have inverses.
- The matrix must be non-singular (
det(A) ≠0).
For a 2×2 matrix: Aâ»Â¹ = (1/det(A)) × [[d, -b], [-c, a]]
Use in Data Science:
- Solving systems of linear equations:
x = Aâ»Â¹ × b - The Normal Equation in Linear Regression:
β = (Xáµ€X)â»Â¹ × Xáµ€y
---
Eigenvalues & Eigenvectors
This is one of the most powerful concepts in Linear Algebra for Data Science.
Definition: For a square matrix A, a non-zero vector v is an eigenvector if multiplying it by A only changes its magnitude (scales it), not its direction:
A × v = λ × v
Where:
- v is the eigenvector (the direction that doesn't change).
- λ (lambda) is the eigenvalue (the scaling factor).
Intuition:
Imagine stretching a rubber sheet. Most points move in complex ways. But some arrows (eigenvectors) only get longer or shorter — they maintain their direction. The factor by which they stretch is the eigenvalue.
---
How to Find Eigenvalues
- Start with the equation:
A × v = λ × v - Rearrange:
(A - λI) × v = 0 - For non-trivial solutions:
det(A - λI) = 0(the Characteristic Equation) - Solve for λ → these are the eigenvalues.
- Substitute each λ back to find the corresponding eigenvector v.
---
Eigenvalues & Eigenvectors in Data Science
| Application | How Eigen Concepts are Used |
|---|---|
| PCA (Principal Component Analysis) | Eigenvectors of the covariance matrix give the principal components (directions of maximum variance) |
| Google PageRank | The PageRank vector is the dominant eigenvector of the web link matrix |
| Spectral Clustering | Uses eigenvectors of the graph Laplacian matrix to identify clusters |
| Stability Analysis | Eigenvalues determine if a system is stable or unstable |
| Matrix Decomposition | SVD (Singular Value Decomposition) is built on eigenvalues |
Why PCA Uses Eigenvalues:
PCA is one of the most important techniques for dimensionality reduction. It works by:
- Computing the covariance matrix of the data.
- Finding the eigenvectors of this covariance matrix — these are the "principal components".
- The eigenvalues tell you how much variance each principal component captures.
- You keep only the top-k eigenvectors (with the largest eigenvalues) to reduce dimensionality while retaining maximum information.
Summary
- Matrix addition requires same dimensions; multiplication requires inner dimensions to match.
- The determinant tells you if a matrix is invertible.
- Matrix inverses are used in Linear Regression (Normal Equation).
- Eigenvalues and eigenvectors reveal the fundamental behavior of linear transformations.
- PCA, Google PageRank, and spectral clustering all rely on eigendecomposition.