Linear Algebra: Vectors & Matrices
Linear Algebra is the branch of mathematics dealing with vectors, matrices, and linear transformations. It is arguably the most important mathematical discipline for Data Science and Machine Learning, because datasets are fundamentally represented as matrices, and most ML algorithms operate on these matrix representations.
---
Why Linear Algebra for Data Science?
- Every dataset (spreadsheet) is a matrix — rows are samples, columns are features.
- Images are represented as matrices of pixel values.
- Neural networks perform millions of matrix multiplications.
- Dimensionality reduction techniques (PCA) rely on eigenvalues and eigenvectors.
- Recommendation systems use matrix factorization.
---
Scalars, Vectors, and Matrices
Scalars
- A single number.
Example: x = 5, temperature = 36.7- Represented by lowercase letters:
a, b, x, y
Vectors
Definition: A vector is an ordered list (array) of numbers. It represents a point or a direction in space.
- Row Vector:
v = [1, 2, 3](horizontal) - Column Vector: Written vertically.
- Dimension: A vector with n elements is called an n-dimensional vector.
Real-World Analogy: A student's exam scores across 5 subjects can be represented as a vector: scores = [85, 92, 78, 96, 88]
Types of Vectors:
| Type | Description | Example |
|---|---|---|
| Zero Vector | All elements are zero | [0, 0, 0] |
| Unit Vector | Has a magnitude of 1 | [1, 0, 0] in 3D space |
| Sparse Vector | Most elements are zero | [0, 0, 5, 0, 0, 3, 0] (common in NLP) |
Vector Operations:
| Operation | Formula | Example | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Addition | [aâ‚+bâ‚, aâ‚‚+bâ‚‚] | [1,2] + [3,4] = [4,6] | ||||||||
| Scalar Multiplication | c × [aâ‚, aâ‚‚] | 3 × [1,2] = [3,6] | ||||||||
| Dot Product | Σ aᵢ × bᵢ | [1,2] · [3,4] = 3+8 = 11 | ||||||||
| Magnitude (Norm) | ` | v | = √(Σ vᵢ²)` | ` | [3,4] | = √(9+16) = 5` |
Dot Product — Why It Matters: The dot product is one of the most fundamental operations in ML:
- Similarity Measurement: Cosine similarity uses dot products to measure how similar two documents or items are.
- Neural Networks: Every neuron computes a weighted dot product of its inputs.
- Projections: Projecting one vector onto another uses the dot product.
---
Matrices
Definition: A matrix is a rectangular array of numbers arranged in rows and columns. An m × n matrix has m rows and n columns.
Notation: A matrix is typically denoted by an uppercase bold letter like A, B, X.
Example: A 2×3 matrix: A = [[1, 2, 3], [4, 5, 6]]
Types of Matrices:
| Type | Description | Example |
|---|---|---|
| Row Matrix | Only 1 row | [1, 2, 3] (1×3) |
| Column Matrix | Only 1 column | [[1], [2], [3]] (3×1) |
| Square Matrix | Same number of rows and columns | 3×3 matrix |
| Identity Matrix (I) | Diagonal elements are 1, rest 0 | [[1,0],[0,1]] |
| Zero Matrix | All elements are 0 | [[0,0],[0,0]] |
| Diagonal Matrix | Non-zero elements only on diagonal | [[3,0],[0,7]] |
| Symmetric Matrix | A = Aáµ€ (transpose equals itself) | Covariance matrices |
| Sparse Matrix | Most elements are zero | Text data (TF-IDF) |
Matrices in Data Science:
- A dataset with 1000 samples and 10 features is a 1000 × 10 matrix.
- A grayscale image of 28×28 pixels (like in MNIST) is a 28 × 28 matrix.
- A color image is a 3D tensor (Height × Width × 3 color channels).
Transpose of a Matrix
Definition: The transpose of a matrix A (written Aáµ€) is obtained by swapping its rows and columns.
If A is an m × n matrix, then Aᵀ is an n × m matrix.
Example: A = [[1, 2, 3], [4, 5, 6]] → Aᵀ = [[1, 4], [2, 5], [3, 6]]
Use in Data Science:
- Computing covariance matrices:
Cov = (1/n) × Xᵀ × X - Many ML formulas require the transpose for matrix multiplication compatibility.
Summary
- Scalars, vectors, and matrices are the building blocks of linear algebra.
- Vectors represent data points; matrices represent entire datasets.
- The dot product is essential for similarity, neural networks, and projections.
- Matrix types (Identity, Diagonal, Sparse, Symmetric) appear frequently in ML.
- The transpose is a fundamental operation used in formulas across data science.