Basic Mathematics for Data Science
Before diving into Linear Algebra or Probability, it is essential to be comfortable with fundamental mathematical concepts that appear repeatedly across data science workflows.
---
1. Functions
Definition: A function is a mathematical relationship that maps each input to exactly one output.
Notation: f(x) = y, meaning "function f takes input x and produces output y."
Types of Functions Important in Data Science:
| Function Type | Formula | Use in Data Science |
|---|---|---|
| Linear | f(x) = mx + b | Linear Regression (predicting continuous values) |
| Quadratic | f(x) = ax² + bx + c | Cost functions, optimization |
| Exponential | f(x) = aˣ | Growth modeling (population, viral spread) |
| Logarithmic | f(x) = log(x) | Feature scaling, information theory |
| Sigmoid | f(x) = 1 / (1 + eâ»Ë£) | Logistic Regression, Neural Networks |
---
2. Logarithms
Definition: A logarithm answers the question: "To what power must the base be raised to produce a given number?"
log_b(x) = y means bʸ = x
Common Bases:
- Base 10 (Common Log):
logâ‚â‚€(1000) = 3because10³ = 1000 - Base 2 (Binary Log): Used in information theory and decision trees.
- Base e (Natural Log, ln): Most common in ML;
e ≈ 2.718
Why Logarithms Matter in Data Science:
- Compressing Large Ranges: Income data ranging from ₹10,000 to ₹10,00,00,000 can be compressed using log transformation.
- Log Loss (Cross-Entropy): The most common loss function for classification models.
- Information Gain: Decision Trees use logâ‚‚ to calculate entropy and information gain.
- Feature Engineering: Applying log transformation to skewed data to make it more normally distributed.
Properties of Logarithms:
| Property | Formula | Example |
|---|---|---|
| Product Rule | log(a × b) = log(a) + log(b) | log(2 × 5) = log(2) + log(5) |
| Quotient Rule | log(a / b) = log(a) - log(b) | log(10/2) = log(10) - log(2) |
| Power Rule | log(aâ¿) = n × log(a) | log(8) = log(2³) = 3 × log(2) |
| Change of Base | log_b(a) = log(a) / log(b) | Convert between bases |
---
3. Summation Notation (Sigma Notation)
Definition: The Greek letter Σ (sigma) represents the sum of a series of terms.
Σᵢ₌â‚â¿ xáµ¢ = xâ‚ + xâ‚‚ + x₃ + ... + xâ‚™
Examples:
- Mean (Average):
μ = (1/n) × Σᵢ₌â‚â¿ xáµ¢ - Sum of Squares:
Σᵢ₌â‚â¿ xᵢ²
Why It Matters:
- Almost every statistical formula (mean, variance, standard deviation) uses summation.
- Cost functions in ML are expressed using sigma notation.
- Understanding it is essential for reading research papers and textbooks.
---
4. Derivatives (Calculus Basics)
Definition: A derivative measures the rate of change of a function with respect to its input. In Data Science, derivatives are used to minimize (or maximize) functions — the core of model training.
Key Concept — Gradient Descent:
- Gradient Descent is the primary optimization algorithm in machine learning.
- It uses derivatives to find the minimum of a cost function.
- The "gradient" is simply the derivative (or partial derivative in multiple dimensions).
Intuition:
Imagine you are standing on a hilly landscape blindfolded. You want to reach the lowest point (valley). You feel the slope under your feet (the derivative) and take a step in the downhill direction. You repeat until you reach the bottom. That is Gradient Descent.
Summary
- Functions, logarithms, summation notation, and basic calculus are the mathematical "alphabet" of Data Science.
- Logarithms are used in loss functions, feature engineering, and information theory.
- Sigma notation is the language of statistical formulas.
- Derivatives power gradient descent, the engine behind training ML models.