Parametric Tests: Z and T Distributions
1. What makes a test "Parametric"?
Parametric tests assume that the underlying population follows a specific probability distribution (usually the Normal distribution) and rely on calculating population parameters (like mean and variance).
Study Deep: Welch's T-Test (Unequal Variance)
The standard Independent T-test assumes that both groups have the same variance (Homoscedasticity). If this assumption is violated, the test can give misleading results.
- The Solution: Use Welch's T-Test.
- Difference: It adjusts the degrees of freedom (df) based on the variances of the two groups. It is now considered the "modern default" because it performs just as well as the standard test when variances are equal, but is much safer when they are not.
2. The Z-Test
Used to compare a sample mean to a population mean. Conditions: Population standard deviation (σ) is KNOWN, OR sample size is large (n ≥ 30, relying on CLT).
Formula: Z = (x̄ - μ) / (σ / √n)
- Example: We know historical server response time has μ=200ms and σ=20ms. A new routing algorithm is tested on 50 requests (n=50), giving x̄=190ms.
Z = (190 - 200) / (20 / √50) = -10 / 2.82 = -3.54- Since |Z| > 1.96, the algorithm significantly improved response time.
3. The Student's T-Test
Used when σ is UNKNOWN and must be estimated using the sample standard deviation (s). Highly essential for small datasets (n < 30).
Formula (One-Sample): t = (x̄ - μ) / (s / √n) with df = n - 1
4. Independent (Two-Sample) T-Test
Used to compare the means of two entirely different groups (e.g., Mac users vs. Windows users).
- Formula:
t = (x̄₁ - x̄₂) / √((s₁²/n₁) + (s₂²/n₂)) - Assumption of Homoscedasticity: This formula assumes both groups have similar variances (Equal Variance).
- Welch's T-Test: If variances are significantly different (Heteroscedasticity), we use Welch's adjustment, which modifies the degrees of freedom calculation to prevent false positives.
5. Paired (Matched) T-Test
Used to compare means from the SAME group at two different times (e.g., System latency Before vs. After a patch).
- Method: We calculate the difference (
d) for each pair, then run a one-sample t-test on those differences to see if the average difference is 0. - Formula:
t = d̄ / (s_d / √n)(whered̄is the mean of differences). - Why use it? It naturally controls for individual variations, making it much more statistically powerful than an independent t-test.
6. Z-Test vs T-Test: The Complete Decision Guide
This is one of the most common exam questions. Use this step-by-step decision process:
Step 1: Is population standard deviation (σ) known?
- YES → Use Z-Test (if n ≥ 30 or population is normally distributed)
- NO → Use T-Test (you must use sample s instead)
Step 2: What is the sample size?
- n ≥ 30 → Z-Test is preferred (CLT ensures normality)
- n < 30 → T-Test is mandatory (rely on t-distribution's heavy tails)
| Criteria | Z-Test | T-Test |
|---|---|---|
| σ known? | Yes | No (use sample s) |
| Sample Size | n ≥ 30 | Any (essential for n < 30) |
| Distribution | Standard Normal Z ~ N(0,1) | Student's t with df=n-1 |
| Critical Value (α=0.05, two-tail) | ±1.96 | ±2.069 (df=24), ±2.262 (df=9) |
| Formula | Z = (x̄-μ)/(σ/√n) | t = (x̄-μ)/(s/√n) |
7. Worked Numerical Example (Exam Style)
Problem: A random sample of 16 students was taken. Their mean score was 62 and sample standard deviation was 10. University claims average score is 65. Test at α = 0.05 (two-tailed).
Step 1: H₀: μ = 65, H₁: μ ≠ 65 Step 2: σ is UNKNOWN, n = 16 < 30 → Use T-Test Step 3: t = (62 - 65) / (10 / √16) = -3 / 2.5 = -1.2 Step 4: df = 16 - 1 = 15. Critical value t₀.₀₅,₁₅ = ±2.131 Step 5: |t| = 1.2 < 2.131 → Fail to Reject H₀ Conclusion: There is insufficient evidence to say the mean score differs from 65.