Siksha Sarovar

Siksha Sarovar (sikshasarovar.com) is a free educational web application that helps students in India learn programming and prepare for academic and competitive exams. The platform offers structured coding courses (C, C++, Python, Java, HTML, CSS, PHP, Power BI, AI, Machine Learning, Data Science), complete university curriculum notes for BCA/MCA students with previous year question papers, Class 10 and Class 12 CBSE/HBSE school notes, and dedicated preparation material for SSC, UPSC, Banking, Railway and other government exams. Browsing the site is completely free and requires no account. Users may optionally sign in with Google solely to save their learning progress, quiz scores and personal preferences across devices.

Privacy Policy | Terms of Service | Contact Siksha Sarovar | About Siksha Sarovar

v4.0.9 · PWA
Siksha Sarovar logo
Siksha Sarovar
Your Learning Universe

Siksha Sarovar is a free e-learning platform for coding courses, BCA university notes and competitive exam preparation. Optional Google sign-in saves your learning progress across devices.

Initializing knowledge base…
Compiling modules 0%

2.4 Rigorous Hypothesis Testing

Lesson 12 of 32 in the free Data Visualisation and Analytics notes on Siksha Sarovar, written by Rohit Jangra.

Hypothesis Testing: The Decision Framework

1. The Philosophical Framework

Hypothesis testing operates like a criminal trial: "Innocent until proven guilty."

Study Deep: Statistical Power (1 - β)

Statistical Power is the probability that a test will correctly reject a false null hypothesis.

  • High Power: Means you are likely to discover a real effect if it exists.
  • How to increase Power:
  1. Increase Sample Size (most common).
  2. Increase Alpha (but this increases Type I error risk).
  3. Increase Effect Size (looking for bigger differences).
  4. Reduce Noise (standardize conditions).

1. The Philosophical Framework

  • Null Hypothesis (H₀): The assumption of innocence. It states there is no effect, no difference, or no relationship. Any observed difference is purely due to random sampling error.
  • Alternative Hypothesis (H₁ or Hₐ): The claim we are trying to prove.

2. Type I and Type II Errors (Machine Learning Context)

No test is 100% accurate because we rely on samples.

MetricStatistical TermML/System EquivalentDefinition
False PositiveType I Error (α)Spam filter marks a normal email as spamRejecting H₀ when H₀ is actually true.
False NegativeType II Error (β)Spam filter lets a virus through to inboxFailing to reject H₀ when H₁ is true.
  • Alpha (α): The probability of making a Type I error (usually set at 0.05). Also called the Significance Level.
  • Beta (β): The probability of making a Type II error.

3. Statistical Power (1 - β)

Power is the probability of correctly rejecting a false Null Hypothesis (i.e., finding an effect when it truly exists). In ML terms, this is Recall or Sensitivity.

  • A test should ideally have a Power of 0.80 (80%) or higher.
  • How to increase Power: Increase sample size (n), increase Alpha (trade-off: more false positives), or increase the effect size (use a more precise measurement).

4. The Rejection Region (Critical Region)

The Critical Region is the area under the probability distribution curve where the test statistic is so extreme that we reject H₀.

  • Critical Values: The boundaries of this region (e.g., Z = +1.96 and Z = -1.96 for a 95% two-tailed test).
  • If your calculated test statistic falls inside the critical region, you reject the null.

5. Steps for Formal Testing

  1. Formulate H₀ and H₁: Ensure they are mutually exclusive.
  2. Set α and select test: (e.g., α = 0.05, two-tailed T-test).
  3. Verify Assumptions: (Normality, Independence, Homoscedasticity).
  4. Compute Test Statistic & p-value: Using data.
  5. Draw Conclusion: Compare p-value to α, or test statistic to critical value. State the conclusion in business/domain terms.