Siksha Sarovar

Siksha Sarovar (sikshasarovar.com) is a free educational web application that helps students in India learn programming and prepare for academic and competitive exams. The platform offers structured coding courses (C, C++, Python, Java, HTML, CSS, PHP, Power BI, AI, Machine Learning, Data Science), complete university curriculum notes for BCA/MCA students with previous year question papers, Class 10 and Class 12 CBSE/HBSE school notes, and dedicated preparation material for SSC, UPSC, Banking, Railway and other government exams. Browsing the site is completely free and requires no account. Users may optionally sign in with Google solely to save their learning progress, quiz scores and personal preferences across devices.

Privacy Policy | Terms of Service | Contact Siksha Sarovar | About Siksha Sarovar

v4.0.9 · PWA
Siksha Sarovar logo
Siksha Sarovar
Your Learning Universe

Siksha Sarovar is a free e-learning platform for coding courses, BCA university notes and competitive exam preparation. Optional Google sign-in saves your learning progress across devices.

Initializing knowledge base…
Compiling modules 0%

4.3 White-Box Testing & Cyclomatic Complexity

Lesson 22 of 24 in the free Software Engineering notes on Siksha Sarovar, written by Rohit Jangra.

4.3 White-Box Testing & Cyclomatic Complexity

What is White-Box Testing?

White-box testing (also glass-box, clear-box, structural testing) examines the internal structure of the code. The tester reads the source code and designs test cases that exercise specific paths, branches and statements.

    Test cases come from understanding
    the code's structure, not the spec.

When to use

  • Unit testing (primary)
  • Some integration testing
  • Security analysis
  • Code-coverage measurement

Advantages

  • Reveals bugs in implementation (not visible from spec)
  • Achieves code coverage targets
  • Validates that all logic paths execute

Disadvantages

  • Requires programming skill
  • Time-consuming
  • May miss missing functionality (if code doesn't have it, no path covers it)
  • Test cases need rewriting if code changes

---

Code Coverage — what we measure

White-box testing uses coverage criteria as goals:

Coverage TypeWhat is coveredStrength
Statement coverageEvery executable statementWeakest
Branch / Decision coverageEvery branch (T and F) of every decisionStronger
Condition coverageEvery Boolean sub-expression's T and FStronger
Multiple Condition coverageEvery combination of conditionsStrong
Path coverageEvery executable path through the codeStrongest (often impractical)
Modified Condition / Decision (MC/DC)Each condition independently affects decisionRequired for DO-178C

Example function

int classify(int x, int y) {
    if (x > 0 && y > 0)         // L2
        return 1;               // L3
    else if (x < 0 || y < 0)    // L4
        return -1;              // L5
    else
        return 0;               // L7
}

100% statement coverage needs 3 tests covering lines 3, 5, 7:

  • (1, 1) → 1 ✓ covers L3
  • (−1, 5) → −1 ✓ covers L5 (via x < 0)
  • (0, 0) → 0 ✓ covers L7

100% branch coverage also needs each decision's true and false:

  • For x > 0 && y > 0: covered (T by 1,1; F by −1,5 and 0,0)
  • For x < 0 || y < 0: covered (T by −1,5; F by 0,0)

---

Cyclomatic Complexity — McCabe (1976)

The most-tested metric in IPU papers. Cyclomatic complexity V(G) measures the number of linearly independent paths through a program. Three equivalent formulas:

V(G) = E − N + 2       (E = edges, N = nodes in control-flow graph)
V(G) = P + 1           (P = number of predicate / decision nodes)
V(G) = number of regions in the control-flow graph

Why it matters

  • V(G) = minimum number of test cases for basis path coverage
  • V(G) = number of independent paths through the code
  • High V(G) means hard-to-test, hard-to-maintain code

Recommended thresholds (McCabe)

  • 1–10: simple, low risk
  • 11–20: moderate, more test effort
  • 21–50: complex, high risk
  • > 50: untestable, must be refactored

---

Step-by-step example

int max(int a, int b, int c) {
    int m = a;                  // S1
    if (b > m)                  // S2 (decision)
        m = b;                  // S3
    if (c > m)                  // S4 (decision)
        m = c;                  // S5
    return m;                   // S6
}

Step 1: Draw control flow graph (CFG)

        ┌────────┐
        │  S1    │
        └───┬────┘
            │
        ┌───▼────┐
        │  S2    │ ◄── decision
        └─┬────┬─┘
       T  │    │  F
        ┌─▼──┐ │
        │S3  │ │
        └─┬──┘ │
          └─┬──┘
            │
        ┌───▼────┐
        │  S4    │ ◄── decision
        └─┬────┬─┘
       T  │    │  F
        ┌─▼──┐ │
        │S5  │ │
        └─┬──┘ │
          └─┬──┘
            │
        ┌───▼────┐
        │  S6    │
        └────────┘

Step 2: Count

  • Nodes (N): S1, S2, S3, S4, S5, S6, and merge points after S2 and S4 → 8 nodes
  • Wait — usually we collapse merge nodes. Simpler count: 6 statement nodes, plus the structure of the diamond gives the count by edges/regions.

Using V(G) = P + 1 (cleanest):

  • Decision nodes: S2 and S4 → P = 2
  • V(G) = 2 + 1 = 3

So 3 independent paths, 3 test cases needed:

Testa, b, cPath
15, 3, 2S2 false, S4 false (a is max)
23, 5, 2S2 true, S4 false (b is max)
33, 2, 5S2 false, S4 true (c is max)

Step 3: Verify with V(G) = E − N + 2

If we draw the CFG cleanly with merge nodes, E (edges) and N (nodes) will satisfy E − N + 2 = 3. All three formulas agree.

---

Bigger example

void categorize(int score) {
    if (score >= 90)              // D1
        printf("A");
    else if (score >= 75)         // D2
        printf("B");
    else if (score >= 50)         // D3
        printf("C");
    else if (score >= 35)         // D4
        printf("D");
    else
        printf("F");
}

Decision nodes: D1, D2, D3, D4 → P = 4 V(G) = 4 + 1 = 5

Test cases for basis paths:

  1. score = 95 → "A"
  2. score = 80 → "B"
  3. score = 60 → "C"
  4. score = 40 → "D"
  5. score = 20 → "F"

5 independent paths, 5 test cases.

---

Basis Path Testing — McCabe's procedure

  1. Draw the CFG from code
  2. Compute V(G)
  3. Determine a basis set of V(G) independent paths
  4. Prepare a test case for each basis path
  5. Execute and verify

The basis set is a mathematical basis for the path space — any other path can be expressed as a combination of basis paths.

---

Common white-box techniques

TechniqueIdea
Statement coverageExecute every statement at least once
Branch coverageExecute every branch (T and F) of every decision
Path coverageExecute every possible path (often infeasible)
Basis path testingMcCabe's method — V(G) paths
Data flow testingTrace variables — definitions and uses
Loop testingTest loops at 0, 1, n, n−1, n+1, and max iterations
Mutation testingInject small bugs into code; tests should catch them

---

Mutation Testing — the bug-injector approach

Idea: generate small mutations of the code (change + to -, < to <=, etc.). A good test suite should "kill" every mutant by failing. Mutation score = killed mutants / total mutants.

Mutation testing is the most rigorous coverage criterion but is computationally expensive. Tools: PIT (Java), Stryker (JS), Mutmut (Python).

---

Loop Testing — checklist

For each loop, design tests for:

TestPurpose
0 iterationsLoop skipped
1 iterationSingle execution
2 iterationsMultiple, baseline
(n−1, n, n+1) iterationsAround expected count
Max iterationsBoundary at max
Nested loopsInner full, outer minimal; vice versa

Loops are a famous source of off-by-one defects. Many production bugs trace back to insufficient loop testing.

---

Black-Box vs White-Box — comparison

AspectBlack-BoxWhite-Box
View of systemExternalInternal
Source code needed?NoYes
Done byTesters, usersDevelopers
StageSystem, acceptanceUnit, integration
FocusFunctionalityLogic / code paths
Test basisSRSCode structure
Programming skillNot requiredRequired
Catches missing function?YesNo
Catches dead code?NoYes

---

Grey-Box Testing

A hybrid — tester has partial knowledge of internals:

  • Knows the data flow / database schema but not the exact code
  • Can write tests that exercise specific paths via inputs
  • Common in security testing (knowing the architecture but treating individual components as black boxes)

---

Key Terms — Lesson 4.3

The vocabulary below covers white-box testing, McCabe's complexity metric, coverage criteria, and the modern automation around them.

White-Box Testing — A testing approach where the internal structure of the code is known to the tester and test cases are designed to exercise specific paths, branches, and statements. Also called glass-box testing, clear-box testing, structural testing, or code-based testing.

Code Coverage — The percentage of the source code that is executed by the test suite. Coverage is measured by instrumenting the code (recording which lines/branches run) during test execution. Industry target: ~80% line coverage as a healthy baseline.

Statement Coverage / Line Coverage — The weakest coverage criterion: every executable statement is executed at least once by some test. Easy to achieve and to measure but does not guarantee that all decision outcomes are tested.

Branch Coverage / Decision Coverage — A stronger criterion than statement coverage: every branch of every decision (the true path and the false path of every if, every loop entry/skip) is exercised. Branch coverage is typically the industry's "useful target."

Condition Coverage — A criterion requiring that each Boolean sub-expression in a compound condition takes both true and false values. Example: if (a > 0 && b > 0) — each of a > 0 and b > 0 must independently be tested true and false.

Multiple Condition Coverage — A still-stronger criterion: every combination of Boolean sub-expression values is tested. For n conditions there are up to 2ⁿ combinations. Usually impractical for real code.

MC/DC (Modified Condition / Decision Coverage) — A specific criterion required by DO-178C Level A (aviation) and ISO 26262 ASIL D (automotive): every Boolean operand must be shown to independently affect the outcome of the decision. MC/DC is the gold standard for safety-critical code.

Path Coverage — The strongest criterion: every executable path through the control-flow graph is exercised. Often infeasible because programs with loops have exponentially or infinitely many paths. Approximated by basis path testing.

Control Flow Graph (CFG) — A directed graph where nodes are basic blocks of code (sequences with one entry and one exit) and edges are control transfers between them (branches, jumps, function calls). The CFG is the structural representation that white-box techniques operate on.

Basic Block — A maximal sequence of statements with one entry and one exit — control enters at the top, exits at the bottom, no jumps in between. Basic blocks are the nodes in the CFG.

Cyclomatic Complexity / V(G) (McCabe, 1976) — A metric for the number of linearly independent paths through a function's control-flow graph. Three equivalent formulas: V(G) = E − N + 2 (edges minus nodes plus 2), V(G) = P + 1 (predicates plus 1), V(G) = number of regions in the CFG. V(G) = minimum number of test cases for basis path coverage.

Predicate Node / Decision Node — A CFG node where control branches based on a conditionif, while, for, switch. The count of predicate nodes plus one equals cyclomatic complexity.

McCabe's Thresholds — McCabe's recommended interpretation of V(G): 1–10 simple, low risk; 11–20 moderate, more test effort; 21–50 complex, high risk; > 50 untestable, must be refactored. Modern industry uses V(G) > 10 as a refactoring signal.

Linearly Independent Path — A path through the CFG that introduces at least one new edge not in any previously chosen path. The basis set is a maximal collection of linearly independent paths; its size equals V(G). Any path in the program is a linear combination of basis paths.

Basis Path Testing — McCabe's procedure: draw the CFG, compute V(G), identify a basis set of V(G) paths, and write one test case per path. Guarantees branch coverage and exercises each independent path.

Region — In a planar drawing of the CFG, a bounded area enclosed by edges (plus the unbounded outer region). The number of regions equals V(G), giving the third equivalent formula.

Data Flow Testing — A white-box technique that traces the lifecycle of each variable — where it is defined (assigned) and where it is used (read). Tests are designed to exercise every definition-use (DU) chain. Catches bugs that statement coverage misses — uninitialised variables, dead definitions, dangerous shadowing.

Loop Testing — A specific white-box technique focused on loops, the most common source of off-by-one defects. Test cases exercise the loop at 0 iterations (skip), 1 iteration, 2 iterations, n−1 / n / n+1, and maximum iterations.

Mutation Testing — A technique that deliberately introduces small bugs (mutants) into the code — change + to -, change < to <=, remove a statement — and runs the test suite. A test suite that kills (fails on) most mutants is judged more thorough than one that misses them. Mutation score = killed / total mutants.

Mutant — A version of the program with one small synthetic bug inserted. Mutants are generated automatically by mutation testing tools (PIT for Java, Stryker for JavaScript, Mutmut for Python).

Killed vs Surviving Mutant — A mutant is killed if at least one test in the suite fails on it (the suite detected the bug). A mutant survives if all tests pass — meaning the suite is blind to that kind of change. Surviving mutants indicate weak tests.

Coverage Tool — Software that instruments the code to record which lines/branches/paths execute during tests, then produces a coverage report. JaCoCo for Java, Coverage.py for Python, Istanbul / nyc for JavaScript, gcov / lcov for C/C++, Coverlet for .NET.

Code Coverage Threshold (in CI) — A configured minimum coverage percentage below which the CI build fails. Common configurations: fail if line coverage falls under 80%, or fail if any single file is under 60%. Coverage thresholds prevent silent erosion of test rigour.

Grey-Box Testing — A hybrid testing approach where the tester has partial knowledge of internals — data flow, database schema, architecture — but still treats individual components as black boxes. Common in security testing and integration testing.

Symbolic Execution — An advanced white-box technique where the program is executed with symbolic inputs (e.g., x = α rather than x = 5) and the tool automatically derives the input constraints that lead to each path. Used by tools like KLEE and SAGE for automatic test case generation, especially in security.

Fuzz Testing / Fuzzing — A testing technique that feeds random or semi-random inputs into a program to look for crashes, hangs, or unexpected outputs. Modern coverage-guided fuzzers (AFL, libFuzzer, honggfuzz) use coverage feedback to evolve inputs that exercise new paths. Hugely successful in finding security bugs.

Static Analysis (White-Box Context) — Analysing the source code without executing it to find bugs, security issues, and code smells. Tools: SonarQube, Coverity, Klocwork, CodeQL. Static analysis complements white-box testing because it can find defects in code paths that tests may never reach.

Inspection vs Testing (White-Box Context) — Two complementary defect-finding activities. Inspection is the static white-box activity — reading code by humans (Fagan, peer review). Testing is the dynamic activity — running code with specific inputs. Inspections find defects testing typically misses (intent-related, design-level); testing finds defects inspections miss (behavioural, environment-dependent).

---

Study deep

  1. Coverage is a means, not an end. 100% statement coverage doesn't mean the code is bug-free — it means every line was executed, not that every behaviour was verified. Coverage targets are a useful baseline (~80% is industry typical) but not a quality guarantee.
  1. V(G) approximates testability. Functions with V(G) ≤ 10 are usually manageable. Above 20 the function is likely doing too much and should be split. Cyclomatic complexity is a refactoring trigger.
  1. MC/DC is required for life-critical software. Aviation (DO-178C Level A) and automotive (ISO 26262 ASIL D) mandate Modified Condition / Decision Coverage — each Boolean operand must independently influence the decision outcome. Far stricter than basic branch coverage.
  1. Mutation testing is the gold standard. Studies show that test suites with high mutation scores catch real bugs better than test suites with merely high coverage. The cost is significant — entire test suite reruns for each mutation — but tools are improving.
  1. Modern white-box automation. Tools like JaCoCo (Java), Coverage.py (Python), and Istanbul (JS) report coverage automatically. CI pipelines fail if coverage drops below a threshold (e.g. 80%).
PYQ pattern (numerical, very common): "For the given code, draw the control flow graph and compute cyclomatic complexity." — Identify decision nodes (P), compute V(G) = P + 1. Verify with E − N + 2 or regions. Identify a basis set of paths.