Aim
To draw a bar graph of India's CWG-2018 medal tally (Gold 26, Silver 20, Bronze 20, Total 66) with matplotlib and to read it using sound perceptual principles.
CO Mapping: CO1, CO2, CO3
Theory
A bar chart encodes each category's value as the length of a bar rising from a common zero baseline. Cleveland & McGill's classic graphical-perception experiments ranked position along a common scale and length as the judgements humans decode most accurately — far better than angle (pie charts) or area/colour. That is why bars are the default choice for comparing quantities across categories.
Two integrity rules follow directly:
- The value axis must start at 0. Length is the encoding; truncating the axis (say, starting at 15) would make Gold look several times Silver instead of 1.3×. The distortion is measured by Tufte's lie factor — visual effect size divided by data effect size, which should stay ≈ 1.
- Do not mix parts with wholes carelessly. "Total" (66) is the sum of the other three bars. Plotting it alongside them is acceptable in a scoreboard, but a reader must recognise it is a different kind of quantity — otherwise the chart appears to claim Total is simply the best category.
Colour here works as identity encoding: gold, silver and bronze hues match the medals, and steel-blue marks Total as different — a legend-free labelling trick, not decoration.
Dataset
| Type | Count |
|---|---|
| Gold | 26 |
| Silver | 20 |
| Bronze | 20 |
| Total | 66 |
Procedure
- Build the
medalsDataFrame with columnsTypeandCount; print it as the data record. - Select the non-interactive backend with
matplotlib.use("Agg")before importingpyplot, so the script also runs on headless/online compilers. - Call
plt.bar(medals["Type"], medals["Count"], color=[...]), passing four colours — one per bar, in data order. - Add
title,xlabel("Medal Type") andylabel("Count"); callplt.tight_layout(). - Save with
plt.savefig("cwg_medal_tally.png"); thetry/exceptprints a friendly message if no plotting backend exists.
Interpretation of Results
Gold (26) stands tallest of the three medal bars — India converted finals into wins at a high rate, with silver and bronze tied at 20. Because every bar rises from zero, the 26:20 ratio is directly readable as a length ratio. The Total bar towers at 66 but conveys no new information — it is a redundant encoding of the sum; an annotation ("Total: 66") often serves better than a fourth bar. Analytical reading of any bar chart means: compare lengths, check the baseline, and ask whether every bar answers the same question.
Common Mistakes
- Truncating the y-axis to "zoom in", which visually exaggerates small differences — the cardinal sin of bar charts.
- Reaching for a pie chart: the angles for 26/20/20 are nearly indistinguishable, while bar lengths are compared trivially.
- Importing
pyplotbefore setting theAggbackend on a server, which crashes with a display error.
🎯 Viva Questions
- Why must bar charts start at zero? Length is the encoding; a cut baseline distorts every length ratio.
- Which perceptual channels are decoded most accurately? Position on a common scale, then length (Cleveland & McGill).
- Why is the Total bar special? It is the sum of the parts — a whole plotted next to its own components.
- What does the Agg backend do? Renders figures to raster files without a display — ideal for headless environments.
- When would a horizontal bar chart be better? Long category labels or many categories.
- What is Tufte's lie factor? The ratio of the effect shown in the graphic to the effect in the data; honest charts keep it ≈ 1.