Aim
To create a DataFrame aid storing aid (toys, books, uniforms, shoes) sent by NGOs to different states, and to display (a) Books and Uniforms only and (b) Shoes only, using column-subset selection with a list of column names.
CO Mapping: CO1, CO2, CO5
Theory
Column projection — choosing which variables to look at — is the vertical twin of the row filtering practised in earlier practicals. In relational terms it is SQL's SELECT col1, col2 as opposed to WHERE. pandas offers a compact idiom: indexing a DataFrame with a list of column names, aid[["State", "Books", "Uniform"]], returns a new DataFrame containing exactly those columns, in exactly the order listed.
Two syntactic subtleties matter:
- Single brackets vs double brackets.
aid["Shoes"](one pair) returns a Series;aid[["Shoes"]](a one-element list) returns a one-column DataFrame. The distinction changes what methods are available and how the output prints — a recurring viva favourite. - Keep the identifier column. Selecting
aid[["Books", "Uniform"]]alone would produce numbers with no state attached. Including"State"in every projection keeps each row interpretable — a discipline analysts call never separating measurements from their keys.
Projection is not merely cosmetic. Narrowing a wide table to the two or three columns a question actually needs reduces memory, speeds later operations, and — most importantly — removes the temptation to eyeball irrelevant columns and drift off the analytical question.
Dataset
The 4-state, 4-item aid table built as the aid DataFrame:
| Index | State | Toys | Books | Uniform | Shoes |
|---|---|---|---|---|---|
| 0 | Delhi | 120 | 300 | 160 | 110 |
| 1 | UP | 80 | 250 | 140 | 95 |
| 2 | Haryana | 65 | 190 | 130 | 70 |
| 3 | Rajasthan | 90 | 210 | 150 | 85 |
Procedure
- Import pandas as
pd. - Build the
aidDataFrame from a column dictionary with State, Toys, Books, Uniform and Shoes; pandas assigns the default index 0–3. - Print the full DataFrame as the reference record.
- For part (a), run
aid[["State", "Books", "Uniform"]]— a three-column projection that pairs each state with its Books and Uniform counts. - For part (b), run
aid[["State", "Shoes"]]— a two-column projection isolating Shoes. - Confirm both outputs still have 4 rows: projection changes the columns, never the rows.
Interpretation of Results
Both projections keep all four states — the row count is untouched because no filtering condition was applied. Reading part (a): Books dominate the aid mix everywhere (Delhi's 300 books against 160 uniforms), and Delhi receives the most of both items while Haryana receives the least (190 books, 130 uniforms). Part (b) shows the Shoes column ranges from 70 (Haryana) to 110 (Delhi) — the same Delhi-first, Haryana-last ordering, suggesting aid volume scales with the state rather than with the item type. That is the kind of pattern projection makes visible: with only the relevant columns on screen, cross-state comparison becomes a single glance instead of a hunt through a wide table.
Common Mistakes
- Writing
aid["State", "Books"]with one pair of brackets — pandas treats the tuple as a single column label and raisesKeyError; the inner list is mandatory. - Dropping the
Statecolumn from the projection and presenting orphaned numbers that no reader can attribute to a state. - Misspelling a column name (
"Uniforms"instead of"Uniform") — pandas column selection is exact and case-sensitive, and fails withKeyError.
🎯 Viva Questions
- What is the difference between
aid["Shoes"]andaid[["Shoes"]]? The first returns a Series; the second returns a one-column DataFrame. - Does column selection change the number of rows? No — projection affects columns only; all 4 rows survive.
- What is the SQL equivalent of this operation? The
SELECTcolumn list (projection), as opposed toWHERE(row selection). - How would you select Books and Uniform using
loc?aid.loc[:, ["State", "Books", "Uniform"]]—:keeps all rows. - Can the column list reorder columns? Yes — the output follows the order of the list, not the original DataFrame order.
- How would you show only states where Shoes > 90? Combine filtering with projection:
aid.loc[aid["Shoes"] > 90, ["State", "Shoes"]].