Siksha Sarovar

Siksha Sarovar (sikshasarovar.com) is a free educational web application that helps students in India learn programming and prepare for academic and competitive exams. The platform offers structured coding courses (C, C++, Python, Java, HTML, CSS, PHP, Power BI, AI, Machine Learning, Data Science), complete university curriculum notes for BCA/MCA students with previous year question papers, Class 10 and Class 12 CBSE/HBSE school notes, and dedicated preparation material for SSC, UPSC, Banking, Railway and other government exams. Browsing the site is completely free and requires no account. Users may optionally sign in with Google solely to save their learning progress, quiz scores and personal preferences across devices.

Privacy Policy | Terms of Service | Contact Siksha Sarovar | About Siksha Sarovar

v4.0.9 · PWA
Siksha Sarovar logo
Siksha Sarovar
Your Learning Universe

Siksha Sarovar is a free e-learning platform for coding courses, BCA university notes and competitive exam preparation. Optional Google sign-in saves your learning progress across devices.

Initializing knowledge base…
Compiling modules 0%

Unit 2: IEEE 754 Floating Point & FP Arithmetic

Lesson 7 of 17 in the free Logical Organization of Computer-II notes on Siksha Sarovar, written by Rohit Jangra.

3.1 Why Floating Point?

Fixed-point integers cannot simultaneously hold 0.000004 and 4 × 10¹². Floating point stores a number as ± mantissa × 2^exponent — sacrificing uniform precision for enormous range. IEEE 754 standardised the encoding so that all machines compute identically.

3.2 The IEEE 754 Formats

FieldSingle (32-bit)Double (64-bit)
Sign (s)1 bit1 bit
Exponent (e)8 bits, bias 12711 bits, bias 1023
Fraction (f)23 bits52 bits
Value (normalized)(−1)ˢ × 1.f × 2^(e−127)(−1)ˢ × 1.f × 2^(e−1023)

Two ideas carry all the marks:

  • Hidden bit: a normalized binary number always starts "1." — so the leading 1 is not stored, buying one free bit of precision.
  • Biased exponent: storing e + 127 instead of a signed e makes floats sortable by their raw bit patterns (comparison hardware = integer comparator). That is why bias beats 2's complement here — a classic "why" question.

3.3 Worked Encoding: −13.25 → Single Precision

  1. Binary: 13 = 1101₂ and 0.25 = 0.01₂, so 13.25 = 1101.01₂.
  2. Normalize: 1101.01 = 1.10101 × 2³ (moved the point 3 places left).
  3. Sign: negative → s = 1.
  4. Exponent: e = 3 + 127 = 130 = 1000 0010₂.
  5. Fraction: drop the hidden 1 → f = 10101 000…0 (padded to 23 bits).
  6. Assemble and group into nibbles:
1 10000010 10101000000000000000000
= 1100 0001 0101 0100 0000 0000 0000 0000
=    C    1    5    4    0    0    0    0

−13.25 = 0xC1540000.

3.4 Worked Decoding: 0x41C80000

0x41C80000 = 0100 0001 1100 1000 0000…₂ → s = 0; e = 1000 0011 = 131 → exponent 131 − 127 = 4; f = 1001 0…0 → mantissa 1.1001₂. Value = 1.1001 × 2⁴ = 11001₂ = 25.0. ✓

3.5 Special Values (Memorise the Table)

e fieldf fieldMeaning
all 0s0±0 (signed zero)
all 0s≠ 0denormal (no hidden bit, gradual underflow)
1–254anynormalized number
all 1s0±∞ (e.g., 1.0/0.0)
all 1s≠ 0NaN (e.g., 0/0, √−1)

Ranges: single-precision normalized magnitudes span ≈1.18 × 10⁻³⁸ to ≈3.4 × 10³⁸ with ~7 decimal digits of precision; double gives ~15–16 digits.

3.6 Floating-Point Addition, Step by Step

Algorithm: (1) align — shift the mantissa of the smaller exponent right until exponents match; (2) add/subtract mantissas; (3) normalize the result (shift + adjust exponent); (4) round.

Example: 2.5 + 4.75.

2.5  = 10.1    = 1.01   x 2^1
4.75 = 100.11  = 1.0011 x 2^2
Align:  1.01 x 2^1  ->  0.1010 x 2^2
Add:    0.1010 + 1.0011 = 1.1101
Result: 1.1101 x 2^2 = 111.01 = 7.25  (already normalized)

**Why alignment shifts the smaller number:** shifting the larger one left would overflow its integer part; shifting the smaller right merely risks losing low-order bits, which rounding then handles. Multiplication is simpler: add exponents (subtract one bias!), multiply mantissas, normalize — forgetting to subtract the extra bias is the standard trap.

3.7 Rounding

IEEE 754 keeps guard, round and sticky bits beyond the mantissa during arithmetic, then applies a rounding mode:

ModeRule
Round to nearest, ties to even (default)choose nearer value; on a tie pick the even LSB — avoids statistical drift
Round toward 0truncate
Round toward +∞ / −∞directed (used in interval arithmetic)

🎯 Exam Focus

  1. Encode −0.375 and 18.625 in IEEE 754 single precision (show every step and the final hex).
  2. Decode the single-precision pattern 0xC2290000 to decimal.
  3. Why does IEEE 754 use a biased exponent and a hidden mantissa bit? Give both hardware justifications.
  4. List the bit patterns for +0, −∞ and NaN in single precision.
  5. Add 9.5 and 0.6875 using binary floating-point addition; show align/add/normalize/round.
  6. Differentiate overflow, underflow and denormal numbers in IEEE 754.