Floating Point & Errors
Understanding how computers store numbers and why $0.1 + 0.2 \neq 0.3$.
🎯 Objectives
- 1 Look at floating point representation in its basic form.
- 2 Expose errors of a different form: rounding error.
- 3 Highlight the IEEE-754 standard.
Why this matters
Errors come in two forms: truncation error and rounding error.
- We always have them.
- Case study: Intel Pentium Bug.
- Our job as developers: Reduce their impact.
Base Representations
We are familiar with base 10 representation. For example, $1234.1234$:
General Base $\beta$
In a system with base $\beta$ (e.g., binary $\beta=2$, hex $\beta=16$), we represent a number as:
Conversion Algorithms
Integer to Binary (Base 10 → 2)
Repeatedly divide by 2 and keep the remainders.
Example: Convert $(11)_{10}$
- $11 / 2 = 5$ R 1 ($a_0$)
- $5 / 2 = 2$ R 1 ($a_1$)
- $2 / 2 = 1$ R 0 ($a_2$)
- $1 / 2 = 0$ R 1 ($a_3$)
Fraction to Binary (Base 10 → 2)
Repeatedly multiply by 2 and keep the integer part.
Example: Convert $0.625$
- $0.625 \times 2 = \mathbf{1}.25$ ($b_1 = 1$)
- $0.25 \times 2 = \mathbf{0}.5$ ($b_2 = 0$)
- $0.5 \times 2 = \mathbf{1}.0$ ($b_3 = 1$)
🧠 Quick Check +10 XP
Use Matlab commands to check conversions:
What is the binary representation of $(5)_{10}$?
The Precision Problem
Converting fractions doesn't always end cleanly. Consider the algorithm:
The Infinite Series
For simple numbers like $\frac{1}{5} = 0.2$, we need an infinite number of binary digits:
$0.2 \rightarrow .0011\, 0011\, 0011\, \dots$
⚠️ This truncation is Roundoff Error in its most basic form.
Case Study: The Intel Pentium Bug
June 1994
Intel engineers discover a division error in the
FPU. Managers decide it's minor and keep it internal.
Simultaneously: Dr. Nicely at
Lynchburg College notices computation problems.
October 1994
After months of testing, Nicely confirms the bug is in the processor. He contacts Intel (Oct 24), but after no action, sends a public email (Oct 30).
November 1994
- Nov 1: Phar Lap Software receives Nicely's email and alerts Microsoft, Borland, Watcom.
- Nov 2: Email goes global.
- Nov 15: USC reverse-engineers the chip to expose the problem. Intel denies severity. Stock falls.
- Nov 22: CNN Moneyline interviews Intel; they claim the problem is minor.
- Nov 23: The MathWorks develops a software fix.
- Nov 24: NYT story. Intel still sending flawed chips.
December 1994
IBM halts shipments. Intel admits fault and sets aside $420 million to fix/replace chips.
Types of Numerical Bugs
Roundoff Error
Occurs when digits are lost due to limited memory (e.g., storing $0.3333...$ as $0.3333$).
Truncation Error
Occurs when discrete values/finite series are used to approximate a mathematical expression (e.g., stopping a Taylor series at term 3).
Uncertainty & Conditioning
Well-conditioned
Numerical results are insensitive to small variations in the input.
Ill-conditioned
Small variations lead to drastically different numerical calculations.
Floating Point Representation
Normalized Floating-Point Form
where $d_1 \neq 0$ for normalization.
Toy Example: 4-Bit System
Suppose we have 3 bits for mantissa ($d_1, d_2, d_3$) and 1 bit for exponent.
| .$d_1$ | $d_2$ | $d_3$ | $e_1$ |
Possible values range from $0.000_2 \times 2^{-1}$ to $0.111_2 \times 2^1$. This creates discrete values on the number line.
Underflow
Computations too close to zero. Often fall back to 0.
Overflow
Computations too large to represent. Considered a severe error.
🧠 Final Check +20 XP
Why do we use "Normalized" floating point numbers (where $d_1 \neq 0$)?
Interactive Playground
Experiment with number representation below. You can run MATLAB code via Octave Online or Python code directly in your browser.
Python (Client-Side)
Loading Python environment...