CMPT 295
- Unit - Data Representation
- Lecture 6 – Representing fractional numbers in memory
- IEEE floating point representation – cont’d
Have you heard of that new band “1023 Megabytes”?
They’re pretty good, but they don’t have a gig just yet. 😭
Last Lecture
- Representing integral numbers in memory
- Can encode a small range of values exactly (in 1, 2, 4, 8 bytes)
- For example: We can represent the values -128 to 127 exactly in 1 byte using a signed char in C
- Can encode a small range of values exactly (in 1, 2, 4, 8 bytes)
- Representing fractional numbers in memory
- Positional notation has some advantages, but also disadvantages -> so not used!
- IEEE floating point representation: can encode a much larger range of e.g., single precision: [10-38..1038] values approximately (in 4 or 8 bytes)
- Overview of IEEE floating point representation
- Precision options (float 32-bit, double 64-bit)
- s –> sign bit
- exp encodes E (but != E)
- frac encodes M (but != M)
We interpret the bit vector (expressed in IEEE floating point encoding) stored in memory using this equation.
Today’s Menu
- Representing data in memory – Most of this is review
- “Under the Hood” - Von Neumann architecture
- Bits and bytes in memory
- How to diagram memory -> Used in this course and other references
- How to represent series of bits -> In binary, in hexadecimal (conversion)
- What kind of information (data) do series of bits represent -> Encoding scheme
- Order of bytes in memory -> Endian
- Bit manipulation – bitwise operations
- Boolean algebra + Shifting
- Representing integral numbers in memory
- Unsigned and signed
- Converting, expanding and truncating
- Arithmetic operations
- Representing real numbers in memory
4
- IEEE floating point representation
- Floating point in C – casting, rounding, addition, …
IEEE Floating Point Representation Three “kinds” of values
We interpret the bit vector (expressed in IEEE floating point encoding) stored in memory using this equation:
Bit breakdown–exp
and frac
interpreted as unsigned:
- s = 1 bit
- exp = k bits
- If
exp
!= 0 andexp
!= 11…11 (exp range: [0000001…11111110]). Equations:- and
- If
exp
= 00…00 (all 0’s) => denormalized. Equations:- and
- If
exp
11…11 (all 1’s) => special cases.- Case 1:
frac
= 000…0 - Case 2:
frac
!= 000…0
- Case 1:
- If
IEEE floating point representation - normalized
Numerical Form:
Bit breakdown:
- s = 1 bit
exp
= k bits- If
exp
!= 0 andexp
!= 11…11 (exp
range: [00000001…11111110]) => normalized. Equations:- and
- If
Why is E
biased?
Using single precision as an example (s = 1 bit, exp = 8 bits, frac = 23 bits):
- (exp range: [00000001 .. 11111110]) =>
- If
E
is not biased (i.e. E = exp), thenE
range V
range […] = [2…] (so cannot express numbers < 2)- By biasing
E
(i.e. E = exp - bias), thenE
range: [1-127…254-127] == [-126…127] (since k = 8, bias = = 127) V
range: […] = [ … (so can now express very small (and very large) numbers)- Why adding 1 to
frac
? Because the number (or value) V is first normalized before it is converted.
Review: Scientific Notation and normalization
- From Wikipedia:
- Scientific notation is a way of expressing numbers that are too large or too small to be conveniently written in decimal form (as they are long strings of digits).
- In scientific notation, nonzero numbers are written in the form +/- M × 10n
- In normalized notation, the exponent n is chosen such that the absolute value of the significand M is at least 1 (M = 1.0) but less than the base
- M range for base 10 => [1.0 .. 10.0 – ε ]
- M range for base 2 => [1.0 .. 2.0 – ε ]
-
Examples:
- A proton’s mass is 0.0000000000000000000000000016726 kg -> kg
- Speed of light is 299,792,458 m/s -> m/s
Syntax of normalized notation:
Name | Notation |
---|---|
Sign | +/- |
Significant | … |
Base | b |
Exponent |
- Let’s try: -> ___
Let’s try normalizing these fractional binary numbers!
IEEE floating point representation
- Once V is normalized, we apply the equations
s
= ???- where
exp
=E
+bias
= ___M
= 1 +frac
= ___s
= 1 bit,exp
= k bits => 8 bits,frac
n bits => 23 bits- bit vector in memory:
Why adding 1 to frac (or subtracting 1 from M)?
- Because the number (or value) V is first normalized before it is converted.
- As part of this normalization process, we transform our binary number such that its significand M is within the range [1.0 .. 2.0 – ε ]
- Remember: M range for base 2 => [1.0 … 2.0 – ε]
- This implies that M is always at least 1.0, so its integral part always has the value 1
- So since this bit is always part of M, IEEE 754 does not explicitly save it in its bit pattern (i.e., in memory)
- Instead, this bit is implied!
Why adding 1 to frac (or subtracting 1 from M)?
Implying this bit has the following effects:
We get the leading bit for free!
- We save 1 bit when we convert (represent) a fractional decimal number into a bit pattern using IEEE 754 floating point representation
- We have to add this 1 bit back when we convert from a bit pattern (IEEE 754 floating point representation) back to a fractional decimal
Example:
M = 1. 01011010101 => M = 1 + frac
This bit is implied hence not stored in the bit pattern produced by the IEEE 754 floating point representation, and what we store in the frac part of the IEEE 754 bit pattern is 01011010101
IEEE floating point representation (single precision)
- What if the 4 bytes starting at M[0x0000] represented a fractional decimal number (encoded as an IEEE floating point number) -> value? single precision
Numerical Form:
Value | Notes |
---|---|
1 | |
10000111 | k=8 bits, interpreted as unsigned |
01011010101000000000000 | n=23 bits, interpreted as unsigned |
- exp ≠ 0 and exp ≠ 111111112 -> normalized
- s = ___
- E = exp – bias where bias = $
- E = ____ - 127 =
- M = 1 +
frac
= 1 + ___ - V = ____
Little endian memory layout:
Address | M[] |
---|---|
size-1 | |
… | … |
0x0003 | 11000011 |
0x0002 | 10101101 |
0x0001 | 01010000 |
0x0000 | 00000000 |
Let’s give it a go!
- What if the 4 bytes starting at M[0x0000] represented a fractional decimal number (encoded as an IEEE floating point number) -> value?
Numerical form:
single precision
Value | Notes |
---|---|
0 | |
10001100 | k=8 bits, interpreted as unsigned |
11011011011000000000000 | n=23 bits, interpreted as unsigned |
- exp ≠ 0 and exp ≠ 111111112 -> normalized
- s = ____
- E = exp - bias where
- E = ____ - 127 = ___
- M = 1 + frac = 1 + ____
- V = ____
Little endian memory map:
Address | M[] |
---|---|
size-1 | |
… | … |
0x0003 | 01000110 |
0x0002 | 01101101 |
0x0001 | 10110000 |
0x0000 | 00000000 |
IEEE floating point representation (single precision)
How would 47.21875 be encoded as IEEE floating point number?
- Convert 47.28 to binary (using the positional notation R2B(X)) =>
- Normalize binary number:
101111.00111 =>
- Determine …
- s = 0
- E = exp – bias where
- exp = E + bias = 5 + 127 = 132 => U2B(132) => 10000100
- M = 1 + frac -> frac = M - 1 =>
- 32 bits organized in s|exp|frac: [0|10000100|01111001110000000000000}
- 0x423CE000
IEEE floating point representation (single precision)
How would 12345.75 be encoded as IEEE floating point number?
- Convert 12345.75 to binary
- 12345 => ____ .75 => ____
- Normalize binary number:
- Determine …
- s = ____
- E = exp – bias where
- exp = E + bias = ____
- M = 1 + frac -> frac = M - 1
- [_|________|_______________________]
- Express in hex:
Summary
- IEEE Floating Point Representation
- Denormalized
- Special cases
- Normalized => exp ≠ 000…0 and exp ≠ 111…1
- Single precision: bias = 127, exp: [1..254], E: [-126..127] => [10-38 … 1038]
- Called “normalized” because binary numbers are normalized
- Effect: “We get the leading bit for free”
- Leading bit is always assumed (never part of bit pattern)
- IEEE floating point number as encoding scheme
- Fractional decimal number IEEE 754 (bit pattern)
-
- s is sign bit, M = 1 + frac, E = exp – bias, and k is width of exp
Next Lecture
- Representing data in memory – Most of this is review
- “Under the Hood” - Von Neumann architecture
- Bits and bytes in memory
- How to diagram memory -> Used in this course and other references
- How to represent series of bits -> In binary, in hexadecimal (conversion)
- What kind of information (data) do series of bits represent -> Encoding scheme
- Order of bytes in memory -> Endian
- Bit manipulation – bitwise operations
- Boolean algebra + Shifting
- Representing integral numbers in memory
- Unsigned and signed
- Converting, expanding and truncating
- Arithmetic operations
- Representing real numbers in memory
- IEEE floating point representation
- Floating point in C – casting, rounding, addition, …