10 KiB

Raw Blame History Unescape Escape

layout	math
simple	true

CMPT 295

Unit - Data Representation
- Lecture 6 – Representing fractional numbers in memory
- IEEE floating point representation – cont’d

Have you heard of that new band "1023 Megabytes"?

They're pretty good, but they don't have a gig just yet. 😭

Last Lecture

Representing integral numbers in memory
- Can encode a small range of values exactly (in 1, 2, 4, 8 bytes)
  - For example: We can represent the values -128 to 127 exactly in 1 byte using a signed char in C
Representing fractional numbers in memory
1. Positional notation has some advantages, but also disadvantages -> so not used!
2. IEEE floating point representation: can encode a much larger range of e.g., single precision: [10-38..1038] values approximately (in 4 or 8 bytes)
Overview of IEEE floating point representation
- Precision options (float 32-bit, double 64-bit)
- V = (-1)s x M x 2E
- s –> sign bit
- exp encodes E (but != E)
- frac encodes M (but != M)

We interpret the bit vector (expressed in IEEE floating point encoding) stored in memory using this equation.

Representing data in memory – Most of this is review
- “Under the Hood” - Von Neumann architecture
- Bits and bytes in memory
  - How to diagram memory -> Used in this course and other references
  - How to represent series of bits -> In binary, in hexadecimal (conversion)
  - What kind of information (data) do series of bits represent -> Encoding scheme
  - Order of bytes in memory -> Endian
- Bit manipulation – bitwise operations
  - Boolean algebra + Shifting
Representing integral numbers in memory
- Unsigned and signed
- Converting, expanding and truncating
- Arithmetic operations
Representing real numbers in memory 4
- IEEE floating point representation
- Floating point in C – casting, rounding, addition, …

IEEE Floating Point Representation Three “kinds” of values

We interpret the bit vector (expressed in IEEE floating point encoding) stored in memory using this equation:

{% katex display %} V = (-1)^{s} M 2^{E} {% endkatex %}

Bit breakdown--exp and frac interpreted as unsigned:

s = 1 bit
exp = k bits
1. If exp != 0 and exp != 11...11 (exp range: [0000001...11111110]). Equations:
- ```
E = \text{exp} - \text{bias}
```
- ```
M = 1 + \text{frac}
```
1. If exp = 00...00 (all 0's) => denormalized. Equations:
- ```
E = 1 - \text{bias}
```
- ```
M = \text{frac}
```
1. If exp 11...11 (all 1's) => special cases.
- Case 1: frac = 000...0
- Case 2: frac != 000...0

IEEE floating point representation - normalized

Numerical Form: $V = (–1)^{s} M 2^{E}$

Bit breakdown:

s = 1 bit
exp = k bits
- If exp != 0 and exp != 11...11 (exp range: [00000001...11111110]) => normalized. Equations:
  - ```
  E = \text{exp} - \text{bias}
```
- ```
M = 1 + \text{frac}
```

Why is E biased?

Using single precision as an example (s = 1 bit, exp = 8 bits, frac = 23 bits):

(exp range: [00000001 .. 11111110]) => [1_{10}...254_{10}]
If E is not biased (i.e. E = exp), then E range [1_{10} ... 254_{10}]
V range [$2^{1}...2^{254}] = \[2...\approx 2.89 \times 10^{76}$] (so cannot express numbers < 2)
By biasing E (i.e. E = exp - bias), then E range: [1-127...254-127] == [-126...127] (since k = 8, bias = $2^{8-1} - 1$ = 127)
V range: [$2^{-126}...2^{127}] = [\approx 1.18 \times 10^{-38}...\approx 1.7 \times 10^{38}$ (so can now express very small (and very large) numbers)
Why adding 1 to frac? Because the number (or value) V is first normalized before it is converted.

Review: Scientific Notation and normalization

From Wikipedia:
- Scientific notation is a way of expressing numbers that are too large or too small to be conveniently written in decimal form (as they are long strings of digits).
- In scientific notation, nonzero numbers are written in the form +/- M × 10n
- In normalized notation, the exponent n is chosen such that the absolute value of the significand M is at least 1 (M = 1.0) but less than the base
  - M range for base 10 => [1.0 .. 10.0 – ε ]
  - M range for base 2 => [1.0 .. 2.0 – ε ]
Examples:
- A proton's mass is 0.0000000000000000000000000016726 kg -> $1.6726 \times 10^{−27}$ kg
- Speed of light is 299,792,458 m/s -> $2.99792,458 \times 10^{8}$ m/s

Syntax of normalized notation:

Name	Notation
Sign	+/-
Significant	`$d_{0}, d_{-1}, d_{-2}, d_{-3}...d_{-n}`$
Base	`b`
Exponent	^{\text{exp}}

Let’s try: $101011010.101_{2}$ -> ___

Let’s try normalizing these fractional binary numbers!

```
101011010.101_{2}
```
```
0.000000001101_{2}
```
```
11000000111001_{2}
```

IEEE floating point representation

Once V is normalized, we apply the equations
- ```
V = (–1)^{s} M 2^{E} = 1.01011010101_{2} \times 2^{8}
```
- s = ???
- ```
E = \text{exp} - \text{bias}
```
- exp = E + bias = ___
- M = 1 + frac = ___
- s = 1 bit, exp = k bits => 8 bits, frac n bits => 23 bits
- bit vector in memory:

Why adding 1 to frac (or subtracting 1 from M)?

Because the number (or value) V is first normalized before it is converted.
- As part of this normalization process, we transform our binary number such that its significand M is within the range [1.0 .. 2.0 – ε ]
- Remember: M range for base 2 => [1.0 ... 2.0 – ε]
- This implies that M is always at least 1.0, so its integral part always has the value 1
- So since this bit is always part of M, IEEE 754 does not explicitly save it in its bit pattern (i.e., in memory)
- Instead, this bit is implied!

Why adding 1 to frac (or subtracting 1 from M)?

Implying this bit has the following effects:

We get the leading bit for free!

We save 1 bit when we convert (represent) a fractional decimal number into a bit pattern using IEEE 754 floating point representation
We have to add this 1 bit back when we convert from a bit pattern (IEEE 754 floating point representation) back to a fractional decimal

Example: $V = (–1)^{s} M 2^{E} = 1.01011010101 \times 2^{8}$

M = 1. 01011010101 => M = 1 + frac

This bit is implied hence not stored in the bit pattern produced by the IEEE 754 floating point representation, and what we store in the frac part of the IEEE 754 bit pattern is 01011010101

IEEE floating point representation (single precision)

What if the 4 bytes starting at M[0x0000] represented a fractional decimal number (encoded as an IEEE floating point number) -> value? single precision

Numerical Form: $V = (–1)^{s} M 2^{E}$

Value	Notes
1
10000111	k=8 bits, interpreted as unsigned
01011010101000000000000	n=23 bits, interpreted as unsigned

exp ≠ 0 and exp ≠ 111111112 -> normalized
s = ___
E = exp – bias where bias = $2^{k-1} – 1 = 2^{7} – 1 = 128 – 1 = 127
E = ____ - 127 =
M = 1 + frac = 1 + ___
V = ____

Little endian memory layout:

Address	M[]
size-1
...	...
0x0003	11000011
0x0002	10101101
0x0001	01010000
0x0000	00000000

Let’s give it a go!

What if the 4 bytes starting at M[0x0000] represented a fractional decimal number (encoded as an IEEE floating point number) -> value?

Numerical form: $V = (-1)^{s} M 2^{E}$

single precision

Value	Notes
0
10001100	k=8 bits, interpreted as unsigned
11011011011000000000000	n=23 bits, interpreted as unsigned

exp ≠ 0 and exp ≠ 111111112 -> normalized
s = ____
E = exp - bias where \text{bias} = 2^{7} - 1 = 128 - 1 = 127$$
E = ____ - 127 = ___
M = 1 + frac = 1 + ____
V = ____

Little endian memory map:

Address	M[]
size-1
...	...
0x0003	01000110
0x0002	01101101
0x0001	10110000
0x0000	00000000

IEEE floating point representation (single precision)

How would 47.21875 be encoded as IEEE floating point number?

Convert 47.28 to binary (using the positional notation R2B(X)) =>

```
47 = 101111_{2}
```
```
.21875 = .00111_{2}
```

Normalize binary number: 101111.00111 => $1.0111100111_{2} \times 2^{5}$

```
V = (–1)^{s} M 2^{E}
```

Determine …

s = 0
E = exp – bias where \text{bias} = 2^{k-1} – 1 = 2^{7} – 1 = 128 – 1 = 127$$
exp = E + bias = 5 + 127 = 132 => U2B(132) => 10000100
M = 1 + frac -> frac = M - 1 => $1.0111100111_{2} - 1 = .0111100111_{2}$

32 bits organized in s|exp|frac: [0|10000100|01111001110000000000000}
0x423CE000

IEEE floating point representation (single precision)

How would 12345.75 be encoded as IEEE floating point number?

{% katex display %} V = (-1)^{s} M 2^{E} {% endkatex %}

Convert 12345.75 to binary

12345 => ____ .75 => ____

Normalize binary number:
Determine …

s = ____
E = exp – bias where \text{bias} = 2^{k-1} – 1 = 2^{7} – 1 = 128 – 1 = 127$$
exp = E + bias = ____
M = 1 + frac -> frac = M - 1

\_\|\_\_\_\_\_\_\_\_\|\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_

Express in hex:

Summary

IEEE Floating Point Representation
1. Denormalized
2. Special cases
3. Normalized => exp ≠ 000…0 and exp ≠ 111…1
- Single precision: bias = 127, exp: [1..254], E: [-126..127] => [10-38 … 1038]
- Called “normalized” because binary numbers are normalized
  - Effect: “We get the leading bit for free”
    - Leading bit is always assumed (never part of bit pattern)
IEEE floating point number as encoding scheme
- Fractional decimal number  IEEE 754 (bit pattern)
- ```
V = (–1)^{s} M 2^{E}
```
  - s is sign bit, M = 1 + frac, E = exp – bias, \text{bias} = 2^{k-1} – 1$$ and k is width of exp

Next Lecture

Representing data in memory – Most of this is review
- “Under the Hood” - Von Neumann architecture
- Bits and bytes in memory
  - How to diagram memory -> Used in this course and other references
  - How to represent series of bits -> In binary, in hexadecimal (conversion)
  - What kind of information (data) do series of bits represent -> Encoding scheme
  - Order of bytes in memory -> Endian
- Bit manipulation – bitwise operations
  - Boolean algebra + Shifting
Representing integral numbers in memory
- Unsigned and signed
- Converting, expanding and truncating
- Arithmetic operations
Representing real numbers in memory
- IEEE floating point representation
- Floating point in C – casting, rounding, addition, …

10 KiB Raw Blame History Unescape Escape

CMPT 295

Have you heard of that new band "1023 Megabytes"?

Last Lecture

Today’s Menu

IEEE Floating Point Representation Three “kinds” of values

IEEE floating point representation - normalized

Review: Scientific Notation and normalization

Let’s try normalizing these fractional binary numbers!

IEEE floating point representation

Why adding 1 to frac (or subtracting 1 from M)?

Why adding 1 to frac (or subtracting 1 from M)?

IEEE floating point representation (single precision)

Let’s give it a go!

IEEE floating point representation (single precision)

IEEE floating point representation (single precision)

Summary

Next Lecture

10 KiB

Raw Blame History Unescape Escape