12 KiB

Raw Blame History Unescape Escape

layout	math
simple	true

CMPT 295

Unit - Data Representation

Lecture 3 – Representing integral numbers in memory - unsigned and signed

Last Lecture

Von Neumann architecture
- Architecture of most computers
- Its components: CPU, memory, input and ouput, bus
- One of its characteristics: Data and code (programs) both stored in memory
A look at memory: defined byte-addressable memory, diagram of (compressed) memory
- Word size (w): size of a series of bits (or bit vector) we manipulate, also size of machine words (see Section 2.1.2)
A look at bits in memory
- Why binary numeral system (0 and 1 -> two values) is used to represent information in memory
- Algorithm for converting binary to hexadecimal (hex)
  1. Partition bit vector into groups of 4 bits, starting from right, i.e., least significant byte (LSB)
  - If most significant “byte” (MSB) does not have 8 bits, pad it: add 0’s to its left
  1. Translate each group of 4 bits into its hex value
- What do bits represent? Encoding scheme gives meaning to bits
- Order of bytes in memory: little endian versus big endian
Bit manipulation – regardless of what bit vectors represent
- Boolean algebra: bitwise operations => AND (&), OR (|), XOR (^), NOT (~)
- Shift operations: left shift, right logical shift and right arithmetic shift
  - Logical shift: Fill x with y 0’s on left
  - Arithmetic shift: Fill x with y copies of x‘s sign bit on left
  - Sign bit: Most significant bit (MSb) before shifting occurred

NOTE: C logical operators and C bitwise (bit-level) operators behave differently! Watch out for && versus &, || versus |, …

Representing data in memory – Most of this is review
- “Under the Hood” - Von Neumann architecture
- Bits and bytes in memory
  - How to diagram memory -> Used in this course and other references
  - How to represent series of bits -> In binary, in hexadecimal (conversion)
  - What kind of information (data) do series of bits represent -> Encoding scheme
  - Order of bytes in memory -> Endian
- Bit manipulation – bitwise operations
  - Boolean algebra + Shifting
Representing integral numbers in memory
- Unsigned and signed
- Converting, expanding and truncating
- Arithmetic operations
Representing real numbers in memory
- IEEE floating point representation
- Floating point in C – casting, rounding, addition, …

Warm up exercise!

As a warm up exercise, fill in the blanks!

If the context is C (on our target machine)
- char => _____ bits/ _____ byte
- short => _____ bits/ _____ bytes
- int => _____ bits/ _____ bytes
- long => _____ bits/ _____ bytes
- float => _____ bits/ _____ bytes
- double => _____ bits/ _____ bytes
- pointer (e.g. char *) => _____ bits/ _____ bytes

Unsigned integral numbers

Remember:

Address|M[] size-1| ...| 0x0003|01000010 0x0002|01101001 0x0001|01110100 0x0000|01110011

What if the byte at M[0x0002] represented an unsigned integral number, what would be its value? x = a series of bits = bit vector w = width of bit vector
```
X = 01101001_{2}, w=8
```
Let’s apply the encoding scheme:

{% katex display %} \it \text{B2U}(X) = \sum_{i=0}^{w-1} X_{i} \times 2^i {% endkatex %}

Example: $0 \times 2^7 + 1 \times 2^6 + 1 \times 2^5 + 0 \times 2^4 + 1 \times 2^3 + 0 \times 2^2 + 1 \times 2^0 = $

For w = 8, range of possible unsigned values: [ blank ]
For any w, range of possible unsigned values: [ blank ]
Conclusion: w bits can only represent a fixed # of possible values, but these w bits represent these values exactly

B2U(X) Conversion (Encoding scheme)

Positional notation: expand and sum all terms

Decimal:

d_{i}

d_{i-1}

...|...

d_{2}

d_{1}

d_{0}

Binary:

b_{i}

b_{i-1}

...|...

b_{2}

b_{1}

b_{0}

Remember: {% katex display %} \it \text{B2U}(X) = \sum_{i=0}^{w-1} X_{i} \times 2^i {% endkatex %}

Example: $246_{10} = 2 \times 10^{2} + 4 \times 10^{1} + 6 \times 10^{0}$

Range of possible values?

If the context is C (on our target machine)
- unsigned char?
- unsigned short?
- unsigned int?
- unsigned long?

Examples of “Show your work”

U2B(X) Conversion (into 8-bit binary # => w = 8)

Method 1 - Using subtraction: subtracting decreasing power of 2 until reach 0:

Starting number: 246

```
246 – 128 = 118
```
```
118 – 64 = 54
```
```
54 – 32 = 22
```
```
22 – 16 = 6
```
```
6 – 8 = \text{nop!}
```
```
6 – 4 = 2
```
```
2 – 2 = 0
```
```
0 – 1 = \text{nop!}
```
```
246_{10} = 11110110_{2}
```

Method 2 - Using division: dividing by 2 until reach 0

Start with 246

```
246 \div 2 = 123,R=0
```
```
123 \div 2 = 61,R=1
```
```
61 \div 2 = 30,R=1
```
```
30 \div 2 = 15,R=0
```
```
15 \div 2 = 7,R=1
```
```
7 \div 2 = 3,R=1
```
```
3 \div 2 = 1,R=1
```
```
1 \div 2 = 0,R=1
```
```
246_{10} = 11110110_{2}
```

U2B(X) Conversion – A few tricks

Decimal -> binary
- Trick: When decimal number is 2n, then its binary representation is 1 followed by n zero’s
- Let’s try: if X = 32 => X = 25, then n = 5 => 100002 (w = 5) What if w = 8? Check: 1 x 24 = 32
Decimal -> hex
- Trick: When decimal number is 2n, then its hexadecimal representation is 2i followed by j zero’s, where n = i + 4j and 0 <= i <=3
- Let try: if X = 8192 => X = 213, then n = 13 and 13 = i + 4j => 1 + 4 x 3 => 0x2000. Convert 0x2000 into a binary number: Check: 2 x 163 = 2 x 4096 = 8192

Signed integral numbers

Remember:

Address|M[] size-1| ...| 0x0003|01000010 0x0002|01101001 0x0001|01110100 0x0000|01110011

What if the byte at M[0x0001] represented a signed integral number, what would be its value?
X = 111101002 w = 8
T => Two’s Complement, w => width of the bit vector (annotation: first part of equaltion [everything before the plus sign] is the "sign bit")
Let’s apply the encoding scheme:

{% katex display %} \it {B2T}(X) = -x_{w-1} \times 2^{w-1} + \sum_{i=0}^{w-2} x_{i} \times 2^{i} {% endkatex %}

Example: -1 \times 2^{7} + 1 \times 2^{6} + 1 \times 2^{5} + 1 \times 2^{4} + 0 \times 2^{3} + 1 \times 2^{2} + 0 \times 2^{1} + 0 \times 2^{0} = ?

What would be the bit pattern of the …
- Most negative value:
- Most positive value:
For w = 8, range of possible signed values: [ blank ]
For any w, range of possible signed values: [ blank ]
Conclusion: same as for unsigned integral numbers

Examples of “Show your work”

T2B(X) Conversion -> Two’s Complement

Annotation: w = 8

Method 1: If X < 0, (~(U2B(|X|)))+1

If X = -14 (and 8 bit binary #s)

```
\lvert X\rvert => \lvert -14 \vert = 
```
```
\text{U2B}(14)
```
(first symbol is a tilde) \sim(00001110_{2}) =>
```
(11110001_{2})+1
```

Binary addition:

  11110001
+ 00000001
= ????????

Method 2: If X = -14 (and 8 bit binary #s)

```
X + 2^{w} => -14 +
```
```
\text{U2B}(242)
```

Using subtraction:

```
242 - 128 = 114
```
```
114 - 64 = 50
```
```
50 – 32 = 18
```
```
18 – 16 = 2
```
```
2 – 8 = \text{nop!}
```
```
2 – 4 = \text{nop!}
```
```
2 – 2 = 0
```
```
0 – 1 = \text{nop!}
```

Properties of unsigned & signed conversions

Annotation: w = 4

X|B2U(X)|B2T(X) 0000|0|0 0001|1|1 0010|2|2 0011|3|3 0100|4|4 0101|5|5 0110|6|6 0111|7|7 1000|8|-8 1001|9|-7 1010|10|-6 1011|11|-5 1100|12|-4 1101|13|-3 1110|14|-2 1111|15|-1

Equivalence
- Both encoding schemes (B2U and B2T ) produce the same bit patterns for nonnegative values
Uniqueness
- Every bit pattern produced by these encoding schemes (B2U and B2T) represents a unique (and exact) integer value
- Each representable integer has unique bit pattern

Converting between signed & unsigned of same size (same data type)

Unsigned
- w=8
- if unsigned ux = 129₁₀
- U2T(X) = B2T(U2B(X))
- then x = ???
- Maintain Same Bit Pattern
Signed (Two’s Complement)
- w=4
- if signed (2's C) $x = -5_{10}$
- T2U(X) = B2U(T2B(X))
- then unsigned ux = ???
- Maintain Same Bit Pattern
Conclusion - Converting between unsigned and signed numbers: Both have same bit pattern, however, this bit pattern may be interpreted differently, i.e., producing a different value

Converting signed to unsigned (and back) with `$w = 4`$

Signed|Bits|Unsigned|Note 0|0000|0|All rows from 0-7 inclusive can be converted from signed to unsigned with T2U(X), and unsigned to signed with U2T(X). 1|0001|1| 2|0010|2| 3|0011|3| 4|0100|4| 5|0101|5| 6|0110|6| 7|0111|7| -8|1000|8|All rows from here to 15 inclusive can be converted to the other like so: T2U(signed + 16/$2^{4}) -> unsigned, U2T(unsigned - 16/2^{4}$) -> signed. -7|1001|9| -6|1010|10| -5|1011|11| -4|1100|12| -3|1101|13| -2|1110|14| -1|1111|15|

Visualizing the relationship between signed & unsigned

If $w = 4, 2^{4} = 16$

Signed (2's Complement) Range: TMin to TMax (0 is the center)
Unsigned range: 0 to UMax (TMax is the center)

Sign extension

Converting unsigned (or signed) of different sizes (different data types)
1. Small data type -> larger
- Sign extension
  - Unsigned: zero extension
  - Signed: sign bit extension
Conclusion: Value unchanged
Let’s try:
- Going from a data type that has a width of 3 bits (w = 3) to a data type that has a width of 5 bits (w = 5)
Unsigned: $X = 3 = 011_{2},w=3, X = 4 = 100_{2},w = 3$
- New: $X = ? = ?_{2},w=5, X = ? + ?_{2},w=5$
Signed: $X = 3 = 011_{2},w=3, X=-3 = 101_{2},w=3$
- New: $X = ? = ?_{2},w=5, x = ? = ?_{2}, w=5$

Truncation

Converting unsigned (or signed) of different sizes(different data types) 2. Large data type -> smaller
- Truncation
Conclusion: Value may be altered
- A form of overflow
Let’s try:
- Going from a data type that has a width of 5 bits (w = 5) to a data type that has a width of 3 bits (w = 3)
Unsigned: $X = 27 = 11011_{2},w = 5$
- New: $X = ? = ?_{2},w=3$
Signed: $X = -15 = 10001_{2},w=3, X = -1 = 11111_{2}, w=5$
- New: $X = ? = ?_{2}, w=3, X = ? = ?_{2}, w=3$

Summary

Interpretation of bit pattern B into either unsigned value U or signed value T
- B2U(X) and U2B(X) encoding schemes (conversion)
- B2T(X) and T2B(X) encoding schemes (conversion)
  - Signed value expressed as two’s complement => T
Conversions from unsigned <-> signed values
- U2T(X) and T2U(X) => adding or subtracting $2^{w}$
Implication in C: when converting (implicitly via promotion and explicitly via casting):
- Sign:
  - Unsigned <-> signed (of same size) -> Both have same bit pattern, however, this bit pattern may be interpreted differently
    - Can have unexpected effects -> producing a different value
- Size:
  - Small -> large (for signed, e.g., short to int and for unsigned, e.g., unsigned short to unsigned int)
    - sign extension: For unsigned -> zeros extension, for signed -> sign bit extension
    - Both yield expected result –> resulting value unchanged
  - Large -> small (e.g., unsigned int to unsigned short)
    - truncation: Unsigned/signed -> most significant bits are truncated (discarded)
    - May not yield expected results -> original value may be altered
- Both (sign and size): 1) size conversion is first done then 2) sign conversion is done

Next Lecture

Representing data in memory – Most of this is review
- “Under the Hood” - Von Neumann architecture
- Bits and bytes in memory
  - How to diagram memory -> Used in this course and other references
  - How to represent series of bits -> In binary, in hexadecimal (conversion)
  - What kind of information (data) do series of bits represent -> Encoding scheme
  - Order of bytes in memory -> Endian
- Bit manipulation – bitwise operations
  - Boolean algebra + Shifting
Representing integral numbers in memory
- Unsigned and signed
- Converting, expanding and truncating
- Arithmetic operations
Representing real numbers in memory 19
- IEEE floating point representation
- Floating point in C – casting, rounding, addition, …

12 KiB Raw Blame History Unescape Escape

CMPT 295

Last Lecture

Today’s Menu

Warm up exercise!

Unsigned integral numbers

B2U(X) Conversion (Encoding scheme)

Range of possible values?

Examples of “Show your work”

U2B(X) Conversion – A few tricks

Signed integral numbers

Remember:

Examples of “Show your work”

Properties of unsigned & signed conversions

Converting between signed & unsigned of same size (same data type)

Converting signed to unsigned (and back) with $w = 4$

Visualizing the relationship between signed & unsigned

Sign extension

Truncation

Summary

Next Lecture

12 KiB

Raw Blame History Unescape Escape

Converting signed to unsigned (and back) with `$w = 4`$