CMPT 295

Unit - Data Representation

Lecture 3 – Representing integral numbers in memory - unsigned and signed

Last Lecture

Von Neumann architecture
- Architecture of most computers
- Its components: CPU, memory, input and ouput, bus
- One of its characteristics: Data and code (programs) both stored in memory
A look at memory: defined byte-addressable memory, diagram of (compressed) memory
- Word size (w): size of a series of bits (or bit vector) we manipulate, also size of machine words (see Section 2.1.2)
A look at bits in memory
- Why binary numeral system (0 and 1 -> two values) is used to represent information in memory
- Algorithm for converting binary to hexadecimal (hex)
  1. Partition bit vector into groups of 4 bits, starting from right, i.e., least significant byte (LSB)
    - If most significant “byte” (MSB) does not have 8 bits, pad it: add 0’s to its left
  2. Translate each group of 4 bits into its hex value
- What do bits represent? Encoding scheme gives meaning to bits
- Order of bytes in memory: little endian versus big endian
Bit manipulation – regardless of what bit vectors represent
- Boolean algebra: bitwise operations => AND (&), OR (|), XOR (^), NOT (~)
- Shift operations: left shift, right logical shift and right arithmetic shift
  - Logical shift: Fill x with y 0’s on left
  - Arithmetic shift: Fill x with y copies of x‘s sign bit on left
  - Sign bit: Most significant bit (MSb) before shifting occurred

NOTE: C logical operators and C bitwise (bit-level) operators behave differently! Watch out for && versus &, || versus |, …

Representing data in memory – Most of this is review
- “Under the Hood” - Von Neumann architecture
- Bits and bytes in memory
  - How to diagram memory -> Used in this course and other references
  - How to represent series of bits -> In binary, in hexadecimal (conversion)
  - What kind of information (data) do series of bits represent -> Encoding scheme
  - Order of bytes in memory -> Endian
- Bit manipulation – bitwise operations
  - Boolean algebra + Shifting
Representing integral numbers in memory
- Unsigned and signed
- Converting, expanding and truncating
- Arithmetic operations
Representing real numbers in memory
- IEEE floating point representation
- Floating point in C – casting, rounding, addition, …

Warm up exercise!

As a warm up exercise, fill in the blanks!

If the context is C (on our target machine)
- char => _____ bits/ _____ byte
- short => _____ bits/ _____ bytes
- int => _____ bits/ _____ bytes
- long => _____ bits/ _____ bytes
- float => _____ bits/ _____ bytes
- double => _____ bits/ _____ bytes
- pointer (e.g. char *) => _____ bits/ _____ bytes

Unsigned integral numbers

Remember:

Address	M[]
size-1
…
0x0003	01000010
0x0002	01101001
0x0001	01110100
0x0000	01110011

What if the byte at M[0x0002] represented an unsigned integral number, what would be its value? x = a series of bits = bit vector w = width of bit vector
$X = 01101001_{2}, w=8$
Let’s apply the encoding scheme:

$\it \text{B2U}(X) = \sum_{i=0}^{w-1} X_{i} \times 2^i$

Example: $0 \times 2^7 + 1 \times 2^6 + 1 \times 2^5 + 0 \times 2^4 + 1 \times 2^3 + 0 \times 2^2 + 1 \times 2^0 =$

For w = 8, range of possible unsigned values: [ blank ]
For any w, range of possible unsigned values: [ blank ]
Conclusion: w bits can only represent a fixed # of possible values, but these w bits represent these values exactly

B2U(X) Conversion (Encoding scheme)

Positional notation: expand and sum all terms

Decimal:

$d_{i}$	$10^{i}$
$d_{i-1}$	$10_{i-1}$
…	…
$d_{2}$	100
$d_{1}$	10
$d_{0}$	1

Binary:

$b_{i}$	$2^{i}$
$b_{i-1}$	$2^{i-1}$
…	…
$b_{2}$	4
$b_{1}$	2
$b_{0}$	1

Remember: $\it \text{B2U}(X) = \sum_{i=0}^{w-1} X_{i} \times 2^i$

Example: $246_{10} = 2 \times 10^{2} + 4 \times 10^{1} + 6 \times 10^{0}$

Range of possible values?

If the context is C (on our target machine)
- unsigned char?
- unsigned short?
- unsigned int?
- unsigned long?

Examples of “Show your work”

U2B(X) Conversion (into 8-bit binary # => w = 8)

Method 1 - Using subtraction: subtracting decreasing power of 2 until reach 0:

Starting number: 246

$246 – 128 = 118$ -> $128 = 1 \times 2^{7}$
$118 – 64 = 54$ -> $64 = 1 \times 2^{6}$
$54 – 32 = 22$ -> $32 = 1 \times 2^{5}$
$22 – 16 = 6$ -> $16 = 1 \times 2^{4}$
$6 – 8 = \text{nop!}$ -> $8 = 0 \times 2^{3}$
$6 – 4 = 2$ -> $4 = 1 \times 2^{2}$
$2 – 2 = 0$ -> $2 = 1 \times 2^{1}$
$0 – 1 = \text{nop!}$ -> $1 = 0 \times 2^{0}$
$246_{10} = 11110110_{2}$

Method 2 - Using division: dividing by 2 until reach 0

Start with 246

$246 \div 2 = 123,R=0$
$123 \div 2 = 61,R=1$
$61 \div 2 = 30,R=1$
$30 \div 2 = 15,R=0$
$15 \div 2 = 7,R=1$
$7 \div 2 = 3,R=1$
$3 \div 2 = 1,R=1$
$1 \div 2 = 0,R=1$
$246_{10} = 11110110_{2}$

U2B(X) Conversion – A few tricks

Decimal -> binary
- Trick: When decimal number is 2n, then its binary representation is 1 followed by n zero’s
- Let’s try: if X = 32 => X = 25, then n = 5 => 100002 (w = 5) What if w = 8? Check: 1 x 24 = 32
Decimal -> hex
- Trick: When decimal number is 2n, then its hexadecimal representation is 2i followed by j zero’s, where n = i + 4j and 0 <= i <=3
- Let try: if X = 8192 => X = 213, then n = 13 and 13 = i + 4j => 1 + 4 x 3 => 0x2000. Convert 0x2000 into a binary number: Check: 2 x 163 = 2 x 4096 = 8192

Signed integral numbers

Remember:

Address	M[]
size-1
…
0x0003	01000010
0x0002	01101001
0x0001	01110100
0x0000	01110011

What if the byte at M[0x0001] represented a signed integral number, what would be its value?
X = 111101002 w = 8
T => Two’s Complement, w => width of the bit vector (annotation: first part of equaltion [everything before the plus sign] is the “sign bit”)
Let’s apply the encoding scheme:

$\it {B2T}(X) = -x_{w-1} \times 2^{w-1} + \sum_{i=0}^{w-2} x_{i} \times 2^{i}$

Example: $-1 \times 2^{7} + 1 \times 2^{6} + 1 \times 2^{5} + 1 \times 2^{4} + 0 \times 2^{3} + 1 \times 2^{2} + 0 \times 2^{1} + 0 \times 2^{0} = ?$

What would be the bit pattern of the …
- Most negative value:
- Most positive value:
For w = 8, range of possible signed values: [ blank ]
For any w, range of possible signed values: [ blank ]
Conclusion: same as for unsigned integral numbers

Examples of “Show your work”

T2B(X) Conversion -> Two’s Complement

Annotation: w = 8

Method 1: If X < 0, (~(U2B(|X|)))+1

If X = -14 (and 8 bit binary #s)

$\lvert X\rvert => \lvert -14 \vert =$
$\text{U2B}(14)$ =>
(first symbol is a tilde) $\sim(00001110_{2})$ =>
$(11110001_{2})+1$ =>

Binary addition:

  11110001
+ 00000001
= ????????

Method 2: If X = -14 (and 8 bit binary #s)

$X + 2^{w} => -14 +$
$\text{U2B}(242)$ =>

Using subtraction:

$242 - 128 = 114$ -> $1 \times 2^{7}$
$114 - 64 = 50$ -> $1 \times 2^{6}$
$50 – 32 = 18$ -> $1 \times 2^{5}$
$18 – 16 = 2$ -> $1 \times 2^{4}$
$2 – 8 = \text{nop!}$ -> $0 \times 2^{3}$
$2 – 4 = \text{nop!}$ -> $0 \times 2^{2}$
$2 – 2 = 0$ -> $1 \times 2^{1}$
$0 – 1 = \text{nop!}$ -> $0 \times 2^{0}$

Properties of unsigned & signed conversions

Annotation: w = 4

X	B2U(X)	B2T(X)
0000	0	0
0001	1	1
0010	2	2
0011	3	3
0100	4	4
0101	5	5
0110	6	6
0111	7	7
1000	8	-8
1001	9	-7
1010	10	-6
1011	11	-5
1100	12	-4
1101	13	-3
1110	14	-2
1111	15	-1

Equivalence
- Both encoding schemes (B2U and B2T ) produce the same bit patterns for nonnegative values
Uniqueness
- Every bit pattern produced by these encoding schemes (B2U and B2T) represents a unique (and exact) integer value
- Each representable integer has unique bit pattern

Converting between signed & unsigned of same size (same data type)

Unsigned
- w=8
- if unsigned ux = 129₁₀
- U2T(X) = B2T(U2B(X))
- then x = ???
- Maintain Same Bit Pattern
Signed (Two’s Complement)
- w=4
- if signed (2’s C) $x = -5_{10}$
- T2U(X) = B2U(T2B(X))
- then unsigned ux = ???
- Maintain Same Bit Pattern
Conclusion - Converting between unsigned and signed numbers: Both have same bit pattern, however, this bit pattern may be interpreted differently, i.e., producing a different value

Converting signed to unsigned (and back) with $w = 4$

Signed	Bits	Unsigned	Note
0	0000	0	All rows from 0-7 inclusive can be converted from signed to unsigned with T2U(X), and unsigned to signed with U2T(X).
1	0001	1
2	0010	2
3	0011	3
4	0100	4
5	0101	5
6	0110	6
7	0111	7
-8	1000	8	All rows from here to 15 inclusive can be converted to the other like so: T2U(signed + 16/ $2^{4}$ ) -> unsigned, U2T(unsigned - 16/ $2^{4}$ ) -> signed.
-7	1001	9
-6	1010	10
-5	1011	11
-4	1100	12
-3	1101	13
-2	1110	14
-1	1111	15

Visualizing the relationship between signed & unsigned

If $w = 4, 2^{4} = 16$

Signed (2’s Complement) Range: TMin to TMax (0 is the center)
Unsigned range: 0 to UMax (TMax is the center)

Sign extension

Converting unsigned (or signed) of different sizes (different data types)
1. Small data type -> larger
  - Sign extension
    - Unsigned: zero extension
    - Signed: sign bit extension
Conclusion: Value unchanged
Let’s try:
- Going from a data type that has a width of 3 bits (w = 3) to a data type that has a width of 5 bits (w = 5)
Unsigned: $X = 3 = 011_{2},w=3$ $X = 3 = 01 1_{2}, w = 3$ , $X = 4 = 100_{2},w = 3$ $X = 4 = 10 0_{2}, w = 3$
- New: $X = ? = ?_{2},w=5$ , $X = ? + ?_{2},w=5$
Signed: $X = 3 = 011_{2},w=3$ $X = 3 = 01 1_{2}, w = 3$ , $X=-3 = 101_{2},w=3$ $X = - 3 = 10 1_{2}, w = 3$
- New: $X = ? = ?_{2},w=5$ , $x = ? = ?_{2}, w=5$

Truncation

Converting unsigned (or signed) of different sizes(different data types)
1. Large data type -> smaller
  - Truncation
Conclusion: Value may be altered
- A form of overflow
Let’s try:
- Going from a data type that has a width of 5 bits (w = 5) to a data type that has a width of 3 bits (w = 3)
Unsigned: $X = 27 = 11011_{2},w = 5$ $X = 27 = 1101 1_{2}, w = 5$
- New: $X = ? = ?_{2},w=3$
Signed: $X = -15 = 10001_{2},w=3$ $X = - 15 = 1000 1_{2}, w = 3$ , $X = -1 = 11111_{2}, w=5$ $X = - 1 = 1111 1_{2}, w = 5$
- New: $X = ? = ?_{2}, w=3$ , $X = ? = ?_{2}, w=3$

Summary

Interpretation of bit pattern B into either unsigned value U or signed value T
- B2U(X) and U2B(X) encoding schemes (conversion)
- B2T(X) and T2B(X) encoding schemes (conversion)
  - Signed value expressed as two’s complement => T
Conversions from unsigned <-> signed values
- U2T(X) and T2U(X) => adding or subtracting $2^{w}$
Implication in C: when converting (implicitly via promotion and explicitly via casting):
- Sign:
  - Unsigned <-> signed (of same size) -> Both have same bit pattern, however, this bit pattern may be interpreted differently
    - Can have unexpected effects -> producing a different value
- Size:
  - Small -> large (for signed, e.g., short to int and for unsigned, e.g., unsigned short to unsigned int)
    - sign extension: For unsigned -> zeros extension, for signed -> sign bit extension
    - Both yield expected result –> resulting value unchanged
  - Large -> small (e.g., unsigned int to unsigned short)
    - truncation: Unsigned/signed -> most significant bits are truncated (discarded)
    - May not yield expected results -> original value may be altered
- Both (sign and size): 1) size conversion is first done then 2) sign conversion is done

Next Lecture

Representing data in memory – Most of this is review
- “Under the Hood” - Von Neumann architecture
- Bits and bytes in memory
  - How to diagram memory -> Used in this course and other references
  - How to represent series of bits -> In binary, in hexadecimal (conversion)
  - What kind of information (data) do series of bits represent -> Encoding scheme
  - Order of bytes in memory -> Endian
- Bit manipulation – bitwise operations
  - Boolean algebra + Shifting
Representing integral numbers in memory
- Unsigned and signed
- Converting, expanding and truncating
- Arithmetic operations
Representing real numbers in memory 19
- IEEE floating point representation
- Floating point in C – casting, rounding, addition, …

CMPT 295

Last Lecture

Today’s Menu

Warm up exercise!

Unsigned integral numbers

B2U(X) Conversion (Encoding scheme)

Range of possible values?

Examples of “Show your work”

U2B(X) Conversion – A few tricks

Signed integral numbers

Remember:

Examples of “Show your work”

Properties of unsigned & signed conversions

Converting between signed & unsigned of same size (same data type)

Converting signed to unsigned (and back) with w=4w = 4w=4

Visualizing the relationship between signed & unsigned

Sign extension

Truncation

Summary

Next Lecture

Converting signed to unsigned (and back) with $w = 4$