tait.tech/_site/melody/cmpt-295/03/Lecture_03_Data_Representat...

CMPT 295
Unit - Data Representation

Lecture 3 – Representing integral numbers in memory - unsigned and signed

1

Last Lecture
 Von Neumann architecture
 Architecture of most computers
 Its components: CPU, memory, input and ouput, bus
 One of its characteristics: Data and code (programs) both stored in memory
 A look at memory: defined byte-addressable memory, diagram of (compressed) memory

 Word size (w): size of a series of bits (or bit vector) we manipulate, also size of machine words
(see Section 2.1.2)
 A look at bits in memory
 Why binary numeral system (0 and 1 -> two values) is used to represent information in memory
 Algorithm for converting binary to hexadecimal (hex)

1. Partition bit vector into groups of 4 bits, starting from right, i.e., least significant byte (LSB)
 If most significant “byte” (MSB) does not have 8 bits, pad it: add 0’s to its left
2. Translate each group of 4 bits into its hex value

 What do bits represent? Encoding scheme gives meaning to bits
 Order of bytes in memory: little endian versus big endian
 Bit manipulation – regardless of what bit vectors represent
 Boolean algebra: bitwise operations => AND (&), OR (|), XOR (^), NOT (~)
 Shift operations: left shift, right logical shift and right arithmetic shift

2

 Logical shift: Fill x with y 0’s on left
 Arithmetic shift: Fill x with y copies of x‘s sign bit on left
 Sign bit: Most significant bit (MSb) before shifting occurred

NOTE:
C logical operators
and C bitwise (bit-level)
operators behave
differently!
Watch out for && versus
&, || versus |, …

Today’s Menu
 Representing data in memory – Most of this is review
 “Under the Hood” - Von Neumann architecture

 Bits and bytes in memory
 How to diagram memory -> Used in this course and other references
 How to represent series of bits -> In binary, in hexadecimal (conversion)
 What kind of information (data) do series of bits represent -> Encoding scheme
 Order of bytes in memory -> Endian

 Bit manipulation – bitwise operations
 Boolean algebra + Shifting

 Representing integral numbers in memory
 Unsigned and signed
 Converting, expanding and truncating
 Arithmetic operations

 Representing real numbers in memory
3

 IEEE floating point representation
 Floating point in C – casting, rounding, addition, …

Warm up exercise!
As a warm up exercise, fill in the blanks!
 If the context is C (on our target machine)
 char

=> _____ bits/ _____ byte

 short => _____ bits/ _____ bytes
 int

=> _____ bits/ _____ bytes

 long

=> _____ bits/ _____ bytes

 float => _____ bits/ _____ bytes
 double=> _____ bits/ _____ bytes
 pointer (e.g. char *)
4

=> _____ bits/ _____ bytes

Remember:

Unsigned integral numbers
 What if the byte at M[0x0002] represented an unsigned integral
A series of bits
number, what would be its value?
=> bit vector
w =>width of

 X = 011010012

the bit vector

w=8

 Let’s apply the encoding scheme:

B2U(X) 

w1

 xi 2

i

i0

0 x 27 + 1 x 26 + 1 x 25 + 0 x 24 + 1 x 23 + 0 x 22 + 0 x 21 + 1 x 20 =

5

 For w = 8, range of possible unsigned values: [

]

 For any w, range of possible unsigned values: [

]

 Conclusion: w bits can only represent a fixed # of possible values,
but these w bits represent these values exactly

B2U(X) Conversion (Encoding scheme)
 Positional notation: expand and sum all terms

•••

10i

2i

10i-1

2i-1

100
10
1

4
2
1

•••

di di-1 ••• d2 d1 d0
Example: 24610 = 2 x 102 + 4 x 101 + 6 x 100
6

1’s = 100
10’s = 101
100’s = 102

B2U(X ) 

w1

 xi 2
i0

i

Range of possible values?
 If the context is C (on our target machine)
unsigned char?
unsigned short?

unsigned int?
unsigned long?

7

Examples of “Show your work”

U2B(X) Conversion (into 8-bit binary # => w = 8)

8

Method 1 - Using subtraction:
subtracting decreasing
power of 2 until reach 0
246 => 246 – 128 = 118 ->128 = 1 x 27
118 – 64 = 54
-> 64 = 1 x 26
54 – 32 = 22
-> 32 = 1 x 25
22 – 16 = 6
-> 16 = 1 x 24
6 – 8 = nop! -> 8 = 0 x 23
6 – 4 =2
-> 4 = 1 x 22
2– 2=0
-> 2 = 1 x 21
0 – 1 = nop! -> 1 = 0 x 20

Method 2 - Using division:
dividing by 2
until reach 0
246 => 246 / 2 = 123 -> R = 0
123 / 2 = 61 -> R = 1
61 / 2 = 30 -> R = 1
30 / 2 = 15 -> R = 0
15 / 2 = 7
-> R = 1
7/2 =3
-> R = 1
3/2 =1
-> R = 1
1/2 =0
-> R = 1

246 => 1 1 1 1 0 1 1 02

246 => 1 1 1 1 0 1 1 02

U2B(X) Conversion – A few tricks
 Decimal -> binary
 Trick: When decimal number is 2n, then its binary representation is 1 followed
by n zero’s
 Let’s try: if X = 32 => X = 25, then n = 5 => 100002 (w = 5)

What if w = 8?
Check: 1 x 24 = 32

 Decimal -> hex
 Trick: When decimal number is 2n, then its hexadecimal representation is 2i
followed by j zero’s, where n = i + 4j and 0 <= i <=3
 Let try: if X = 8192 => X = 213, then n = 13 and 13 = i + 4j => 1 + 4 x 3
=> 0x2000
9

Convert 0x2000 into a binary number:
Check: 2 x 163 = 2 x 4096 = 8192

Remember:

Signed integral numbers
 What if the byte at M[0x0001] represented a signed integral
number, what would be its value? T => Two’s Complement w =>width of
the bit vector
 X = 111101002 w = 8
w2
w1
i
B2T
(X
)


x
2

x
2

w1
i
 Let’s apply the encoding scheme:
Sign bit

i0

-1 x 27 + 1 x 26 + 1 x 25 + 1 x 24 + 0 x 23 + 1 x 22 + 0 x 21 + 0 x 20 =

 What would be the bit pattern of the …
 Most negative value:
 Most positive value:

10

 For w = 8, range of possible signed values: [

]

 For any w, range of possible signed values: [

]

 Conclusion: same as for unsigned integral numbers

Examples of “Show your work”

T2B(X) Conversion -> Two’s Complement
w=8
Method 1 If X < 0, (~(U2B(|X|)))+1

Method 2

If X = -14 (and 8 bit binary #s)

If X = -14 (and 8 bit binary #s)

1. |X| => |-14| =

1.

2. U2B(14) =>

2. U2B(242) =>

3. ~(000011102) =>
4. (111100012)+1 =>
Binary addition:
11110001
+ 00000001
11
Check:

If X < 0, U2B(X + 2w)

X + 2w => -14 +

Using subtraction:

242 – 128 = 114 -> 1 x 27
114 – 64 = 50 -> 1 x 26
50 – 32 = 18
-> 1 x 25
18 – 16 = 2
-> 1 x 24
2 – 8 -> nop! -> 0 x 23
2 – 4 -> nop! -> 0 x 22
2–2=0
-> 1 x 21
0 – 1 -> nop! -> 0 x 20

Properties of unsigned & signed conversions
w=4

12

X
0000
0001
0010
0011
0100
0101
0110
0111
1000
1001
1010
1011
1100
1101
1110
1111

B2U(X)
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

B2T(X)
0
1
2
3
4
5
6
7
–8
–7
–6
–5
–4
–3
–2
–1

 Equivalence
 Both encoding schemes (B2U
and B2T ) produce the same bit
patterns for nonnegative values
 Uniqueness

Every bit pattern produced by
these encoding schemes (B2U
and B2T ) represents a unique
(and exact) integer value

 Each representable integer has
unique bit pattern

Converting between signed & unsigned
of same size (same data type)
Unsigned

w=8

ux

If ux = 12910
Signed (Two’s Complement)

w=4

13

x

If x = -510

U2T
U2B X

B2T

Maintain Same Bit Pattern

then x =
Unsigned

T2U

T2B X

Signed (Two’s Complement)
x

B2U

Maintain Same Bit Pattern

ux

then ux =

 Conclusion - Converting between unsigned and signed numbers:
Both have same bit pattern, however, this bit pattern may be interpreted
differently, i.e., producing a different value

Converting signed  unsigned with w = 4
Signed

Bits

Unsigned

0

0000

0

1

0001

2

0010

2

3

0011

3

0100

4

5

0101

5

6

0110

6

7

0111

7

-8

1000

8

-7

1001

9

-6

1010

10

4

-5

U2T(X)

+ 16 (+24)

-4
-3

14

14

-2
-1

U2T(X)

T2U(X)

1

1011

T2U(X)

11

1100

- 16 (+24)

12

1101

13

1110

14

1111

15

Visualizing the relationship between
signed & unsigned
If w = 4,

24

UMax
UMax – 1

= 16

TMax

Signed
(2’s Complement)
Range
15

0
–1
–2

TMin

TMax + 1
TMax

0

Unsigned
Range

Sign extension
 Converting unsigned (or signed) of different sizes (different data types)
1. Small data type -> larger

Sign bit

 Sign extension

X

Unsigned: zero extension

•••

Signed: sign bit extension

 Conclusion: Value unchanged

•••

X

 Let’s try:

•••

•••

 Going from a data type that has a width of 3 bits (w = 3) to a data type
that has a width of 5 bits (w = 5)
 Unsigned: X = 3 =>
new X =
16

 Signed:

0112 w = 3

<=

w=5

X = 3 =>

0112 w = 3

new X =

<=

w=5

X = 4 =>
new X =

1002 w = 3

<=

w=5

X = -3 =>

1012 w = 3

new X =

<=

w=5

Truncation
 Converting unsigned (or signed) of different sizes(different data types)

2. Large data type -> smaller
•••

X

 Truncation

•••

 Conclusion: Value may be altered
A form of overflow

 Let’s try:

X

•••

 Going from a data type that has a width of 5 bits (w = 5) to a data type
that has a width of 3 bits (w = 3)
 Unsigned: X = 27 => 110112 w = 5

new X =
 Signed:
17

<=

w=3

X = -15 => 100012 w = 5

new X =

<=

w=3

X = -1 => 111112 w = 5
new X =

<=

w=3

Summary
 Interpretation of bit pattern B into either unsigned value U or signed value T
 B2U(X) and U2B(X) encoding schemes (conversion)
 B2T(X) and T2B(X) encoding schemes (conversion)
 Signed value expressed as two’s complement => T

 Conversions from unsigned <-> signed values
 U2T(X) and T2U(X) => adding or subtracting 2w

 Implication in C: when converting (implicitly via promotion and explicitly via casting):
 Sign:
 Unsigned <-> signed (of same size) -> Both have same bit pattern, however, this bit pattern may

be interpreted differently

 Can have unexpected effects -> producing a different value

 Size:
 Small -> large (for signed, e.g., short to int and for unsigned, e.g., unsigned short to unsigned int)
 sign extension: For unsigned -> zeros extension, for signed -> sign bit extension
 Both yield expected result –> resulting value unchanged

 Large -> small (e.g., unsigned int to unsigned short)
 truncation: Unsigned/signed -> most significant bits are truncated (discarded)
 May not yield expected results -> original value may be altered

18

 Both (sign and size): 1) size conversion is first done then 2) sign conversion is done

Next Lecture
 Representing data in memory – Most of this is review
 “Under the Hood” - Von Neumann architecture

 Bits and bytes in memory
 How to diagram memory -> Used in this course and other references
 How to represent series of bits -> In binary, in hexadecimal (conversion)
 What kind of information (data) do series of bits represent -> Encoding scheme
 Order of bytes in memory -> Endian

 Bit manipulation – bitwise operations
 Boolean algebra + Shifting

 Representing integral numbers in memory
 Unsigned and signed
 Converting, expanding and truncating
 Arithmetic operations

 Representing real numbers in memory
19

 IEEE floating point representation
 Floating point in C – casting, rounding, addition, …