You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
tait.tech/_site/melody/cmpt-295/03/Lecture_03_Data_Representat...

782 lines
11 KiB

CMPT 295
Unit - Data Representation
Lecture 3 Representing integral numbers in memory - unsigned and signed
1
Last Lecture
 Von Neumann architecture
 Architecture of most computers
 Its components: CPU, memory, input and ouput, bus
 One of its characteristics: Data and code (programs) both stored in memory
 A look at memory: defined byte-addressable memory, diagram of (compressed) memory
 Word size (w): size of a series of bits (or bit vector) we manipulate, also size of machine words
(see Section 2.1.2)
 A look at bits in memory
 Why binary numeral system (0 and 1 -> two values) is used to represent information in memory
 Algorithm for converting binary to hexadecimal (hex)
1. Partition bit vector into groups of 4 bits, starting from right, i.e., least significant byte (LSB)
 If most significant “byte” (MSB) does not have 8 bits, pad it: add 0s to its left
2. Translate each group of 4 bits into its hex value
 What do bits represent? Encoding scheme gives meaning to bits
 Order of bytes in memory: little endian versus big endian
 Bit manipulation regardless of what bit vectors represent
 Boolean algebra: bitwise operations => AND (&), OR (|), XOR (^), NOT (~)
 Shift operations: left shift, right logical shift and right arithmetic shift
2
 Logical shift: Fill x with y 0s on left
 Arithmetic shift: Fill x with y copies of xs sign bit on left
 Sign bit: Most significant bit (MSb) before shifting occurred
NOTE:
C logical operators
and C bitwise (bit-level)
operators behave
differently!
Watch out for && versus
&, || versus |, …
Todays Menu
 Representing data in memory Most of this is review
 “Under the Hood” - Von Neumann architecture
 Bits and bytes in memory
 How to diagram memory -> Used in this course and other references
 How to represent series of bits -> In binary, in hexadecimal (conversion)
 What kind of information (data) do series of bits represent -> Encoding scheme
 Order of bytes in memory -> Endian
 Bit manipulation bitwise operations
 Boolean algebra + Shifting
 Representing integral numbers in memory
 Unsigned and signed
 Converting, expanding and truncating
 Arithmetic operations
 Representing real numbers in memory
3
 IEEE floating point representation
 Floating point in C casting, rounding, addition, …
Warm up exercise!
As a warm up exercise, fill in the blanks!
 If the context is C (on our target machine)
 char
=> _____ bits/ _____ byte
 short => _____ bits/ _____ bytes
 int
=> _____ bits/ _____ bytes
 long
=> _____ bits/ _____ bytes
 float => _____ bits/ _____ bytes
 double=> _____ bits/ _____ bytes
 pointer (e.g. char *)
4
=> _____ bits/ _____ bytes
Remember:
Unsigned integral numbers
 What if the byte at M[0x0002] represented an unsigned integral
A series of bits
number, what would be its value?
=> bit vector
w =>width of
 X = 011010012
the bit vector
w=8
 Lets apply the encoding scheme:
B2U(X) 
w1
 xi 2
i
i0
0 x 27 + 1 x 26 + 1 x 25 + 0 x 24 + 1 x 23 + 0 x 22 + 0 x 21 + 1 x 20 =
5
 For w = 8, range of possible unsigned values: [
]
 For any w, range of possible unsigned values: [
]
 Conclusion: w bits can only represent a fixed # of possible values,
but these w bits represent these values exactly
B2U(X) Conversion (Encoding scheme)
 Positional notation: expand and sum all terms
•••
10i
2i
10i-1
2i-1
100
10
1
4
2
1
•••
di di-1 ••• d2 d1 d0
Example: 24610 = 2 x 102 + 4 x 101 + 6 x 100
6
1s = 100
10s = 101
100s = 102
B2U(X ) 
w1
 xi 2
i0
i
Range of possible values?
 If the context is C (on our target machine)
unsigned char?
unsigned short?
unsigned int?
unsigned long?
7
Examples of “Show your work”
U2B(X) Conversion (into 8-bit binary # => w = 8)
8
Method 1 - Using subtraction:
subtracting decreasing
power of 2 until reach 0
246 => 246 128 = 118 ->128 = 1 x 27
118 64 = 54
-> 64 = 1 x 26
54 32 = 22
-> 32 = 1 x 25
22 16 = 6
-> 16 = 1 x 24
6 8 = nop! -> 8 = 0 x 23
6 4 =2
-> 4 = 1 x 22
2 2=0
-> 2 = 1 x 21
0 1 = nop! -> 1 = 0 x 20
Method 2 - Using division:
dividing by 2
until reach 0
246 => 246 / 2 = 123 -> R = 0
123 / 2 = 61 -> R = 1
61 / 2 = 30 -> R = 1
30 / 2 = 15 -> R = 0
15 / 2 = 7
-> R = 1
7/2 =3
-> R = 1
3/2 =1
-> R = 1
1/2 =0
-> R = 1
246 => 1 1 1 1 0 1 1 02
246 => 1 1 1 1 0 1 1 02
U2B(X) Conversion A few tricks
 Decimal -> binary
 Trick: When decimal number is 2n, then its binary representation is 1 followed
by n zeros
 Lets try: if X = 32 => X = 25, then n = 5 => 100002 (w = 5)
What if w = 8?
Check: 1 x 24 = 32
 Decimal -> hex
 Trick: When decimal number is 2n, then its hexadecimal representation is 2i
followed by j zeros, where n = i + 4j and 0 <= i <=3
 Let try: if X = 8192 => X = 213, then n = 13 and 13 = i + 4j => 1 + 4 x 3
=> 0x2000
9
Convert 0x2000 into a binary number:
Check: 2 x 163 = 2 x 4096 = 8192
Remember:
Signed integral numbers
 What if the byte at M[0x0001] represented a signed integral
number, what would be its value? T => Twos Complement w =>width of
the bit vector
 X = 111101002 w = 8
w2
w1
i
B2T
(X
)
x
2
x
2
w1
i
 Lets apply the encoding scheme:
Sign bit
i0
-1 x 27 + 1 x 26 + 1 x 25 + 1 x 24 + 0 x 23 + 1 x 22 + 0 x 21 + 0 x 20 =
 What would be the bit pattern of the …
 Most negative value:
 Most positive value:
10
 For w = 8, range of possible signed values: [
]
 For any w, range of possible signed values: [
]
 Conclusion: same as for unsigned integral numbers
Examples of “Show your work”
T2B(X) Conversion -> Twos Complement
w=8
Method 1 If X < 0, (~(U2B(|X|)))+1
Method 2
If X = -14 (and 8 bit binary #s)
If X = -14 (and 8 bit binary #s)
1. |X| => |-14| =
1.
2. U2B(14) =>
2. U2B(242) =>
3. ~(000011102) =>
4. (111100012)+1 =>
Binary addition:
11110001
+ 00000001
11
Check:
If X < 0, U2B(X + 2w)
X + 2w => -14 +
Using subtraction:
242 128 = 114 -> 1 x 27
114 64 = 50 -> 1 x 26
50 32 = 18
-> 1 x 25
18 16 = 2
-> 1 x 24
2 8 -> nop! -> 0 x 23
2 4 -> nop! -> 0 x 22
22=0
-> 1 x 21
0 1 -> nop! -> 0 x 20
Properties of unsigned & signed conversions
w=4
12
X
0000
0001
0010
0011
0100
0101
0110
0111
1000
1001
1010
1011
1100
1101
1110
1111
B2U(X)
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
B2T(X)
0
1
2
3
4
5
6
7
8
7
6
5
4
3
2
1
 Equivalence
 Both encoding schemes (B2U
and B2T ) produce the same bit
patterns for nonnegative values
 Uniqueness
Every bit pattern produced by
these encoding schemes (B2U
and B2T ) represents a unique
(and exact) integer value
 Each representable integer has
unique bit pattern
Converting between signed & unsigned
of same size (same data type)
Unsigned
w=8
ux
If ux = 12910
Signed (Twos Complement)
w=4
13
x
If x = -510
U2T
U2B X
B2T
Maintain Same Bit Pattern
then x =
Unsigned
T2U
T2B X
Signed (Twos Complement)
x
B2U
Maintain Same Bit Pattern
ux
then ux =
 Conclusion - Converting between unsigned and signed numbers:
Both have same bit pattern, however, this bit pattern may be interpreted
differently, i.e., producing a different value
Converting signed  unsigned with w = 4
Signed
Bits
Unsigned
0
0000
0
1
0001
2
0010
2
3
0011
3
0100
4
5
0101
5
6
0110
6
7
0111
7
-8
1000
8
-7
1001
9
-6
1010
10
4
-5
U2T(X)
+ 16 (+24)
-4
-3
14
14
-2
-1
U2T(X)
T2U(X)
1
1011
T2U(X)
11
1100
- 16 (+24)
12
1101
13
1110
14
1111
15
Visualizing the relationship between
signed & unsigned
If w = 4,
24
UMax
UMax 1
= 16
TMax
Signed
(2s Complement)
Range
15
0
1
2
TMin
TMax + 1
TMax
0
Unsigned
Range
Sign extension
 Converting unsigned (or signed) of different sizes (different data types)
1. Small data type -> larger
Sign bit
 Sign extension
X
Unsigned: zero extension
•••
Signed: sign bit extension
 Conclusion: Value unchanged
•••
X
 Lets try:
•••
•••
 Going from a data type that has a width of 3 bits (w = 3) to a data type
that has a width of 5 bits (w = 5)
 Unsigned: X = 3 =>
new X =
16
 Signed:
0112 w = 3
<=
w=5
X = 3 =>
0112 w = 3
new X =
<=
w=5
X = 4 =>
new X =
1002 w = 3
<=
w=5
X = -3 =>
1012 w = 3
new X =
<=
w=5
Truncation
 Converting unsigned (or signed) of different sizes(different data types)
2. Large data type -> smaller
•••
X
 Truncation
•••
 Conclusion: Value may be altered
A form of overflow
 Lets try:
X
•••
 Going from a data type that has a width of 5 bits (w = 5) to a data type
that has a width of 3 bits (w = 3)
 Unsigned: X = 27 => 110112 w = 5
new X =
 Signed:
17
<=
w=3
X = -15 => 100012 w = 5
new X =
<=
w=3
X = -1 => 111112 w = 5
new X =
<=
w=3
Summary
 Interpretation of bit pattern B into either unsigned value U or signed value T
 B2U(X) and U2B(X) encoding schemes (conversion)
 B2T(X) and T2B(X) encoding schemes (conversion)
 Signed value expressed as twos complement => T
 Conversions from unsigned <-> signed values
 U2T(X) and T2U(X) => adding or subtracting 2w
 Implication in C: when converting (implicitly via promotion and explicitly via casting):
 Sign:
 Unsigned <-> signed (of same size) -> Both have same bit pattern, however, this bit pattern may
be interpreted differently
 Can have unexpected effects -> producing a different value
 Size:
 Small -> large (for signed, e.g., short to int and for unsigned, e.g., unsigned short to unsigned int)
 sign extension: For unsigned -> zeros extension, for signed -> sign bit extension
 Both yield expected result > resulting value unchanged
 Large -> small (e.g., unsigned int to unsigned short)
 truncation: Unsigned/signed -> most significant bits are truncated (discarded)
 May not yield expected results -> original value may be altered
18
 Both (sign and size): 1) size conversion is first done then 2) sign conversion is done
Next Lecture
 Representing data in memory Most of this is review
 “Under the Hood” - Von Neumann architecture
 Bits and bytes in memory
 How to diagram memory -> Used in this course and other references
 How to represent series of bits -> In binary, in hexadecimal (conversion)
 What kind of information (data) do series of bits represent -> Encoding scheme
 Order of bytes in memory -> Endian
 Bit manipulation bitwise operations
 Boolean algebra + Shifting
 Representing integral numbers in memory
 Unsigned and signed
 Converting, expanding and truncating
 Arithmetic operations
 Representing real numbers in memory
19
 IEEE floating point representation
 Floating point in C casting, rounding, addition, …