You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
tait.tech/_site/melody/cmpt-295/03/Lecture_03_Data_Representat...

782 lines
11 KiB

This file contains invisible Unicode characters!

This file contains invisible Unicode characters that may be processed differently from what appears below. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to reveal hidden characters.

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

CMPT 295
Unit - Data Representation
Lecture 3 Representing integral numbers in memory - unsigned and signed
1
Last Lecture
 Von Neumann architecture
 Architecture of most computers
 Its components: CPU, memory, input and ouput, bus
 One of its characteristics: Data and code (programs) both stored in memory
 A look at memory: defined byte-addressable memory, diagram of (compressed) memory
 Word size (w): size of a series of bits (or bit vector) we manipulate, also size of machine words
(see Section 2.1.2)
 A look at bits in memory
 Why binary numeral system (0 and 1 -> two values) is used to represent information in memory
 Algorithm for converting binary to hexadecimal (hex)
1. Partition bit vector into groups of 4 bits, starting from right, i.e., least significant byte (LSB)
 If most significant “byte” (MSB) does not have 8 bits, pad it: add 0s to its left
2. Translate each group of 4 bits into its hex value
 What do bits represent? Encoding scheme gives meaning to bits
 Order of bytes in memory: little endian versus big endian
 Bit manipulation regardless of what bit vectors represent
 Boolean algebra: bitwise operations => AND (&), OR (|), XOR (^), NOT (~)
 Shift operations: left shift, right logical shift and right arithmetic shift
2
 Logical shift: Fill x with y 0s on left
 Arithmetic shift: Fill x with y copies of xs sign bit on left
 Sign bit: Most significant bit (MSb) before shifting occurred
NOTE:
C logical operators
and C bitwise (bit-level)
operators behave
differently!
Watch out for && versus
&, || versus |, …
Todays Menu
 Representing data in memory Most of this is review
 “Under the Hood” - Von Neumann architecture
 Bits and bytes in memory
 How to diagram memory -> Used in this course and other references
 How to represent series of bits -> In binary, in hexadecimal (conversion)
 What kind of information (data) do series of bits represent -> Encoding scheme
 Order of bytes in memory -> Endian
 Bit manipulation bitwise operations
 Boolean algebra + Shifting
 Representing integral numbers in memory
 Unsigned and signed
 Converting, expanding and truncating
 Arithmetic operations
 Representing real numbers in memory
3
 IEEE floating point representation
 Floating point in C casting, rounding, addition, …
Warm up exercise!
As a warm up exercise, fill in the blanks!
 If the context is C (on our target machine)
 char
=> _____ bits/ _____ byte
 short => _____ bits/ _____ bytes
 int
=> _____ bits/ _____ bytes
 long
=> _____ bits/ _____ bytes
 float => _____ bits/ _____ bytes
 double=> _____ bits/ _____ bytes
 pointer (e.g. char *)
4
=> _____ bits/ _____ bytes
Remember:
Unsigned integral numbers
 What if the byte at M[0x0002] represented an unsigned integral
A series of bits
number, what would be its value?
=> bit vector
w =>width of
 X = 011010012
the bit vector
w=8
 Lets apply the encoding scheme:
B2U(X) 
w1
 xi 2
i
i0
0 x 27 + 1 x 26 + 1 x 25 + 0 x 24 + 1 x 23 + 0 x 22 + 0 x 21 + 1 x 20 =
5
 For w = 8, range of possible unsigned values: [
]
 For any w, range of possible unsigned values: [
]
 Conclusion: w bits can only represent a fixed # of possible values,
but these w bits represent these values exactly
B2U(X) Conversion (Encoding scheme)
 Positional notation: expand and sum all terms
•••
10i
2i
10i-1
2i-1
100
10
1
4
2
1
•••
di di-1 ••• d2 d1 d0
Example: 24610 = 2 x 102 + 4 x 101 + 6 x 100
6
1s = 100
10s = 101
100s = 102
B2U(X ) 
w1
 xi 2
i0
i
Range of possible values?
 If the context is C (on our target machine)
unsigned char?
unsigned short?
unsigned int?
unsigned long?
7
Examples of “Show your work”
U2B(X) Conversion (into 8-bit binary # => w = 8)
8
Method 1 - Using subtraction:
subtracting decreasing
power of 2 until reach 0
246 => 246 128 = 118 ->128 = 1 x 27
118 64 = 54
-> 64 = 1 x 26
54 32 = 22
-> 32 = 1 x 25
22 16 = 6
-> 16 = 1 x 24
6 8 = nop! -> 8 = 0 x 23
6 4 =2
-> 4 = 1 x 22
2 2=0
-> 2 = 1 x 21
0 1 = nop! -> 1 = 0 x 20
Method 2 - Using division:
dividing by 2
until reach 0
246 => 246 / 2 = 123 -> R = 0
123 / 2 = 61 -> R = 1
61 / 2 = 30 -> R = 1
30 / 2 = 15 -> R = 0
15 / 2 = 7
-> R = 1
7/2 =3
-> R = 1
3/2 =1
-> R = 1
1/2 =0
-> R = 1
246 => 1 1 1 1 0 1 1 02
246 => 1 1 1 1 0 1 1 02
U2B(X) Conversion A few tricks
 Decimal -> binary
 Trick: When decimal number is 2n, then its binary representation is 1 followed
by n zeros
 Lets try: if X = 32 => X = 25, then n = 5 => 100002 (w = 5)
What if w = 8?
Check: 1 x 24 = 32
 Decimal -> hex
 Trick: When decimal number is 2n, then its hexadecimal representation is 2i
followed by j zeros, where n = i + 4j and 0 <= i <=3
 Let try: if X = 8192 => X = 213, then n = 13 and 13 = i + 4j => 1 + 4 x 3
=> 0x2000
9
Convert 0x2000 into a binary number:
Check: 2 x 163 = 2 x 4096 = 8192
Remember:
Signed integral numbers
 What if the byte at M[0x0001] represented a signed integral
number, what would be its value? T => Twos Complement w =>width of
the bit vector
 X = 111101002 w = 8
w2
w1
i
B2T
(X
)
x
2
x
2
w1
i
 Lets apply the encoding scheme:
Sign bit
i0
-1 x 27 + 1 x 26 + 1 x 25 + 1 x 24 + 0 x 23 + 1 x 22 + 0 x 21 + 0 x 20 =
 What would be the bit pattern of the …
 Most negative value:
 Most positive value:
10
 For w = 8, range of possible signed values: [
]
 For any w, range of possible signed values: [
]
 Conclusion: same as for unsigned integral numbers
Examples of “Show your work”
T2B(X) Conversion -> Twos Complement
w=8
Method 1 If X < 0, (~(U2B(|X|)))+1
Method 2
If X = -14 (and 8 bit binary #s)
If X = -14 (and 8 bit binary #s)
1. |X| => |-14| =
1.
2. U2B(14) =>
2. U2B(242) =>
3. ~(000011102) =>
4. (111100012)+1 =>
Binary addition:
11110001
+ 00000001
11
Check:
If X < 0, U2B(X + 2w)
X + 2w => -14 +
Using subtraction:
242 128 = 114 -> 1 x 27
114 64 = 50 -> 1 x 26
50 32 = 18
-> 1 x 25
18 16 = 2
-> 1 x 24
2 8 -> nop! -> 0 x 23
2 4 -> nop! -> 0 x 22
22=0
-> 1 x 21
0 1 -> nop! -> 0 x 20
Properties of unsigned & signed conversions
w=4
12
X
0000
0001
0010
0011
0100
0101
0110
0111
1000
1001
1010
1011
1100
1101
1110
1111
B2U(X)
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
B2T(X)
0
1
2
3
4
5
6
7
8
7
6
5
4
3
2
1
 Equivalence
 Both encoding schemes (B2U
and B2T ) produce the same bit
patterns for nonnegative values
 Uniqueness
Every bit pattern produced by
these encoding schemes (B2U
and B2T ) represents a unique
(and exact) integer value
 Each representable integer has
unique bit pattern
Converting between signed & unsigned
of same size (same data type)
Unsigned
w=8
ux
If ux = 12910
Signed (Twos Complement)
w=4
13
x
If x = -510
U2T
U2B X
B2T
Maintain Same Bit Pattern
then x =
Unsigned
T2U
T2B X
Signed (Twos Complement)
x
B2U
Maintain Same Bit Pattern
ux
then ux =
 Conclusion - Converting between unsigned and signed numbers:
Both have same bit pattern, however, this bit pattern may be interpreted
differently, i.e., producing a different value
Converting signed  unsigned with w = 4
Signed
Bits
Unsigned
0
0000
0
1
0001
2
0010
2
3
0011
3
0100
4
5
0101
5
6
0110
6
7
0111
7
-8
1000
8
-7
1001
9
-6
1010
10
4
-5
U2T(X)
+ 16 (+24)
-4
-3
14
14
-2
-1
U2T(X)
T2U(X)
1
1011
T2U(X)
11
1100
- 16 (+24)
12
1101
13
1110
14
1111
15
Visualizing the relationship between
signed & unsigned
If w = 4,
24
UMax
UMax 1
= 16
TMax
Signed
(2s Complement)
Range
15
0
1
2
TMin
TMax + 1
TMax
0
Unsigned
Range
Sign extension
 Converting unsigned (or signed) of different sizes (different data types)
1. Small data type -> larger
Sign bit
 Sign extension
X
Unsigned: zero extension
•••
Signed: sign bit extension
 Conclusion: Value unchanged
•••
X
 Lets try:
•••
•••
 Going from a data type that has a width of 3 bits (w = 3) to a data type
that has a width of 5 bits (w = 5)
 Unsigned: X = 3 =>
new X =
16
 Signed:
0112 w = 3
<=
w=5
X = 3 =>
0112 w = 3
new X =
<=
w=5
X = 4 =>
new X =
1002 w = 3
<=
w=5
X = -3 =>
1012 w = 3
new X =
<=
w=5
Truncation
 Converting unsigned (or signed) of different sizes(different data types)
2. Large data type -> smaller
•••
X
 Truncation
•••
 Conclusion: Value may be altered
A form of overflow
 Lets try:
X
•••
 Going from a data type that has a width of 5 bits (w = 5) to a data type
that has a width of 3 bits (w = 3)
 Unsigned: X = 27 => 110112 w = 5
new X =
 Signed:
17
<=
w=3
X = -15 => 100012 w = 5
new X =
<=
w=3
X = -1 => 111112 w = 5
new X =
<=
w=3
Summary
 Interpretation of bit pattern B into either unsigned value U or signed value T
 B2U(X) and U2B(X) encoding schemes (conversion)
 B2T(X) and T2B(X) encoding schemes (conversion)
 Signed value expressed as twos complement => T
 Conversions from unsigned <-> signed values
 U2T(X) and T2U(X) => adding or subtracting 2w
 Implication in C: when converting (implicitly via promotion and explicitly via casting):
 Sign:
 Unsigned <-> signed (of same size) -> Both have same bit pattern, however, this bit pattern may
be interpreted differently
 Can have unexpected effects -> producing a different value
 Size:
 Small -> large (for signed, e.g., short to int and for unsigned, e.g., unsigned short to unsigned int)
 sign extension: For unsigned -> zeros extension, for signed -> sign bit extension
 Both yield expected result > resulting value unchanged
 Large -> small (e.g., unsigned int to unsigned short)
 truncation: Unsigned/signed -> most significant bits are truncated (discarded)
 May not yield expected results -> original value may be altered
18
 Both (sign and size): 1) size conversion is first done then 2) sign conversion is done
Next Lecture
 Representing data in memory Most of this is review
 “Under the Hood” - Von Neumann architecture
 Bits and bytes in memory
 How to diagram memory -> Used in this course and other references
 How to represent series of bits -> In binary, in hexadecimal (conversion)
 What kind of information (data) do series of bits represent -> Encoding scheme
 Order of bytes in memory -> Endian
 Bit manipulation bitwise operations
 Boolean algebra + Shifting
 Representing integral numbers in memory
 Unsigned and signed
 Converting, expanding and truncating
 Arithmetic operations
 Representing real numbers in memory
19
 IEEE floating point representation
 Floating point in C casting, rounding, addition, …