CMPT 295 Unit - Data Representation Lecture 3 – Representing integral numbers in memory - unsigned and signed 1 Last Lecture  Von Neumann architecture  Architecture of most computers  Its components: CPU, memory, input and ouput, bus  One of its characteristics: Data and code (programs) both stored in memory  A look at memory: defined byte-addressable memory, diagram of (compressed) memory  Word size (w): size of a series of bits (or bit vector) we manipulate, also size of machine words (see Section 2.1.2)  A look at bits in memory  Why binary numeral system (0 and 1 -> two values) is used to represent information in memory  Algorithm for converting binary to hexadecimal (hex) 1. Partition bit vector into groups of 4 bits, starting from right, i.e., least significant byte (LSB)  If most significant “byte” (MSB) does not have 8 bits, pad it: add 0’s to its left 2. Translate each group of 4 bits into its hex value  What do bits represent? Encoding scheme gives meaning to bits  Order of bytes in memory: little endian versus big endian  Bit manipulation – regardless of what bit vectors represent  Boolean algebra: bitwise operations => AND (&), OR (|), XOR (^), NOT (~)  Shift operations: left shift, right logical shift and right arithmetic shift 2  Logical shift: Fill x with y 0’s on left  Arithmetic shift: Fill x with y copies of x‘s sign bit on left  Sign bit: Most significant bit (MSb) before shifting occurred NOTE: C logical operators and C bitwise (bit-level) operators behave differently! Watch out for && versus &, || versus |, … Today’s Menu  Representing data in memory – Most of this is review  “Under the Hood” - Von Neumann architecture  Bits and bytes in memory  How to diagram memory -> Used in this course and other references  How to represent series of bits -> In binary, in hexadecimal (conversion)  What kind of information (data) do series of bits represent -> Encoding scheme  Order of bytes in memory -> Endian  Bit manipulation – bitwise operations  Boolean algebra + Shifting  Representing integral numbers in memory  Unsigned and signed  Converting, expanding and truncating  Arithmetic operations  Representing real numbers in memory 3  IEEE floating point representation  Floating point in C – casting, rounding, addition, … Warm up exercise! As a warm up exercise, fill in the blanks!  If the context is C (on our target machine)  char => _____ bits/ _____ byte  short => _____ bits/ _____ bytes  int => _____ bits/ _____ bytes  long => _____ bits/ _____ bytes  float => _____ bits/ _____ bytes  double=> _____ bits/ _____ bytes  pointer (e.g. char *) 4 => _____ bits/ _____ bytes Remember: Unsigned integral numbers  What if the byte at M[0x0002] represented an unsigned integral A series of bits number, what would be its value? => bit vector w =>width of  X = 011010012 the bit vector w=8  Let’s apply the encoding scheme: B2U(X)  w1  xi 2 i i0 0 x 27 + 1 x 26 + 1 x 25 + 0 x 24 + 1 x 23 + 0 x 22 + 0 x 21 + 1 x 20 = 5  For w = 8, range of possible unsigned values: [ ]  For any w, range of possible unsigned values: [ ]  Conclusion: w bits can only represent a fixed # of possible values, but these w bits represent these values exactly B2U(X) Conversion (Encoding scheme)  Positional notation: expand and sum all terms ••• 10i 2i 10i-1 2i-1 100 10 1 4 2 1 ••• di di-1 ••• d2 d1 d0 Example: 24610 = 2 x 102 + 4 x 101 + 6 x 100 6 1’s = 100 10’s = 101 100’s = 102 B2U(X )  w1  xi 2 i0 i Range of possible values?  If the context is C (on our target machine) unsigned char? unsigned short? unsigned int? unsigned long? 7 Examples of “Show your work” U2B(X) Conversion (into 8-bit binary # => w = 8) 8 Method 1 - Using subtraction: subtracting decreasing power of 2 until reach 0 246 => 246 – 128 = 118 ->128 = 1 x 27 118 – 64 = 54 -> 64 = 1 x 26 54 – 32 = 22 -> 32 = 1 x 25 22 – 16 = 6 -> 16 = 1 x 24 6 – 8 = nop! -> 8 = 0 x 23 6 – 4 =2 -> 4 = 1 x 22 2– 2=0 -> 2 = 1 x 21 0 – 1 = nop! -> 1 = 0 x 20 Method 2 - Using division: dividing by 2 until reach 0 246 => 246 / 2 = 123 -> R = 0 123 / 2 = 61 -> R = 1 61 / 2 = 30 -> R = 1 30 / 2 = 15 -> R = 0 15 / 2 = 7 -> R = 1 7/2 =3 -> R = 1 3/2 =1 -> R = 1 1/2 =0 -> R = 1 246 => 1 1 1 1 0 1 1 02 246 => 1 1 1 1 0 1 1 02 U2B(X) Conversion – A few tricks  Decimal -> binary  Trick: When decimal number is 2n, then its binary representation is 1 followed by n zero’s  Let’s try: if X = 32 => X = 25, then n = 5 => 100002 (w = 5) What if w = 8? Check: 1 x 24 = 32  Decimal -> hex  Trick: When decimal number is 2n, then its hexadecimal representation is 2i followed by j zero’s, where n = i + 4j and 0 <= i <=3  Let try: if X = 8192 => X = 213, then n = 13 and 13 = i + 4j => 1 + 4 x 3 => 0x2000 9 Convert 0x2000 into a binary number: Check: 2 x 163 = 2 x 4096 = 8192 Remember: Signed integral numbers  What if the byte at M[0x0001] represented a signed integral number, what would be its value? T => Two’s Complement w =>width of the bit vector  X = 111101002 w = 8 w2 w1 i B2T (X )   x 2  x 2  w1 i  Let’s apply the encoding scheme: Sign bit i0 -1 x 27 + 1 x 26 + 1 x 25 + 1 x 24 + 0 x 23 + 1 x 22 + 0 x 21 + 0 x 20 =  What would be the bit pattern of the …  Most negative value:  Most positive value: 10  For w = 8, range of possible signed values: [ ]  For any w, range of possible signed values: [ ]  Conclusion: same as for unsigned integral numbers Examples of “Show your work” T2B(X) Conversion -> Two’s Complement w=8 Method 1 If X < 0, (~(U2B(|X|)))+1 Method 2 If X = -14 (and 8 bit binary #s) If X = -14 (and 8 bit binary #s) 1. |X| => |-14| = 1. 2. U2B(14) => 2. U2B(242) => 3. ~(000011102) => 4. (111100012)+1 => Binary addition: 11110001 + 00000001 11 Check: If X < 0, U2B(X + 2w) X + 2w => -14 + Using subtraction: 242 – 128 = 114 -> 1 x 27 114 – 64 = 50 -> 1 x 26 50 – 32 = 18 -> 1 x 25 18 – 16 = 2 -> 1 x 24 2 – 8 -> nop! -> 0 x 23 2 – 4 -> nop! -> 0 x 22 2–2=0 -> 1 x 21 0 – 1 -> nop! -> 0 x 20 Properties of unsigned & signed conversions w=4 12 X 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111 B2U(X) 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 B2T(X) 0 1 2 3 4 5 6 7 –8 –7 –6 –5 –4 –3 –2 –1  Equivalence  Both encoding schemes (B2U and B2T ) produce the same bit patterns for nonnegative values  Uniqueness Every bit pattern produced by these encoding schemes (B2U and B2T ) represents a unique (and exact) integer value  Each representable integer has unique bit pattern Converting between signed & unsigned of same size (same data type) Unsigned w=8 ux If ux = 12910 Signed (Two’s Complement) w=4 13 x If x = -510 U2T U2B X B2T Maintain Same Bit Pattern then x = Unsigned T2U T2B X Signed (Two’s Complement) x B2U Maintain Same Bit Pattern ux then ux =  Conclusion - Converting between unsigned and signed numbers: Both have same bit pattern, however, this bit pattern may be interpreted differently, i.e., producing a different value Converting signed  unsigned with w = 4 Signed Bits Unsigned 0 0000 0 1 0001 2 0010 2 3 0011 3 0100 4 5 0101 5 6 0110 6 7 0111 7 -8 1000 8 -7 1001 9 -6 1010 10 4 -5 U2T(X) + 16 (+24) -4 -3 14 14 -2 -1 U2T(X) T2U(X) 1 1011 T2U(X) 11 1100 - 16 (+24) 12 1101 13 1110 14 1111 15 Visualizing the relationship between signed & unsigned If w = 4, 24 UMax UMax – 1 = 16 TMax Signed (2’s Complement) Range 15 0 –1 –2 TMin TMax + 1 TMax 0 Unsigned Range Sign extension  Converting unsigned (or signed) of different sizes (different data types) 1. Small data type -> larger Sign bit  Sign extension X Unsigned: zero extension ••• Signed: sign bit extension  Conclusion: Value unchanged ••• X  Let’s try: ••• •••  Going from a data type that has a width of 3 bits (w = 3) to a data type that has a width of 5 bits (w = 5)  Unsigned: X = 3 => new X = 16  Signed: 0112 w = 3 <= w=5 X = 3 => 0112 w = 3 new X = <= w=5 X = 4 => new X = 1002 w = 3 <= w=5 X = -3 => 1012 w = 3 new X = <= w=5 Truncation  Converting unsigned (or signed) of different sizes(different data types) 2. Large data type -> smaller ••• X  Truncation •••  Conclusion: Value may be altered A form of overflow  Let’s try: X •••  Going from a data type that has a width of 5 bits (w = 5) to a data type that has a width of 3 bits (w = 3)  Unsigned: X = 27 => 110112 w = 5 new X =  Signed: 17 <= w=3 X = -15 => 100012 w = 5 new X = <= w=3 X = -1 => 111112 w = 5 new X = <= w=3 Summary  Interpretation of bit pattern B into either unsigned value U or signed value T  B2U(X) and U2B(X) encoding schemes (conversion)  B2T(X) and T2B(X) encoding schemes (conversion)  Signed value expressed as two’s complement => T  Conversions from unsigned <-> signed values  U2T(X) and T2U(X) => adding or subtracting 2w  Implication in C: when converting (implicitly via promotion and explicitly via casting):  Sign:  Unsigned <-> signed (of same size) -> Both have same bit pattern, however, this bit pattern may be interpreted differently  Can have unexpected effects -> producing a different value  Size:  Small -> large (for signed, e.g., short to int and for unsigned, e.g., unsigned short to unsigned int)  sign extension: For unsigned -> zeros extension, for signed -> sign bit extension  Both yield expected result –> resulting value unchanged  Large -> small (e.g., unsigned int to unsigned short)  truncation: Unsigned/signed -> most significant bits are truncated (discarded)  May not yield expected results -> original value may be altered 18  Both (sign and size): 1) size conversion is first done then 2) sign conversion is done Next Lecture  Representing data in memory – Most of this is review  “Under the Hood” - Von Neumann architecture  Bits and bytes in memory  How to diagram memory -> Used in this course and other references  How to represent series of bits -> In binary, in hexadecimal (conversion)  What kind of information (data) do series of bits represent -> Encoding scheme  Order of bytes in memory -> Endian  Bit manipulation – bitwise operations  Boolean algebra + Shifting  Representing integral numbers in memory  Unsigned and signed  Converting, expanding and truncating  Arithmetic operations  Representing real numbers in memory 19  IEEE floating point representation  Floating point in C – casting, rounding, addition, …