CMPT 295
Unit - Data Representation
Lecture 3 – Representing integral numbers in memory - unsigned and signed
Last Lecture
- Von Neumann architecture
- Architecture of most computers
- Its components: CPU, memory, input and ouput, bus
- One of its characteristics: Data and code (programs) both stored in memory
- A look at memory: defined byte-addressable memory, diagram of (compressed) memory
- Word size (w): size of a series of bits (or bit vector) we manipulate, also size of machine words (see Section 2.1.2)
- A look at bits in memory
- Why binary numeral system (0 and 1 -> two values) is used to represent information in memory
- Algorithm for converting binary to hexadecimal (hex)
- Partition bit vector into groups of 4 bits, starting from right, i.e., least significant byte (LSB)
- If most significant “byte” (MSB) does not have 8 bits, pad it: add 0’s to its left
- Translate each group of 4 bits into its hex value
- Partition bit vector into groups of 4 bits, starting from right, i.e., least significant byte (LSB)
- What do bits represent? Encoding scheme gives meaning to bits
- Order of bytes in memory: little endian versus big endian
- Bit manipulation – regardless of what bit vectors represent
- Boolean algebra: bitwise operations => AND (&), OR (|), XOR (^), NOT (~)
- Shift operations: left shift, right logical shift and right arithmetic shift
- Logical shift: Fill x with y 0’s on left
- Arithmetic shift: Fill x with y copies of x‘s sign bit on left
- Sign bit: Most significant bit (MSb) before shifting occurred
NOTE: C logical operators and C bitwise (bit-level) operators behave differently! Watch out for && versus &, || versus |, …
Today’s Menu
- Representing data in memory – Most of this is review
- “Under the Hood” - Von Neumann architecture
- Bits and bytes in memory
- How to diagram memory -> Used in this course and other references
- How to represent series of bits -> In binary, in hexadecimal (conversion)
- What kind of information (data) do series of bits represent -> Encoding scheme
- Order of bytes in memory -> Endian
- Bit manipulation – bitwise operations
- Boolean algebra + Shifting
- Representing integral numbers in memory
- Unsigned and signed
- Converting, expanding and truncating
- Arithmetic operations
- Representing real numbers in memory
- IEEE floating point representation
- Floating point in C – casting, rounding, addition, …
Warm up exercise!
As a warm up exercise, fill in the blanks!
- If the context is C (on our target machine)
- char => _____ bits/ _____ byte
- short => _____ bits/ _____ bytes
- int => _____ bits/ _____ bytes
- long => _____ bits/ _____ bytes
- float => _____ bits/ _____ bytes
- double => _____ bits/ _____ bytes
- pointer (e.g. char *) => _____ bits/ _____ bytes
Unsigned integral numbers
Remember:
Address | M[] |
size-1 | |
… | |
0x0003 | 01000010 |
0x0002 | 01101001 |
0x0001 | 01110100 |
0x0000 | 01110011 |
- What if the byte at M[0x0002] represented an unsigned integral number, what would be its value? x = a series of bits = bit vector w = width of bit vector
- Let’s apply the encoding scheme:
Example:
- For w = 8, range of possible unsigned values: [ blank ]
- For any w, range of possible unsigned values: [ blank ]
- Conclusion: w bits can only represent a fixed # of possible values, but these w bits represent these values exactly
B2U(X) Conversion (Encoding scheme)
- Positional notation: expand and sum all terms
Decimal:
… | … |
100 | |
10 | |
1 |
Binary:
… | … |
4 | |
2 | |
1 |
Remember:
Example:
Range of possible values?
- If the context is C (on our target machine)
- unsigned char?
- unsigned short?
- unsigned int?
- unsigned long?
Examples of “Show your work”
U2B(X) Conversion (into 8-bit binary # => w = 8)
Method 1 - Using subtraction: subtracting decreasing power of 2 until reach 0:
Starting number: 246
- ->
- ->
- ->
- ->
- ->
- ->
- ->
- ->
Method 2 - Using division: dividing by 2 until reach 0
Start with 246
U2B(X) Conversion – A few tricks
- Decimal -> binary
- Trick: When decimal number is 2n, then its binary representation is 1 followed by n zero’s
- Let’s try:
if X = 32 => X = 25, then n = 5 => 100002 (w = 5)
What if w = 8? Check: 1 x 24 = 32
- Decimal -> hex
- Trick: When decimal number is 2n, then its hexadecimal representation is 2i followed by j zero’s, where n = i + 4j and 0 <= i <=3
- Let try: if X = 8192 => X = 213, then n = 13 and 13 = i + 4j => 1 + 4 x 3 => 0x2000. Convert 0x2000 into a binary number: Check: 2 x 163 = 2 x 4096 = 8192
Signed integral numbers
Remember:
Address | M[] |
size-1 | |
… | |
0x0003 | 01000010 |
0x0002 | 01101001 |
0x0001 | 01110100 |
0x0000 | 01110011 |
- What if the byte at M[0x0001] represented a signed integral number, what would be its value?
- X = 111101002 w = 8
- T => Two’s Complement, w => width of the bit vector (annotation: first part of equaltion [everything before the plus sign] is the “sign bit”)
- Let’s apply the encoding scheme:
Example:
- What would be the bit pattern of the …
- Most negative value:
- Most positive value:
- For w = 8, range of possible signed values: [ blank ]
- For any w, range of possible signed values: [ blank ]
- Conclusion: same as for unsigned integral numbers
Examples of “Show your work”
T2B(X) Conversion -> Two’s Complement
Annotation: w = 8
Method 1: If X < 0, (~(U2B(|X|)))+1
If X = -14 (and 8 bit binary #s)
- =>
- (first symbol is a tilde) =>
- =>
Binary addition:
11110001 + 00000001 = ????????
Method 2: If X = -14 (and 8 bit binary #s)
- =>
Using subtraction:
- ->
- ->
- ->
- ->
- ->
- ->
- ->
- ->
Properties of unsigned & signed conversions
Annotation: w = 4
X | B2U(X) | B2T(X) |
0000 | 0 | 0 |
0001 | 1 | 1 |
0010 | 2 | 2 |
0011 | 3 | 3 |
0100 | 4 | 4 |
0101 | 5 | 5 |
0110 | 6 | 6 |
0111 | 7 | 7 |
1000 | 8 | -8 |
1001 | 9 | -7 |
1010 | 10 | -6 |
1011 | 11 | -5 |
1100 | 12 | -4 |
1101 | 13 | -3 |
1110 | 14 | -2 |
1111 | 15 | -1 |
- Equivalence
- Both encoding schemes (B2U and B2T ) produce the same bit patterns for nonnegative values
- Uniqueness
- Every bit pattern produced by these encoding schemes (B2U and B2T) represents a unique (and exact) integer value
- Each representable integer has unique bit pattern
Converting between signed & unsigned of same size (same data type)
- Unsigned
- w=8
- if unsigned
ux
= 12910 - U2T(X) = B2T(U2B(X))
- then x = ???
- Maintain Same Bit Pattern
- Signed (Two’s Complement)
- w=4
- if signed (2’s C)
- T2U(X) = B2U(T2B(X))
- then unsigned
ux
= ??? - Maintain Same Bit Pattern
- Conclusion - Converting between unsigned and signed numbers: Both have same bit pattern, however, this bit pattern may be interpreted differently, i.e., producing a different value
Converting signed to unsigned (and back) with
Signed | Bits | Unsigned | Note |
0 | 0000 | 0 | All rows from 0-7 inclusive can be converted from signed to unsigned with T2U(X), and unsigned to signed with U2T(X). |
1 | 0001 | 1 | |
2 | 0010 | 2 | |
3 | 0011 | 3 | |
4 | 0100 | 4 | |
5 | 0101 | 5 | |
6 | 0110 | 6 | |
7 | 0111 | 7 | |
-8 | 1000 | 8 | All rows from here to 15 inclusive can be converted to the other like so: T2U(signed + 16/) -> unsigned, U2T(unsigned - 16/) -> signed. |
-7 | 1001 | 9 | |
-6 | 1010 | 10 | |
-5 | 1011 | 11 | |
-4 | 1100 | 12 | |
-3 | 1101 | 13 | |
-2 | 1110 | 14 | |
-1 | 1111 | 15 |
Visualizing the relationship between signed & unsigned
If
- Signed (2’s Complement) Range: TMin to TMax (0 is the center)
- Unsigned range: 0 to UMax (TMax is the center)
Sign extension
- Converting unsigned (or signed) of different sizes (different data types)
- Small data type -> larger
- Sign extension
- Unsigned: zero extension
- Signed: sign bit extension
- Sign extension
- Small data type -> larger
- Conclusion: Value unchanged
- Let’s try:
- Going from a data type that has a width of 3 bits (w = 3) to a data type that has a width of 5 bits (w = 5)
- Unsigned: ,
- New: ,
- Signed: ,
- New: ,
Truncation
- Converting unsigned (or signed) of different sizes(different data types)
- Large data type -> smaller
- Truncation
- Large data type -> smaller
- Conclusion: Value may be altered
- A form of overflow
- Let’s try:
- Going from a data type that has a width of 5 bits (w = 5) to a data type that has a width of 3 bits (w = 3)
- Unsigned:
- New:
- Signed: ,
- New: ,
Summary
- Interpretation of bit pattern B into either unsigned value U or signed value T
- B2U(X) and U2B(X) encoding schemes (conversion)
- B2T(X) and T2B(X) encoding schemes (conversion)
- Signed value expressed as two’s complement => T
- Conversions from unsigned <-> signed values
- U2T(X) and T2U(X) => adding or subtracting
- Implication in C: when converting (implicitly via promotion and explicitly via casting):
- Sign:
- Unsigned <-> signed (of same size) -> Both have same bit pattern, however, this bit pattern may be interpreted differently
- Can have unexpected effects -> producing a different value
- Unsigned <-> signed (of same size) -> Both have same bit pattern, however, this bit pattern may be interpreted differently
- Size:
- Small -> large (for signed, e.g., short to int and for unsigned, e.g., unsigned short to unsigned int)
- sign extension: For unsigned -> zeros extension, for signed -> sign bit extension
- Both yield expected result –> resulting value unchanged
- Large -> small (e.g., unsigned int to unsigned short)
- truncation: Unsigned/signed -> most significant bits are truncated (discarded)
- May not yield expected results -> original value may be altered
- Small -> large (for signed, e.g., short to int and for unsigned, e.g., unsigned short to unsigned int)
- Both (sign and size): 1) size conversion is first done then 2) sign conversion is done
- Sign:
Next Lecture
- Representing data in memory – Most of this is review
- “Under the Hood” - Von Neumann architecture
- Bits and bytes in memory
- How to diagram memory -> Used in this course and other references
- How to represent series of bits -> In binary, in hexadecimal (conversion)
- What kind of information (data) do series of bits represent -> Encoding scheme
- Order of bytes in memory -> Endian
- Bit manipulation – bitwise operations
- Boolean algebra + Shifting
- Representing integral numbers in memory
- Unsigned and signed
- Converting, expanding and truncating
- Arithmetic operations
- Representing real numbers in memory
19
- IEEE floating point representation
- Floating point in C – casting, rounding, addition, …