CMPT 295 - Unit - Instruction Set Architecture
Lecture 25
- ISA Design + Evaluation
Last Lecture
- Looked at an example of a RISC instruction set: MIPS
- From its Instruction Set Architecture (ISA):
- Registers
- Memory model
- (Sub)set of instructions
- Assembly instructions
- Machine instructions
- Format of R-format of a MIPS machine instruction
- Size of its fields
- Format of R-format of a MIPS machine instruction
- From its Instruction Set Architecture (ISA):
Format of assembly instruction not necessarily == format of machine instruction
From last lecture!
Example of an ISA: MIPS
- Function call conventions
- caller saved registers (4-15, 24, 25)
- callee saved registers (16-23, 30, 31)
- Model of Computation
- Sequential
Register name | Number | Usage |
---|---|---|
$zero |
0 | constant 0 |
$at |
1 | reserved for assembler |
$v0 |
2 | expression evaluation and results of a function |
$v1 |
3 | expression evaluation and results of a function |
$a0 |
4 | argument 1 |
$a1 |
5 | argument 2 |
$a2 |
6 | argument 3 |
$a3 |
7 | argument 4 |
$t0 |
8 | temporary (not preserved across call) |
$t1 |
9 | temporary (not preserved across call) |
$t2 |
10 | temporary (not preserved across call) |
$t3 |
11 | temporary (not preserved across call) |
$t5 |
12 | temporary (not preserved across call) |
$t6 |
13 | temporary (not preserved across call) |
$t7 |
14 | temporary (not preserved across call) |
$s0 |
15 | saved temporary (preserved across call) |
$s1 |
16 | saved temporary (preserved across call) |
$s2 |
17 | saved temporary (preserved across call) |
$s3 |
18 | saved temporary (preserved across call) |
$s4 |
19 | saved temporary (preserved across call) |
$s5 |
20 | saved temporary (preserved across call) |
$s6 |
21 | saved temporary (preserved across call) |
$s7 |
22 | saved temporary (preserved across call) |
$t8 |
24 | temporary (not preserved across call) |
$t9 |
25 | temporary (not preserved across call) |
$k0 |
26 | reserved for OS kernel |
$k1 |
27 | reserved for OS kernel |
$gp |
28 | pointer to global area |
$sp |
29 | stack pointer |
$fp |
30 | frame pointer |
$ra |
31 | return address (used by function call) |
From last lecture! – MIPS - Design guidelines
- 3) In terms of machine instruction format:
- Create as few of them as possible
- Have them all of the same length and same format!
- If we have different machine instruction formats, then position the fields that
have the same purpose in the same location in the format
- Can all MIPS machine instructions have the same length and same format?
- For example:
lw $s1, 20($s2)
=> opcode, rs, rt, rd, shamt, func? - When designing its corresponding machine instruction …
- Must specify source register using 5 bits -> OK!
- Must specify destination register using 5 bits -> OK!
- Must specify a constant using 5 bits -> Hum…
- For example:
- Value of constant limited to [0..25-1]
- Often use to access array elements so needs to be > 25 = 32
- Can all MIPS machine instructions have the same length and same format?
From last lecture! – MIPS ISA designers compromise
- Keep all machine instructions format the same length
- Consequence -> different formats for different kinds of MIPS
instructions
- R-format for register
-
Label Size opcode 6 bits rs 5 bits rt 5 bits shamt 5 bits func 6 bits
-
- I-format for immediate
-
Label Size opcode 6 bits rs 5 bits rt 5 bits Address/immediate 16 bits
-
- J-format for jump
-
Label Size opcode 6 bits Target address 26 bits
-
- R-format for register
- opcode indicates the format of the instruction
- This way, the hardware knows whether to treat the last half of the instruction as 3 fields (R-format) or as 1 field (I-format)
- Also, position of fields with same purpose are in the same location in the 3 formats ☺
Today’s Menu
- Instruction Set Architecture (ISA)
- Definition of ISA
- Instruction Set design
- Design guidelines
- Example of an instruction set: MIPS
- Create our own instruction sets
- ISA evaluation
- Implementation of a microprocessor (CPU) based on an ISA
- Execution of machine instructions (datapath)
- Intro to logic design + Combinational logic + Sequential logic circuit
- Sequential execution of machine instructions
- Pipelined execution of machine instructions + Hazards
Let’s design our own ISA - x295M (1 of 2)
- Registers and Memory model
- # of registers -> 0 registers – model called “Memory Only” (except
$sp
) - Each memory address has -> 32 bits (m = 32)
- Word size -> 32 bits
- Byte-addressable memory so address resolution -> n = 1 byte (8 bits)
- Memory size -> -> byte OR bits
- # of registers -> 0 registers – model called “Memory Only” (except
Let’s design our own ISA - x295M (2 of 2)
- (Sub)set of instructions
x295M assembly language instructions | Semantic (i.e., Meaning) | Corresponding x295M machine instructions |
---|---|---|
ADD a, b, c |
M[c] = M[a] + M[b] | 01 <30 bits> 00 <30 bits> 00 <30 bits> |
SUB a, b, c |
M[c] = M[a] - M[b] | 10 <30 bits> 00 <30 bits> 00 <30 bits> |
MUL a, b, c |
M[c] = M[a] * M[b] | 11 <30 bits> 00 <30 bits> 00 <30 bits> |
- Memory addressing mode -> Direct (absolute), base + displacement and indirect
- Operand model -> “3 Operand” model
- Format of corresponding x295M machine instructions:
-
01 memory address of dest word 1 00 memory address of src1 word 2 00 memory address of src2 word 3
-
- Size of opcode -> 2 bits Size of operand -> 30 bits
- Length of 1 instruction -> ___________ bits
Note about the machine code format
- This ISA has two machine code formats:
Format 1 (within a word):
opcode | memory address |
Format 2 (within a word):
X | memory address |
About Design guideline:
3) In terms of machine instruction format:
- Create as few of them as possible -> 2 formats
- Have them all of the same length -> 32 bits
- Since we have two
different machine
instruction formats, fields
with same purpose are
positioned in the same
location in the 2 formats ->
- operand field (purpose -> memory address) positioned in the same location in the 2 formats
Evaluation of our ISA x295M versus MIPS
Sample C code:
- C code -> Assembly code
- ADD 0($sp), 4($sp), 12($sp)
- SUB 0($sp), 4($sp), 16($sp)
- MUL 12($sp), 16($sp), 8($sp)
- Meaning
- M[$sp+12] = M[$sp+0] + M[$sp+4]
- M[$sp+16] = M[$sp+0] – M[$sp+4]
- M[$sp+8] = M[$sp+12] * M[$sp+16]
Evaluation of our ISA x295M versus MIPS
Sample C code:
Stack segment:
Memory | Stack Segment |
0x7fffff00<- $sp -> | |
Text Segment:
Memory | Text Segment |
0x00400000 -> |
Opcode | Encoding |
---|---|
ADD | 01 |
SUB | 10 |
MUL | 11 |
No op | 00 |
Assembly code | Memory address/machine code pt. 1 | Memory address/machine code pt. 2 | Memory Address/machine code pt. 3 |
---|---|---|---|
ADD 0($sp),4($sp),12($sp) |
0x00400000 / 01 0x7fffff0c |
0x00400004 / 00 0x7fffff00 |
0x00400008 / 00 0x7fffff04 |
SUB 0($sp),4($sp),16($sp) |
0x0040000c / 10 0x7fffff10 |
0x00400010 / 00 0x7fffff00 |
0x00400014 / 00 0x7fffff04 |
MUL 12($sp),16($sp),8($sp) |
0x00400018 / 11 0x7fffff08 |
0x0040001c / 00 0x7fffff0c |
0x00400020 / 00 0x7fffff10 |
Evaluation of our ISA x295M versus MIPS
Sample C code:
C code -> Assembly code | Meaning |
---|---|
lw $s1,0($sp) |
$s1 = M[$sp + 0] |
lw $s2,4($sp) |
$s2 = M[$sp + 4] |
add $a3,$s1,$s2 |
$s3 = $s1 + $s2 |
sub $s4,$s1,$s2 |
$s4 = $s1 - $s2 |
mul $s5,$s3,$s4 |
$s5 = $s3 * $s4 |
sw $s5,8($sp) |
M[$sp + 8] = $s5 |
Evaluation of our ISA x295M versus MIPS
Opcode table:
Opcode + func | Encoding |
---|---|
lw |
|
sw |
|
add |
|
sub |
|
mul |
Register table:
Register | Number |
---|---|
$s1 |
|
$s2 |
|
$s3 |
|
$s4 |
|
$s5 |
|
$sp |
Sample C code:
I-format | src | dest | |
---|---|---|---|
opcode 6 bits | rs 5 bits | rt 5 bits | Address/immediate 16 bits |
Assembly code | Machine code |
---|---|
lw $s1,0($sp) |
100011 11101 10001 0x0000 |
lw $s2,4($sp) |
100011 11101 10010 0x0004 |
sw $s5,8($sp) |
101011 10101 11101 0x0008 |
R-format | src1 | src2 | dest | ||
---|---|---|---|---|---|
opcode 6 bits | rs 5 bits | rt 5 bits | rd 5 bits | shamt 5 bits | func 6 bits |
Assembly code | Machine code |
---|---|
add $s3,$s1,$s2 |
000000 10001 10010 10011 00000 100000 |
sub $s4,$s1,$s2 |
000000 10001 10010 10100 00000 100010 |
mul $s5,$s3,$s4 |
000000 10011 10100 10101 00000 100100 |
Evaluation of our ISA x295M versus MIPS
Sample C code:
x295M in machine code:
Memory address of each instruction | at address | +1 | +2 | +3 | +4 | +5 | +6 | +7 |
---|---|---|---|---|---|---|---|---|
0x00400000 |
01 11 |
1111 |
1111 |
1111 |
1111 |
1111 |
0000 |
1100 |
0x00400004 |
00 11 |
1111 |
1111 |
1111 |
1111 |
1111 |
0000 |
0000 |
0x00400008 |
00 11 |
1111 |
1111 |
1111 |
1111 |
1111 |
0000 |
0100 |
0x0040000c |
10 11 |
1111 |
1111 |
1111 |
1111 |
1111 |
0001 |
0000 |
0x00400010 |
00 11 |
1111 |
1111 |
1111 |
1111 |
1111 |
0000 |
0000 |
0x00400014 |
00 11 |
1111 |
1111 |
1111 |
1111 |
1111 |
0000 |
0100 |
0x00400018 |
11 11 |
1111 |
1111 |
1111 |
1111 |
1111 |
0000 |
1000 |
0x0040001c |
00 11 |
1111 |
1111 |
1111 |
1111 |
1111 |
0000 |
1100 |
0x00400020 |
00 11 |
1111 |
1111 |
1111 |
1111 |
1111 |
0001 |
0000 |
MIPS in machine code:
Memory address of each instruction | Machine code |
---|---|
0x00400000 |
100011 11101 10001 0000 0000 0000 0000 |
0x00400004 |
100011 11101 10010 0000 0000 0000 0100 |
0x00400008 |
000000 10001 10010 10011 00000 100000 |
0x0040000c |
000000 10001 10010 10100 00000 100010 |
0x00400010 |
000000 10011 10100 10101 00000 100100 |
0x00400014 |
101011 10101 11101 0000 0000 0000 0008 |
Which criteria shall we use when comparing/evaluating ISAs?
- Whether or not the Instruction set (IS) design guidelines
have been satisfied:
- Each instruction of IS have an unambiguous binary encoding
- IS is functionally complete -> i.e., it is “Turing complete”
- In terms of machine instruction format:
- Create as few of them as possible
- Have them all of the same length
- If we have different machine instruction formats, then position the fields that have the same purpose in the same location in the format
Which criteria shall we use when comparing/evaluating ISAs?
- Program performance -> usually measured using time
- If an ISA design results in faster program execution then it is deemed “better”
- What can affect the time a program takes to execute?
- Since accessing memory is slow (slower than accessing registers), the number of memory accesses a program does will affect its execution time
- Therefore, possible criteria: number of memory accesses
- The fewer memory accesses our program makes, the faster it executes, hence the “better” it is
Why is memory access slow!
- Memory access is the most time constraining aspect of program execution
- Why? Because of transfer rate limitation of the bus between
memory and CPU
- Memory is “far away” from the CPU so it takes time to transfer instructions and data from memory to microprocessor
- This is known as the von Neumann bottleneck
- From the above diagram, we can gather that register access is faster than memory access! Why?
Diagram:
A box labled CPU contains Registers, Condition Codes, PC. It is on the left. A box labled Memory is on the right. There is an arrow from CPU to Memory labled “Address Bus”. There is a double-sided arrow from CPU and Memory, labled “Data Bus”. Finally, there is an arrow from Memory to the CPU labled “Instruction Bus”.
How is the von Neumann Bottleneck created?
- It is created when memory is accessed
- During fetch stage
- An instruction is retrieved from memory
- During decode/execute stages
- The value of operands may be read from memory
- The result may be written to memory
Evaluation of our ISA x295M versus MIPS
- Sample C code:
- Let’s count the number of memory accesses:
x295M | fetch | decode/execute | MIPS | fetch | decode/execute |
---|---|---|---|---|---|
ADD 0($sp), 4($sp), 12($sp) |
lw $s1,0($sp) |
||||
SUB 0($sp), 4($sp), 16($sp) |
lw $s2,4($sp) |
||||
MUL 12($sp), 16($sp), 8($sp) |
add $s3,$s1,$s2 |
||||
sub $s4,$s1,$s2 |
|||||
mul $s5,$s3,$s4 |
|||||
sw $s5,8($sp) |
Total: _____
Summary
- ISA design
- MIPS
- Created our own x295M: “Memory only”
- ISA Evaluation
- Examining the effect of the von Neumann bottleneck on the
execution time of our program by counting number of
memory accesses
- The fewer memory accesses our program makes, the faster it executes, hence the “better” it is
- Improvements:
- Decreasing effect of von Neumann bottleneck by reducing the number of memory accesses
- Examining the effect of the von Neumann bottleneck on the
execution time of our program by counting number of
memory accesses
Next Lecture
- Instruction Set Architecture (ISA)
- Definition of ISA
- Instruction Set design
- Design principles
- Look at an example of an instruction set: MIPS
- Create our own
- ISA evaluation
- (highlighted) Implementation of a microprocessor (CPU) based on an ISA
- (highlighted) Execution of machine instructions (datapath)
- (highlighted) Intro to logic design + Combinational logic + Sequential logic circuit
- Sequential execution of machine instructions
- Pipelined execution of machine instructions + Hazards