Assignment title: Information
may work in pairs with a partner on this assignment if you wish or you may work alone. If you work with a partner,
only submit one PDF file with both of your names in the document; you will each earn the same number of points. Your
PDF file must be uploaded to Blackboard by the assignment deadline. Section 3 describes what to submit and by when.
2 Chapter 4 Exercises: Pipeline Design
1. (14 pts) See the final pipeline design diagram on p. 45 of the Ch. 4 lecture notes (this diagram is also in your book).
Consider the pipelined sequence of instructions shown below. This pipeline diagram shows the instructions executing
under ideal conditions, i.e., if instruction In enters the IF stage in clock cycle c then instruction In+1 enters the pipe-
line in the IF stage in clock cycle c+1.
Assume registers $t1 and $t2 are not equal when the beq instruction on line 4 is executed. In the pipeline design on
p. 45, the ellipse with the = sign in it is the comparator circuit that compares the contents of registers rs and rt to
determine if they are equal. Note also that the branch target address adder has been moved from the EX stage
(where it was at in the first pipeline design diagram on p. 25 of the Ch. 4 notes) to the ID stage. This means that
when beq moves from the ID stage to the EX stage, we know the correct address to write to PC: it will be PC+4 if
the branch is not taken or it will be the branch target address if the branch is taken. If we assume the design does
not perform branch prediction, beq always requires one bubble. This bubble forms what is called the branch delay
slot and when possible, the compiler will reorder instructions around a beq to try to place an instruction that has to
be executed, whether the branch is taken or not, into the branch delay slot.
CC 01 02 03 04 05 06 07 08 09 10
1 lw $t0, 8($sp) IF ID EX ME WB
2 and $t1, $t0, $s9 IF ID EX ME WB
3 sub $t2, $t1, $t0 IF ID EX ME WB
4 beq $t1, $t2, false IF ID EX ME WB
5 add $t3, $t1, $t2 IF ID EX MW WB
6 j end_loop IF ID EX ME WB
7 false:
8 addi $sp, $sp, -4
...
37 end_loop:
Assume the pipeline design does not perform forwarding to resolve EX and MEM data hazards, i.e., the only way we
have of resolving a data hazard is to stall the pipeline. Remember, we are assuming that $t1 ≠≠ $t2 on line 4 and we
are not performing branch prediction. There are some data and/or control hazards in this instruction sequence. For
this exercise, you will identify and specify all data and control hazards. For each data hazard, specify: (1) the name
or type of data hazard: EX, MEM, load use, and store word (see Note 1 below); we can also have data hazards that
are not one of these types, which we shall call unnamed data hazards); (2) the line numbers of the two involved
instructions; (3) Write a brief 1-2 sentence description of why the hazard will occur, e.g., instruction 1 (lw) will write
to $t0 in clock 5 but instruction 2 (and) will read the wrong value of $t0 in clock 3. Hint: there are six data hazards,
two of which are unnamed data hazards.
2. (7 pts) Continuing with Ex. 1, draw the instruction sequence pipeline diagram when only stalling is performed.
3. (3 pts) Assume there were no data hazards in the given instruction sequence, so the pipeline diagram shown above is
correct. What would be the average CPI when the instructions on lines 1-6 are executed?
4. (3 pts) For the pipelined instructions of Ex. 2 what would be the average CPI when instructions 1-6 are executed?
5. (7 pts) As we discussed, load use and branch hazards always require 1 bubble (see §4.9 of the Ch. 4 notes). Store
word and unnamed data hazards can only be resolved by stalling. Assume now that EX and MEM hazards can be re-
solved by forwarding. Redraw the pipeline diagram showing the sequence of instructions being executed. Insert the
(c) Kevin R. Burger :: Computer Science & Engineering :: Arizona State University Page 1
CSE/EEE 230 Computer Organization and Assembly Language Homework 6 :: 50 pts
minimum number of nop instructions or bubbles when required to resolve hazards. Hint: See the top of p. 41 in the
Ch. 4 notes. In that instruction sequence, there are three hazards: EX between 1 and 2; MEM between 1 and 3; EX be-
tween 2 and 3. The EX hazard between 1 and 2 can be resolved by forwarding, which leaves us only the last two haz-
ards. Since instruction 2 is going to overwrite the value of $t0 that instruction 1 writes there (when it is in the WB
stage in clock 5), the value of $t0 that should be sent into the ALU for 3 must be the value coming out of the EX
stage of 2 and not the value coming from the MEM stage of 1. To summarize: when instruction K on line n has both
a MEM hazard with instruction I on line n-2 and an EX hazard with instruction J on line n-1, the EX hazard must
take precedence and be resolved by forwarding.
6. (3 pts) Forwarding is a powerful technique for reducing pipeline stalls. For the pipelined sequence of Ex. 4 what is
the average CPI for instructions 1-6?
7. (5 pts) In designing a computer system, an important goal is to minimize CPI (which maximizes IPC). For this
instruction sequence, the optimal CPI is the answer from Ex. 3, call this CPIopt. Consider the CPI for the instruction
sequence where only stalling was used, i.e, the answer to Ex. 4, call this CPIstall. Then how much greater is CPIstall
expressed as a percentage of CPIopt? That is, calculate |CPIstall - CPIopt| / CPIopt. This number tells us how many how
many more clocks the stalled sequence requires as a percentage of the number of clocks for the optimal sequence. To
determine how much slower the stalled sequence is than the optimal sequence, expressed as a percentage of the clocks
for the optimal sequence, calculate |CPIstall - CPIopt| / CPIstall.
8. (5 pts) Let CPIfwd be the CPI for the instruction sequence using forwarding, i.e., the answer to Ex. 6. Then how
much greater is CPIfwd expressed as a percentage of CPIopt? That is, calculate |CPIfwd - CPIopt| / CPIopt. This number
tells us how many how many more clocks the sequence using forwarding requires as a percentage of the number of
clocks for the optimal sequence. To determine how much slower the sequence using forwarding is than the optimal
sequence, expressed as a percentage of the clocks for the optimal sequence, calculate |CPIfwd - CPIopt| / CPIfwd.
9. (3 pts) How much faster is the instruction sequence using forwarding relative to the instruction sequence using only
stalling? That is, calculate |CPIfwd - CPIstall| / CPIfwd.
Notes
1. What I am calling a store word hazard occurs when a sw instruction on line n would read the old value of register r
in the ID stage, but r is written in the WB stage by an instruction on lines n - 1 or n - 2. For example (r is $t0),
CC 01 02 03 04 05 06
[n-1] add $t0, $t1, $t2 IF ID EX ME WB add writes $t0 in clock 5
[n] sw $t0, 8($sp) IF ID EX ME WB sw would read the old value of $t0 in clock 3
CC 01 02 03 04 05 06 07
[n-2] add $t0, $t1, $t2 IF ID EX ME WB add writes to $t0 in clock 5
[n-1] I IF ID EX ME WB instruction I does not access $t0
[n] sw $t0, 8($sp) IF ID EX ME WB sw would read the old value of $t0 in clock 4
In the first diagram, add will write to $t0 in C05 but sw would read the old value of $t0 in C03. In the second diagram,
add will write to $t0 in C05 but sw would read the old value in C04. If there were two instructions (both not accessing
$t0) between the add and sw then there would be no hazard. These are not EX nor MEM hazards because the new value
of $t0 generated by the add ($t1 + $t2) does not need to be forwarded to be an ALU input for sw; rather, during sw the
ALU calculates the address in the data memory to write to by adding $sp and 8.
If the only way to resolve store word hazards is by stalling (and that is the only way as far as the pipeline design of Ch. 4
is concerned), then a store word hazard either requires 2 bubbles for the first sequence of instructions and 1 bubble for
the second.
(c) Kevin R. Burger :: Computer Science & Engineering :: Arizona State University Page 2
CSE/EEE 230 Computer Organization and Assembly Language Homework 6 :: 50 pts
3 Submission Instructions
Create a word processing document or text file and neatly type your solutions to each of the exercises. Make certain to
put the names of both partners in the document if you worked with a partner. Convert this document into a PDF file
named cse230-f16-h06-asurite.pdf or cse230-f16-h06-asurite1-asurite2.pdf, where asurite is the user name you
use to log into My ASU and Blackboard. If you work with a partner, put both user names in the file name. Upload the
PDF to Blackboard using the homework submission link by the deadline, which is 4:00am Sun 4 Dec. Note: this is a
hard deadline; no assignment will be accepted for grading after this deadline. I will publish the solution
and grading rubric document on Sun morning so those of you studying for the final exam on Mon will
have the solutions to study with on Sun. Consult the online syllabus for the late and academic integrity policies.