In: Computer Science
Consider the design of a processor, with a max instruction length of 600 ps. The propagation delay to load a register is 25 ps
What is the minimum clock cycle time, the instruction latency and CPU throughput using serial execution?
What is the minimum clock cycle time, the instruction latency and CPU throughput using a pipelined execution with 8 equal stages?
Consider a design which used n equal stages. What is the minimum clock cycle time, the instruction latency and CPU throughput expressed as a function of n? (You may wish to check that your generalization agrees with your results from parts (a) and (b), i.e., by substituting n = 1, 8.)
In case of serial execution, minimum clock cycle time = maximum instruction length = 600 ps
Instruction latency = time taken to finish a instruction = 600 ps
CPU throughput = number of instructions processed per second = 1/(time taken to finish a instruction) = 1/(600*10-12) = 1.67*109 instructions per second
When pipeline is divided into 8 equal states, then time taken by each stage = 600/8 = 75 ps
Also since propagation delay to load a register to pass the result from one stage of pipeline to next stage of pipeline is 25 ps . Hence in stage pipeline, this delay will occur 7 times.
Minimum clock cycle time = (maximum instruction length)/number of stages + time taken to load a register = 600/8 + 25 = 100 ps
Hence instruction latency = Total instruction length + 7* time taken to load register = 600 + 7*25 = 775 ps
Since in pipeline, each instruction effectively cause additional one clock cycle , hence CPU thoughput = 1/(clock cycle time) = 1/(100*10-12) = 10*109 instructions per second
If there are n stages in pipeline, then there will be (n-1) number of register load and hence delay of 25*(n-1) ns.
Minimum clock cycle time = (maximum instruction length)/n + delay due to load register = (600/n + 25) ns
So instruction latency = maximum instruction length + delay due to register load = 600 + 25*(n-1) ns
CPU throughput = 1/((600/n+25)*10-12) instructions per second
We can put n=8 to see our result match. However for n=1 since without pipeline, there will be no register load delay, hence serial execution will give different result.
Please comment for any clarification.