In: Computer Science
Assume for arithmetic, load/store, and branch instructions, a
processor has CPIs of 1, 12, and 5, respectively. Also assume that
on a single processor a program requires the execution of 2.56E9
arithmetic instructions, 1.28E9 load/store instructions, and 256
million branch instructions. Assume that each processor has a 2 GHz
clock frequency.
Assume that, as the program is parallelized to run over multiple
cores, the number of arithmetic and load/store instructions per
processor is divided by 0.7 x p (where p is the number of
processors) but the number of branch instructions per processor
remains the same.
Find the total execution time for this program on 1, 2, 4, and 8 processors, and show the relative speedup of the 2, 4, and 8 processor result relative to the single processor result.
We know, clock cycles = Number of instructions * CPI
There are different types of instructions so the total clock cycles would be the sum of all the various clock cycle
So, number clock cycles for one processor:
Clock cycles = ((2.56 * 109 * 1) + (1.28 * 109 * 12) + (256 * 106 * 5)) = 1.92 * 1010
Execution time = Clock cycles / Clock speed.
Clock Speed = 2 GHz = 2 * 109 Hz [As we know, 1 GHz = 109 Hz]
So, execution time for one processor = ((1.92 * 1010) / (2 * 109)) = 9.6 seconds
The clock cycles for p processors:
According to the question, the arithmetic and load store instructions get divided by 0.7p.
So, Clock cycles = (2.56 * 109 * 1) / 0.7p + (1.28 * 109 * 12) / 0.7p + 256 * 106 * 5
or, Clock cycles = (2.56 * 1010) / p + 1.28 * 109
The execution time for p processors:
Execution time = Clock cycles / Clock speed.
= ((2.56 * 1010) / p + 1.28 * 109) / (2 * 109)
= 12.8 / p + 0.64
The execution time for p = 2.
Execution time = 12.8 / 2 + 0.64 = 7.04
Speedup (compared to single processor) = 9.6 / 7.04 = 1.36
Let's find execution time for p = 4.
Execution time = 12.8 / 4 + 0.64 = 3.84
Speedup (compared to single processor) = 9.6 / 3.84 = 2.5
Let's find execution time for p = 8.
Execution time = 12.8 / 8 + 0.64 = 2.24
Speedup (compared to single processor) = 9.6 / 2.24 = 4.29
Please comment in case of any doubt.
Please upvote if this helps.