In: Computer Science
Given the following code:
for (i=2;i<100;i=i+1)
{ a[i] = b[i] + a[i]; /* S1 */
c[i-1] = a[i] + d[i]; /* S2 */
a[i-1] = 2 * b[i]; /* S3 */
b[i+1] = 2 * b[i]; /* S4 */ }
a. List all the dependencies by their types (TD: true-data, AD: anti-data, OD: output-data dependencies).
b. Show how Software Pipelining can exploit parallelism in this code to its fullest potential. How many functional units would be sufficient to achieve the fastest execution time?
c. Is this a parallel loop? If not, can it be parallelized? How?
a. Listing all the dependencies:
There are seven dependences in the C loop given:
1. The true dependence from S1 to S2 on a.
2. The Loop that carried true dependence from S4 to S1 on b.
3. The Loop that carried true dependence from S4 to S3 on b.
4. The Loop that carried true dependence from S4 to S4 on b.
5. The Loop that carried output dependence from S1 to S3 on a.
6. The Loop that carried antidependence from S1 to S3 on a.
7. The Loop that carried antidependence from S2 to S3 on a.
b&c. It is for a loop to be parallel, every iteration must be independent of all others iterations, which will not be the case in the code used for this problem code. Because dependences 1, 2, 3 and 4 are 'true' dependences, however they can not be removed through renaming. Along with these, as dependences 2, 3 and 4 are loop-carried, they will imply that iterations of the loop are not independent. All these factors together imply the loop that it can not be made parallel as the loop is written. Maybe after rewriting the loop it may be possible to find a loop that is functionally equivalent to the original loop that can be made parallel.