parallel processing - How is Loop skewing making loop parallelizable? -


i reading loop tranformation techniques , have hard time understanding how loop skewing makes loop parallelizable here are 2 loops, initial 1 , seconde one, difference betwwen 2 ? 2 of them depends on previous iteration on both , j, make second loop parallisable ? or why can make interchange on second 1 , not first 1 ? both of them have dependencies on , j

for(int =2; < 5; i++){             for(int j =2; j < 5; j++){                 a[i][j] = a[i-1][j] + a[i][j-1];             }         } for(int =2; < 5; i++){             for(int j =2+i; j < 5+i; j++){                 a[i][j-i] = a[i-1][j-i] + a[i][j-1-i];             }         } 

i have no credit this, formatted , copied source, , hope you

[source: ece 1754, survey of loop transformation techniques, eric laforest, march 19, 2010]

it distance between 2 executive iterations, in first distance 1 between 1 outer loop , inner loop, there dependency between them.

loop skewing says: skews execution of inner loop relative outer one. useful if inner loop has dependence on outer loop prevents running in parallel. example, following code has dependency vector of {(1, 0),(0, 1)} .neither loop can parallelized since each carry dependency. interchanging loops merely interchange indices holding dependencies, accomplishing nothing.

do = 2, n-1 j = 2, m-1 a[i,j] =       (a[a-1,j] + a[i,j-1] + a[i+1,j] + a[i,j+1]) / 4 end end 

loop skewing implemented adding index of outer loop, times skewing factor f, bounds of inner loop , subtracting same value uses of inner loop index. subtraction keeps indices within new loop bounds, preserving correctness of program. effect on inner loop iterations shift position in array forwards f relative current outer loop, increasing dependency distance outer loop in same manner. in other words, given dependency vector (a, b), skewing transforms (a, f + b). since transformation preserves lexicographic order of dependencies, legal. applying skew factor of 1 above inner loop yields following code:

do = 2, n-1 j = 2+i, m-1+i a[i,j-i] = (a[a-1,j-i] + a[i,j-1-i] + a[i+1,j-i] + a[i,j+1-i]) / 4 end end 

this new code executes in same manner, dependencies of {(1, 1),(0, 1)}. both loops still carry dependency. however, interchanging loops @ point yields dependence vector {(1, 0),(1, 1)}, shown in following code:

do j = 4, m+n-2 = max(2, j-m+1), min(n-1, j-2) a[i,j-i] = (a[a-1,j-i] + a[i,j-1-i] + a[i+1,j-i] + a[i,j+1-i]) / 4 end end 

the inner loop can parallelized since has no loop-carried dependency on j, , dependency carried outer loop.note interchanging skewed loop bounds no longer straightforward: each loop must take account upper , lower bounds of other loop.


Comments

Popular posts from this blog

javascript - jquery or ashx not working -

opencv - DataType<cv::detail::deriv_type>::depth what is it used for -

python 3.x - Mapping specific letters onto a list of words -