The Queen's University of Belfast
Parallel Computer Centre

[Next] [Previous] [Top]

3 Dependencies


A dependency is a primary inhibitor of vectorization which occurs when results of an operation could differ between scalar and vector processing. A recurrence is a data dependency between loop iterations occurring when one loop iteration requires a value that was defined in a previous iteration.

Example

DO I =2, 4

IB(I) = IA(I-1)

IA(I) = IC(I)

ENDDO

This code contains a recurrence caused by the subscript (I-1) hence the vector version would produce incorrect results.

3.1 Testing for dependency

It is necessary to determine whether two appearances of an array in a loop can create a dependency conflict

Example

DO 20 J = 2, M

Z(J) = YY(J) + TEMPA

R(J) = Z(J + 1)/TEMPB

20 CONTINUE

Array Z is both defined and referenced within the loop ie one appearance is the key definition where array Z is defined and the other reference is where array Z is referenced. Using the key definition and the other reference it is then possible to test the loop in the following steps:

relative to the array's key definition, determine whether the other reference is in the previous or the subsequent area as illustrated in the diagram.

In the example Z(J) is the key definition and Z(J+1) or the other reference is subsequent to the key definition.

determine if the subscript of the other reference is greater than or less than the subscript of the key definition eg Z(J+1) is greater than the subscript for Z(J).

determine whether the array subscripts are incrementing or decrementing on each iteration of the loop eg J is incrementing on each iteration.

The use of an array in a loop has the following characteristics:

An array's other reference is either Previous or Subsequent to its key definition

The subscript on the other reference is either Greater or Less than on the key definition

The array's subscript is either Incrementing or Decrementing

These characteristics can be abbreviated to summarize a total of 8 possibilities for loop-dependency analysis with 4 of these cases indicating dependencies that inhibit vectorization eg SLD, SGI, PLI, PGD.

The following examples of illustrate situations which inhibit vectorization:

SUBROUTINE SGI(A, B, C) ! Subsequent, Greater, Incrementing

DIMENSION A(100), B(100), C(100)

DO 10 I = 1, 99

A(I) = B(I)

C(I) = A(I+1)

10 CONTINUE

END

SUBROUTINE SLD(A, B, C) ! Subsequent, Less, Decrementing

DIMENSION A(100), B(100), C(100)

DO 10 I = 100, 2, -1

A(I) = B(I)

C(I) = A(I-1)

10 CONTINUE

END

SUBROUTINE PLI(A, B, C) ! Previous, Less, Incrementing

DIMENSION A(100), B(100), C(100)

DO 10 I = 2, 100

B(I) = A(I-1)

A(I) = C(I)

10 CONTINUE

END

SUBROUTINE PGD(A, B, C) ! Previous, Greater, Decrementing

DIMENSION A(100), B(100), C(100)

DO 10 I = 99, 1, -1

B(I) = A(I+1)

A(I) = C(I)

10 CONTINUE

END

The following examples of code do not inhibit vectorization

SUBROUTINE SGD(A, B, C) ! Subsequent, Greater, Decrementing

DIMENSION A(100), B(100), C(100)

DO 10 I = 1, 99, -1

A(I) = B(I)

C(I) = A(I+1)

10 CONTINUE

END

SUBROUTINE SLI(A, B, C) ! Subsequent, Less, Incrementing

DIMENSION A(100), B(100), C(100)

DO 10 I = 2, 100

A(I) = B(I)

C(I) = A(I-1)

10 CONTINUE

END

SUBROUTINE PLD(A, B, C) ! Previous, Less, Decrementing

DIMENSION A(100), B(100), C(100)

DO 10 I = 100, 2, -1

B(I) = A(I-1)

A(I) = C(I)

10 CONTINUE

END

SUBROUTINE PGI(A, B, C) ! Previous, Greater, Incrementing

DIMENSION A(100), B(100), C(100)

DO 10 I = 1, 99

B(I) = A(I+1)

A(I) = C(I)

10 CONTINUE

END

3.2 Rigorous testing for dependency

Previous tests may not be sufficient to cover all possible situations and a more rigorous test is to take into account the stride of the indexes of the arrays as in the following example:

DO 20 J = 2, M, 2

Z(J) = Y(J) + TEMPA

R(J) = Z(J+1) / TEMPB

20 CONTINUE

A potential dependency exists in this loop in that the Z array is both defined and referenced in the loop: -

Z(J) is the key definition and Z(J+1) is the other reference.

Use these to be the most previous reference ie ref1 and the most subsequent ie ref2. Define index1 as the index of ref1 eg index of Z(J) is J, define index2 as the index of ref2 eg the index of Z(J+1) which is J+1, and stride which is 2.

If the sign of index2 minus index1 equals the sign of stride there may be a dependency so proceed to the next step otherwise no dependency exists eg

index1 = J, index2 = J+1, stride = 2

The sign of index2 - index1 = sign of ((J+1)-(J)) = the sign of (1), which is positive

The sign of stride = the sign of 2, which is positive

There may be a dependency in the example because the sign of index1 minus index2 equals the sign of stride (both positive) so it is necessary to do the next step.

If (index2 minus index1) mod stride equals 0) there is a dependency, otherwise no dependency exists eg

index1 = J index2 = J+1 stride = 2

(index1 -index2) mod stride = ((J+1) -(J)) mod 2 = 1

There is no data dependency as the stride does not equal 0.

3.3 Vectorizing recurrences

A recurrence can be vectorized by preventing the recurrence from affecting the loop's result. The threshold of a recurrence is the number of iterations that occur before a value is used, but if the vector length equals the recurrence, then the recurrence does not affect the results in the vectorized version eg a recurrence whose threshold is 64 is fully vectorized. If the compiler can detect a threshold value in the range 2 < k< 64, the loop is vectorized with a vector length of k. The following code illustrates a short vector loop.

SUBROUTINE SHORT_VL(A)

DIMENSION A(100)

DO 20 I = 7, 100

A(I) = A(I-6) + 1.0

20 CONTINUE

END

Recurrence here has a threshold of 6 and is vectorizable with a shortened vector length of 6.

The compiler can include a run-time test to determine a safe vector length thus offering better performance on some loops that involve potential recurrences. A safe vector length is one which is less than or equal to the recurrence threshold. One important consideration when employing this option is the fact that the use of safe vector length may result in lengths of 1 and 2 which can degrade performance to the point of being slower than scalar code. The compiler run-time test involves significant overhead so should consider modifying each loop that performs poorly due to the use of the run-time test for vector length.

3.4 Data dependency directives

Data dependency directives can be used to provide the compiler, CF77, with additional information so that code can be fully optimized.

CFPP$ NODEPCK directs the compiler to ignore potential data dependencies in a loop but is only safe to use when absolutely sure that no recurrence exists

CFPP$ NOEQCHK directs the compiler to examine equivalence statements for recurrences (recurrences can be hidden in EQUIVALENCE statements and use of this directive suppresses any potentially unsafe transformations)

CFPP$ RELATION used to provide additional information about array subscript ranges thereby determining whether or not a loop is safe to vectorize eg CFPP$ RELATION(J.GE.N). This is useful when unsure whether or not NODEPCHK is safe to use and you have some information about relative values of index variables


[Next] [Previous] [Top]
All documents are the responsibility of, and copyright, © their authors and do not represent the views of The Parallel Computer Centre, nor of The Queen's University of Belfast.
Maintained by Alan Rea, email A.Rea@qub.ac.uk
Generated with CERN WebMaker