Example
DO I =2, 4
IB(I) = IA(I-1)
IA(I) = IC(I)
ENDDO
This code contains a recurrence caused by the subscript (I-1) hence the vector version would produce incorrect results.
Example
DO 20 J = 2, M
Z(J) = YY(J) + TEMPA
R(J) = Z(J + 1)/TEMPB
20 CONTINUE
Array Z is both defined and referenced within the loop ie one appearance is the key definition where array Z is defined and the other reference is where array Z is referenced. Using the key definition and the other reference it is then possible to test the loop in the following steps:
relative to the array's key definition, determine whether the other reference is in the previous or the subsequent area as illustrated in the diagram.
In the example Z(J) is the key definition and Z(J+1) or the other reference is subsequent to the key definition.
determine if the subscript of the other reference is greater than or less than the subscript of the key definition eg Z(J+1) is greater than the subscript for Z(J).
determine whether the array subscripts are incrementing or decrementing on each iteration of the loop eg J is incrementing on each iteration.
The use of an array in a loop has the following characteristics:
An array's other reference is either Previous or Subsequent to its key definition
The subscript on the other reference is either Greater or Less than on the key definition
The array's subscript is either Incrementing or Decrementing
These characteristics can be abbreviated to summarize a total of 8 possibilities for loop-dependency analysis with 4 of these cases indicating dependencies that inhibit vectorization eg SLD, SGI, PLI, PGD.
The following examples of illustrate situations which inhibit vectorization:
SUBROUTINE SGI(A, B, C) ! Subsequent, Greater, Incrementing
DIMENSION A(100), B(100), C(100)
DO 10 I = 1, 99
A(I) = B(I)
C(I) = A(I+1)
10 CONTINUE
END
SUBROUTINE SLD(A, B, C) ! Subsequent, Less, Decrementing
DIMENSION A(100), B(100), C(100)
DO 10 I = 100, 2, -1
A(I) = B(I)
C(I) = A(I-1)
10 CONTINUE
END
SUBROUTINE PLI(A, B, C) ! Previous, Less, Incrementing
DIMENSION A(100), B(100), C(100)
DO 10 I = 2, 100
B(I) = A(I-1)
A(I) = C(I)
10 CONTINUE
END
SUBROUTINE PGD(A, B, C) ! Previous, Greater, Decrementing
DIMENSION A(100), B(100), C(100)
DO 10 I = 99, 1, -1
B(I) = A(I+1)
A(I) = C(I)
10 CONTINUE
END
The following examples of code do not inhibit vectorization
SUBROUTINE SGD(A, B, C) ! Subsequent, Greater, Decrementing
DIMENSION A(100), B(100), C(100)
DO 10 I = 1, 99, -1
A(I) = B(I)
C(I) = A(I+1)
10 CONTINUE
END
SUBROUTINE SLI(A, B, C) ! Subsequent, Less, Incrementing
DIMENSION A(100), B(100), C(100)
DO 10 I = 2, 100
A(I) = B(I)
C(I) = A(I-1)
10 CONTINUE
END
SUBROUTINE PLD(A, B, C) ! Previous, Less, Decrementing
DIMENSION A(100), B(100), C(100)
DO 10 I = 100, 2, -1
B(I) = A(I-1)
A(I) = C(I)
10 CONTINUE
END
SUBROUTINE PGI(A, B, C) ! Previous, Greater, Incrementing
DIMENSION A(100), B(100), C(100)
DO 10 I = 1, 99
B(I) = A(I+1)
A(I) = C(I)
10 CONTINUE
END
DO 20 J = 2, M, 2
Z(J) = Y(J) + TEMPA
R(J) = Z(J+1) / TEMPB
20 CONTINUE
A potential dependency exists in this loop in that the Z array is both defined and referenced in the loop: -
Z(J) is the key definition and Z(J+1) is the other reference.
Use these to be the most previous reference ie ref1 and the most subsequent ie ref2. Define index1 as the index of ref1 eg index of Z(J) is J, define index2 as the index of ref2 eg the index of Z(J+1) which is J+1, and stride which is 2.
If the sign of index2 minus index1 equals the sign of stride there may be a dependency so proceed to the next step otherwise no dependency exists eg
index1 = J, index2 = J+1, stride = 2
The sign of index2 - index1 = sign of ((J+1)-(J)) = the sign of (1), which is positive
The sign of stride = the sign of 2, which is positive
There may be a dependency in the example because the sign of index1 minus index2 equals the sign of stride (both positive) so it is necessary to do the next step.
If (index2 minus index1) mod stride equals 0) there is a dependency, otherwise no dependency exists eg
index1 = J index2 = J+1 stride = 2
(index1 -index2) mod stride = ((J+1) -(J)) mod 2 = 1
There is no data dependency as the stride does not equal 0.
SUBROUTINE SHORT_VL(A)
DIMENSION A(100)
DO 20 I = 7, 100
A(I) = A(I-6) + 1.0
20 CONTINUE
END
Recurrence here has a threshold of 6 and is vectorizable with a shortened vector length of 6.
The compiler can include a run-time test to determine a safe vector length thus offering better performance on some loops that involve potential recurrences. A safe vector length is one which is less than or equal to the recurrence threshold. One important consideration when employing this option is the fact that the use of safe vector length may result in lengths of 1 and 2 which can degrade performance to the point of being slower than scalar code. The compiler run-time test involves significant overhead so should consider modifying each loop that performs poorly due to the use of the run-time test for vector length.
CFPP$ NODEPCK directs the compiler to ignore potential data dependencies in a loop but is only safe to use when absolutely sure that no recurrence exists
CFPP$ NOEQCHK directs the compiler to examine equivalence statements for recurrences (recurrences can be hidden in EQUIVALENCE statements and use of this directive suppresses any potentially unsafe transformations)
CFPP$ RELATION used to provide additional information about array subscript ranges thereby determining whether or not a loop is safe to vectorize eg CFPP$ RELATION(J.GE.N). This is useful when unsure whether or not NODEPCHK is safe to use and you have some information about relative values of index variables