does the program use significant resources and which ones
is the program destined for frequent long-term use
how can one get the maximum return (execution speedup) on investment (your time)
learning experience to apply to development phase of future programming
gather execution performance statistics for properly executing code ie job accounting and procstat utilities give information which should be retained and compared with the results obtained after optimization efforts
maintain separately a working version of the program
do not begin optimization until properly executing code has been obtained
operate on a computationally short model which is representative of the whole
determine if the program is I/O bound
determine if the program is vectorized
determine if the program efficiently utilises memory
compare loopmark listings - run preprocessor to determine additional speedup eg
cf77 -Zv -Wf"-em" *.f
verify the results produced by fpp
check loopmark for vectorization information
vectorization is often the single most effective method of reducing overall program execution time
look for high percentage loops or regions
look for high percentage regions which are not included in a vectorized inner loop
look for high percentage regions which are vectorized but include complex index calculations and/or divisions
look for high percentage regions which are vectorized, but include IF statements, and/or GOTO branches
NOTE
May have to restructure code so that significant code takes place in the inner most loop not outside of the loop. Complex index calculations can cause problems. Loops containing IF statements will always slow a routine down.
CALL, RETURN, STOP, or PAUSE statements
function references
branching statements (IF, GOTO)
data dependencies and recurrences
ambiguous subscript references
long loops - could use aggress option of Wf but have to verify results eg
cf77 -Wf"-o aggress" *.f
inline often called subroutines
promote scalars to vectors
use parameter statements to define constants especially those which define loop lengths
switch loop dimensions when appropriate
perform optimization comparisons on a short run
cross reference listings (cf77 -Wf"-ex" prog.f = creates the prog.l file)
program listings and utilities - Loopmark listing and ftref
Dynamic Analysis Tools - give run-time information
ja - job accounting
flowtrace - subroutine level analysis of the program giving the dynamic calling tree
prof and profview - use a fixed time interval for sampling so resulting statistics involve probability
procstat and procrpt
CRAY TOOLS SUMMARY