cf77 -Zp -Wd"-l listfile" prog.f OR more simply fpp -l listfile prog.f
This will give a routine-by-routine breakdown of what fpp found and what action it took. Each loop is marked according to whether it is vectorizable, parallelisable, or inhibited from optimization. The routine is then shown in translated form, sometimes showing code restructuring, routine inlining or loop unrolling, and a loop summary printed showing how many loops were optimized, and an indication of the problems remaining. These problems can then be tackled with directives to the compiler, or command-line options, see later notes.
cf77 -Wf"-e m" prog.f OR cft77 -e m prog.f
This gives a what is called a "Loopmark" listing of the code, similar to fpp's discussed above. More options may be enabled, such as a cross reference listing, but this makes the listing file much larger, with little more useful information.
xbrowse &
cf77 -Wf"-esx" -c progname.f
ftref -c full -tfull progname.l > progname.xref
cf77 -Wf"-ez" -ltrace prog.f (have to enable with the libtrace library)
jt a.out
jumpview -Lumch > jump.report (non interactive version)
The -L flag indicates a non-interactive report, use jumpview by itself for an interactive X Windows session.
Jumptracing requires recompilation and reloading. It also incurs significant CPU overhead.
Although the UNICOS Performance Utilities Reference Manual claims that Jumptracing works with multitasked codes, the author has found the opposite, ie, the NCPUS environment variable should be set to 1.
Jumptracing timings are exact and reproducible, in contrast to the operating system timings in Flowtracing and probabilistic timings in Profiling.
Unlike flowview, jumpview gives times for library routines.
In practice, jumptracing helps us to see the ratio of vector to scalar operations in each subroutine. Combined with the megaflops rating, this gives what is probably the clearest indicator of vector performance.
Example
JUMPTRACE DATA REPORT Showing Routines Sorted by Total CPU Time (CPU Times are Shown in Seconds) Name Called Time Avg Time EX % ACM % Mflops Mmems -------- ------- -------- -------- ----- ----- ----- ----- $WFV$% 303 1.56E-02 5.13E-05 20.1 20.1 0.0 2.0 ***** NOCV% 306 8.94E-03 2.92E-05 11.5 31.6 0.0 1.7 ** fwrite 316 8.89E-03 2.81E-05 11.5 43.1 0.8 2.4 ** $WFI 303 6.35E-03 2.10E-05 8.2 51.3 0.0 5.7 ** f_wch 309 6.16E-03 1.99E-05 8.0 59.2 0.0 3.3 * $gtdsp 306 4.73E-03 1.54E-05 6.1 65.3 0.0 2.3 * _xflsbuf 331 4.11E-03 1.24E-05 5.3 70.6 0.6 2.3 * __flsbu 309 3.58E-03 1.16E-05 4.6 75.2 0.7 2.2 * $pack 306 2.61E-03 8.52E-06 3.4 78.6 0.0 3.3 $FFS 303 2.11E-03 6.96E-06 2.7 81.3 0.0 1.3 $WFF 303 1.91E-03 6.30E-06 2.5 83.8 0.0 3.5 EDMSS% 274 1.77E-03 6.46E-06 2.3 86.0 1.7 0.0 write 331 1.59E-03 4.80E-06 2.0 88.1 0.0 1.7
etc