On a IBM Power4 architecture (such as HPCx) different event sets
have to be selected. When using HPMCOUNT use the -g
option and when running code instrumented with LIBHPM use the
HPM_EVENT_SET to specify the event set.
Luiz DeRose recommends five different sets as particularly useful. These are discussed in the following subsections. Each set consists of a number of raw counters and a set of derived metrics, which are often easier to judge than the raw counters. The -l option of HPMCOUNT provides a complete listing of the raw counters for all sets.
Several of these derived metrics refer to rates which use either wall-clock time or user time. When investigating the entire program using HPMCOUNT, the wall-clock time will include various overheads, such as the time needed for starting the application, initalisation routines etc. In particular for the short test jobs normally used in performance analysis, these overheads will take up a large portion of the total execution time. Hence derived metrics using wall-clock time will be severely distorted in comparison to long production runs, which might lead to misleading conclusions. When instrumenting the application code using LIBHPM, it is easy to exclude the overheads from the measured code segments and derived metrics using the vall clock time will produce meaningful results even when used in short runs.
The user time is only available in those sets which feature the
PM_CYC counts the processor cycles
consumed by the application. The user time is calculated by dividing
the number of cycles by the processor frequency.