cc-metric-collector/collectors/likwid/groups/arm64fx/FP_PIPE.txt
2021-03-25 14:47:10 +01:00

34 lines
1.1 KiB
Plaintext

SHORT Utilization of FP pipelines
EVENTSET
PMC0 INST_RETIRED
PMC1 CPU_CYCLES
PMC2 FLA_VAL
PMC3 FLA_VAL_PRD_CNT
PMC4 FLB_VAL
PMC5 FLB_VAL_PRD_CNT
METRICS
Runtime (RDTSC) [s] time
CPI PMC1/PMC0
FP operation pipeline A busy rate [%] (PMC2/PMC1)*100.0
FP pipeline A active element rate [%] (PMC3/(PMC2*16))*100.0
FP operation pipeline B busy rate [%] (PMC4/PMC1)*100.0
FP pipeline B active element rate [%] (PMC5/(PMC4*16))*100.0
LONG
Formulas:
CPI = CPU_CYCLES/INST_SPEC
FP operation pipeline A busy rate [%] = (FLA_VAL/CPU_CYCLES)*100.0
FP pipeline A active element rate [%] = (FLA_VAL_PRD_CNT/(FLA_VAL*16))*100.0
FP operation pipeline B busy rate [%] = (FLB_VAL/CPU_CYCLES)*100.0
FP pipeline B active element rate [%] = (FLB_VAL_PRD_CNT/(FLB_VAL*16))*100.0
-
FLx_VAL: This event counts valid cycles of FLx pipeline.
FLx_VAL_PRD_CNT: This event counts the number of 1's in the predicate bits of
request in FLA pipeline, where it is corrected so that it
becomes 16 when all bits are 1.
So each predicate mask has 16 slots, so there are 16 slots per cycle in FLA and
FLB. FLA is partly used by other instructions like SVE stores.