cc-metric-collector/collectors/likwid/groups/arm64fx/MEM_DP.txt
2021-03-25 14:47:10 +01:00

51 lines
2.7 KiB
Plaintext

SHORT Overview of arithmetic and main memory performance
EVENTSET
PMC0 INST_RETIRED
PMC1 CPU_CYCLES
PMC2 BUS_READ_TOTAL_MEM
PMC3 BUS_WRITE_TOTAL_MEM
PMC4 FP_DP_FIXED_OPS_SPEC
PMC5 FP_DP_SCALE_OPS_SPEC
METRICS
Runtime (RDTSC) [s] time
CPI PMC1/PMC0
DP (FP) [MFLOP/s] 1E-06*(PMC4)/time
DP (FP+SVE128) [MFLOP/s] 1E-06*(((PMC5*128.0)/128.0)+PMC4)/time
DP (FP+SVE256) [MFLOP/s] 1E-06*(((PMC5*256.0)/128.0)+PMC4)/time
DP (FP+SVE512) [MFLOP/s] 1E-06*(((PMC5*512.0)/128.0)+PMC4)/time
Memory read bandwidth [MBytes/s] 1.0E-06*(PMC2)*256.0/time
Memory read data volume [GBytes] 1.0E-09*(PMC2)*256.0
Memory write bandwidth [MBytes/s] 1.0E-06*(PMC3)*256.0/time
Memory write data volume [GBytes] 1.0E-09*(PMC3)*256.0
Memory bandwidth [MBytes/s] 1.0E-06*(PMC2+PMC3)*256.0/time
Memory data volume [GBytes] 1.0E-09*(PMC2+PMC3)*256.0
Operational intensity (FP) PMC4/((PMC2+PMC3)*256.0)
Operational intensity (FP+SVE128) (((PMC5*128.0)/128.0)+PMC4)/((PMC2+PMC3)*256.0)
Operational intensity (FP+SVE256) (((PMC5*256.0)/128.0)+PMC4)/((PMC2+PMC3)*256.0)
Operational intensity (FP+SVE512) (((PMC5*512.0)/128.0)+PMC4)/((PMC2+PMC3)*256.0)
LONG
Formulas:
DP (FP) [MFLOP/s] = 1E-06*FP_DP_FIXED_OPS_SPEC/time
DP (FP+SVE128) [MFLOP/s] = 1.0E-06*(FP_DP_FIXED_OPS_SPEC+((FP_DP_SCALE_OPS_SPEC*128)/128))/time
DP (FP+SVE256) [MFLOP/s] = 1.0E-06*(FP_DP_FIXED_OPS_SPEC+((FP_DP_SCALE_OPS_SPEC*256)/128))/time
DP (FP+SVE512) [MFLOP/s] = 1.0E-06*(FP_DP_FIXED_OPS_SPEC+((FP_DP_SCALE_OPS_SPEC*512)/128))/time
Memory read bandwidth [MBytes/s] = 1.0E-06*(BUS_READ_TOTAL_MEM)*256.0/runtime
Memory read data volume [GBytes] = 1.0E-09*(BUS_READ_TOTAL_MEM)*256.0
Memory write bandwidth [MBytes/s] = 1.0E-06*(BUS_WRITE_TOTAL_MEM)*256.0/runtime
Memory write data volume [GBytes] = 1.0E-09*(BUS_WRITE_TOTAL_MEM)*256.0
Memory bandwidth [MBytes/s] = 1.0E-06*(BUS_READ_TOTAL_MEM+BUS_WRITE_TOTAL_MEM)*256.0/runtime
Memory data volume [GBytes] = 1.0E-09*(BUS_READ_TOTAL_MEM+BUS_WRITE_TOTAL_MEM)*256.0
Operational intensity (FP) = FP_DP_FIXED_OPS_SPEC/((BUS_READ_TOTAL_MEM+BUS_WRITE_TOTAL_MEM)*256.0)
Operational intensity (FP+SVE128) = (FP_DP_FIXED_OPS_SPEC+((FP_DP_SCALE_OPS_SPEC*128)/128)/((BUS_READ_TOTAL_MEM+BUS_WRITE_TOTAL_MEM)*256.0)
Operational intensity (FP+SVE256) = (FP_DP_FIXED_OPS_SPEC+((FP_DP_SCALE_OPS_SPEC*256)/128)/((BUS_READ_TOTAL_MEM+BUS_WRITE_TOTAL_MEM)*256.0)
Operational intensity (FP+SVE512) = (FP_DP_FIXED_OPS_SPEC+((FP_DP_SCALE_OPS_SPEC*512)/128)/((BUS_READ_TOTAL_MEM+BUS_WRITE_TOTAL_MEM)*256.0)
-
Profiling group to measure memory bandwidth and double-precision FP rate for scalar and SVE vector
operations with different widths. The events for the SVE metrics assumes that all vector elements
are active. The cache line size is 256 Byte.