mirror of
https://github.com/ClusterCockpit/cc-metric-collector.git
synced 2024-11-12 21:17:25 +01:00
51 lines
2.7 KiB
Plaintext
51 lines
2.7 KiB
Plaintext
SHORT Overview of arithmetic and main memory performance
|
|
|
|
EVENTSET
|
|
PMC0 INST_RETIRED
|
|
PMC1 CPU_CYCLES
|
|
PMC2 BUS_READ_TOTAL_MEM
|
|
PMC3 BUS_WRITE_TOTAL_MEM
|
|
PMC4 FP_DP_FIXED_OPS_SPEC
|
|
PMC5 FP_DP_SCALE_OPS_SPEC
|
|
|
|
|
|
METRICS
|
|
Runtime (RDTSC) [s] time
|
|
CPI PMC1/PMC0
|
|
DP (FP) [MFLOP/s] 1E-06*(PMC4)/time
|
|
DP (FP+SVE128) [MFLOP/s] 1E-06*(((PMC5*128.0)/128.0)+PMC4)/time
|
|
DP (FP+SVE256) [MFLOP/s] 1E-06*(((PMC5*256.0)/128.0)+PMC4)/time
|
|
DP (FP+SVE512) [MFLOP/s] 1E-06*(((PMC5*512.0)/128.0)+PMC4)/time
|
|
Memory read bandwidth [MBytes/s] 1.0E-06*(PMC2)*256.0/time
|
|
Memory read data volume [GBytes] 1.0E-09*(PMC2)*256.0
|
|
Memory write bandwidth [MBytes/s] 1.0E-06*(PMC3)*256.0/time
|
|
Memory write data volume [GBytes] 1.0E-09*(PMC3)*256.0
|
|
Memory bandwidth [MBytes/s] 1.0E-06*(PMC2+PMC3)*256.0/time
|
|
Memory data volume [GBytes] 1.0E-09*(PMC2+PMC3)*256.0
|
|
Operational intensity (FP) PMC4/((PMC2+PMC3)*256.0)
|
|
Operational intensity (FP+SVE128) (((PMC5*128.0)/128.0)+PMC4)/((PMC2+PMC3)*256.0)
|
|
Operational intensity (FP+SVE256) (((PMC5*256.0)/128.0)+PMC4)/((PMC2+PMC3)*256.0)
|
|
Operational intensity (FP+SVE512) (((PMC5*512.0)/128.0)+PMC4)/((PMC2+PMC3)*256.0)
|
|
|
|
|
|
LONG
|
|
Formulas:
|
|
DP (FP) [MFLOP/s] = 1E-06*FP_DP_FIXED_OPS_SPEC/time
|
|
DP (FP+SVE128) [MFLOP/s] = 1.0E-06*(FP_DP_FIXED_OPS_SPEC+((FP_DP_SCALE_OPS_SPEC*128)/128))/time
|
|
DP (FP+SVE256) [MFLOP/s] = 1.0E-06*(FP_DP_FIXED_OPS_SPEC+((FP_DP_SCALE_OPS_SPEC*256)/128))/time
|
|
DP (FP+SVE512) [MFLOP/s] = 1.0E-06*(FP_DP_FIXED_OPS_SPEC+((FP_DP_SCALE_OPS_SPEC*512)/128))/time
|
|
Memory read bandwidth [MBytes/s] = 1.0E-06*(BUS_READ_TOTAL_MEM)*256.0/runtime
|
|
Memory read data volume [GBytes] = 1.0E-09*(BUS_READ_TOTAL_MEM)*256.0
|
|
Memory write bandwidth [MBytes/s] = 1.0E-06*(BUS_WRITE_TOTAL_MEM)*256.0/runtime
|
|
Memory write data volume [GBytes] = 1.0E-09*(BUS_WRITE_TOTAL_MEM)*256.0
|
|
Memory bandwidth [MBytes/s] = 1.0E-06*(BUS_READ_TOTAL_MEM+BUS_WRITE_TOTAL_MEM)*256.0/runtime
|
|
Memory data volume [GBytes] = 1.0E-09*(BUS_READ_TOTAL_MEM+BUS_WRITE_TOTAL_MEM)*256.0
|
|
Operational intensity (FP) = FP_DP_FIXED_OPS_SPEC/((BUS_READ_TOTAL_MEM+BUS_WRITE_TOTAL_MEM)*256.0)
|
|
Operational intensity (FP+SVE128) = (FP_DP_FIXED_OPS_SPEC+((FP_DP_SCALE_OPS_SPEC*128)/128)/((BUS_READ_TOTAL_MEM+BUS_WRITE_TOTAL_MEM)*256.0)
|
|
Operational intensity (FP+SVE256) = (FP_DP_FIXED_OPS_SPEC+((FP_DP_SCALE_OPS_SPEC*256)/128)/((BUS_READ_TOTAL_MEM+BUS_WRITE_TOTAL_MEM)*256.0)
|
|
Operational intensity (FP+SVE512) = (FP_DP_FIXED_OPS_SPEC+((FP_DP_SCALE_OPS_SPEC*512)/128)/((BUS_READ_TOTAL_MEM+BUS_WRITE_TOTAL_MEM)*256.0)
|
|
-
|
|
Profiling group to measure memory bandwidth and double-precision FP rate for scalar and SVE vector
|
|
operations with different widths. The events for the SVE metrics assumes that all vector elements
|
|
are active. The cache line size is 256 Byte.
|