mirror of
https://github.com/ClusterCockpit/cc-metric-collector.git
synced 2025-01-15 00:29:09 +01:00
71 lines
3.7 KiB
Plaintext
71 lines
3.7 KiB
Plaintext
SHORT Overview of arithmetic and main memory performance
|
|
|
|
EVENTSET
|
|
FIXC0 INSTR_RETIRED_ANY
|
|
FIXC1 CPU_CLK_UNHALTED_CORE
|
|
FIXC2 CPU_CLK_UNHALTED_REF
|
|
PWR0 PWR_PKG_ENERGY
|
|
PWR3 PWR_DRAM_ENERGY
|
|
PMC0 FP_ARITH_INST_RETIRED_128B_PACKED_DOUBLE
|
|
PMC1 FP_ARITH_INST_RETIRED_SCALAR_DOUBLE
|
|
PMC2 FP_ARITH_INST_RETIRED_256B_PACKED_DOUBLE
|
|
PMC3 FP_ARITH_INST_RETIRED_512B_PACKED_DOUBLE
|
|
MBOX0C0 CAS_COUNT_RD
|
|
MBOX0C1 CAS_COUNT_WR
|
|
MBOX1C0 CAS_COUNT_RD
|
|
MBOX1C1 CAS_COUNT_WR
|
|
MBOX2C0 CAS_COUNT_RD
|
|
MBOX2C1 CAS_COUNT_WR
|
|
MBOX3C0 CAS_COUNT_RD
|
|
MBOX3C1 CAS_COUNT_WR
|
|
MBOX4C0 CAS_COUNT_RD
|
|
MBOX4C1 CAS_COUNT_WR
|
|
MBOX5C0 CAS_COUNT_RD
|
|
MBOX5C1 CAS_COUNT_WR
|
|
|
|
METRICS
|
|
Runtime (RDTSC) [s] time
|
|
Runtime unhalted [s] FIXC1*inverseClock
|
|
Clock [MHz] 1.E-06*(FIXC1/FIXC2)/inverseClock
|
|
CPI FIXC1/FIXC0
|
|
Energy [J] PWR0
|
|
Power [W] PWR0/time
|
|
Energy DRAM [J] PWR3
|
|
Power DRAM [W] PWR3/time
|
|
DP [MFLOP/s] 1.0E-06*(PMC0*2.0+PMC1+PMC2*4.0+PMC3*8.0)/time
|
|
AVX DP [MFLOP/s] 1.0E-06*(PMC2*4.0+PMC3*8.0)/time
|
|
Packed [MUOPS/s] 1.0E-06*(PMC0+PMC2+PMC3)/time
|
|
Scalar [MUOPS/s] 1.0E-06*PMC1/time
|
|
Memory read bandwidth [MBytes/s] 1.0E-06*(MBOX0C0+MBOX1C0+MBOX2C0+MBOX3C0+MBOX4C0+MBOX5C0)*64.0/time
|
|
Memory read data volume [GBytes] 1.0E-09*(MBOX0C0+MBOX1C0+MBOX2C0+MBOX3C0+MBOX4C0+MBOX5C0)*64.0
|
|
Memory write bandwidth [MBytes/s] 1.0E-06*(MBOX0C1+MBOX1C1+MBOX2C1+MBOX3C1+MBOX4C1+MBOX5C1)*64.0/time
|
|
Memory write data volume [GBytes] 1.0E-09*(MBOX0C1+MBOX1C1+MBOX2C1+MBOX3C1+MBOX4C1+MBOX5C1)*64.0
|
|
Memory bandwidth [MBytes/s] 1.0E-06*(MBOX0C0+MBOX1C0+MBOX2C0+MBOX3C0+MBOX4C0+MBOX5C0+MBOX0C1+MBOX1C1+MBOX2C1+MBOX3C1+MBOX4C1+MBOX5C1)*64.0/time
|
|
Memory data volume [GBytes] 1.0E-09*(MBOX0C0+MBOX1C0+MBOX2C0+MBOX3C0+MBOX4C0+MBOX5C0+MBOX0C1+MBOX1C1+MBOX2C1+MBOX3C1+MBOX4C1+MBOX5C1)*64.0
|
|
Operational intensity (PMC0*2.0+PMC1+PMC2*4.0+PMC3*8.0)/((MBOX0C0+MBOX1C0+MBOX2C0+MBOX3C0+MBOX4C0+MBOX5C0+MBOX0C1+MBOX1C1+MBOX2C1+MBOX3C1+MBOX4C1+MBOX5C1)*64.0)
|
|
|
|
LONG
|
|
Formulas:
|
|
Power [W] = PWR_PKG_ENERGY/runtime
|
|
Power DRAM [W] = PWR_DRAM_ENERGY/runtime
|
|
DP [MFLOP/s] = 1.0E-06*(FP_ARITH_INST_RETIRED_128B_PACKED_DOUBLE*2+FP_ARITH_INST_RETIRED_SCALAR_DOUBLE+FP_ARITH_INST_RETIRED_256B_PACKED_DOUBLE*4+FP_ARITH_INST_RETIRED_512B_PACKED_DOUBLE*8)/runtime
|
|
AVX DP [MFLOP/s] = 1.0E-06*(FP_ARITH_INST_RETIRED_256B_PACKED_DOUBLE*4+FP_ARITH_INST_RETIRED_512B_PACKED_DOUBLE*8)/runtime
|
|
Packed [MUOPS/s] = 1.0E-06*(FP_ARITH_INST_RETIRED_128B_PACKED_DOUBLE+FP_ARITH_INST_RETIRED_256B_PACKED_DOUBLE+FP_ARITH_INST_RETIRED_512B_PACKED_DOUBLE)/runtime
|
|
Scalar [MUOPS/s] = 1.0E-06*FP_ARITH_INST_RETIRED_SCALAR_DOUBLE/runtime
|
|
Memory read bandwidth [MBytes/s] = 1.0E-06*(SUM(CAS_COUNT_RD))*64.0/runtime
|
|
Memory read data volume [GBytes] = 1.0E-09*(SUM(CAS_COUNT_RD))*64.0
|
|
Memory write bandwidth [MBytes/s] = 1.0E-06*(SUM(CAS_COUNT_WR))*64.0/runtime
|
|
Memory write data volume [GBytes] = 1.0E-09*(SUM(CAS_COUNT_WR))*64.0
|
|
Memory bandwidth [MBytes/s] = 1.0E-06*(SUM(CAS_COUNT_RD)+SUM(CAS_COUNT_WR))*64.0/runtime
|
|
Memory data volume [GBytes] = 1.0E-09*(SUM(CAS_COUNT_RD)+SUM(CAS_COUNT_WR))*64.0
|
|
Operational intensity = (FP_ARITH_INST_RETIRED_128B_PACKED_DOUBLE*2+FP_ARITH_INST_RETIRED_SCALAR_DOUBLE+FP_ARITH_INST_RETIRED_256B_PACKED_DOUBLE*4+FP_ARITH_INST_RETIRED_512B_PACKED_DOUBLE*8)/(SUM(CAS_COUNT_RD)+SUM(CAS_COUNT_WR))*64.0)
|
|
--
|
|
Profiling group to measure memory bandwidth drawn by all cores of a socket.
|
|
Since this group is based on Uncore events it is only possible to measure on
|
|
a per socket base. Also outputs total data volume transferred from main memory.
|
|
SSE scalar and packed double precision FLOP rates. Also reports on packed AVX
|
|
32b instructions.
|
|
The operational intensity is calculated using the FP values of the cores and the
|
|
memory data volume of the whole socket. The actual operational intensity for
|
|
multiple CPUs can be found in the statistics table in the Sum column.
|