mirror of
https://github.com/ClusterCockpit/cc-metric-collector.git
synced 2025-01-28 06:45:16 +01:00
35 lines
1.5 KiB
Plaintext
35 lines
1.5 KiB
Plaintext
SHORT Double Precision MFLOP/s
|
|
|
|
EVENTSET
|
|
FIXC0 INSTR_RETIRED_ANY
|
|
FIXC1 CPU_CLK_UNHALTED_CORE
|
|
FIXC2 CPU_CLK_UNHALTED_REF
|
|
PMC0 UOPS_RETIRED_SCALAR_SIMD
|
|
PMC1 UOPS_RETIRED_PACKED_SIMD
|
|
|
|
METRICS
|
|
Runtime (RDTSC) [s] time
|
|
Runtime unhalted [s] FIXC1*inverseClock
|
|
Clock [MHz] 1.E-06*(FIXC1/FIXC2)/inverseClock
|
|
CPI FIXC1/FIXC0
|
|
DP [MFLOP/s] (SSE assumed) 1.0E-06*((PMC1*2.0)+PMC0)/time
|
|
DP [MFLOP/s] (AVX assumed) 1.0E-06*((PMC1*4.0)+PMC0)/time
|
|
DP [MFLOP/s] (AVX512 assumed) 1.0E-06*((PMC1*8.0)+PMC0)/time
|
|
Packed [MUOPS/s] 1.0E-06*(PMC1)/time
|
|
Scalar [MUOPS/s] 1.0E-06*PMC0/time
|
|
|
|
LONG
|
|
Formulas:
|
|
DP [MFLOP/s] (SSE assumed) = 1.0E-06*(UOPS_RETIRED_PACKED_SIMD*2+UOPS_RETIRED_SCALAR_SIMD)/runtime
|
|
DP [MFLOP/s] (AVX assumed) = 1.0E-06*(UOPS_RETIRED_PACKED_SIMD*4+UOPS_RETIRED_SCALAR_SIMD)/runtime
|
|
DP [MFLOP/s] (AVX512 assumed) = 1.0E-06*(UOPS_RETIRED_PACKED_SIMD*8+UOPS_RETIRED_SCALAR_SIMD)/runtime
|
|
Packed [MUOPS/s] = 1.0E-06*(UOPS_RETIRED_PACKED_SIMD)/runtime
|
|
Scalar [MUOPS/s] = 1.0E-06*UOPS_RETIRED_SCALAR_SIMD/runtime
|
|
-
|
|
AVX/SSE scalar and packed double precision FLOP rates. The Xeon Phi (Knights Landing) provides
|
|
no possibility to differentiate between double and single precision FLOP/s. Therefore, we only
|
|
assume that the printed [MFLOP/s] value is for double-precision code. Moreover, there is no way
|
|
to distinguish between SSE, AVX or AVX512 packed SIMD operations. Therefore, this group prints
|
|
out the [MFLOP/s] for different SIMD techniques.
|
|
WARNING: The events also count for integer arithmetics
|