mirror of
https://github.com/ClusterCockpit/cc-metric-collector.git
synced 2025-01-15 16:49:05 +01:00
35 lines
1.5 KiB
Plaintext
35 lines
1.5 KiB
Plaintext
|
SHORT Single Precision MFLOP/s
|
||
|
|
||
|
EVENTSET
|
||
|
FIXC0 INSTR_RETIRED_ANY
|
||
|
FIXC1 CPU_CLK_UNHALTED_CORE
|
||
|
FIXC2 CPU_CLK_UNHALTED_REF
|
||
|
PMC0 UOPS_RETIRED_SCALAR_SIMD
|
||
|
PMC1 UOPS_RETIRED_PACKED_SIMD
|
||
|
|
||
|
METRICS
|
||
|
Runtime (RDTSC) [s] time
|
||
|
Runtime unhalted [s] FIXC1*inverseClock
|
||
|
Clock [MHz] 1.E-06*(FIXC1/FIXC2)/inverseClock
|
||
|
CPI FIXC1/FIXC0
|
||
|
SP [MFLOP/s] (SSE assumed) 1.0E-06*(PMC1*4.0+PMC0)/time
|
||
|
SP [MFLOP/s] (AVX assumed) 1.0E-06*(PMC1*8.0+PMC0)/time
|
||
|
SP [MFLOP/s] (AVX512 assumed) 1.0E-06*(PMC1*16.0+PMC0)/time
|
||
|
Packed [MUOPS/s] 1.0E-06*(PMC1)/time
|
||
|
Scalar [MUOPS/s] 1.0E-06*PMC0/time
|
||
|
|
||
|
LONG
|
||
|
Formulas:
|
||
|
SP [MFLOP/s] (SSE assumed) = 1.0E-06*(UOPS_RETIRED_PACKED_SIMD*4+UOPS_RETIRED_SCALAR_SIMD)/runtime
|
||
|
SP [MFLOP/s] (AVX assumed) = 1.0E-06*(UOPS_RETIRED_PACKED_SIMD*8+UOPS_RETIRED_SCALAR_SIMD)/runtime
|
||
|
SP [MFLOP/s] (AVX512 assumed) = 1.0E-06*(UOPS_RETIRED_PACKED_SIMD*16+UOPS_RETIRED_SCALAR_SIMD)/runtime
|
||
|
Packed [MUOPS/s] = 1.0E-06*(UOPS_RETIRED_PACKED_SIMD)/runtime
|
||
|
Scalar [MUOPS/s] = 1.0E-06*UOPS_RETIRED_SCALAR_SIMD/runtime
|
||
|
-
|
||
|
AVX/SSE scalar and packed single precision FLOP rates. The Xeon Phi (Knights Landing) provides
|
||
|
no possibility to differentiate between double and single precision FLOP/s. Therefore, we only
|
||
|
assume that the printed MFLOP/s value is for single-precision code. Moreover, there is no way
|
||
|
to distinguish between SSE, AVX or AVX512 packed SIMD operations. Therefore, this group prints
|
||
|
out the MFLOP/s for different SIMD techniques.
|
||
|
WARNING: The events also count for integer arithmetics
|