SHORT Single Precision MFLOP/s EVENTSET FIXC0 INSTR_RETIRED_ANY FIXC1 CPU_CLK_UNHALTED_CORE FIXC2 CPU_CLK_UNHALTED_REF PMC0 UOPS_RETIRED_SCALAR_SIMD PMC1 UOPS_RETIRED_PACKED_SIMD METRICS Runtime (RDTSC) [s] time Runtime unhalted [s] FIXC1*inverseClock Clock [MHz] 1.E-06*(FIXC1/FIXC2)/inverseClock CPI FIXC1/FIXC0 SP [MFLOP/s] (SSE assumed) 1.0E-06*(PMC1*4.0+PMC0)/time SP [MFLOP/s] (AVX assumed) 1.0E-06*(PMC1*8.0+PMC0)/time SP [MFLOP/s] (AVX512 assumed) 1.0E-06*(PMC1*16.0+PMC0)/time Packed [MUOPS/s] 1.0E-06*(PMC1)/time Scalar [MUOPS/s] 1.0E-06*PMC0/time LONG Formulas: SP [MFLOP/s] (SSE assumed) = 1.0E-06*(UOPS_RETIRED_PACKED_SIMD*4+UOPS_RETIRED_SCALAR_SIMD)/runtime SP [MFLOP/s] (AVX assumed) = 1.0E-06*(UOPS_RETIRED_PACKED_SIMD*8+UOPS_RETIRED_SCALAR_SIMD)/runtime SP [MFLOP/s] (AVX512 assumed) = 1.0E-06*(UOPS_RETIRED_PACKED_SIMD*16+UOPS_RETIRED_SCALAR_SIMD)/runtime Packed [MUOPS/s] = 1.0E-06*(UOPS_RETIRED_PACKED_SIMD)/runtime Scalar [MUOPS/s] = 1.0E-06*UOPS_RETIRED_SCALAR_SIMD/runtime - AVX/SSE scalar and packed single precision FLOP rates. The Xeon Phi (Knights Landing) provides no possibility to differentiate between double and single precision FLOP/s. Therefore, we only assume that the printed MFLOP/s value is for single-precision code. Moreover, there is no way to distinguish between SSE, AVX or AVX512 packed SIMD operations. Therefore, this group prints out the MFLOP/s for different SIMD techniques. WARNING: The events also count for integer arithmetics