mirror of
https://github.com/ClusterCockpit/cc-metric-collector.git
synced 2025-08-01 00:56:26 +02:00
Add likwid collector
This commit is contained in:
17
collectors/likwid/groups/pentiumm/BRANCH.txt
Normal file
17
collectors/likwid/groups/pentiumm/BRANCH.txt
Normal file
@@ -0,0 +1,17 @@
|
||||
SHORT Branch prediction miss rate/ratio
|
||||
|
||||
EVENTSET
|
||||
PMC0 BR_INST_EXEC
|
||||
PMC1 BR_MISSP_EXEC
|
||||
|
||||
METRICS
|
||||
Runtime (RDTSC) [s] time
|
||||
Branch misprediction ratio PMC1/PMC0
|
||||
|
||||
LONG
|
||||
Formulas:
|
||||
Branch misprediction ratio = BR_MISSP_EXEC / BR_INST_EXEC
|
||||
-
|
||||
The rates state how often on average a branch or a mispredicted branch occurred
|
||||
per instruction retired in total. The branch misprediction ratio sets directly
|
||||
into relation what ratio of all branch instruction where mispredicted.
|
22
collectors/likwid/groups/pentiumm/CPI.txt
Normal file
22
collectors/likwid/groups/pentiumm/CPI.txt
Normal file
@@ -0,0 +1,22 @@
|
||||
SHORT Cycles per instruction
|
||||
|
||||
EVENTSET
|
||||
PMC0 UOPS_RETIRED
|
||||
PMC1 CPU_CLK_UNHALTED
|
||||
|
||||
METRICS
|
||||
Runtime (RDTSC) [s] time
|
||||
CPI PMC1/PMC0
|
||||
IPC PMC0/PMC1
|
||||
|
||||
LONG
|
||||
Formulas:
|
||||
CPI = CPU_CLK_UNHALTED/UOPS_RETIRED
|
||||
IPC = UOPS_RETIRED/CPU_CLK_UNHALTED
|
||||
-
|
||||
This group measures how efficient the processor works with
|
||||
regard to instruction throughput. Also important as a standalone
|
||||
metric is UOPS_RETIRED as it tells you how many uops
|
||||
you need to execute for a task. An optimization might show very
|
||||
low CPI values but execute many more instruction for it.
|
||||
|
20
collectors/likwid/groups/pentiumm/FLOPS_DP.txt
Normal file
20
collectors/likwid/groups/pentiumm/FLOPS_DP.txt
Normal file
@@ -0,0 +1,20 @@
|
||||
SHORT Double Precision MFLOP/s
|
||||
|
||||
EVENTSET
|
||||
PMC0 EMON_SSE_SSE2_COMP_INST_RETIRED_PACKED_DP
|
||||
PMC1 EMON_SSE_SSE2_COMP_INST_RETIRED_SCALAR_DP
|
||||
|
||||
METRICS
|
||||
Runtime (RDTSC) [s] time
|
||||
DP [MFLOP/s] 1.0E-06*(PMC0*2.0+PMC1)/time
|
||||
Packed [MUOPS/s] 1.0E-06*(PMC0)/time
|
||||
Scalar [MUOPS/s] 1.0E-06*PMC1/time
|
||||
|
||||
LONG
|
||||
Formulas:
|
||||
DP [MFLOP/s] = (EMON_SSE_SSE2_COMP_INST_RETIRED_PACKED_DP*2 + EMON_SSE_SSE2_COMP_INST_RETIRED_SCALAR_DP )/ runtime
|
||||
Packed [MUOPS/s] = 1.0E-06*(EMON_SSE_SSE2_COMP_INST_RETIRED_PACKED_DP)/time
|
||||
Scalar [MUOPS/s] = 1.0E-06*EMON_SSE_SSE2_COMP_INST_RETIRED_SCALAR_DP/time
|
||||
-
|
||||
SSE scalar and packed double precision FLOP rates.
|
||||
|
18
collectors/likwid/groups/pentiumm/FLOPS_SP.txt
Normal file
18
collectors/likwid/groups/pentiumm/FLOPS_SP.txt
Normal file
@@ -0,0 +1,18 @@
|
||||
SHORT Single Precision MFLOP/s
|
||||
|
||||
EVENTSET
|
||||
PMC0 EMON_SSE_SSE2_COMP_INST_RETIRED_ALL_SP
|
||||
PMC1 EMON_SSE_SSE2_COMP_INST_RETIRED_SCALAR_SP
|
||||
|
||||
METRICS
|
||||
Runtime (RDTSC) [s] time
|
||||
SP [MFLOP/s] 1.0E-06*(PMC0)/time
|
||||
Scalar [MUOPS/s] 1.0E-06*(PMC1)/time
|
||||
|
||||
LONG
|
||||
Formulas:
|
||||
SP [MFLOP/s] = (EMON_SSE_SSE2_COMP_INST_RETIRED_ALL_SP)/ runtime
|
||||
Scalar [MUOPS/s] = (EMON_SSE_SSE2_COMP_INST_RETIRED_SCALAR_SP)/ runtime
|
||||
-
|
||||
SSE scalar and packed single precision FLOP rates.
|
||||
|
30
collectors/likwid/groups/pentiumm/L3.txt
Normal file
30
collectors/likwid/groups/pentiumm/L3.txt
Normal file
@@ -0,0 +1,30 @@
|
||||
SHORT L3 cache bandwidth in MBytes/s
|
||||
|
||||
EVENTSET
|
||||
PMC0 L2_LINES_IN_ALL_ALL
|
||||
PMC1 L2_LINES_OUT_ALL_ALL
|
||||
|
||||
METRICS
|
||||
Runtime (RDTSC) [s] time
|
||||
L3 load bandwidth [MBytes/s] 1.0E-06*PMC0*64.0/time
|
||||
L3 load data volume [GBytes] 1.0E-09*PMC0*64.0
|
||||
L3 evict bandwidth [MBytes/s] 1.0E-06*PMC1*64.0/time
|
||||
L3 evict data volume [GBytes] 1.0E-09*PMC1*64.0
|
||||
L3 bandwidth [MBytes/s] 1.0E-06*(PMC0+PMC1)*64.0/time
|
||||
L3 data volume [GBytes] 1.0E-09*(PMC0+PMC1)*64.0
|
||||
|
||||
LONG
|
||||
Formulas:
|
||||
L3 load bandwidth [MBytes/s] = 1.0E-06*L2_LINES_IN_ALL_ALL*64.0/time
|
||||
L3 load data volume [GBytes] = 1.0E-09*L2_LINES_IN_ALL_ALL*64.0
|
||||
L3 evict bandwidth [MBytes/s] = 1.0E-06*L2_LINES_OUT_ALL_ALL*64.0/time
|
||||
L3 evict data volume [GBytes] = 1.0E-09*L2_LINES_OUT_ALL_ALL*64.0
|
||||
L3 bandwidth [MBytes/s] = 1.0E-06*(L2_LINES_IN_ALL_ALL+L2_LINES_OUT_ALL_ALL)*64/time
|
||||
L3 data volume [GBytes] = 1.0E-09*(L2_LINES_IN_ALL_ALL+L2_LINES_OUT_ALL_ALL)*64
|
||||
-
|
||||
Profiling group to measure L3 cache bandwidth. The bandwidth is computed by the
|
||||
number of cache line allocated in the L2 and the number of modified cache lines
|
||||
evicted from the L2. The group also output total data volume transferred between
|
||||
L2. Note that this bandwidth also includes data transfers due to a write
|
||||
allocate load on a store miss in L2.
|
||||
|
Reference in New Issue
Block a user