mirror of
https://github.com/ClusterCockpit/cc-metric-collector.git
synced 2024-12-27 23:59:05 +01:00
39 lines
1.6 KiB
Plaintext
39 lines
1.6 KiB
Plaintext
SHORT L3 cache bandwidth in MBytes/s
|
|
|
|
EVENTSET
|
|
PMC0 INST_RETIRED
|
|
PMC1 CPU_CYCLES
|
|
PMC2 L2D_CACHE_REFILL
|
|
PMC3 L2D_CACHE_WB
|
|
PMC4 L2D_CACHE_ALLOCATE
|
|
|
|
|
|
METRICS
|
|
Runtime (RDTSC) [s] time
|
|
Clock [MHz] 1.E-06*PMC1/time
|
|
CPI PMC1/PMC0
|
|
L3 load bandwidth [MBytes/s] 1.0E-06*(PMC2-PMC4)*64.0/time
|
|
L3 load data volume [GBytes] 1.0E-09*(PMC2-PMC4)*64.0
|
|
L3 evict bandwidth [MBytes/s] 1.0E-06*PMC3*64.0/time
|
|
L3 evict data volume [GBytes] 1.0E-09*PMC3*64.0
|
|
L3 bandwidth [MBytes/s] 1.0E-06*(PMC2+PMC3-PMC4)*64.0/time
|
|
L3 data volume [GBytes] 1.0E-09*(PMC2+PMC3-PMC4)*64.0
|
|
|
|
LONG
|
|
Formulas:
|
|
CPI = CPU_CYCLES/INST_RETIRED
|
|
L3 load bandwidth [MBytes/s] = 1.0E-06*(L2D_CACHE_REFILL-L2D_CACHE_ALLOCATE)*64.0/time
|
|
L3 load data volume [GBytes] = 1.0E-09*(L2D_CACHE_REFILL-L2D_CACHE_ALLOCATE)*64.0
|
|
L3 evict bandwidth [MBytes/s] = 1.0E-06*L2D_CACHE_WB*64.0/time
|
|
L3 evict data volume [GBytes] = 1.0E-09*L2D_CACHE_WB*64.0
|
|
L3 bandwidth [MBytes/s] = 1.0E-06*(L2D_CACHE_REFILL+L2D_CACHE_WB-L2D_CACHE_ALLOCATE))*64.0/time
|
|
L3 data volume [GBytes] = 1.0E-09*(L2D_CACHE_REFILL+L2D_CACHE_WB-L2D_CACHE_ALLOCATE))*64.0
|
|
-
|
|
Profiling group to measure L2 <-> L3 cache bandwidth. The bandwidth is computed by the
|
|
number of cache lines loaded from the L3 to the L2 data cache and the writebacks from
|
|
the L2 data cache to the L3 cache. The group also outputs total data volume transfered between
|
|
L3 and L2. For streaming-stores, the cache lines are allocated in L2, consequently there
|
|
is no traffic between L3 and L2 in this case. But the L2D_CACHE_REFILL event counts these
|
|
allocated cache lines, that's why the value of L2D_CACHE_REFILL is reduced
|
|
by L2D_CACHE_ALLOCATE.
|