mirror of
https://github.com/ClusterCockpit/cc-metric-collector.git
synced 2024-12-28 16:19:05 +01:00
41 lines
1.7 KiB
Plaintext
41 lines
1.7 KiB
Plaintext
SHORT L2 cache bandwidth in MBytes/s
|
|
|
|
EVENTSET
|
|
PMC0 INST_RETIRED
|
|
PMC1 CPU_CYCLES
|
|
PMC2 L1D_CACHE_REFILL
|
|
PMC3 L1D_CACHE_WB
|
|
PMC4 L1I_CACHE_REFILL
|
|
|
|
|
|
METRICS
|
|
Runtime (RDTSC) [s] time
|
|
CPI PMC1/PMC0
|
|
L1D<-L2 load bandwidth [MBytes/s] 1.0E-06*(PMC2)*256.0/time
|
|
L1D<-L2 load data volume [GBytes] 1.0E-09*(PMC2)*256.0
|
|
L1D->L2 evict bandwidth [MBytes/s] 1.0E-06*PMC3*256.0/time
|
|
L1D->L2 evict data volume [GBytes] 1.0E-09*PMC3*256.0
|
|
L1I<-L2 load bandwidth [MBytes/s] 1.0E-06*PMC4*256.0/time
|
|
L1I<-L2 load data volume [GBytes] 1.0E-09*PMC4*256.0
|
|
L1<->L2 bandwidth [MBytes/s] 1.0E-06*(PMC2+PMC3+PMC4)*256.0/time
|
|
L1<->L2 data volume [GBytes] 1.0E-09*(PMC2+PMC3+PMC4)*256.0
|
|
|
|
LONG
|
|
Formulas:
|
|
CPI = CPU_CYCLES/INST_RETIRED
|
|
L1D<-L2 load bandwidth [MBytes/s] = 1.0E-06*L1D_CACHE_REFILL*256.0/time
|
|
L1D<-L2 load data volume [GBytes] = 1.0E-09*L1D_CACHE_REFILL*256.0
|
|
L1D->L2 evict bandwidth [MBytes/s] = 1.0E-06*L1D_CACHE_WB*256.0/time
|
|
L1D->L2 evict data volume [GBytes] = 1.0E-09*L1D_CACHE_WB*256.0
|
|
L1I<-L2 load bandwidth [MBytes/s] = 1.0E-06*L1I_CACHE_REFILL*256.0/time
|
|
L1I<-L2 load data volume [GBytes] = 1.0E-09*L1I_CACHE_REFILL*256.0
|
|
L1<->L2 bandwidth [MBytes/s] = 1.0E-06*(L1D_CACHE_REFILL+L1D_CACHE_WB+L1I_CACHE_REFILL)*256.0/time
|
|
L1<->L2 data volume [GBytes] = 1.0E-09*(L1D_CACHE_REFILL+L1D_CACHE_WB+L1I_CACHE_REFILL)*256.0
|
|
-
|
|
Profiling group to measure L2 cache bandwidth. The bandwidth is computed by the
|
|
number of cacheline loaded from the L2 to the L1 data cache and the writebacks from
|
|
the L1 data cache to the L2 cache. The group also outputs total data volume transfered between
|
|
L2 and L1. Note that this bandwidth also includes data transfers due to a write
|
|
allocate load on a store miss in L1 and cachelines transfered in the L1 instruction
|
|
cache.
|