cc-metric-collector/collectors/likwid/groups/arm8_tx2/L3.txt
2021-03-25 14:47:10 +01:00

39 lines
1.6 KiB
Plaintext

SHORT L3 cache bandwidth in MBytes/s
EVENTSET
PMC0 INST_RETIRED
PMC1 CPU_CYCLES
PMC2 L2D_CACHE_REFILL
PMC3 L2D_CACHE_WB
PMC4 L2D_CACHE_ALLOCATE
METRICS
Runtime (RDTSC) [s] time
Clock [MHz] 1.E-06*PMC1/time
CPI PMC1/PMC0
L3 load bandwidth [MBytes/s] 1.0E-06*(PMC2-PMC4)*64.0/time
L3 load data volume [GBytes] 1.0E-09*(PMC2-PMC4)*64.0
L3 evict bandwidth [MBytes/s] 1.0E-06*PMC3*64.0/time
L3 evict data volume [GBytes] 1.0E-09*PMC3*64.0
L3 bandwidth [MBytes/s] 1.0E-06*(PMC2+PMC3-PMC4)*64.0/time
L3 data volume [GBytes] 1.0E-09*(PMC2+PMC3-PMC4)*64.0
LONG
Formulas:
CPI = CPU_CYCLES/INST_RETIRED
L3 load bandwidth [MBytes/s] = 1.0E-06*(L2D_CACHE_REFILL-L2D_CACHE_ALLOCATE)*64.0/time
L3 load data volume [GBytes] = 1.0E-09*(L2D_CACHE_REFILL-L2D_CACHE_ALLOCATE)*64.0
L3 evict bandwidth [MBytes/s] = 1.0E-06*L2D_CACHE_WB*64.0/time
L3 evict data volume [GBytes] = 1.0E-09*L2D_CACHE_WB*64.0
L3 bandwidth [MBytes/s] = 1.0E-06*(L2D_CACHE_REFILL+L2D_CACHE_WB-L2D_CACHE_ALLOCATE))*64.0/time
L3 data volume [GBytes] = 1.0E-09*(L2D_CACHE_REFILL+L2D_CACHE_WB-L2D_CACHE_ALLOCATE))*64.0
-
Profiling group to measure L2 <-> L3 cache bandwidth. The bandwidth is computed by the
number of cache lines loaded from the L3 to the L2 data cache and the writebacks from
the L2 data cache to the L3 cache. The group also outputs total data volume transfered between
L3 and L2. For streaming-stores, the cache lines are allocated in L2, consequently there
is no traffic between L3 and L2 in this case. But the L2D_CACHE_REFILL event counts these
allocated cache lines, that's why the value of L2D_CACHE_REFILL is reduced
by L2D_CACHE_ALLOCATE.