cc-metric-collector/collectors/likwid/groups/broadwellEP/CACHES.txt

136 lines
7.1 KiB
Plaintext
Raw Normal View History

2021-03-25 14:47:10 +01:00
SHORT Cache bandwidth in MBytes/s
EVENTSET
FIXC0 INSTR_RETIRED_ANY
FIXC1 CPU_CLK_UNHALTED_CORE
FIXC2 CPU_CLK_UNHALTED_REF
PMC0 L1D_REPLACEMENT
PMC1 L2_TRANS_L1D_WB
PMC2 L2_LINES_IN_ALL
PMC3 L2_TRANS_L2_WB
CBOX0C1 LLC_VICTIMS_M
CBOX1C1 LLC_VICTIMS_M
CBOX2C1 LLC_VICTIMS_M
CBOX3C1 LLC_VICTIMS_M
CBOX4C1 LLC_VICTIMS_M
CBOX5C1 LLC_VICTIMS_M
CBOX6C1 LLC_VICTIMS_M
CBOX7C1 LLC_VICTIMS_M
CBOX8C1 LLC_VICTIMS_M
CBOX9C1 LLC_VICTIMS_M
CBOX10C1 LLC_VICTIMS_M
CBOX11C1 LLC_VICTIMS_M
CBOX12C1 LLC_VICTIMS_M
CBOX13C1 LLC_VICTIMS_M
CBOX14C1 LLC_VICTIMS_M
CBOX15C1 LLC_VICTIMS_M
CBOX16C1 LLC_VICTIMS_M
CBOX17C1 LLC_VICTIMS_M
CBOX18C1 LLC_VICTIMS_M
CBOX19C1 LLC_VICTIMS_M
CBOX20C1 LLC_VICTIMS_M
CBOX21C1 LLC_VICTIMS_M
CBOX0C0 LLC_LOOKUP_DATA_READ
CBOX1C0 LLC_LOOKUP_DATA_READ
CBOX2C0 LLC_LOOKUP_DATA_READ
CBOX3C0 LLC_LOOKUP_DATA_READ
CBOX4C0 LLC_LOOKUP_DATA_READ
CBOX5C0 LLC_LOOKUP_DATA_READ
CBOX6C0 LLC_LOOKUP_DATA_READ
CBOX7C0 LLC_LOOKUP_DATA_READ
CBOX8C0 LLC_LOOKUP_DATA_READ
CBOX9C0 LLC_LOOKUP_DATA_READ
CBOX10C0 LLC_LOOKUP_DATA_READ
CBOX11C0 LLC_LOOKUP_DATA_READ
CBOX12C0 LLC_LOOKUP_DATA_READ
CBOX13C0 LLC_LOOKUP_DATA_READ
CBOX14C0 LLC_LOOKUP_DATA_READ
CBOX15C0 LLC_LOOKUP_DATA_READ
CBOX16C0 LLC_LOOKUP_DATA_READ
CBOX17C0 LLC_LOOKUP_DATA_READ
CBOX18C0 LLC_LOOKUP_DATA_READ
CBOX19C0 LLC_LOOKUP_DATA_READ
CBOX20C0 LLC_LOOKUP_DATA_READ
CBOX21C0 LLC_LOOKUP_DATA_READ
MBOX0C0 CAS_COUNT_RD
MBOX0C1 CAS_COUNT_WR
MBOX1C0 CAS_COUNT_RD
MBOX1C1 CAS_COUNT_WR
MBOX2C0 CAS_COUNT_RD
MBOX2C1 CAS_COUNT_WR
MBOX3C0 CAS_COUNT_RD
MBOX3C1 CAS_COUNT_WR
MBOX4C0 CAS_COUNT_RD
MBOX4C1 CAS_COUNT_WR
MBOX5C0 CAS_COUNT_RD
MBOX5C1 CAS_COUNT_WR
MBOX6C0 CAS_COUNT_RD
MBOX6C1 CAS_COUNT_WR
MBOX7C0 CAS_COUNT_RD
MBOX7C1 CAS_COUNT_WR
METRICS
Runtime (RDTSC) [s] time
Runtime unhalted [s] FIXC1*inverseClock
Clock [MHz] 1.E-06*(FIXC1/FIXC2)/inverseClock
CPI FIXC1/FIXC0
L2 to L1 load bandwidth [MBytes/s] 1.0E-06*PMC0*64.0/time
L2 to L1 load data volume [GBytes] 1.0E-09*PMC0*64.0
L1 to L2 evict bandwidth [MBytes/s] 1.0E-06*PMC1*64.0/time
L1 to L2 evict data volume [GBytes] 1.0E-09*PMC1*64.0
L1 to/from L2 bandwidth [MBytes/s] 1.0E-06*(PMC0+PMC1)*64.0/time
L1 to/from L2 data volume [GBytes] 1.0E-09*(PMC0+PMC1)*64.0
L3 to L2 load bandwidth [MBytes/s] 1.0E-06*PMC2*64.0/time
L3 to L2 load data volume [GBytes] 1.0E-09*PMC2*64.0
L2 to L3 evict bandwidth [MBytes/s] 1.0E-06*PMC3*64.0/time
L2 to L3 evict data volume [GBytes] 1.0E-09*PMC3*64.0
L2 to/from L3 bandwidth [MBytes/s] 1.0E-06*(PMC2+PMC3)*64.0/time
L2 to/from L3 data volume [GBytes] 1.0E-09*(PMC2+PMC3)*64.0
System to L3 bandwidth [MBytes/s] 1.0E-06*(CBOX0C0+CBOX1C0+CBOX2C0+CBOX3C0+CBOX4C0+CBOX5C0+CBOX6C0+CBOX7C0+CBOX8C0+CBOX9C0+CBOX10C0+CBOX11C0+CBOX12C0+CBOX13C0+CBOX14C0+CBOX15C0+CBOX16C0+CBOX17C0+CBOX18C0+CBOX19C0+CBOX20C0+CBOX21C0)*64.0/time
System to L3 data volume [GBytes] 1.0E-09*(CBOX0C0+CBOX1C0+CBOX2C0+CBOX3C0+CBOX4C0+CBOX5C0+CBOX6C0+CBOX7C0+CBOX8C0+CBOX9C0+CBOX10C0+CBOX11C0+CBOX12C0+CBOX13C0+CBOX14C0+CBOX15C0+CBOX16C0+CBOX17C0+CBOX18C0+CBOX19C0+CBOX20C0+CBOX21C0)*64.0
L3 to system bandwidth [MBytes/s] 1.0E-06*(CBOX0C1+CBOX1C1+CBOX2C1+CBOX3C1+CBOX4C1+CBOX5C1+CBOX6C1+CBOX7C1+CBOX8C1+CBOX9C1+CBOX10C1+CBOX11C1+CBOX12C1+CBOX13C1+CBOX14C1+CBOX15C1+CBOX16C1+CBOX17C1+CBOX18C1+CBOX19C1+CBOX20C1+CBOX21C1)*64/time
L3 to system data volume [GBytes] 1.0E-09*(CBOX0C1+CBOX1C1+CBOX2C1+CBOX3C1+CBOX4C1+CBOX5C1+CBOX6C1+CBOX7C1+CBOX8C1+CBOX9C1+CBOX10C1+CBOX11C1+CBOX12C1+CBOX13C1+CBOX14C1+CBOX15C1+CBOX16C1+CBOX17C1+CBOX18C1+CBOX19C1+CBOX20C1+CBOX21C1)*64
L3 to/from system bandwidth [MBytes/s] 1.0E-06*(CBOX0C0+CBOX1C0+CBOX2C0+CBOX3C0+CBOX4C0+CBOX5C0+CBOX6C0+CBOX7C0+CBOX8C0+CBOX9C0+CBOX10C0+CBOX11C0+CBOX12C0+CBOX13C0+CBOX14C0+CBOX15C0+CBOX16C0+CBOX17C0+CBOX18C0+CBOX19C0+CBOX20C0+CBOX21C0+CBOX0C1+CBOX1C1+CBOX2C1+CBOX3C1+CBOX4C1+CBOX5C1+CBOX6C1+CBOX7C1+CBOX8C1+CBOX9C1+CBOX10C1+CBOX11C1+CBOX12C1+CBOX13C1+CBOX14C1+CBOX15C1+CBOX16C1+CBOX17C1+CBOX18C1+CBOX19C1+CBOX20C1+CBOX21C1)*64.0/time
L3 to/from system data volume [GBytes] 1.0E-09*(CBOX0C0+CBOX1C0+CBOX2C0+CBOX3C0+CBOX4C0+CBOX5C0+CBOX6C0+CBOX7C0+CBOX8C0+CBOX9C0+CBOX10C0+CBOX11C0+CBOX12C0+CBOX13C0+CBOX14C0+CBOX15C0+CBOX16C0+CBOX17C0+CBOX18C0+CBOX19C0+CBOX20C0+CBOX21C0+CBOX0C1+CBOX1C1+CBOX2C1+CBOX3C1+CBOX4C1+CBOX5C1+CBOX6C1+CBOX7C1+CBOX8C1+CBOX9C1+CBOX10C1+CBOX11C1+CBOX12C1+CBOX13C1+CBOX14C1+CBOX15C1+CBOX16C1+CBOX17C1+CBOX18C1+CBOX19C1+CBOX20C1+CBOX21C1)*64.0
Memory read bandwidth [MBytes/s] 1.0E-06*(MBOX0C0+MBOX1C0+MBOX2C0+MBOX3C0+MBOX4C0+MBOX5C0+MBOX6C0+MBOX7C0)*64.0/time
Memory read data volume [GBytes] 1.0E-09*(MBOX0C0+MBOX1C0+MBOX2C0+MBOX3C0+MBOX4C0+MBOX5C0+MBOX6C0+MBOX7C0)*64.0
Memory write bandwidth [MBytes/s] 1.0E-06*(MBOX0C1+MBOX1C1+MBOX2C1+MBOX3C1+MBOX4C1+MBOX5C1+MBOX6C1+MBOX7C1)*64.0/time
Memory write data volume [GBytes] 1.0E-09*(MBOX0C1+MBOX1C1+MBOX2C1+MBOX3C1+MBOX4C1+MBOX5C1+MBOX6C1+MBOX7C1)*64.0
Memory bandwidth [MBytes/s] 1.0E-06*(MBOX0C0+MBOX1C0+MBOX2C0+MBOX3C0+MBOX4C0+MBOX5C0+MBOX6C0+MBOX7C0+MBOX0C1+MBOX1C1+MBOX2C1+MBOX3C1+MBOX4C1+MBOX5C1+MBOX6C1+MBOX7C1)*64.0/time
Memory data volume [GBytes] 1.0E-09*(MBOX0C0+MBOX1C0+MBOX2C0+MBOX3C0+MBOX4C0+MBOX5C0+MBOX6C0+MBOX7C0+MBOX0C1+MBOX1C1+MBOX2C1+MBOX3C1+MBOX4C1+MBOX5C1+MBOX6C1+MBOX7C1)*64.0
LONG
Formulas:
L2 to L1 load bandwidth [MBytes/s] = 1.0E-06*L1D_REPLACEMENT*64/time
L2 to L1 load data volume [GBytes] = 1.0E-09*L1D_REPLACEMENT*64
L1 to L2 evict bandwidth [MBytes/s] = 1.0E-06*L1D_M_EVICT*64/time
L1 to L2 evict data volume [GBytes] = 1.0E-09*L1D_M_EVICT*64
L1 to/from L2 bandwidth [MBytes/s] = 1.0E-06*(L1D_REPLACEMENT+L1D_M_EVICT)*64/time
L1 to/from L2 data volume [GBytes] = 1.0E-09*(L1D_REPLACEMENT+L1D_M_EVICT)*64
L3 to L2 load bandwidth [MBytes/s] = 1.0E-06*L2_LINES_IN_ALL*64/time
L3 to L2 load data volume [GBytes] = 1.0E-09*L2_LINES_IN_ALL*64
L2 to L3 evict bandwidth [MBytes/s] = 1.0E-06*L2_TRANS_L2_WB*64/time
L2 to L3 evict data volume [GBytes] = 1.0E-09*L2_TRANS_L2_WB*64
L2 to/from L3 bandwidth [MBytes/s] = 1.0E-06*(L2_LINES_IN_ALL+L2_TRANS_L2_WB)*64/time
L2 to/from L3 data volume [GBytes] = 1.0E-09*(L2_LINES_IN_ALL+L2_TRANS_L2_WB)*64
System to L3 bandwidth [MBytes/s] = 1.0E-06*(SUM(LLC_LOOKUP_DATA_READ))*64/time
System to L3 data volume [GBytes] = 1.0E-09*(SUM(LLC_LOOKUP_DATA_READ))*64
L3 to system bandwidth [MBytes/s] = 1.0E-06*(SUM(LLC_VICTIMS_M))*64/time
L3 to system data volume [GBytes] = 1.0E-09*(SUM(LLC_VICTIMS_M))*64
L3 to/from system bandwidth [MBytes/s] = 1.0E-06*(SUM(LLC_LOOKUP_DATA_READ)+SUM(LLC_VICTIMS_M))*64/time
L3 to/from system data volume [GBytes] = 1.0E-09*(SUM(LLC_LOOKUP_DATA_READ)+SUM(LLC_VICTIMS_M))*64
Memory read bandwidth [MBytes/s] = 1.0E-06*(SUM(CAS_COUNT_RD))*64.0/time
Memory read data volume [GBytes] = 1.0E-09*(SUM(CAS_COUNT_RD))*64.0
Memory write bandwidth [MBytes/s] = 1.0E-06*(SUM(CAS_COUNT_WR))*64.0/time
Memory write data volume [GBytes] = 1.0E-09*(SUM(CAS_COUNT_WR))*64.0
Memory bandwidth [MBytes/s] = 1.0E-06*(SUM(CAS_COUNT_RD)+SUM(CAS_COUNT_WR))*64.0/time
Memory data volume [GBytes] = 1.0E-09*(SUM(CAS_COUNT_RD)+SUM(CAS_COUNT_WR))*64.0
-
Group to measure cache transfers between L1 and Memory. Please notice that the
L3 to/from system metrics contain any traffic to the system (memory,
Intel QPI, etc.) but don't seem to handle anything because commonly memory read
bandwidth and L3 to L2 bandwidth is higher as the memory to L3 bandwidth.