SHORT Some data from the CBOXes EVENTSET FIXC0 INSTR_RETIRED_ANY FIXC1 CPU_CLK_UNHALTED_CORE FIXC2 CPU_CLK_UNHALTED_REF PMC0 L1D_REPLACEMENT PMC1 L1D_M_EVICT PMC2 L2_LINES_IN_ALL PMC3 L2_TRANS_L2_WB CBOX0C0:STATE=0x3F LLC_LOOKUP_DATA_READ CBOX1C0:STATE=0x3F LLC_LOOKUP_DATA_READ CBOX2C0:STATE=0x3F LLC_LOOKUP_DATA_READ CBOX3C0:STATE=0x3F LLC_LOOKUP_DATA_READ CBOX4C0:STATE=0x3F LLC_LOOKUP_DATA_READ CBOX5C0:STATE=0x3F LLC_LOOKUP_DATA_READ CBOX6C0:STATE=0x3F LLC_LOOKUP_DATA_READ CBOX7C0:STATE=0x3F LLC_LOOKUP_DATA_READ CBOX0C1 LLC_VICTIMS_M_STATE CBOX1C1 LLC_VICTIMS_M_STATE CBOX2C1 LLC_VICTIMS_M_STATE CBOX3C1 LLC_VICTIMS_M_STATE CBOX4C1 LLC_VICTIMS_M_STATE CBOX5C1 LLC_VICTIMS_M_STATE CBOX6C1 LLC_VICTIMS_M_STATE CBOX7C1 LLC_VICTIMS_M_STATE MBOX0C0 CAS_COUNT_RD MBOX0C1 CAS_COUNT_WR MBOX1C0 CAS_COUNT_RD MBOX1C1 CAS_COUNT_WR MBOX2C0 CAS_COUNT_RD MBOX2C1 CAS_COUNT_WR MBOX3C0 CAS_COUNT_RD MBOX3C1 CAS_COUNT_WR METRICS Runtime (RDTSC) [s] time Runtime unhalted [s] FIXC1*inverseClock Clock [MHz] 1.E-06*(FIXC1/FIXC2)/inverseClock CPI FIXC1/FIXC0 L2 to L1 load bandwidth [MBytes/s] 1.0E-06*PMC0*64.0/time L2 to L1 load data volume [GBytes] 1.0E-09*PMC0*64.0 L1 to L2 evict bandwidth [MBytes/s] 1.0E-06*PMC1*64.0/time L1 to L2 evict data volume [GBytes] 1.0E-09*PMC1*64.0 L1 to/from L2 bandwidth [MBytes/s] 1.0E-06*(PMC0+PMC1)*64.0/time L1 to/from L2 data volume [GBytes] 1.0E-09*(PMC0+PMC1)*64.0 L3 to L2 load bandwidth [MBytes/s] 1.0E-06*PMC2*64.0/time L3 to L2 load data volume [GBytes] 1.0E-09*PMC2*64.0 L2 to L3 evict bandwidth [MBytes/s] 1.0E-06*PMC3*64.0/time L2 to L3 evict data volume [GBytes] 1.0E-09*PMC3*64.0 L2 to/from L3 bandwidth [MBytes/s] 1.0E-06*(PMC2+PMC3)*64.0/time L2 to/from L3 data volume [GBytes] 1.0E-09*(PMC2+PMC3)*64.0 System to L3 bandwidth [MBytes/s] 1.0E-06*(CBOX0C0:STATE=0x3F+CBOX1C0:STATE=0x3F+CBOX2C0:STATE=0x3F+CBOX3C0:STATE=0x3F+CBOX4C0:STATE=0x3F+CBOX5C0:STATE=0x3F+CBOX6C0:STATE=0x3F+CBOX7C0:STATE=0x3F)*64.0/time System to L3 data volume [GBytes] 1.0E-09*(CBOX0C0:STATE=0x3F+CBOX1C0:STATE=0x3F+CBOX2C0:STATE=0x3F+CBOX3C0:STATE=0x3F+CBOX4C0:STATE=0x3F+CBOX5C0:STATE=0x3F+CBOX6C0:STATE=0x3F+CBOX7C0:STATE=0x3F)*64.0 L3 to system bandwidth [MBytes/s] 1.0E-06*(CBOX0C1+CBOX1C1+CBOX2C1+CBOX3C1+CBOX4C1+CBOX5C1+CBOX6C1+CBOX7C1)*64.0/time L3 to system data volume [GBytes] 1.0E-09*(CBOX0C1+CBOX1C1+CBOX2C1+CBOX3C1+CBOX4C1+CBOX5C1+CBOX6C1+CBOX7C1)*64.0 L3 to/from system bandwidth [MBytes/s] 1.0E-06*(CBOX0C0:STATE=0x3F+CBOX1C0:STATE=0x3F+CBOX2C0:STATE=0x3F+CBOX3C0:STATE=0x3F+CBOX4C0:STATE=0x3F+CBOX5C0:STATE=0x3F+CBOX6C0:STATE=0x3F+CBOX7C0:STATE=0x3F+CBOX0C1+CBOX1C1+CBOX2C1+CBOX3C1+CBOX4C1+CBOX5C1+CBOX6C1+CBOX7C1)*64.0/time L3 to/from system data volume [GBytes] 1.0E-09*(CBOX0C0:STATE=0x3F+CBOX1C0:STATE=0x3F+CBOX2C0:STATE=0x3F+CBOX3C0:STATE=0x3F+CBOX4C0:STATE=0x3F+CBOX5C0:STATE=0x3F+CBOX6C0:STATE=0x3F+CBOX7C0:STATE=0x3F+CBOX0C1+CBOX1C1+CBOX2C1+CBOX3C1+CBOX4C1+CBOX5C1+CBOX6C1+CBOX7C1)*64.0 Memory read bandwidth [MBytes/s] 1.0E-06*(MBOX0C0+MBOX1C0+MBOX2C0+MBOX3C0)*64.0/time Memory read data volume [GBytes] 1.0E-09*(MBOX0C0+MBOX1C0+MBOX2C0+MBOX3C0)*64.0 Memory write bandwidth [MBytes/s] 1.0E-06*(MBOX0C1+MBOX1C1+MBOX2C1+MBOX3C1)*64.0/time Memory write data volume [GBytes] 1.0E-09*(MBOX0C1+MBOX1C1+MBOX2C1+MBOX3C1)*64.0 Memory bandwidth [MBytes/s] 1.0E-06*(MBOX0C0+MBOX1C0+MBOX2C0+MBOX3C0+MBOX0C1+MBOX1C1+MBOX2C1+MBOX3C1)*64.0/time Memory data volume [GBytes] 1.0E-09*(MBOX0C0+MBOX1C0+MBOX2C0+MBOX3C0+MBOX0C1+MBOX1C1+MBOX2C1+MBOX3C1)*64.0 LONG Formulas: L2 to L1 load bandwidth [MBytes/s] = 1.0E-06*L1D_REPLACEMENT*64/time L2 to L1 load data volume [GBytes] = 1.0E-09*L1D_REPLACEMENT*64 L1 to L2 evict bandwidth [MBytes/s] = 1.0E-06*L1D_M_EVICT*64/time L1 to L2 evict data volume [GBytes] = 1.0E-09*L1D_M_EVICT*64 L1 to/from L2 bandwidth [MBytes/s] = 1.0E-06*(L1D_REPLACEMENT+L1D_M_EVICT)*64/time L1 to/from L2 data volume [GBytes] = 1.0E-09*(L1D_REPLACEMENT+L1D_M_EVICT)*64 L3 to L2 load bandwidth [MBytes/s] = 1.0E-06*L2_LINES_IN_ALL*64/time L3 to L2 load data volume [GBytes] = 1.0E-09*L2_LINES_IN_ALL*64 L2 to L3 evict bandwidth [MBytes/s] = 1.0E-06*L2_TRANS_L2_WB*64/time L2 to L3 evict data volume [GBytes] = 1.0E-09*L2_TRANS_L2_WB*64 L2 to/from L3 bandwidth [MBytes/s] = 1.0E-06*(L2_LINES_IN_ALL+L2_TRANS_L2_WB)*64/time L2 to/from L3 data volume [GBytes] = 1.0E-09*(L2_LINES_IN_ALL+L2_TRANS_L2_WB)*64 System to L3 bandwidth [MBytes/s] = 1.0E-06*(SUM(LLC_LOOKUP_DATA_READ:STATE=0x3F))*64/time System to L3 data volume [GBytes] = 1.0E-09*(SUM(LLC_LOOKUP_DATA_READ:STATE=0x3F))*64 L3 to system bandwidth [MBytes/s] = 1.0E-06*(SUM(LLC_VICTIMS_M_STATE))*64/time L3 to system data volume [GBytes] = 1.0E-09*(SUM(LLC_VICTIMS_M_STATE))*64 L3 to/from system bandwidth [MBytes/s] = 1.0E-06*(SUM(LLC_LOOKUP_DATA_READ:STATE=0x3F)+SUM(LLC_VICTIMS_M_STATE))*64/time L3 to/from system data volume [GBytes] = 1.0E-09*(SUM(LLC_LOOKUP_DATA_READ:STATE=0x3F)+SUM(LLC_VICTIMS_M_STATE))*64 Memory read bandwidth [MBytes/s] = 1.0E-06*(SUM(CAS_COUNT_RD))*64.0/time Memory read data volume [GBytes] = 1.0E-09*(SUM(CAS_COUNT_RD))*64.0 Memory write bandwidth [MBytes/s] = 1.0E-06*(SUM(CAS_COUNT_WR))*64.0/time Memory write data volume [GBytes] = 1.0E-09*(SUM(CAS_COUNT_WR))*64.0 Memory bandwidth [MBytes/s] = 1.0E-06*(SUM(CAS_COUNT_RD)+SUM(CAS_COUNT_WR))*64.0/time Memory data volume [GBytes] = 1.0E-09*(SUM(CAS_COUNT_RD)+SUM(CAS_COUNT_WR))*64.0 - Group to measure cache transfers between L1 and Memory. Please notice that the L3 to/from system metrics contain any traffic to the system (memory, Intel QPI, etc.) but don't seem to handle anything because commonly memory read bandwidth and L3 to L2 bandwidth is higher as the memory to L3 bandwidth.