* Cleanup: Remove unused code * Use Golang duration parser for 'interval' and 'duration' in main config * Update handling of LIKWID headers. Download only if not already present in the system. Fixes #73 * Units with cc-units (#64) * Add option to normalize units with cc-unit * Add unit conversion to router * Add option to change unit prefix in the router * Add to MetricRouter README * Add order of operations in router to README * Use second add_tags/del_tags only if metric gets renamed * Skip disks in DiskstatCollector that have size=0 * Check readability of sensor files in TempCollector * Fix for --once option * Rename `cpu` type to `hwthread` (#69) * Rename 'cpu' type to 'hwthread' to avoid naming clashes with MetricStore and CC-Webfrontend * Collectors in parallel (#74) * Provide info to CollectorManager whether the collector can be executed in parallel with others * Split serial and parallel collectors. Read in parallel first * Update NvidiaCollector with new metrics, MIG and NvLink support (#75) * CC topology module update (#76) * Rename CPU to hardware thread, write some comments * Do renaming in other parts * Remove CpuList and SocketList function from metricCollector. Available in ccTopology * Option to use MIG UUID as subtype-id in NvidiaCollector * Option to use MIG slice name as subtype-id in NvidiaCollector * MetricRouter: Fix JSON in README * Fix for Github Action to really use the selected version * Remove Ganglia installation in runonce Action and add Go 1.18 * Fix daemon options in init script * Add separate go.mod files to use it with deprecated 1.16 * Minor updates for Makefiles * fix string comparison * AMD ROCm SMI collector (#77) * Add collector for AMD ROCm SMI metrics * Fix import path * Fix imports * Remove Board Number * store GPU index explicitly * Remove board number from description * Use http instead of ftp to download likwid * Fix serial number in rocmCollector * Improved http sink (#78) * automatic flush in NatsSink * tweak default options of HttpSink * shorter cirt. section and retries for HttpSink * fix error handling * Remove file added by mistake. * Use http instead of ftp to download likwid * Fix serial number in rocmCollector Co-authored-by: Thomas Roehl <thomas.roehl@fau.de> Co-authored-by: Holger Obermaier <40787752+ho-ob@users.noreply.github.com> Co-authored-by: Lou <lou.knauer@gmx.de>
12 KiB
likwid
collector
The likwid
collector is probably the most complicated collector. The LIKWID library is included as static library with direct access mode. The direct access mode is suitable if the daemon is executed by a root user. The static library does not contain the performance groups, so all information needs to be provided in the configuration.
"likwid": {
"force_overwrite" : false,
"invalid_to_zero" : false,
"eventsets": [
{
"events" : {
"COUNTER0": "EVENT0",
"COUNTER1": "EVENT1",
},
"metrics" : [
{
"name": "sum_01",
"calc": "COUNTER0 + COUNTER1",
"publish": false,
"unit": "myunit",
"type": "hwthread"
}
]
}
]
"globalmetrics" : [
{
"name": "global_sum",
"calc": "sum_01",
"publish": true,
"unit": "myunit",
"type": "hwthread"
}
]
}
The likwid
configuration consists of two parts, the eventsets
and globalmetrics
:
- An event set list itself has two parts, the
events
and a set of derivablemetrics
. Each of theevents
is acounter:event
pair in LIKWID's syntax. Themetrics
are a list of formulas to derive the metric value from the measurements of theevents
' values. Each metric has a name, the formula, a type and a publish flag. There is an optionalunit
field. Counter names can be used like variables in the formulas, soPMC0+PMC1
sums the measurements for the both events configured in the countersPMC0
andPMC1
. You can optionally usetime
for the measurement time andinverseClock
for1.0/baseCpuFrequency
. The type tells the LikwidCollector whether it is a metric for each hardware thread (cpu
) or each CPU socket (socket
). You may specify a unit for the metric withunit
. The last one is the publishing flag. It tells the LikwidCollector whether a metric should be sent to the router or is only used internally to compute a global metric. - The
globalmetrics
are metrics which require data from multiple event set measurements to be derived. The inputs are the metrics in the event sets. Similar to the metrics in the event sets, the global metrics are defined by a name, a formula, a scope and a publish flag. See event set metrics for details. The only difference is that there is no access to the raw event measurements anymore but only to the metrics. Alsotime
andinverseClock
cannot be used anymore. So, the idea is to derive a metric in theeventsets
section and reuse it in theglobalmetrics
part. If you need a metric only for deriving the global metrics, disable forwarding of the event set metrics ("publish": false
). Be aware that the combination might be misleading because the "behavior" of a metric changes over time and the multiple measurements might count different computing phases. Similar to the metrics in the eventset, you can specify a metric unit with theunit
field.
Additional options:
force_overwrite
: Same as settingLIKWID_FORCE=1
. In case counters are already in-use, LIKWID overwrites their configuration to do its measurementsinvalid_to_zero
: In some cases, the calculations result inNaN
orInf
. With this option, allNaN
andInf
values are replaces with0.0
. See below in seperate sectionaccess_mode
: Specify LIKWID access mode:direct
for direct register access as root user oraccessdaemon
. The access modeperf_event
is current untested.accessdaemon_path
: Folder of the accessDaemonlikwid-accessD
(like/usr/local/sbin
)liblikwid_path
: Location ofliblikwid.so
including file name like/usr/local/lib/liblikwid.so
Available metric scopes
Hardware performance counters are scattered all over the system nowadays. A counter coveres a specific part of the system. While there are hardware thread specific counter for CPU cycles, instructions and so on, some others are specific for a whole CPU socket/package. To address that, the LikwidCollector provides the specification of a type
for each metric.
hwthread
: One metric per CPU hardware thread with the tags"type" : "hwthread"
and"type-id" : "$hwthread_id"
socket
: One metric per CPU socket/package with the tags"type" : "socket"
and"type-id" : "$socket_id"
Note: You cannot specify socket
scope for a metric that is measured at hwthread
scope, so some kind of expert knowledge or lookup work in the Likwid Wiki is required. Get the scope of each counter from the Architecture pages and as soon as one counter in a metric is socket-specific, the whole metric is socket-specific.
As a guideline:
- All counters
FIXCx
,PMCy
andTMAz
have the scopehwthread
- All counters names containing
BOX
have the scopesocket
- All
PWRx
counters have scopesocket
, except"PWR1" : "RAPL_CORE_ENERGY"
hashwthread
scope - All
DFCx
counters have scopesocket
Help with the configuration
The configuration for the likwid
collector is quite complicated. Most users don't use LIKWID with the event:counter notation but rely on the performance groups defined by the LIKWID team for each architecture. In order to help with the likwid
collector configuration, we included a script scripts/likwid_perfgroup_to_cc_config.py
that creates the configuration of an eventset
from a performance group (using a LIKWID installation in $PATH
):
$ likwid-perfctr -i
[...]
short name: ICX
[...]
$ likwid-perfctr -a
[...]
MEM_DP
MEM
FLOPS_SP
CLOCK
[...]
$ scripts/likwid_perfgroup_to_cc_config.py ICX MEM_DP
{
"events": {
"FIXC0": "INSTR_RETIRED_ANY",
"FIXC1": "CPU_CLK_UNHALTED_CORE",
"..." : "..."
},
"metrics" : [
{
"calc": "time",
"name": "Runtime (RDTSC) [s]",
"publish": true,
"unit": "seconds"
"scope": "hwthread"
},
{
"..." : "..."
}
]
}
You can copy this JSON and add it to the eventsets
list. If you specify multiple event sets, you can add globally derived metrics in the extra global_metrics
section with the metric names as variables.
Mixed usage between daemon and users
LIKWID checks the file /var/run/likwid.lock
before performing any interfering operations. Who is allowed to access the counters is determined by the owner of the file. If it does not exist, it is created for the current user. So, if you want to temporarly allow counter access to a user (e.g. in a job):
Before (SLURM prolog, ...)
$ chown $JOBUSER /var/run/likwid.lock
After (SLURM epilog, ...)
$ chown $CCUSER /var/run/likwid.lock
invalid_to_zero
option
In some cases LIKWID returns 0.0
for some events that are further used in processing and maybe used as divisor in a calculation. After evaluation of a metric, the result might be NaN
or +-Inf
. These resulting metrics are commonly not created and forwarded to the router because the InfluxDB line protocol does not support these special floating-point values. If you want to have them sent, this option forces these metric values to be 0.0
instead.
One might think this does not happen often but often used metrics in the world of performance engineering like Instructions-per-Cycle (IPC) or more frequently the actual CPU clock are derived with events like CPU_CLK_UNHALTED_CORE
(Intel) which do not increment in halted state (as the name implies). In there are different power management systems in a chip which can cause a hardware thread to go in such a state. Moreover, if no cycles are executed by the core, also many other events are not incremented as well (like INSTR_RETIRED_ANY
for retired instructions and part of IPC).
Example configuration
AMD Zen3
"likwid": {
"force_overwrite" : false,
"invalid_to_zero" : false,
"eventsets": [
{
"events": {
"FIXC1": "ACTUAL_CPU_CLOCK",
"FIXC2": "MAX_CPU_CLOCK",
"PMC0": "RETIRED_INSTRUCTIONS",
"PMC1": "CPU_CLOCKS_UNHALTED",
"PMC2": "RETIRED_SSE_AVX_FLOPS_ALL",
"PMC3": "MERGE",
"DFC0": "DRAM_CHANNEL_0",
"DFC1": "DRAM_CHANNEL_1",
"DFC2": "DRAM_CHANNEL_2",
"DFC3": "DRAM_CHANNEL_3"
},
"metrics": [
{
"name": "ipc",
"calc": "PMC0/PMC1",
"type": "hwthread",
"publish": true
},
{
"name": "flops_any",
"calc": "0.000001*PMC2/time",
"unit": "MFlops/s",
"type": "hwthread",
"publish": true
},
{
"name": "clock",
"calc": "0.000001*(FIXC1/FIXC2)/inverseClock",
"type": "hwthread",
"unit": "MHz",
"publish": true
},
{
"name": "mem1",
"calc": "0.000001*(DFC0+DFC1+DFC2+DFC3)*64.0/time",
"unit": "Mbyte/s",
"type": "socket",
"publish": false
}
]
},
{
"events": {
"DFC0": "DRAM_CHANNEL_4",
"DFC1": "DRAM_CHANNEL_5",
"DFC2": "DRAM_CHANNEL_6",
"DFC3": "DRAM_CHANNEL_7",
"PWR0": "RAPL_CORE_ENERGY",
"PWR1": "RAPL_PKG_ENERGY"
},
"metrics": [
{
"name": "pwr_core",
"calc": "PWR0/time",
"unit": "Watt"
"type": "socket",
"publish": true
},
{
"name": "pwr_pkg",
"calc": "PWR1/time",
"type": "socket",
"unit": "Watt"
"publish": true
},
{
"name": "mem2",
"calc": "0.000001*(DFC0+DFC1+DFC2+DFC3)*64.0/time",
"unit": "Mbyte/s",
"type": "socket",
"publish": false
}
]
}
],
"globalmetrics": [
{
"name": "mem_bw",
"calc": "mem1+mem2",
"type": "socket",
"unit": "Mbyte/s",
"publish": true
}
]
}
How to get the eventsets and metrics from LIKWID
The likwid
collector reads hardware performance counters at a hwthread and socket level. The configuration looks quite complicated but it is basically copy&paste from LIKWID's performance groups. The collector made multiple iterations and tried to use the performance groups but it lacked flexibility. The current way of configuration provides most flexibility.
The logic is as following: There are multiple eventsets, each consisting of a list of counters+events and a list of metrics. If you compare a common performance group with the example setting above, there is not much difference:
EVENTSET -> "events": {
FIXC1 ACTUAL_CPU_CLOCK -> "FIXC1": "ACTUAL_CPU_CLOCK",
FIXC2 MAX_CPU_CLOCK -> "FIXC2": "MAX_CPU_CLOCK",
PMC0 RETIRED_INSTRUCTIONS -> "PMC0" : "RETIRED_INSTRUCTIONS",
PMC1 CPU_CLOCKS_UNHALTED -> "PMC1" : "CPU_CLOCKS_UNHALTED",
PMC2 RETIRED_SSE_AVX_FLOPS_ALL -> "PMC2": "RETIRED_SSE_AVX_FLOPS_ALL",
PMC3 MERGE -> "PMC3": "MERGE",
-> }
The metrics are following the same procedure:
METRICS -> "metrics": [
IPC PMC0/PMC1 -> {
-> "name" : "IPC",
-> "calc" : "PMC0/PMC1",
-> "scope": "hwthread",
-> "publish": true
-> }
-> ]
The script scripts/likwid_perfgroup_to_cc_config.py
might help you.