mirror of
https://github.com/ClusterCockpit/cc-metric-collector.git
synced 2024-11-13 21:47:25 +01:00
b3c27e0af5
* Cleanup: Remove unused code * Use Golang duration parser for 'interval' and 'duration' in main config * Update handling of LIKWID headers. Download only if not already present in the system. Fixes #73 * Units with cc-units (#64) * Add option to normalize units with cc-unit * Add unit conversion to router * Add option to change unit prefix in the router * Add to MetricRouter README * Add order of operations in router to README * Use second add_tags/del_tags only if metric gets renamed * Skip disks in DiskstatCollector that have size=0 * Check readability of sensor files in TempCollector * Fix for --once option * Rename `cpu` type to `hwthread` (#69) * Rename 'cpu' type to 'hwthread' to avoid naming clashes with MetricStore and CC-Webfrontend * Collectors in parallel (#74) * Provide info to CollectorManager whether the collector can be executed in parallel with others * Split serial and parallel collectors. Read in parallel first * Update NvidiaCollector with new metrics, MIG and NvLink support (#75) * CC topology module update (#76) * Rename CPU to hardware thread, write some comments * Do renaming in other parts * Remove CpuList and SocketList function from metricCollector. Available in ccTopology * Option to use MIG UUID as subtype-id in NvidiaCollector * Option to use MIG slice name as subtype-id in NvidiaCollector * MetricRouter: Fix JSON in README * Fix for Github Action to really use the selected version * Remove Ganglia installation in runonce Action and add Go 1.18 * Fix daemon options in init script * Add separate go.mod files to use it with deprecated 1.16 * Minor updates for Makefiles * fix string comparison * AMD ROCm SMI collector (#77) * Add collector for AMD ROCm SMI metrics * Fix import path * Fix imports * Remove Board Number * store GPU index explicitly * Remove board number from description * Use http instead of ftp to download likwid * Fix serial number in rocmCollector * Improved http sink (#78) * automatic flush in NatsSink * tweak default options of HttpSink * shorter cirt. section and retries for HttpSink * fix error handling * Remove file added by mistake. * Use http instead of ftp to download likwid * Fix serial number in rocmCollector Co-authored-by: Thomas Roehl <thomas.roehl@fau.de> * Fix: When sending metrics failed the batch size could be exceeded * Improved dropping of metrics failed to send * Add memstats and topprocs metric * Updated to latest modules * Check that at least one sink is running * Add drop rate, when send buffer is full * Allow only one timer at a time * Use mutex to ensure only on flush timer is running * Fix for NvidiaCollector when devices are not in MiG mode * Remove Golang version 1.16 an 1.17 from Action. Latest commits require Golang 1.18 * Use Golang 1.18 in Release action to build RPMs * Change unit of CpufreqCollector to Hz. That's what the sysfs outputs * Make wget quiet in Release action to reduce log size Co-authored-by: Holger Obermaier <40787752+ho-ob@users.noreply.github.com> Co-authored-by: Lou <lou.knauer@gmx.de>
41 lines
819 B
JSON
41 lines
819 B
JSON
{
|
|
"cpufreq": {},
|
|
"cpufreq_cpuinfo": {},
|
|
"gpfs": {
|
|
"exclude_filesystem": [
|
|
"test_fs"
|
|
]
|
|
},
|
|
"ibstat": {},
|
|
"loadavg": {
|
|
"exclude_metrics": [
|
|
"proc_total"
|
|
]
|
|
},
|
|
"memstat": {},
|
|
"netstat": {
|
|
"include_devices": [
|
|
"enp5s0"
|
|
],
|
|
"send_derived_values": true
|
|
},
|
|
"numastats": {},
|
|
"nvidia": {},
|
|
"tempstat": {
|
|
"report_max_temperature": true,
|
|
"report_critical_temperature": true,
|
|
"tag_override": {
|
|
"hwmon0": {
|
|
"type": "socket",
|
|
"type-id": "0"
|
|
},
|
|
"hwmon1": {
|
|
"type": "socket",
|
|
"type-id": "1"
|
|
}
|
|
}
|
|
},
|
|
"topprocs": {
|
|
"num_procs": 5
|
|
}
|
|
} |