* Add cpu_used (all-cpu_idle) to CpustatCollector
* Update cc-metric-collector.init
* Allow selection of timestamp precision in HttpSink
* Add comment about precision requirement for cc-metric-store
* Fix for API changes in gofish@v0.15.0
* Update requirements to latest version
* Read sensors through redfish
* Update golang toolchain to 1.21
* Remove stray error check
* Update main config in configuration.md
* Update Release action to use golang 1.22 stable release, no golang RPMs anymore
* Update runonce action to use golang 1.22 stable release, no golang RPMs anymore
* Update README.md
Use right JSON type in configuration
* Update sink's README
* Test whether ipmitool or ipmi-sensors can be executed without errors
* Little fixes to the prometheus sink (#115)
* Add uint64 to float64 cast option
* Add prometheus sink to the list of available sinks
* Add aggregated counters by gpu for nvlink errors
---------
Co-authored-by: Michael Schwarz <schwarz@uni-paderborn.de>
* Ccmessage migration (#119)
* Add cpu_used (all-cpu_idle) to CpustatCollector
* Update cc-metric-collector.init
* Allow selection of timestamp precision in HttpSink
* Add comment about precision requirement for cc-metric-store
* Fix for API changes in gofish@v0.15.0
* Update requirements to latest version
* Read sensors through redfish
* Update golang toolchain to 1.21
* Remove stray error check
* Update main config in configuration.md
* Update Release action to use golang 1.22 stable release, no golang RPMs anymore
* Update runonce action to use golang 1.22 stable release, no golang RPMs anymore
* Switch to CCMessage for all files.
---------
Co-authored-by: Holger Obermaier <Holger.Obermaier@kit.edu>
Co-authored-by: Holger Obermaier <40787752+ho-ob@users.noreply.github.com>
* Switch to ccmessage also for latest additions in nvidiaMetric
* New Message processor (#118)
* Add cpu_used (all-cpu_idle) to CpustatCollector
* Update cc-metric-collector.init
* Allow selection of timestamp precision in HttpSink
* Add comment about precision requirement for cc-metric-store
* Fix for API changes in gofish@v0.15.0
* Update requirements to latest version
* Read sensors through redfish
* Update golang toolchain to 1.21
* Remove stray error check
* Update main config in configuration.md
* Update Release action to use golang 1.22 stable release, no golang RPMs anymore
* Update runonce action to use golang 1.22 stable release, no golang RPMs anymore
* New message processor to check whether a message should be dropped or manipulate it in flight
* Create a copy of message before manipulation
---------
Co-authored-by: Holger Obermaier <Holger.Obermaier@kit.edu>
Co-authored-by: Holger Obermaier <40787752+ho-ob@users.noreply.github.com>
* Update collector's Makefile and go.mod/sum files
* Use message processor in router, all sinks and all receivers
* Add support for credential file (NKEY) to NATS sink and receiver
* Fix JSON keys in message processor configuration
* Update docs for message processor, router and the default router config file
* Add link to expr syntax and fix regex matching docs
* Update sample collectors
* Minor style change in collector manager
* Some helpers for ccTopology
* LIKWID collector: write log owner change only once
* Fix for metrics without units and reduce debugging messages for messageProcessor
* Use shorted hostname for hostname added by router
* Define default port for NATS
* CPUstat collector: only add unit for applicable metrics
* Add precision option to all sinks using Influx's encoder
* Add message processor to all sink documentation
* Add units to documentation of cpustat collector
---------
Co-authored-by: Holger Obermaier <Holger.Obermaier@kit.edu>
Co-authored-by: Holger Obermaier <40787752+ho-ob@users.noreply.github.com>
Co-authored-by: oscarminus <me@oscarminus.de>
Co-authored-by: Michael Schwarz <schwarz@uni-paderborn.de>
* cpustatMetric.go: Use derived values instead of absolute values
The values in /proc/stat are absolute counters related to the boot
time of the system. To obtain a utilization of the CPU, the changes
in the counters must be derived according to time. To take only the
absolute values leads to the fact that changes in the utilization,
straight with larger values, do not become visible.
* Add new collector for /proc/schedstat
The `schedstat` collector reads data from /proc/schedstat and calculates
a load value, separated by hwthread. This might be useful to detect bad
cpu pinning on shared nodes etc.
Co-authored-by: Michael Schwarz <post@michael-schwarz.name>