678 Commits

Author SHA1 Message Date
Thomas Roehl
83d5ad72fd Fix for metrics without units and reduce debugging messages for messageProcessor 2024-12-19 14:33:04 +01:00
Thomas Roehl
2f6f8c846a LIKWID collector: write log owner change only once 2024-12-19 14:29:49 +01:00
Thomas Roehl
8270d93b67 Some helpers for ccTopology 2024-12-19 14:29:00 +01:00
Thomas Roehl
d1e406f765 Minor style change in collector manager 2024-12-19 14:05:32 +01:00
Thomas Roehl
0c95439159 Update sample collectors 2024-12-19 14:04:42 +01:00
Thomas Roehl
276aa58e50 Add link to expr syntax and fix regex matching docs 2024-12-12 05:35:48 +01:00
Thomas Roehl
e91fc6004f Update docs for message processor, router and the default router config file 2024-12-12 05:24:22 +01:00
Thomas Roehl
beeea9e3aa Fix JSON keys in message processor configuration 2024-12-12 05:23:54 +01:00
Thomas Roehl
8fd60afad9 Add support for credential file (NKEY) to NATS sink and receiver 2024-12-12 04:10:51 +01:00
Thomas Roehl
14ca925622 Use message processor in router, all sinks and all receivers 2024-12-11 20:53:22 +01:00
Thomas Roehl
f8075c92ba Update collector's Makefile and go.mod/sum files 2024-12-11 19:10:55 +01:00
Thomas Gruber
6d7604c74f New Message processor (#118)
* Add cpu_used (all-cpu_idle) to CpustatCollector

* Update cc-metric-collector.init

* Allow selection of timestamp precision in HttpSink

* Add comment about precision requirement for cc-metric-store

* Fix for API changes in gofish@v0.15.0

* Update requirements to latest version

* Read sensors through redfish

* Update golang toolchain to 1.21

* Remove stray error check

* Update main config in configuration.md

* Update Release action to use golang 1.22 stable release, no golang RPMs anymore

* Update runonce action to use golang 1.22 stable release, no golang RPMs anymore

* New message processor to check whether a message should be dropped or manipulate it in flight

* Create a copy of message before manipulation

---------

Co-authored-by: Holger Obermaier <Holger.Obermaier@kit.edu>
Co-authored-by: Holger Obermaier <40787752+ho-ob@users.noreply.github.com>
2024-12-11 19:09:50 +01:00
Thomas Roehl
704d332082 Switch to ccmessage also for latest additions in nvidiaMetric 2024-12-11 19:01:54 +01:00
Thomas Gruber
38e78c7b37
Ccmessage migration (#119)
* Add cpu_used (all-cpu_idle) to CpustatCollector

* Update cc-metric-collector.init

* Allow selection of timestamp precision in HttpSink

* Add comment about precision requirement for cc-metric-store

* Fix for API changes in gofish@v0.15.0

* Update requirements to latest version

* Read sensors through redfish

* Update golang toolchain to 1.21

* Remove stray error check

* Update main config in configuration.md

* Update Release action to use golang 1.22 stable release, no golang RPMs anymore

* Update runonce action to use golang 1.22 stable release, no golang RPMs anymore

* Switch to CCMessage for all files.

---------

Co-authored-by: Holger Obermaier <Holger.Obermaier@kit.edu>
Co-authored-by: Holger Obermaier <40787752+ho-ob@users.noreply.github.com>
2024-12-11 19:00:29 +01:00
Thomas Roehl
dbe50c5dd0 Merge branch 'develop' of github.com:ClusterCockpit/cc-metric-collector into develop 2024-12-10 19:11:06 +01:00
oscarminus
26ce177b5b
Little fixes to the prometheus sink (#115)
* Add uint64 to float64 cast option

* Add prometheus sink to the list of available sinks

* Add aggregated counters by gpu for nvlink errors

---------

Co-authored-by: Michael Schwarz <schwarz@uni-paderborn.de>
2024-11-22 21:04:44 +01:00
Thomas Gruber
8837ff4474
Merge 'develop' into 'main' (#121)
* Add cpu_used (all-cpu_idle) to CpustatCollector

* Update cc-metric-collector.init

* Allow selection of timestamp precision in HttpSink

* Add comment about precision requirement for cc-metric-store

* Fix for API changes in gofish@v0.15.0

* Update requirements to latest version

* Read sensors through redfish

* Update golang toolchain to 1.21

* Remove stray error check

* Update main config in configuration.md

* Update Release action to use golang 1.22 stable release, no golang RPMs anymore

* Update runonce action to use golang 1.22 stable release, no golang RPMs anymore

* Update README.md

Use right JSON type in configuration

* Update sink's README

* Test whether ipmitool or ipmi-sensors can be executed without errors

---------

Co-authored-by: Holger Obermaier <Holger.Obermaier@kit.edu>
Co-authored-by: Holger Obermaier <40787752+ho-ob@users.noreply.github.com>
2024-11-20 16:50:12 +01:00
Thomas Gruber
8e8be09ed9
Merge latest commits from develop to main branch (#114)
* Add cpu_used (all-cpu_idle) to CpustatCollector

* Update cc-metric-collector.init

* Allow selection of timestamp precision in HttpSink

* Add comment about precision requirement for cc-metric-store

* Fix for API changes in gofish@v0.15.0

* Update requirements to latest version

* Read sensors through redfish

* Update golang toolchain to 1.21

* Remove stray error check

* Update main config in configuration.md

* Update Release action to use golang 1.22 stable release, no golang RPMs anymore

* Update runonce action to use golang 1.22 stable release, no golang RPMs anymore

* Update README.md

Use right JSON type in configuration

* Update sink's README

* Test whether ipmitool or ipmi-sensors can be executed without errors

---------

Co-authored-by: Holger Obermaier <Holger.Obermaier@kit.edu>
Co-authored-by: Holger Obermaier <40787752+ho-ob@users.noreply.github.com>
2024-11-20 16:22:39 +01:00
Thomas Gruber
51dda886f1
Update runonce.yml to download golang from official sources 2024-11-14 16:31:51 +01:00
brinkcoder
c96021c7cc
Fix: Create lock file if it does not exist in likwidMetric.go (#120)
Co-authored-by: exterr2f <Robert.Externbrink@rub.de>
2024-11-14 16:20:47 +01:00
Thomas Gruber
8f336c1bb7
Update likwidMetric.md 2024-10-08 13:36:46 +02:00
Thomas Gruber
7d3f67f15b
Update likwidMetric.md 2024-10-07 14:09:09 +02:00
Thomas Roehl
a36f8fe19d Test whether ipmitool or ipmi-sensors can be executed without errors 2024-07-26 16:46:16 +02:00
Thomas Roehl
2efed7c631 Merge branch 'develop' of github.com:ClusterCockpit/cc-metric-collector into develop 2024-07-15 12:42:58 +02:00
Thomas Roehl
2affb4d8a7 Update sink's README 2024-07-15 12:42:51 +02:00
Thomas Gruber
55cb12c9f8
Update README.md
Use right JSON type in configuration
2024-07-15 12:41:07 +02:00
Thomas Gruber
f6c94e32b3
Update README.md for sinks
Wrong JSON format, it is an object, not a list.
2024-07-15 12:38:34 +02:00
Thomas Roehl
b69efdc2a4 Update runonce action to use golang 1.22 stable release, no golang RPMs anymore 2024-06-17 14:28:17 +02:00
Thomas Roehl
caa04da163 Update Release action to use golang 1.22 stable release, no golang RPMs anymore 2024-06-17 14:11:33 +02:00
Thomas Gruber
0ae537fdc9
Update main config in configuration.md 2024-06-17 11:07:51 +02:00
Thomas Gruber
2e7990f87d
Update likwidMetric.md 2024-04-18 13:14:32 +02:00
Thomas Roehl
16c796a2b8 Merge branch 'develop' of github.com:ClusterCockpit/cc-metric-collector into develop 2024-04-10 19:57:54 +02:00
Thomas Roehl
b6c4769db3 Remove stray error check 2024-04-10 19:57:46 +02:00
Holger Obermaier
7bbee70c14 Update golang toolchain to 1.21 2024-03-06 15:14:44 +01:00
Holger Obermaier
902f4349b6 Read sensors through redfish 2024-03-06 14:59:47 +01:00
Holger Obermaier
6aada60d97 Update requirements to latest version 2024-01-22 16:21:14 +01:00
Holger Obermaier
06ca37e705 Fix for API changes in gofish@v0.15.0 2024-01-22 15:46:18 +01:00
Thomas Roehl
9b671ce68f Add comment about precision requirement for cc-metric-store 2023-12-11 16:06:28 +01:00
Thomas Roehl
226e8425cb Allow selection of timestamp precision in HttpSink 2023-12-11 14:57:06 +01:00
Thomas Gruber
a37f6603c8
Update cc-metric-collector.init 2023-12-11 13:47:53 +01:00
Thomas Roehl
78902305e8 Merge branch 'develop' of github.com:ClusterCockpit/cc-metric-collector into develop 2023-12-08 15:11:40 +01:00
Thomas Gruber
f496db4905
Fix job dependency in Release.yml v0.6.7 2023-12-04 12:26:57 +01:00
Thomas Gruber
6ab45dd3ec
Merge develop into main (#109)
* Add cpu_used (all-cpu_idle) to CpustatCollector

* Update to line-protocol/v2

* Update runonce.yml with Golang 1.20

* Update fsnotify in LIKWID Collector

* Use not a pointer to line-protocol.Encoder

* Simplify Makefile

* Use only as many arguments as required

* Allow sum function to handle non float types

* Allow values to be a slice of type float64, float32, int, int64, int32, bool

* Use generic function to simplify code

* Add missing case for type []int32

* Use generic function to compute minimum

* Use generic function to compute maximum

* Use generic function to compute average

* Add error value to sumAnyType

* Use generic function to compute median

* For older versions of go slices is not part of the installation

* Remove old entries from go.sum

* Use simpler sort function

* Compute metrics ib_total and ib_total_pkts

* Add aggregated metrics.
Add missing units

* Update likwidMetric.go

Fixes a potential bug when `fsnotify.NewWatcher()` fails with an error

* Completly avoid memory allocations in infinibandMetric read()

* Fixed initialization: Initalization and measurements should run in the same thread

* Add safe.directory to Release action

* Fix path after installation to /usr/bin after installation

* ioutil.ReadFile is deprecated: As of Go 1.16, this function simply calls os.ReadFile

* Switch to package slices from the golang 1.21 default library

* Read file line by line

* Read file line by line

* Read file line by line

* Use CamelCase

* Use CamelCase

* Fix function getNumaDomain, it always returned 0

* Avoid type conversion by using Atoi
Avoid copying structs by using pointer access
Increase readability with CamelCase variable names

* Add caching

* Cache CpuData

* Cleanup

* Use init function to initalize cache structure to avoid multi threading problems

* Reuse information from /proc/cpuinfo

* Avoid slice cloning. Directly use the cache

* Add DieList

* Add NumaDomainList and SMTList

* Cleanup

* Add comment

* Lookup core ID from /sys/devices/system/cpu, /proc/cpuinfo is not portable

* Lookup all information from /sys/devices/system/cpu, /proc/cpuinfo is not portable

* Correctly handle lists from /sys

* Add Simultaneous Multithreading siblings

* Replace deprecated thread_siblings_list by core_cpus_list

* Reduce number of required slices

* Allow to send total values per core, socket and node

* Send all metrics with same time stamp
calcEventsetMetrics does only computiation, counter measurement is done before

* Input parameters should be float64 when evaluating to float64

* Send all metrics with same time stamp
calcGlobalMetrics does only computiation, counter measurement is done before

* Remove unused variable gmresults

* Add comments

* Updated go packages

* Add build with golang 1.21

* Switch to checkout action version 4

* Switch to setup-go action version 4

* Add workflow_dispatch to allow manual run of workflow

* Add workflow_dispatch to allow manual run of workflow

* Add release build jobs to runonce.yml

* Switch to golang 1.20 for RHEL based distributions

* Use dnf to download golang

* Remove golang versions before 1.20

* Upgrade Ubuntu focal -> jammy

* Pipe golang tar package directly to tar

* Update golang version

* Fix Ubuntu version number

* Add links to ipmi and redfish receivers

* Fix http server addr format

* github.com/influxdata/line-protocol -> github.com/influxdata/line-protocol/v2/lineprotocol

* Corrected spelling

* Add some comments

* github.com/influxdata/line-protocol -> github.com/influxdata/line-protocol/v2/lineprotocol

* Allow other fields not only field "value"

* Add some basic debugging documentation

* Add some basic debugging documentation

* Use a lock for the flush timer

* Add tags in lexical order as required by AddTag()

* Only access meta data, when it gets used as tag

* Use slice to store lexialicly orderd key value pairs

* Increase golang version requirement to 1.20.

* Avoid package cmp to allow builds with golang v1.20

* Fix: Error NVML library not found did crash
cc-metric-collector with "SIGSEGV: segmentation violation"

* Add config option idle_timeout

* Add basic authentication support

* Add basic authentication support

* Avoid unneccessary memory allocations

* Add documentation for send_*_total values

* Use generic package maps to clone maps

* Reuse flush timer

* Add Influx client options

* Reuse ccTopology functionality

* Do not store unused topology information

* Add batch_size config

* Cleanup

* Use stype and stype-id for the NIC in NetstatCollector

* Wait for concurrent flush operations to finish

* Be more verbose in error messages

* Reverted previous changes.
Made the code to complex without much advantages

* Use line protocol encoder

* Go pkg update

* Stop flush timer, when immediatelly flushing

* Fix: Corrected unlock access to batch slice

* Add config option to specify whether to use GZip compression in influx write requests

* Add asynchron send of encoder metrics

* Use DefaultServeMux instead of github.com/gorilla/mux

* Add config option for HTTP keep-alives

* Be more strict, when parsing json

* Add config option for HTTP request timeout and Retry interval

* Allow more then one background send operation

* Fix %sysusers_create_package args (#108)

%sysusers_create_package requires two arguments. See: https://github.com/systemd/systemd/blob/main/src/rpm/macros.systemd.in#L165

* Add nfsiostat to list of collectors

---------

Co-authored-by: Holger Obermaier <40787752+ho-ob@users.noreply.github.com>
Co-authored-by: Holger Obermaier <holgerob@gmx.de>
Co-authored-by: Obihörnchen <obihoernchende@gmail.com>
2023-12-04 12:21:26 +01:00
Thomas Gruber
9df1054e32
Update stdoutSink.md 2023-10-10 11:57:13 +02:00
Thomas Gruber
e76eaa86ad
Update influxAsyncSink.md 2023-10-10 11:56:42 +02:00
Thomas Gruber
262f0c6a86
Update influxSink.md 2023-10-10 11:56:02 +02:00
Thomas Gruber
b488ff76b1
Update natsSink.md 2023-10-10 11:54:30 +02:00
Thomas Roehl
e42b41f264 Add safe.directory to Release action v0.6.6 2023-08-29 15:39:47 +02:00
Thomas Gruber
195d0794b0
Merge develop branch into main (#106)
* Add cpu_used (all-cpu_idle) to CpustatCollector

* Update to line-protocol/v2

* Update runonce.yml with Golang 1.20

* Update fsnotify in LIKWID Collector

* Use not a pointer to line-protocol.Encoder

* Simplify Makefile

* Use only as many arguments as required

* Allow sum function to handle non float types

* Allow values to be a slice of type float64, float32, int, int64, int32, bool

* Use generic function to simplify code

* Add missing case for type []int32

* Use generic function to compute minimum

* Use generic function to compute maximum

* Use generic function to compute average

* Add error value to sumAnyType

* Use generic function to compute median

* For older versions of go slices is not part of the installation

* Remove old entries from go.sum

* Use simpler sort function

* Compute metrics ib_total and ib_total_pkts

* Add aggregated metrics.
Add missing units

* Update likwidMetric.go

Fixes a potential bug when `fsnotify.NewWatcher()` fails with an error

* Completly avoid memory allocations in infinibandMetric read()

* Fixed initialization: Initalization and measurements should run in the same thread

---------

Co-authored-by: Holger Obermaier <40787752+ho-ob@users.noreply.github.com>
2023-08-29 14:12:49 +02:00
Thomas Röhl
e7b77f7721 Add cpu_used (all-cpu_idle) to CpustatCollector 2023-04-05 11:20:09 +02:00