Commit Graph

570 Commits

Author SHA1 Message Date
Thomas Roehl
18c5d0eb34 Add example interval aggregation to MetricRouter config for CI 2022-01-30 15:04:31 +01:00
Thomas Gruber
cf810b1c0c
Add Cache and Aggregator to MetricRouter (#21)
* Add Cache and Aggregator to MetricRouter

* Close done channel in MetricCache
2022-01-30 15:03:21 +01:00
Thomas Gruber
11844d9d5d
Add common topology module for MetricCollectors and MetricRouter (#20) 2022-01-30 14:59:26 +01:00
Thomas Gruber
6abbc5f77e
Fix Github Actions (#18)
* Fix config for Github Actions

* Fix paths

* Add CentOS Latest and AlmaLinux 8.5 to RPM action

* Fix ID

* Reduce min Go version to 1.16 and use time.Unix in gpfsMetric
2022-01-30 14:54:36 +01:00
Thomas Roehl
e4a2927b96 Merge branch 'develop' of github.com:ClusterCockpit/cc-metric-collector into develop 2022-01-30 14:31:19 +01:00
Thomas Roehl
d3f5611541 Add functions to get the fields of a CCMetric and export some more CCMetric functions 2022-01-30 14:30:06 +01:00
Thomas Roehl
4541e50bea Minor fixes in ccLogger 2022-01-30 14:29:25 +01:00
Holger Obermaier
9e99e47d73 Wait for close of done channel, to ensure manager finished. 2022-01-30 12:08:33 +01:00
Holger Obermaier
8df58c051f Lower minimum required golang version to 1.16. 2022-01-29 10:04:31 +01:00
Holger Obermaier
7316de2813 Fix crash caused by:
* not running a collector manager when collector manager config file is missing
* not running a metric router when metric router config file is missing
* not running a sink manager when sink manager config file is missing
2022-01-28 19:49:46 +01:00
Holger Obermaier
d2e02ed36d Fix: Add missing hostname tag 2022-01-28 19:31:27 +01:00
Holger Obermaier
4e408f9490 Add documentation 2022-01-28 15:16:58 +01:00
Holger Obermaier
82f5c1c5d0 Minimum requirement go version 1.17 2022-01-28 09:42:19 +01:00
Holger Obermaier
db5b4e4f65 Add type=node to gpf metric tags 2022-01-28 09:14:25 +01:00
Holger Obermaier
aea3e2c6b1 Place wait group Add() and Done() near to each other 2022-01-27 20:45:22 +01:00
Holger Obermaier
b9236dcc31 Handle shutdown sequentially 2022-01-27 17:43:00 +01:00
Holger Obermaier
e1d0aacd1e Moved as much work as possible to Init() 2022-01-27 11:08:27 +01:00
Holger Obermaier
7077452a5d Split InfiniBand metric collector, one using
/sys filesystem reads and one using perfquery.
2022-01-26 20:18:47 +01:00
Thomas Roehl
76884c3380 Prefix Nvidia metrics with 'nv_' 2022-01-26 18:45:23 +01:00
Thomas Roehl
86e9b55bc9 Fix for documentation 2022-01-26 18:41:25 +01:00
Thomas Roehl
78834337b0 Fix for documentation 2022-01-26 18:37:59 +01:00
Thomas Roehl
0a383a3789 Update CCLogger 2022-01-26 17:09:20 +01:00
Thomas Roehl
5600cf1f5f Use two separate inputs for metric router to simplify management. Activate --logfile option and close MultiChanTicker explicitly 2022-01-26 17:08:53 +01:00
Thomas Roehl
3fd77e6887 Use non-blocking send at close, use common done function and remove default case 2022-01-26 16:54:51 +01:00
Thomas Roehl
babd7a9af8 Use non-blocking send at close 2022-01-26 16:52:56 +01:00
Holger Obermaier
09b7538479 Avoid labels in collector manager loop 2022-01-26 15:54:49 +01:00
Holger Obermaier
c193b80083 Add documentation 2022-01-26 12:31:04 +01:00
Holger Obermaier
3d073080f8 Add documentation 2022-01-26 12:08:40 +01:00
Holger Obermaier
9bd8a3a90b Add documentation 2022-01-26 11:38:43 +01:00
Thomas Roehl
7f77cad056 Don't wait too long in case of --once 2022-01-25 17:49:15 +01:00
Thomas Roehl
2925ad9f40 Use ccLogger anywhere 2022-01-25 17:43:10 +01:00
Holger Obermaier
b4fde31626 Add documentation 2022-01-25 17:20:20 +01:00
Thomas Roehl
8f9bff7efd Merge branch 'develop' of github.com:ClusterCockpit/cc-metric-collector into develop 2022-01-25 16:41:54 +01:00
Thomas Roehl
bafc6322e6 Change to own Logger 2022-01-25 16:40:02 +01:00
Holger Obermaier
a40d1c954b Fix data type mismatch 2022-01-25 16:33:23 +01:00
Thomas Roehl
99aaece6c2 Activate --once option and return proper exit Code with os.Exit() 2022-01-25 15:46:41 +01:00
Thomas Gruber
200af84c54
Modularize the whole thing (#16)
* Use channels, add a metric router, split up configuration and use extended version of Influx line protocol internally

* Use central timer for collectors and router. Add expressions to router

* Add expression to router config

* Update entry points

* Start with README

* Update README for CCMetric

* Formatting

* Update README.md

* Add README for MultiChanTicker

* Add README for MultiChanTicker

* Update README.md

* Add README to metric router

* Update main README

* Remove SinkEntity type

* Update README for sinks

* Update go files

* Update README for receivers

* Update collectors README

* Update collectors README

* Use seperate page per collector

* Fix for tempstat page

* Add docs for customcmd collector

* Add docs for ipmistat collector

* Add docs for topprocs collector

* Update customCmdMetric.md

* Use seconds when calculating LIKWID metrics

* Add IB metrics ib_recv_pkts and ib_xmit_pkts

* Drop domain part of host name

* Updated to latest stable version of likwid

* Define source code dependencies in Makefile

* Add GPFS / IBM Spectrum Scale collector

* Add vet and staticcheck make targets

* Add vet and staticcheck make targets

* Avoid go vet warning:
struct field tag `json:"..., omitempty"` not compatible with reflect.StructTag.Get: suspicious space in struct tag value
struct field tag `json:"...", omitempty` not compatible with reflect.StructTag.Get: key:"value" pairs not separated by spaces

* Add sample collector to README.md

* Add CPU frequency collector

* Avoid staticcheck warning: redundant return statement

* Avoid staticcheck warning: unnecessary assignment to the blank identifier

* Simplified code

* Add CPUFreqCollectorCpuinfo
a metric collector to measure the current frequency of the CPUs
as obtained from /proc/cpuinfo
Only measure on the first hyperthread

* Add collector for NFS clients

* Move publication of metrics into Flush() for NatsSink

* Update GitHub actions

* Refactoring

* Avoid vet warning: Println arg list ends with redundant newline

* Avoid vet warning struct field commands has json tag but is not exported

* Avoid vet warning: return copies lock value.

* Corrected typo

* Refactoring

* Add go sources in internal/...

* Bad separator in Makefile

* Fix Infiniband collector

Co-authored-by: Holger Obermaier <40787752+ho-ob@users.noreply.github.com>
2022-01-25 15:37:43 +01:00
Holger Obermaier
d903fc6daa Avoid vet warning struct field commands has json tag but is not exported 2022-01-25 11:16:46 +01:00
Holger Obermaier
222862af32 Avoid vet warning struct field commands has json tag but is not exported 2022-01-25 11:15:36 +01:00
Holger Obermaier
9f8d3ddbd3 Avoid vet warning: Println arg list ends with redundant newline 2022-01-25 10:34:02 +01:00
Holger Obermaier
df77c3fd60 Avoid vet warning: Println arg list ends with redundant newline 2022-01-25 10:33:20 +01:00
Holger Obermaier
ae6ffd4974 Refactoring 2022-01-25 09:48:22 +01:00
Holger Obermaier
e095e4f202 Refactoring 2022-01-25 09:47:24 +01:00
Holger Obermaier
3d377760b8 Refactoring 2022-01-24 22:04:05 +01:00
Holger Obermaier
be8c92676a Refactoring 2022-01-24 22:03:13 +01:00
Holger Obermaier
9157fdbab2 Fixed topology detection 2022-01-24 20:23:24 +01:00
Holger Obermaier
2026c3acd9 Fixed topology detection 2022-01-24 20:22:08 +01:00
Holger Obermaier
f0a62152fd Update GitHub actions 2022-01-24 16:02:43 +01:00
Holger Obermaier
7953629940 Update GitHub actions 2022-01-24 15:55:15 +01:00
Holger Obermaier
f84f7de05c Add CPUFreqCollectorCpuinfo
a metric collector to measure the current frequency of the CPUs
as obtained from /proc/cpuinfo
Only measure on the first hyperthread
2022-01-24 13:12:25 +01:00