Commit Graph

115 Commits

Author SHA1 Message Date
Holger Obermaier
4bee75d4b5 Allow more then one background send operation 2023-10-13 15:15:10 +02:00
Holger Obermaier
78fac33a06 Add config option for HTTP request timeout and Retry interval 2023-10-13 15:00:06 +02:00
Holger Obermaier
0b509ca9e4 Be more strict, when parsing json 2023-10-13 09:53:49 +02:00
Holger Obermaier
b618e81cbb Add asynchron send of encoder metrics 2023-10-11 14:55:52 +02:00
Holger Obermaier
8837400bf2 Add config option to specify whether to use GZip compression in influx write requests 2023-10-09 16:57:26 +02:00
Holger Obermaier
3be11984f2 Fix: Corrected unlock access to batch slice 2023-10-09 16:48:42 +02:00
Holger Obermaier
dd40c852ca Stop flush timer, when immediatelly flushing 2023-10-09 11:01:01 +02:00
Holger Obermaier
a4d7593af5 Use line protocol encoder 2023-10-09 10:12:14 +02:00
Holger Obermaier
fd1cdc5c07 Reverted previous changes.
Made the code to complex without much advantages
2023-10-06 16:56:30 +02:00
Holger Obermaier
94c88f23df Be more verbose in error messages 2023-10-05 17:22:30 +02:00
Holger Obermaier
9dae829f9d Wait for concurrent flush operations to finish 2023-10-05 16:44:03 +02:00
Holger Obermaier
778bb62602 Cleanup 2023-10-04 16:24:39 +02:00
Holger Obermaier
5aa9603c01 Add batch_size config 2023-10-04 12:37:25 +02:00
Holger Obermaier
9f65365f9d Add Influx client options 2023-09-29 10:36:42 +02:00
Holger Obermaier
1e606a1aa1 Reuse flush timer 2023-09-26 15:04:39 +02:00
Holger Obermaier
7b5a4caf6a Avoid unneccessary memory allocations 2023-09-21 10:19:25 +02:00
Holger Obermaier
a401e4cdd1 Add basic authentication support 2023-09-20 17:41:12 +02:00
Holger Obermaier
75b705aa87 Avoid package cmp to allow builds with golang v1.20 2023-09-19 17:00:16 +02:00
Holger Obermaier
42a9423203 Use slice to store lexialicly orderd key value pairs 2023-09-19 14:48:11 +02:00
Holger Obermaier
c472029c2d Add tags in lexical order as required by AddTag() 2023-09-19 13:33:25 +02:00
Holger Obermaier
9e73849081 Use a lock for the flush timer 2023-09-19 12:57:43 +02:00
Holger Obermaier
64ffa3d23e Allow other fields not only field "value" 2023-09-18 16:35:56 +02:00
Holger Obermaier
2d41531b51 Corrected spelling 2023-09-18 14:52:09 +02:00
Thomas Röhl
ef49701f14 Use not a pointer to line-protocol.Encoder 2023-07-17 18:02:50 +02:00
Thomas Röhl
547e2546c7 Update to line-protocol/v2 2023-07-17 15:20:12 +02:00
Thomas Gruber
f0da07310b
Update README.md 2022-11-04 14:53:08 +01:00
Thomas Gruber
0f35469168
Update httpSink.md 2022-11-04 14:52:05 +01:00
Thomas Gruber
be20f956c2
Add latest development to main branch (#89)
* InfiniBandCollector: Scale raw readings from octets to bytes

* Fix clock frequency coming from LikwidCollector and update docs

* Build DEB package for Ubuntu 20.04 for releases

* Fix memstat collector with numa_stats option

* Remove useless prints from MemstatCollector

* Replace ioutils with os and io (#87)

* Use lower case for error strings in RocmSmiCollector

* move maybe-usable-by-other-cc-components to pkg. Fix all files to use the new paths (#88)

* Add collector for monitoring the execution of cc-metric-collector itself (#81)

* Add collector to monitor execution of cc-metric-collector itself

* Register SelfCollector

* Fix import paths for moved packages
2022-10-10 12:23:51 +02:00
Thomas Gruber
b3c27e0af5
Merge latest development changes (#80)
* Cleanup: Remove unused code

* Use Golang duration parser for 'interval' and 'duration'
 in main config

* Update handling of LIKWID headers. Download only if not already present in the system. Fixes #73

* Units with cc-units (#64)

* Add option to normalize units with cc-unit

* Add unit conversion to router

* Add option to change unit prefix in the router

* Add to MetricRouter README

* Add order of operations in router to README

* Use second add_tags/del_tags only if metric gets renamed

* Skip disks in DiskstatCollector that have size=0

* Check readability of sensor files in TempCollector

* Fix for --once option

* Rename `cpu` type to `hwthread` (#69)

* Rename 'cpu' type to 'hwthread' to avoid naming clashes with MetricStore and CC-Webfrontend

* Collectors in parallel (#74)

* Provide info to CollectorManager whether the collector can be executed in parallel with others

* Split serial and parallel collectors. Read in parallel first

* Update NvidiaCollector with new metrics, MIG and NvLink support (#75)

* CC topology module update (#76)

* Rename CPU to hardware thread, write some comments

* Do renaming in other parts

* Remove CpuList and SocketList function from metricCollector. Available in ccTopology

* Option to use MIG UUID as subtype-id in NvidiaCollector

* Option to use MIG slice name as subtype-id in NvidiaCollector

* MetricRouter: Fix JSON in README

* Fix for Github Action to really use the selected version

* Remove Ganglia installation in runonce Action and add Go 1.18

* Fix daemon options in init script

* Add separate go.mod files to use it with deprecated 1.16

* Minor updates for Makefiles

* fix string comparison

* AMD ROCm SMI collector (#77)

* Add collector for AMD ROCm SMI metrics

* Fix import path

* Fix imports

* Remove Board Number

* store GPU index explicitly

* Remove board number from description

* Use http instead of ftp to download likwid

* Fix serial number in rocmCollector

* Improved http sink (#78)

* automatic flush in NatsSink

* tweak default options of HttpSink

* shorter cirt. section and retries for HttpSink

* fix error handling

* Remove file added by mistake.

* Use http instead of ftp to download likwid

* Fix serial number in rocmCollector

Co-authored-by: Thomas Roehl <thomas.roehl@fau.de>

* Fix: When sending metrics failed the batch size could be exceeded

* Improved dropping of metrics failed to send

* Add memstats and topprocs metric

* Updated to latest modules

* Check that at least one sink is running

* Add drop rate, when send buffer is full

* Allow only one timer at a time

* Use mutex to ensure only on flush timer is running

* Fix for NvidiaCollector when devices are not in MiG mode

* Remove Golang version 1.16 an 1.17 from Action. Latest commits require Golang 1.18

* Use Golang 1.18 in Release action to build RPMs

* Change unit of CpufreqCollector to Hz. That's what the sysfs outputs

* Make wget quiet in Release action to reduce log size

Co-authored-by: Holger Obermaier <40787752+ho-ob@users.noreply.github.com>
Co-authored-by: Lou <lou.knauer@gmx.de>
2022-07-13 10:09:49 +02:00
Thomas Gruber
8d85bd53f1
Merge latest development changes to main branch (#79)
* Cleanup: Remove unused code

* Use Golang duration parser for 'interval' and 'duration'
 in main config

* Update handling of LIKWID headers. Download only if not already present in the system. Fixes #73

* Units with cc-units (#64)

* Add option to normalize units with cc-unit

* Add unit conversion to router

* Add option to change unit prefix in the router

* Add to MetricRouter README

* Add order of operations in router to README

* Use second add_tags/del_tags only if metric gets renamed

* Skip disks in DiskstatCollector that have size=0

* Check readability of sensor files in TempCollector

* Fix for --once option

* Rename `cpu` type to `hwthread` (#69)

* Rename 'cpu' type to 'hwthread' to avoid naming clashes with MetricStore and CC-Webfrontend

* Collectors in parallel (#74)

* Provide info to CollectorManager whether the collector can be executed in parallel with others

* Split serial and parallel collectors. Read in parallel first

* Update NvidiaCollector with new metrics, MIG and NvLink support (#75)

* CC topology module update (#76)

* Rename CPU to hardware thread, write some comments

* Do renaming in other parts

* Remove CpuList and SocketList function from metricCollector. Available in ccTopology

* Option to use MIG UUID as subtype-id in NvidiaCollector

* Option to use MIG slice name as subtype-id in NvidiaCollector

* MetricRouter: Fix JSON in README

* Fix for Github Action to really use the selected version

* Remove Ganglia installation in runonce Action and add Go 1.18

* Fix daemon options in init script

* Add separate go.mod files to use it with deprecated 1.16

* Minor updates for Makefiles

* fix string comparison

* AMD ROCm SMI collector (#77)

* Add collector for AMD ROCm SMI metrics

* Fix import path

* Fix imports

* Remove Board Number

* store GPU index explicitly

* Remove board number from description

* Use http instead of ftp to download likwid

* Fix serial number in rocmCollector

* Improved http sink (#78)

* automatic flush in NatsSink

* tweak default options of HttpSink

* shorter cirt. section and retries for HttpSink

* fix error handling

* Remove file added by mistake.

* Use http instead of ftp to download likwid

* Fix serial number in rocmCollector

Co-authored-by: Thomas Roehl <thomas.roehl@fau.de>

Co-authored-by: Holger Obermaier <40787752+ho-ob@users.noreply.github.com>
Co-authored-by: Lou <lou.knauer@gmx.de>
2022-06-08 15:25:40 +02:00
Holger Obermaier
c35ac9dba8 Flush if batch size is reached 2022-05-04 11:28:06 +02:00
Thomas Roehl
70a9530aba Set WriteFailedCallback to get some error message 2022-04-04 11:48:54 +02:00
Thomas Roehl
69f7c19659 InfluxAsyncSink: Add custom flush mechanism 2022-04-04 02:56:23 +02:00
Thomas Roehl
28348bd108 InfluxSink: Use batch&flush logic from HttpSink 2022-04-01 18:37:45 +02:00
Thomas Roehl
a3b9d8a90b HttpSink: Use sink name in error outputs 2022-04-01 18:36:54 +02:00
Thomas Roehl
7e43e9171e Use default options. Overwrite if anything is configured differently. Use seconds as precision 2022-04-01 17:26:56 +02:00
Thomas Gruber
57629a2e0a
Meta to tags list and map for sinks (#63)
* Change ccMetric->Influx functions

* Use a meta_as_tags string list in config but create a lookup map afterwards

* Add meta as tag logic to sampleSink
2022-03-15 16:16:26 +01:00
Thomas Gruber
1de3dda7be
Use old metric name in Ganglia if rename has happened in the router (#60)
* Use old metric name if rename has happened in the router

* Also check for Ganglia renames for the oldname
2022-03-11 13:44:32 +01:00
Thomas Gruber
c9b8fcdaa7
Add config options for retry intervals of InfluxDB clients (#59) 2022-03-11 13:43:03 +01:00
Holger Obermaier
33fec95eac Additional comments 2022-02-28 12:16:48 +01:00
Holger Obermaier
a2f9b23e85 Additional comments 2022-02-28 09:39:59 +01:00
Thomas Gruber
f099a311a0
Add sink for Prometheus monitoring system (#46)
* Add sink for Prometheus monitoring system

* Add prometheus sink to README
2022-02-25 14:33:20 +01:00
Thomas Roehl
fe3a8d59b0 Ping InfluxDB server after connecting to recognize faulty connections 2022-02-25 13:51:52 +01:00
Thomas Roehl
bac1f18b1d Add samples for collectors, sinks and receivers 2022-02-25 13:47:19 +01:00
Thomas Gruber
16c03d2aa2
Use Ganglia configuration (#44)
* Copy all metric configurations from original Ganglia code

* Use metric configurations from Ganglia for some metrics

* Format value string also for known metrics
2022-02-24 18:22:20 +01:00
Holger Obermaier
73981527d3 Refactor: Embed Init() into New() function 2022-02-23 14:56:29 +01:00
Thomas Roehl
24e12ccc57 Update sink README and SampleSink 2022-02-22 16:19:46 +01:00
Thomas Roehl
18a226183c Use new sink instances to allow multiple of same sink type 2022-02-22 16:15:25 +01:00
Thomas Roehl
9cfbe10247 Add uint types to GangliaSink and LibgangliaSink 2022-02-22 15:51:08 +01:00
Holger Obermaier
a97c705f4c Do not create link to libganglia.so.
libganglia.so is now loaded during runtime by dlopen
and no longer required during link time
2022-02-21 20:55:14 +01:00