Commit Graph

20 Commits

Author SHA1 Message Date
Thomas Gruber
8d85bd53f1
Merge latest development changes to main branch (#79)
* Cleanup: Remove unused code

* Use Golang duration parser for 'interval' and 'duration'
 in main config

* Update handling of LIKWID headers. Download only if not already present in the system. Fixes #73

* Units with cc-units (#64)

* Add option to normalize units with cc-unit

* Add unit conversion to router

* Add option to change unit prefix in the router

* Add to MetricRouter README

* Add order of operations in router to README

* Use second add_tags/del_tags only if metric gets renamed

* Skip disks in DiskstatCollector that have size=0

* Check readability of sensor files in TempCollector

* Fix for --once option

* Rename `cpu` type to `hwthread` (#69)

* Rename 'cpu' type to 'hwthread' to avoid naming clashes with MetricStore and CC-Webfrontend

* Collectors in parallel (#74)

* Provide info to CollectorManager whether the collector can be executed in parallel with others

* Split serial and parallel collectors. Read in parallel first

* Update NvidiaCollector with new metrics, MIG and NvLink support (#75)

* CC topology module update (#76)

* Rename CPU to hardware thread, write some comments

* Do renaming in other parts

* Remove CpuList and SocketList function from metricCollector. Available in ccTopology

* Option to use MIG UUID as subtype-id in NvidiaCollector

* Option to use MIG slice name as subtype-id in NvidiaCollector

* MetricRouter: Fix JSON in README

* Fix for Github Action to really use the selected version

* Remove Ganglia installation in runonce Action and add Go 1.18

* Fix daemon options in init script

* Add separate go.mod files to use it with deprecated 1.16

* Minor updates for Makefiles

* fix string comparison

* AMD ROCm SMI collector (#77)

* Add collector for AMD ROCm SMI metrics

* Fix import path

* Fix imports

* Remove Board Number

* store GPU index explicitly

* Remove board number from description

* Use http instead of ftp to download likwid

* Fix serial number in rocmCollector

* Improved http sink (#78)

* automatic flush in NatsSink

* tweak default options of HttpSink

* shorter cirt. section and retries for HttpSink

* fix error handling

* Remove file added by mistake.

* Use http instead of ftp to download likwid

* Fix serial number in rocmCollector

Co-authored-by: Thomas Roehl <thomas.roehl@fau.de>

Co-authored-by: Holger Obermaier <40787752+ho-ob@users.noreply.github.com>
Co-authored-by: Lou <lou.knauer@gmx.de>
2022-06-08 15:25:40 +02:00
Holger Obermaier
c35ac9dba8 Flush if batch size is reached 2022-05-04 11:28:06 +02:00
Thomas Roehl
28348bd108 InfluxSink: Use batch&flush logic from HttpSink 2022-04-01 18:37:45 +02:00
Thomas Gruber
57629a2e0a
Meta to tags list and map for sinks (#63)
* Change ccMetric->Influx functions

* Use a meta_as_tags string list in config but create a lookup map afterwards

* Add meta as tag logic to sampleSink
2022-03-15 16:16:26 +01:00
Thomas Gruber
c9b8fcdaa7
Add config options for retry intervals of InfluxDB clients (#59) 2022-03-11 13:43:03 +01:00
Thomas Roehl
fe3a8d59b0 Ping InfluxDB server after connecting to recognize faulty connections 2022-02-25 13:51:52 +01:00
Holger Obermaier
73981527d3 Refactor: Embed Init() into New() function 2022-02-23 14:56:29 +01:00
Thomas Roehl
18a226183c Use new sink instances to allow multiple of same sink type 2022-02-22 16:15:25 +01:00
Holger Obermaier
1d299be3ea Add comments 2022-02-09 11:08:50 +01:00
Holger Obermaier
e1a7379c2e Generate influxDB point for data type ccMetric 2022-02-08 09:31:08 +01:00
Holger Obermaier
7b104ebe90 Use cclog.ComponentDebug. Avoid copying point.Fields() 2022-02-07 18:00:17 +01:00
Thomas Gruber
fdb58b0be2
Sink specific configuration maps (#25)
* Use sink-specific configurations to have more flexibility. Adjust sample sink configuration files

* Add documentation

* Add links to individual sink readmes

* Fix link in README

* HTTPS for HttpSink

* If no CPU die id available, use the socket id instead
2022-02-04 18:12:24 +01:00
Thomas Gruber
6ff6cb7219
Change CCMetric's internal data structure (#22)
* package ccmetric rewrite

* Create deep copy in New() to avoid access conflicts

* Renamed TagMap() -> Tags(), MetaMap() -> Meta

Co-authored-by: Holger Obermaier <40787752+ho-ob@users.noreply.github.com>
2022-02-01 14:54:34 +01:00
Thomas Gruber
200af84c54
Modularize the whole thing (#16)
* Use channels, add a metric router, split up configuration and use extended version of Influx line protocol internally

* Use central timer for collectors and router. Add expressions to router

* Add expression to router config

* Update entry points

* Start with README

* Update README for CCMetric

* Formatting

* Update README.md

* Add README for MultiChanTicker

* Add README for MultiChanTicker

* Update README.md

* Add README to metric router

* Update main README

* Remove SinkEntity type

* Update README for sinks

* Update go files

* Update README for receivers

* Update collectors README

* Update collectors README

* Use seperate page per collector

* Fix for tempstat page

* Add docs for customcmd collector

* Add docs for ipmistat collector

* Add docs for topprocs collector

* Update customCmdMetric.md

* Use seconds when calculating LIKWID metrics

* Add IB metrics ib_recv_pkts and ib_xmit_pkts

* Drop domain part of host name

* Updated to latest stable version of likwid

* Define source code dependencies in Makefile

* Add GPFS / IBM Spectrum Scale collector

* Add vet and staticcheck make targets

* Add vet and staticcheck make targets

* Avoid go vet warning:
struct field tag `json:"..., omitempty"` not compatible with reflect.StructTag.Get: suspicious space in struct tag value
struct field tag `json:"...", omitempty` not compatible with reflect.StructTag.Get: key:"value" pairs not separated by spaces

* Add sample collector to README.md

* Add CPU frequency collector

* Avoid staticcheck warning: redundant return statement

* Avoid staticcheck warning: unnecessary assignment to the blank identifier

* Simplified code

* Add CPUFreqCollectorCpuinfo
a metric collector to measure the current frequency of the CPUs
as obtained from /proc/cpuinfo
Only measure on the first hyperthread

* Add collector for NFS clients

* Move publication of metrics into Flush() for NatsSink

* Update GitHub actions

* Refactoring

* Avoid vet warning: Println arg list ends with redundant newline

* Avoid vet warning struct field commands has json tag but is not exported

* Avoid vet warning: return copies lock value.

* Corrected typo

* Refactoring

* Add go sources in internal/...

* Bad separator in Makefile

* Fix Infiniband collector

Co-authored-by: Holger Obermaier <40787752+ho-ob@users.noreply.github.com>
2022-01-25 15:37:43 +01:00
Lou Knauer
cdc1811576 Add Flush method to sink interface 2021-10-12 13:43:58 +02:00
Thomas Gruber
d7ef32de18
Merge branch 'main' into alternate_storage 2021-10-04 15:49:46 +02:00
Thomas Roehl
558bbaba59 Change storage format 2021-10-04 15:23:43 +02:00
Thomas Roehl
586c6c12ac Add SSL to InfluxDB sink 2021-06-30 16:56:47 +02:00
Thomas Roehl
1b9cb8955c Hand over full config to Sink and Receiver 2021-05-18 15:16:10 +02:00
Thomas Roehl
f822f00cdc Add sink for InfluxDB (with the original InfluxDB client) 2021-03-26 16:48:09 +01:00