Commit Graph

271 Commits

Author SHA1 Message Date
Thomas Roehl
2505b2f20b Add power averager to Nvidia GPU collector 2024-02-22 20:30:34 +01:00
Thomas Roehl
c8a91903f6 Add nfsiostat to list of collectors 2023-11-30 14:43:02 +01:00
Thomas Roehl
b0f0462995 Use stype and stype-id for the NIC in NetstatCollector 2023-10-05 09:12:24 +02:00
Holger Obermaier
0db1cda27f Do not store unused topology information 2023-10-02 11:30:06 +02:00
Holger Obermaier
013ae7ec6d Reuse ccTopology functionality 2023-10-02 10:57:50 +02:00
Holger Obermaier
553fcff468 Add documentation for send_*_total values 2023-09-21 10:44:14 +02:00
Holger Obermaier
0c95db50ad Fix: Error NVML library not found did crash
cc-metric-collector with "SIGSEGV: segmentation violation"
2023-09-20 13:45:38 +02:00
Holger Obermaier
3fdb60d708 Updated go packages 2023-09-12 16:17:24 +02:00
Holger Obermaier
12130361fd Add comments 2023-09-12 11:28:57 +02:00
Holger Obermaier
faad23ed64 Remove unused variable gmresults 2023-09-12 10:48:51 +02:00
Holger Obermaier
674e78b3d0 Send all metrics with same time stamp
calcGlobalMetrics does only computiation, counter measurement is done before
2023-09-12 10:45:50 +02:00
Holger Obermaier
302e42d1d0 Input parameters should be float64 when evaluating to float64 2023-09-12 10:35:36 +02:00
Holger Obermaier
1aca1b6caf Send all metrics with same time stamp
calcEventsetMetrics does only computiation, counter measurement is done before
2023-09-12 10:18:55 +02:00
Holger Obermaier
1b60935f38 Allow to send total values per core, socket and node 2023-09-11 16:26:15 +02:00
Holger Obermaier
013aa9ec92 ioutil.ReadFile is deprecated: As of Go 1.16, this function simply calls os.ReadFile 2023-09-05 17:41:08 +02:00
Holger Obermaier
fa755ae401 Fixed initialization: Initalization and measurements should run in the same thread 2023-08-25 08:26:05 +02:00
Holger Obermaier
1b97953cdb Completly avoid memory allocations in infinibandMetric read() 2023-08-21 10:09:21 +02:00
Thomas Gruber
fc19b2b9a5
Update likwidMetric.go
Fixes a potential bug when `fsnotify.NewWatcher()` fails with an error
2023-08-18 11:27:47 +02:00
Holger Obermaier
e425b2c38e Add aggregated metrics.
Add missing units
2023-08-18 10:39:43 +02:00
Holger Obermaier
f5d2d27090 Compute metrics ib_total and ib_total_pkts 2023-08-17 16:46:53 +02:00
Holger Obermaier
fb480993ed Simplify Makefile 2023-08-16 15:40:33 +02:00
Thomas Röhl
34bc23fbbd Update fsnotify in LIKWID Collector 2023-07-17 18:01:49 +02:00
Thomas Röhl
e7b77f7721 Add cpu_used (all-cpu_idle) to CpustatCollector 2023-04-05 11:20:09 +02:00
fodinabor
ec570f884c
Use customcmd commands if they did not error. (#101)
* Merge develop and main (#99)

* InfiniBandCollector: Scale raw readings from octets to bytes

* Fix clock frequency coming from LikwidCollector and update docs

* Build DEB package for Ubuntu 20.04 for releases

* Fix memstat collector with numa_stats option

* Remove useless prints from MemstatCollector

* Replace ioutils with os and io (#87)

* Use lower case for error strings in RocmSmiCollector

* move maybe-usable-by-other-cc-components to pkg. Fix all files to use the new paths (#88)

* Add collector for monitoring the execution of cc-metric-collector itself (#81)

* Add collector to monitor execution of cc-metric-collector itself

* Register SelfCollector

* Fix import paths for moved packages

* Check if at least one CPU with frequency information was detected

* Correct type: /proc/stats -> /proc/stat

* Update README.md

* Run ipmitool asynchron.  Improved error handling.

* Corrected some typos

* Add running average power limit (RAPL) metric collector

* Add running average power limit (RAPL) metric collector

* Do not mess up with the orignal configuration

* * Corrected json config in numastatsMetric.md
* Added some debug output to numastatsMetric.go

* Fixed computing number of physical packages for non continous physical package IDs (e.g. on Ampere Altra Q80-30)

* Fix kernel panic for receiver config with missing receiver type

* Add receiver to gather remote IPMI sensor metrics

* Added config option to add ipmi-sensors command line options

* Add documentaion for IPMI receiver

* Update to latest version of included go modules

* Add go.mod to App dependency

* Try to use common metric tags across hardware vendors

* Add IPMI metric: current

* remove prefix enumeration like 01-...

* Add IPMI receiver example configuration to receivers.json

* Minimal formating changes

* Add hostlist package

* Added tests for hostlist Expand()

* Use package hostlist to expand a host list

* Use package hostlist to expand a host list

* Some servers return "ConsumedPowerWatt":65535 instead of "ConsumedPowerWatt":null

* Updated to latest package versions

* Do not allow unknown fields in JSON configuration file

* Add workflow to customize packages to docs

* NFS I/O Stats Collector (#91)

* Initial version

* Delete values for vanished mount points and  comments

* Fix for Likwid collector (#95)

* Run LIKWID in separate thread and check metric type

* Change LIKWID collector documentation to use 'type' instead of 'scope'

* Re-initialize LIKWID after one read is missing due to lock toggle

* Register cc-metric-collector at Zenodo (#93)

* Add initial version of Zenodo project file

* Orcid ID added

* Update .zenodo.json

Co-authored-by: Holger Obermaier <holger.obermaier@kit.edu>

* Update ipmiMetric.go

* Use latest LIKWID version for builds

* Update README.md

* Remove development stuff from Makefile

* Add Requires(pre) to RPM SPEC file

* Use curly brackets in packaging make targets

* Fix for LIKWID collector with separate measurement thread and inotify watcher on the LIKWID lock (#97)

Co-authored-by: Holger Obermaier <40787752+ho-ob@users.noreply.github.com>
Co-authored-by: Holger Obermaier <Holger.Obermaier@kit.edu>

* Update likwid_perfgroup_to_cc_config.py

* Use customcmd commands if they did not error.

---------

Co-authored-by: Thomas Gruber <Thomas.Roehl@googlemail.com>
Co-authored-by: Holger Obermaier <40787752+ho-ob@users.noreply.github.com>
Co-authored-by: Holger Obermaier <Holger.Obermaier@kit.edu>
2023-02-28 12:02:01 +01:00
Thomas Gruber
b0423b842d
Merge branch 'main' into develop 2022-12-20 13:02:31 +01:00
Thomas Gruber
6c10c9741a
Fix for LIKWID collector with separate measurement thread and inotify watcher on the LIKWID lock (#97) 2022-12-20 12:59:33 +01:00
Thomas Roehl
2bd386dae7 Use latest LIKWID version for builds 2022-12-14 17:43:41 +01:00
Thomas Gruber
162cce0fda
Merge develop branch into main (#96)
* InfiniBandCollector: Scale raw readings from octets to bytes

* Fix clock frequency coming from LikwidCollector and update docs

* Build DEB package for Ubuntu 20.04 for releases

* Fix memstat collector with numa_stats option

* Remove useless prints from MemstatCollector

* Replace ioutils with os and io (#87)

* Use lower case for error strings in RocmSmiCollector

* move maybe-usable-by-other-cc-components to pkg. Fix all files to use the new paths (#88)

* Add collector for monitoring the execution of cc-metric-collector itself (#81)

* Add collector to monitor execution of cc-metric-collector itself

* Register SelfCollector

* Fix import paths for moved packages

* Check if at least one CPU with frequency information was detected

* Correct type: /proc/stats -> /proc/stat

* Update README.md

* Run ipmitool asynchron.  Improved error handling.

* Corrected some typos

* Add running average power limit (RAPL) metric collector

* Add running average power limit (RAPL) metric collector

* Do not mess up with the orignal configuration

* * Corrected json config in numastatsMetric.md
* Added some debug output to numastatsMetric.go

* Fixed computing number of physical packages for non continous physical package IDs (e.g. on Ampere Altra Q80-30)

* Fix kernel panic for receiver config with missing receiver type

* Add receiver to gather remote IPMI sensor metrics

* Added config option to add ipmi-sensors command line options

* Add documentaion for IPMI receiver

* Update to latest version of included go modules

* Add go.mod to App dependency

* Try to use common metric tags across hardware vendors

* Add IPMI metric: current

* remove prefix enumeration like 01-...

* Add IPMI receiver example configuration to receivers.json

* Minimal formating changes

* Add hostlist package

* Added tests for hostlist Expand()

* Use package hostlist to expand a host list

* Use package hostlist to expand a host list

* Some servers return "ConsumedPowerWatt":65535 instead of "ConsumedPowerWatt":null

* Updated to latest package versions

* Do not allow unknown fields in JSON configuration file

* Add workflow to customize packages to docs

* NFS I/O Stats Collector (#91)

* Initial version

* Delete values for vanished mount points and  comments

* Fix for Likwid collector (#95)

* Run LIKWID in separate thread and check metric type

* Change LIKWID collector documentation to use 'type' instead of 'scope'

* Re-initialize LIKWID after one read is missing due to lock toggle

* Register cc-metric-collector at Zenodo (#93)

* Add initial version of Zenodo project file

* Orcid ID added

* Update .zenodo.json

Co-authored-by: Holger Obermaier <holger.obermaier@kit.edu>

* Update ipmiMetric.go

Co-authored-by: Holger Obermaier <40787752+ho-ob@users.noreply.github.com>
Co-authored-by: Holger Obermaier <Holger.Obermaier@kit.edu>
2022-12-14 17:02:39 +01:00
Thomas Gruber
155d1b9acf
Update ipmiMetric.go 2022-12-14 17:00:09 +01:00
Thomas Gruber
c9b9752b6a
Merge branch 'main' into develop 2022-12-14 16:58:12 +01:00
Thomas Gruber
efd4f5feb4
Fix for Likwid collector (#95)
* Run LIKWID in separate thread and check metric type

* Change LIKWID collector documentation to use 'type' instead of 'scope'

* Re-initialize LIKWID after one read is missing due to lock toggle
2022-12-14 16:53:08 +01:00
Thomas Gruber
a1f4dd6a6c
NFS I/O Stats Collector (#91)
* Initial version

* Delete values for vanished mount points and  comments
2022-12-14 16:52:53 +01:00
Holger Obermaier
5918f96fd8 Minimal formating changes 2022-11-24 09:48:44 +01:00
Holger Obermaier
7bb80780e0 Fixed computing number of physical packages for non continous physical package IDs (e.g. on Ampere Altra Q80-30) 2022-11-16 14:58:11 +01:00
Holger Obermaier
e66d52bb32 * Corrected json config in numastatsMetric.md
* Added some debug output to numastatsMetric.go
2022-11-16 14:10:25 +01:00
Holger Obermaier
9840d0193d Do not mess up with the orignal configuration 2022-11-16 09:37:40 +01:00
Holger Obermaier
ce7eef8d30 Add running average power limit (RAPL) metric collector 2022-11-15 17:15:27 +01:00
Holger Obermaier
92e45ca62c Add running average power limit (RAPL) metric collector 2022-11-15 17:09:26 +01:00
Holger Obermaier
fd10a279fc Corrected some typos 2022-11-14 09:35:02 +01:00
Holger Obermaier
9e63d0ea59 Run ipmitool asynchron. Improved error handling. 2022-11-11 16:16:14 +01:00
Holger Obermaier
deb1bcfa2f Correct type: /proc/stats -> /proc/stat 2022-10-13 15:01:39 +02:00
Holger Obermaier
7a67d5e25f Check if at least one CPU with frequency information was detected 2022-10-13 14:53:55 +02:00
Thomas Gruber
be20f956c2
Add latest development to main branch (#89)
* InfiniBandCollector: Scale raw readings from octets to bytes

* Fix clock frequency coming from LikwidCollector and update docs

* Build DEB package for Ubuntu 20.04 for releases

* Fix memstat collector with numa_stats option

* Remove useless prints from MemstatCollector

* Replace ioutils with os and io (#87)

* Use lower case for error strings in RocmSmiCollector

* move maybe-usable-by-other-cc-components to pkg. Fix all files to use the new paths (#88)

* Add collector for monitoring the execution of cc-metric-collector itself (#81)

* Add collector to monitor execution of cc-metric-collector itself

* Register SelfCollector

* Fix import paths for moved packages
2022-10-10 12:23:51 +02:00
Thomas Gruber
9ae0806aa9
Add collector for monitoring the execution of cc-metric-collector itself (#81)
* Add collector to monitor execution of cc-metric-collector itself

* Register SelfCollector

* Fix import paths for moved packages
2022-10-10 12:18:52 +02:00
Thomas Gruber
4bd71224df
move maybe-usable-by-other-cc-components to pkg. Fix all files to use the new paths (#88) 2022-10-10 11:53:11 +02:00
Thomas Roehl
6bf3bfd10a Use lower case for error strings in RocmSmiCollector 2022-10-09 17:05:49 +02:00
Thomas Gruber
0fbff00996
Replace ioutils with os and io (#87) 2022-10-09 17:03:38 +02:00
Thomas Roehl
8849824ba9 Remove useless prints from MemstatCollector 2022-10-09 02:56:15 +02:00
Thomas Roehl
ed511b7c09 Fix memstat collector with numa_stats option 2022-09-28 15:09:36 +02:00
Thomas Gruber
5b6a2b9018
Merge latest fixed from develop to main (#85)
* InfiniBandCollector: Scale raw readings from octets to bytes

* Fix clock frequency coming from LikwidCollector and update docs
2022-09-12 12:54:40 +02:00