Thomas Gruber
826f364772
CC topology module update ( #76 )
...
* Rename CPU to hardware thread, write some comments
* Do renaming in other parts
* Remove CpuList and SocketList function from metricCollector. Available in ccTopology
2022-05-13 14:28:07 +02:00
Thomas Gruber
5df550b208
Update NvidiaCollector with new metrics, MIG and NvLink support ( #75 )
2022-05-13 14:11:55 +02:00
Thomas Gruber
5c34805918
Collectors in parallel ( #74 )
...
* Provide info to CollectorManager whether the collector can be executed in parallel with others
* Split serial and parallel collectors. Read in parallel first
2022-05-13 14:10:39 +02:00
Thomas Gruber
1db5f3b29a
Rename cpu
type to hwthread
( #69 )
...
* Rename 'cpu' type to 'hwthread' to avoid naming clashes with MetricStore and CC-Webfrontend
2022-05-13 14:09:45 +02:00
Thomas Roehl
9886f14d14
Check readability of sensor files in TempCollector
2022-05-13 13:32:54 +02:00
Thomas Roehl
857903be2b
Skip disks in DiskstatCollector that have size=0
2022-05-13 13:31:22 +02:00
Thomas Roehl
8068e59818
Update handling of LIKWID headers. Download only if not already present in the system. Fixes #73
2022-05-13 13:14:47 +02:00
Thomas Roehl
38d4e0a730
Merge branch 'develop' of github.com:ClusterCockpit/cc-metric-collector into develop
2022-05-04 11:54:55 +02:00
Thomas Roehl
54d14519ca
Skip mount points in DiskstatCollector if statfs() call does not work (bind mounts, ...)
2022-05-04 11:54:34 +02:00
Holger Obermaier
fb6f6a4daa
Fix GPFS collector last state handling
2022-05-02 16:57:19 +02:00
Thomas Roehl
017cd58247
Updating page for LikwidCollector
2022-04-05 10:57:09 +02:00
Thomas Roehl
7b098e0b1b
Fix for missing metrics in LikwidCollector is hwthread is inactive
2022-04-04 15:16:11 +02:00
Thomas Roehl
5d25a7bf12
Add units to InfiniBandCollector
2022-04-01 17:14:26 +02:00
Thomas Roehl
83b4343310
Likwid receives signal at first Read, check when re-initializing
2022-04-01 17:10:31 +02:00
Thomas Gruber
2a014b6fba
Read unit of values from /proc/meminfo ( #68 )
2022-03-31 11:56:31 +02:00
Thomas Roehl
50479f9325
Move all LIKWID related stuff to late initialization routine
2022-03-24 18:12:23 +01:00
Thomas Roehl
e0e91844bc
Use late initialization of LIKWID and catch access daemon death. Fixes #70 and fixes #71 .
2022-03-24 17:56:51 +01:00
Thomas Roehl
296225f3a8
Always export all metrics in NfsCollectors
2022-03-24 13:50:35 +01:00
Thomas Roehl
b66fdd1436
Add missing socket->thread_id map for LikwidCollector
2022-03-16 19:04:39 +01:00
Thomas Gruber
c182d295f4
Fix staticcheck warnings ( #66 )
2022-03-15 16:38:20 +01:00
Thomas Gruber
aa1afd745e
Derived metrics ( #65 )
...
* Add time-based derivatived (e.g. bandwidth) to some collectors
* Add documentation
* Add comments
* Fix: Only compute rates with a valid previous state
* Only compute rates with a valid previous state
* Define const values for net/dev fields
* Set default config values
* Add comments
* Refactor: Consolidate data structures
* Refactor: Consolidate data structures
* Refactor: Avoid struct deep copy
* Refactor: Avoid redundant tag maps
* Refactor: Use int64 type for absolut values
* Update LustreCollector
Co-authored-by: Holger Obermaier <40787752+ho-ob@users.noreply.github.com>
2022-03-15 16:09:47 +01:00
Holger Obermaier
992b19d354
Move unit tag to meta data tags
2022-03-11 14:47:18 +01:00
Holger Obermaier
0b08ca9ae0
Simplified iota usage
2022-03-11 14:09:22 +01:00
Thomas Gruber
f6dae7c013
Derived metrics ( #57 )
...
* Add time-based derivatived (e.g. bandwidth) to some collectors
* Add documentation
* Add comments
* Fix: Only compute rates with a valid previous state
* Only compute rates with a valid previous state
* Define const values for net/dev fields
* Set default config values
* Add comments
* Refactor: Consolidate data structures
* Refactor: Consolidate data structures
* Refactor: Avoid struct deep copy
* Refactor: Avoid redundant tag maps
* Refactor: Use int64 type for absolut values
Co-authored-by: Holger Obermaier <40787752+ho-ob@users.noreply.github.com>
2022-03-11 13:48:18 +01:00
Thomas Gruber
73f22c1041
Refactoring of LikwidCollector and metric units ( #62 )
...
* Reduce complexity of LikwidCollector and allow metric units
* Add unit to LikwidCollector docu and fix some typos
* Make library path configurable
2022-03-11 13:43:17 +01:00
Thomas Roehl
e7f7e68095
Use GBytes as unit for large memory numbers
2022-03-09 11:05:26 +01:00
Thomas Gruber
f2486abeab
Just download LIKWID to get the headers ( #54 )
...
* Just download LIKWID to get the headers
* Remove perl-Data-Dumper from BuildRequires, only required by LIKWID build
2022-03-05 17:30:40 +01:00
Thomas Gruber
21864e0ac4
Change default GpfsCollector command to mmpmon
( #53 )
...
* Set default cmd to 'mmpmon'
* Reuse looked up path
* Cast const to string
2022-03-05 14:42:04 +01:00
Mehmet Soysal
547bc0461f
Beegfs collector ( #50 )
...
* added beegfs collectors to collectors/README.md
* added beegfs collectors and docs
* added new beegfs collectors to AvailableCollectors list
* Feedback implemented
* changed error type
* changed error to only return
* changed beegfs lookup path
* fixed typo in md files
Co-authored-by: Mehmet Soysal <mehmet.soysal@kit.edu>
2022-03-04 14:35:47 +01:00
Thomas Roehl
f1d2828e1d
Fix error print in LustreCollector
2022-03-04 11:32:10 +01:00
Holger Obermaier
db04c8fbae
Removed infinibandPerfQueryMetric.go. infinibandMetric.go offers the same functionality without requiring root privileges.
2022-03-03 15:52:50 +01:00
Thomas Roehl
60de21c41e
Switch access mode of LikwidCollector in config file
2022-03-03 13:03:58 +01:00
Thomas Roehl
276c00442a
Add option to LustreCollector to call lctl with sudo
2022-03-03 13:02:00 +01:00
Thomas Roehl
092e7f6a71
Add section how to temporarly disable LIKWID access to page
2022-03-02 13:54:43 +01:00
Holger Obermaier
a5325a6535
GitHub actions ( #51 )
...
Create new GitHub action which uses unmodified AlmaLinux Docker image
2022-03-01 15:39:26 +01:00
Holger Obermaier
33fec95eac
Additional comments
2022-02-28 12:16:48 +01:00
Holger Obermaier
2c08e53be4
Additional comments
2022-02-28 09:57:26 +01:00
Thomas Roehl
bac1f18b1d
Add samples for collectors, sinks and receivers
2022-02-25 13:47:19 +01:00
Thomas Gruber
c8bca59de4
Numa-aware memstat collector ( #45 )
2022-02-24 18:27:05 +01:00
Thomas Roehl
d542f32baa
Mention likwid config script in LikwidCollector README
2022-02-22 17:46:44 +01:00
Thomas Roehl
66275ecf74
DiskstatCollector: cast part_max_used metric to int
2022-02-22 15:50:49 +01:00
Thomas Roehl
eed9cd227c
Remove doubled import and remove merge artifacts
2022-02-21 14:50:11 +01:00
Thomas Roehl
24a2c9992f
Merge branch 'develop' into main
2022-02-21 14:32:24 +01:00
Thomas Gruber
f683f2e6da
Dynamically load liblikwid ( #40 )
...
* Check whether LIKWID library is present
* Generalize nan_to_zero option to invalid_to_zero including +Inf,+Inf and NaN
* Remove double error printing and return if measurements do not work
2022-02-21 13:29:33 +01:00
Thomas Gruber
435528fa97
Split diskstat Collector ( #38 )
...
* Split diskstats (free, total space) and iostats (reads, writes, ...
* Add iostat Collector to CollectorManager
2022-02-21 12:44:26 +01:00
Holger Obermaier
65c3106af2
Remove tags for num cores and packages
2022-02-18 16:59:59 +01:00
Holger Obermaier
635a75c64b
Report maximum and critical temperature
2022-02-18 16:56:41 +01:00
Thomas Roehl
4e8ee59211
Update NetstatCollector to derive bandwidths and use an include list
2022-02-18 02:25:23 +01:00
Thomas Gruber
0152c0dc1e
Update CpustatCollector ( #36 )
...
* Update cpustat collector
* Update CpustatCollector to use percentages and add 'num_cpus' metric
2022-02-17 15:46:06 +01:00
Holger Obermaier
542520d2c0
Refactoring: Use array of pointers
2022-02-15 15:37:25 +01:00
Holger Obermaier
01faa3b531
Add comments and units to all nvidia metrics
2022-02-15 10:57:32 +01:00
Holger Obermaier
14c9d6f792
Fixed: All nvidia metrics were excluded
2022-02-15 09:47:24 +01:00
Holger Obermaier
fcfb58c31c
Use slice element of m.gpus without slice index
2022-02-15 09:23:57 +01:00
Holger Obermaier
5060497abd
Cleanup
2022-02-14 22:14:06 +01:00
Holger Obermaier
342f09fabf
Cleanup
2022-02-14 11:19:19 +01:00
Holger Obermaier
09b1ea130e
Add error handling. Cleanup.
2022-02-14 10:46:05 +01:00
Holger Obermaier
6b12baff6e
Use sensor name and sensor label as metric name
2022-02-12 10:13:38 +01:00
Thomas Roehl
bd246bdacf
Fix group for netstat collector
2022-02-11 18:18:10 +01:00
Thomas Roehl
23d13b2ceb
Fix group for netstat collector
2022-02-11 18:09:39 +01:00
Holger Obermaier
cfc5279958
Move sensor detection to Init()
2022-02-11 17:17:25 +01:00
Thomas Roehl
b15fdf72b9
Exclude metrics and devices in Init() for NvidiaCollector
2022-02-11 14:20:06 +01:00
Holger Obermaier
82138df48e
Refactor: Replace readOneLine() by ioutil.ReadFile()
2022-02-10 09:28:06 +01:00
Thomas Gruber
1ea63332d3
Update README.md
2022-02-08 13:49:48 +01:00
Thomas Roehl
7e4c35e224
Merge branch 'develop' of github.com:ClusterCockpit/cc-metric-collector into develop
2022-02-08 13:46:48 +01:00
Thomas Roehl
fcc25f7d30
Add collector documentation
2022-02-08 13:46:44 +01:00
Thomas Roehl
cc86fc00a0
Add missing error check in InfiniBandPerfQueryMetric
2022-02-08 13:46:19 +01:00
Thomas Roehl
9e73dcd437
Fix type tag for numastat
2022-02-08 13:40:27 +01:00
Thomas Roehl
006b9f91f6
Excluding NaN values in Likwid metrics from sending
2022-02-08 13:39:58 +01:00
Thomas Gruber
e1cf682989
Add other collectors to README
2022-02-08 13:22:20 +01:00
Holger Obermaier
4e0782d66b
Use FromInfluxMetric() to convert influx to cc metric
2022-02-08 10:58:53 +01:00
Thomas Roehl
a6bec61b1e
LikwidCollector: Filter out NaNs or set them to zero if 'nan_to_zero' option is set
2022-02-07 18:35:08 +01:00
Thomas Roehl
7182b339b9
Respect the publish option in the LikwidCollector
2022-02-07 17:41:35 +01:00
Thomas Roehl
d8ab3b0eb0
Use LookPath in IpmiCollector
2022-02-07 15:44:29 +01:00
Thomas Roehl
b19ae7a4db
Fix initialization of InfinibandCollector
2022-02-07 15:43:57 +01:00
Thomas Gruber
5263a974d1
Split NfsCollector in Nfs3Collector and Nfs4Collector ( #28 )
...
* Split NfsCollector in Nfs3Collector and Nfs4Collector
* Add documentation
2022-02-07 15:43:01 +01:00
Thomas Roehl
b7ee125942
Merge branch 'develop' of github.com:ClusterCockpit/cc-metric-collector into develop
2022-02-07 13:47:06 +01:00
Holger Obermaier
ead7117cad
Add skip_filesystem configuration
2022-02-07 13:30:42 +01:00
Thomas Roehl
52458ce5a1
Fix for LustreCollector. Check for root user
2022-02-07 13:27:35 +01:00
Holger Obermaier
a534f16685
Add documentation for GPFS metric
2022-02-07 11:37:34 +01:00
Holger Obermaier
25c2ae4910
Avoid int -> int64 conversions
2022-02-07 11:12:03 +01:00
Holger Obermaier
3c10c6b340
Add error handling to Read()
2022-02-07 10:02:38 +01:00
Holger Obermaier
79b25ddbee
Add markdown documentation for metric collector ibstat_perfquery
2022-02-07 09:46:19 +01:00
Holger Obermaier
5ac3af895d
Moved documentation to markdown file
2022-02-07 09:22:59 +01:00
Holger Obermaier
9ab7a6424b
Moved check which metric to skip to Init()
2022-02-04 19:22:42 +01:00
Holger Obermaier
f719f1915c
Add error handling
2022-02-04 16:11:56 +01:00
Holger Obermaier
76b69c59b4
Switched to cclog.ComponentError() for error reporting in Read()
2022-02-04 14:42:53 +01:00
Thomas Roehl
66b9a25a88
Prefix metrics from NetstatCollector with 'net'
2022-02-04 12:39:59 +01:00
Thomas Roehl
db02c89683
Update LustreCollector to use lctl. Sysfs version is commented out
2022-02-03 22:05:16 +01:00
Thomas Gruber
92d4a9c2b9
Split MetricRouter and MetricAggregator ( #24 )
...
* Split MetricRouter and MetricAggregator
* Missing change in MetricCache
* Add README for MetricAggregator
2022-02-03 16:52:55 +01:00
Holger Obermaier
d5ff5b83ce
Add NUMA metric collector
2022-02-03 16:19:45 +01:00
Holger Obermaier
a016483012
Add NUMA metric collector.
2022-02-03 15:02:13 +01:00
Thomas Roehl
2806b1e7cc
Remove debugging artifacts
2022-02-02 17:14:29 +01:00
Thomas Roehl
e59852be03
Fix LikwidCollector, merge artifact causes problems
2022-02-02 16:55:15 +01:00
Thomas Roehl
6f399d5f08
Add scope guidelines in LikwidCollector page
2022-02-02 16:46:35 +01:00
Thomas Roehl
5bf538bf97
Update LikwidCollector page
2022-02-02 16:40:20 +01:00
Thomas Roehl
ed62e952ce
Use MetricAggregator to calculate metrics in LIKWID collector.
2022-02-02 14:52:07 +01:00
Thomas Roehl
e550226416
Use gval in LikwidCollector
2022-02-01 16:01:31 +01:00
Holger Obermaier
9e99e47d73
Wait for close of done channel, to ensure manager finished.
2022-01-30 12:08:33 +01:00
Holger Obermaier
8df58c051f
Lower minimum required golang version to 1.16.
2022-01-29 10:04:31 +01:00
Holger Obermaier
4e408f9490
Add documentation
2022-01-28 15:16:58 +01:00