Update Hugo integration

This commit is contained in:
Thomas Roehl 2025-04-16 23:54:17 +02:00
parent 8ccbb4f69c
commit a1077b58a8
32 changed files with 361 additions and 18 deletions

View File

@ -1,6 +1,17 @@
<!--
---
title: cc-metric-collector
description: Metric collecting node agent
categories: [cc-metric-collector]
tags: ['Admin']
weight: 2
hugo_path: docs/reference/cc-metric-collector/_index.md
---
-->
# cc-metric-collector # cc-metric-collector
A node agent for measuring, processing and forwarding node level metrics. It is part of the [ClusterCockpit ecosystem](./docs/introduction.md). A node agent for measuring, processing and forwarding node level metrics. It is part of the [ClusterCockpit ecosystem](https://clustercockpit.org/docs/overview/).
The metric collector sends (and receives) metric in the [InfluxDB line protocol](https://docs.influxdata.com/influxdb/cloud/reference/syntax/line-protocol/) as it provides flexibility while providing a separation between tags (like index columns in relational databases) and fields (like data columns). The metric collector sends (and receives) metric in the [InfluxDB line protocol](https://docs.influxdata.com/influxdb/cloud/reference/syntax/line-protocol/) as it provides flexibility while providing a separation between tags (like index columns in relational databases) and fields (like data columns).
@ -35,8 +46,8 @@ The `interval` defines how often the metrics should be read and send to the sink
See the component READMEs for their configuration: See the component READMEs for their configuration:
* [`collectors`](./collectors/README.md) * [`collectors`](./collectors/README.md)
* [`sinks`](./sinks/README.md) * [`sinks`](https://github.com/ClusterCockpit/cc-lib/blob/main/sinks/README.md)
* [`receivers`](./receivers/README.md) * [`receivers`](https://github.com/ClusterCockpit/cc-lib/blob/main/receivers/README.md)
* [`router`](./internal/metricRouter/README.md) * [`router`](./internal/metricRouter/README.md)
# Installation # Installation

View File

@ -1,3 +1,14 @@
<!--
---
title: Metric Collectors
description: Metric collectors for cc-metric-collector
categories: [cc-metric-collector]
tags: ['Admin']
weight: 2
hugo_path: docs/reference/cc-metric-collector/collectors/_index.md
---
-->
# CCMetric collectors # CCMetric collectors
This folder contains the collectors for the cc-metric-collector. This folder contains the collectors for the cc-metric-collector.
@ -23,7 +34,6 @@ In contrast to the configuration files for sinks and receivers, the collectors c
* [`loadavg`](./loadavgMetric.md) * [`loadavg`](./loadavgMetric.md)
* [`netstat`](./netstatMetric.md) * [`netstat`](./netstatMetric.md)
* [`ibstat`](./infinibandMetric.md) * [`ibstat`](./infinibandMetric.md)
* [`ibstat_perfquery`](./infinibandPerfQueryMetric.md)
* [`tempstat`](./tempMetric.md) * [`tempstat`](./tempMetric.md)
* [`lustrestat`](./lustreMetric.md) * [`lustrestat`](./lustreMetric.md)
* [`likwid`](./likwidMetric.md) * [`likwid`](./likwidMetric.md)
@ -53,7 +63,7 @@ A collector reads data from any source, parses it to metrics and submits these m
* `Name() string`: Return the name of the collector * `Name() string`: Return the name of the collector
* `Init(config json.RawMessage) error`: Initializes the collector using the given collector-specific config in JSON. Check if needed files/commands exists, ... * `Init(config json.RawMessage) error`: Initializes the collector using the given collector-specific config in JSON. Check if needed files/commands exists, ...
* `Initialized() bool`: Check if a collector is successfully initialized * `Initialized() bool`: Check if a collector is successfully initialized
* `Read(duration time.Duration, output chan ccMetric.CCMetric)`: Read, parse and submit data to the `output` channel as [`CCMetric`](../internal/ccMetric/README.md). If the collector has to measure anything for some duration, use the provided function argument `duration`. * `Read(duration time.Duration, output chan ccMessage.CCMessage)`: Read, parse and submit data to the `output` channel as [`CCMessage`](https://github.com/ClusterCockpit/cc-lib/blob/main/ccMessage/README.md). If the collector has to measure anything for some duration, use the provided function argument `duration`.
* `Close()`: Closes down the collector. * `Close()`: Closes down the collector.
It is recommanded to call `setup()` in the `Init()` function. It is recommanded to call `setup()` in the `Init()` function.

View File

@ -1,5 +1,17 @@
<!--
---
title: BeeGFS metadata metric collector
description: Collect metadata clientstats for `BeeGFS on Demand`
categories: [cc-metric-collector]
tags: ['Admin']
weight: 2
hugo_path: docs/reference/cc-metric-collector/collectors/beegfsmeta.md
---
-->
## `BeeGFS on Demand` collector ## `BeeGFS on Demand` collector
This Collector is to collect BeeGFS on Demand (BeeOND) metadata clientstats. This Collector is to collect `BeeGFS on Demand` (BeeOND) metadata clientstats.
```json ```json
"beegfs_meta": { "beegfs_meta": {
@ -72,4 +84,4 @@ Available Metrics:
* setXA * setXA
* mirror * mirror
The collector adds a `filesystem` tag to all metrics The collector adds a `filesystem` tag to all metrics

View File

@ -1,3 +1,14 @@
<!--
---
title: "BeeGFS on Demand metric collector"
description: Collect performance metrics for BeeGFS filesystems
categories: [cc-metric-collector]
tags: ['Admin']
weight: 2
hugo_path: docs/reference/cc-metric-collector/collectors/beegfsstorage.md
---
-->
## `BeeGFS on Demand` collector ## `BeeGFS on Demand` collector
This Collector is to collect BeeGFS on Demand (BeeOND) storage stats. This Collector is to collect BeeGFS on Demand (BeeOND) storage stats.
@ -52,4 +63,4 @@ Available Metrics:
* "unlnk" * "unlnk"
The collector adds a `filesystem` tag to all metrics The collector adds a `filesystem` tag to all metrics

View File

@ -1,3 +1,14 @@
<!--
---
title: CPU frequency metric collector through cpuinfo
description: Collect the CPU frequency from `/proc/cpuinfo`
categories: [cc-metric-collector]
tags: ['Admin']
weight: 2
hugo_path: docs/reference/cc-metric-collector/collectors/cpufreq_cpuinfo.md
---
-->
## `cpufreq_cpuinfo` collector ## `cpufreq_cpuinfo` collector
```json ```json

View File

@ -1,3 +1,14 @@
<!--
---
title: CPU frequency metric collector through sysfs
description: Collect the CPU frequency metrics from `/sys/.../cpu/.../cpufreq`
categories: [cc-metric-collector]
tags: ['Admin']
weight: 2
hugo_path: docs/reference/cc-metric-collector/collectors/cpufreq.md
---
-->
## `cpufreq_cpuinfo` collector ## `cpufreq_cpuinfo` collector
```json ```json

View File

@ -1,3 +1,14 @@
<!--
---
title: CPU usage metric collector
description: Collect CPU metrics from `/proc/stat`
categories: [cc-metric-collector]
tags: ['Admin']
weight: 2
hugo_path: docs/reference/cc-metric-collector/collectors/cpustat.md
---
-->
## `cpustat` collector ## `cpustat` collector
@ -24,4 +35,4 @@ Metrics:
* `cpu_guest` with `unit=Percent` * `cpu_guest` with `unit=Percent`
* `cpu_guest_nice` with `unit=Percent` * `cpu_guest_nice` with `unit=Percent`
* `cpu_used` = `cpu_* - cpu_idle` with `unit=Percent` * `cpu_used` = `cpu_* - cpu_idle` with `unit=Percent`
* `num_cpus` * `num_cpus`

View File

@ -1,3 +1,13 @@
<!--
---
title: CustomCommand metric collector
description: Collect messages from custom command or files
categories: [cc-metric-collector]
tags: ['Admin']
weight: 2
hugo_path: docs/reference/cc-metric-collector/collectors/customcmd.md
---
-->
## `customcmd` collector ## `customcmd` collector

View File

@ -1,3 +1,13 @@
<!--
---
title: Disk usage statistics metric collector
description: Collect metrics for various filesystems from `/proc/self/mounts`
categories: [cc-metric-collector]
tags: ['Admin']
weight: 2
hugo_path: docs/reference/cc-metric-collector/collectors/diskstat.md
---
-->
## `diskstat` collector ## `diskstat` collector

View File

@ -1,3 +1,14 @@
<!--
---
title: GPFS collector
description: Collect infos about GPFS filesystems
categories: [cc-metric-collector]
tags: ['Admin']
weight: 2
hugo_path: docs/reference/cc-metric-collector/collectors/gpfs.md
---
-->
## `gpfs` collector ## `gpfs` collector
```json ```json

View File

@ -1,3 +1,13 @@
<!--
---
title: InfiniBand Metric collector
description: Collect metrics for InfiniBand devices
categories: [cc-metric-collector]
tags: ['Admin']
weight: 2
hugo_path: docs/reference/cc-metric-collector/collectors/infiniband.md
---
-->
## `ibstat` collector ## `ibstat` collector

View File

@ -1,3 +1,13 @@
<!--
---
title: IOStat Metric collector
description: Collect metrics from `/proc/diskstats`
categories: [cc-metric-collector]
tags: ['Admin']
weight: 2
hugo_path: docs/reference/cc-metric-collector/collectors/iostat.md
---
-->
## `iostat` collector ## `iostat` collector

View File

@ -1,3 +1,13 @@
<!--
---
title: IPMI Metric collector
description: Collect metrics using ipmitool or ipmi-sensors
categories: [cc-metric-collector]
tags: ['Admin']
weight: 2
hugo_path: docs/reference/cc-metric-collector/collectors/ipmi.md
---
-->
## `ipmistat` collector ## `ipmistat` collector

View File

@ -1,3 +1,13 @@
<!--
---
title: LIKWID collector
description: Collect hardware performance events and metrics using LIKWID
categories: [cc-metric-collector]
tags: ['Admin']
weight: 2
hugo_path: docs/reference/cc-metric-collector/collectors/likwid.md
---
-->
## `likwid` collector ## `likwid` collector

View File

@ -1,3 +1,14 @@
<!--
---
title: Load average metric collector
description: Collect metrics from `/proc/loadavg`
categories: [cc-metric-collector]
tags: ['Admin']
weight: 2
hugo_path: docs/reference/cc-metric-collector/collectors/loadavg.md
---
-->
## `loadavg` collector ## `loadavg` collector

View File

@ -1,3 +1,14 @@
<!--
---
title: Lustre filesystem metric collector
description: Collect metrics for Lustre filesystems
categories: [cc-metric-collector]
tags: ['Admin']
weight: 2
hugo_path: docs/reference/cc-metric-collector/collectors/lustre.md
---
-->
## `lustrestat` collector ## `lustrestat` collector
@ -43,4 +54,4 @@ Metrics:
* `lustre_statfs_diff` (if `send_diff_values == true`) * `lustre_statfs_diff` (if `send_diff_values == true`)
* `lustre_inode_permission_diff` (if `send_diff_values == true`) * `lustre_inode_permission_diff` (if `send_diff_values == true`)
This collector adds an `device` tag. This collector adds an `device` tag.

View File

@ -1,3 +1,14 @@
<!--
---
title: Memory statistics metric collector
description: Collect metrics from `/proc/meminfo`
categories: [cc-metric-collector]
tags: ['Admin']
weight: 2
hugo_path: docs/reference/cc-metric-collector/collectors/memstat.md
---
-->
## `memstat` collector ## `memstat` collector

View File

@ -1,3 +1,13 @@
<!--
---
title: Network device metric collector
description: Collect metrics for network devices through procfs
categories: [cc-metric-collector]
tags: ['Admin']
weight: 2
hugo_path: docs/reference/cc-metric-collector/collectors/netstat.md
---
-->
## `netstat` collector ## `netstat` collector
@ -28,4 +38,4 @@ Metrics:
* `net_pkts_in_bw` (`unit=packets/sec` if `send_derived_values == true`) * `net_pkts_in_bw` (`unit=packets/sec` if `send_derived_values == true`)
* `net_pkts_out_bw` (`unit=packets/sec` if `send_derived_values == true`) * `net_pkts_out_bw` (`unit=packets/sec` if `send_derived_values == true`)
The device name is added as tag `stype=network,stype-id=<device>`. The device name is added as tag `stype=network,stype-id=<device>`.

View File

@ -1,3 +1,14 @@
<!--
---
title: NFS network filesystem (v3) metric collector
description: Collect metrics for NFS network filesystems in version 3
categories: [cc-metric-collector]
tags: ['Admin']
weight: 2
hugo_path: docs/reference/cc-metric-collector/collectors/nfs3.md
---
-->
## `nfs3stat` collector ## `nfs3stat` collector

View File

@ -1,3 +1,14 @@
<!--
---
title: NFS network filesystem (v4) metric collector
description: Collect metrics for NFS network filesystems in version 4
categories: [cc-metric-collector]
tags: ['Admin']
weight: 2
hugo_path: docs/reference/cc-metric-collector/collectors/nfs4.md
---
-->
## `nfs4stat` collector ## `nfs4stat` collector

View File

@ -1,3 +1,14 @@
<!--
---
title: NFS network filesystem metrics from procfs
description: Collect NFS network filesystem metrics for mounts from `/proc/self/mountstats`
categories: [cc-metric-collector]
tags: ['Admin']
weight: 2
hugo_path: docs/reference/cc-metric-collector/collectors/nfsio.md
---
-->
## `nfsiostat` collector ## `nfsiostat` collector
```json ```json

View File

@ -1,3 +1,13 @@
<!--
---
title: NUMAStat collector
description: Collect infos about NUMA domains
categories: [cc-metric-collector]
tags: ['Admin']
weight: 2
hugo_path: docs/reference/cc-metric-collector/collectors/numastat.md
---
-->
## `numastat` collector ## `numastat` collector

View File

@ -1,3 +1,13 @@
<!--
---
title: "Nvidia NVML metric collector"
description: Collect metrics for Nvidia GPUs using the NVML
categories: [cc-metric-collector]
tags: ['Admin']
weight: 2
hugo_path: docs/reference/cc-metric-collector/collectors/nvidia.md
---
-->
## `nvidia` collector ## `nvidia` collector
@ -73,4 +83,4 @@ Metrics:
* `nv_nvlink_replay_errors` * `nv_nvlink_replay_errors`
* `nv_nvlink_recovery_errors` * `nv_nvlink_recovery_errors`
Some metrics add the additional sub type tag (`stype`) like the `nv_nvlink_*` metrics set `stype=nvlink,stype-id=<link_number>`. Some metrics add the additional sub type tag (`stype`) like the `nv_nvlink_*` metrics set `stype=nvlink,stype-id=<link_number>`.

View File

@ -1,3 +1,14 @@
<!--
---
title: RAPL metric collector
description: Collect energy data through the RAPL sysfs interface
categories: [cc-metric-collector]
tags: ['Admin']
weight: 2
hugo_path: docs/reference/cc-metric-collector/collectors/rapl.md
---
-->
## `rapl` collector ## `rapl` collector
This collector reads running average power limit (RAPL) monitoring attributes to compute average power consumption metrics. See <https://www.kernel.org/doc/html/latest/power/powercap/powercap.html#monitoring-attributes>. This collector reads running average power limit (RAPL) monitoring attributes to compute average power consumption metrics. See <https://www.kernel.org/doc/html/latest/power/powercap/powercap.html#monitoring-attributes>.

View File

@ -1,3 +1,14 @@
<!--
---
title: "ROCm SMI metric collector"
description: Collect metrics for AMD GPUs using the SMI library
categories: [cc-metric-collector]
tags: ['Admin']
weight: 2
hugo_path: docs/reference/cc-metric-collector/collectors/rocmsmi.md
---
-->
## `rocm_smi` collector ## `rocm_smi` collector

View File

@ -1,3 +1,13 @@
<!--
---
title: SchedStat Metric collector
description: Collect metrics from `/proc/schedstat`
categories: [cc-metric-collector]
tags: ['Admin']
weight: 2
hugo_path: docs/reference/cc-metric-collector/collectors/schedstat.md
---
-->
## `schedstat` collector ## `schedstat` collector
```json ```json
@ -8,4 +18,4 @@
The `schedstat` collector reads data from /proc/schedstat and calculates a load value, separated by hwthread. This might be useful to detect bad cpu pinning on shared nodes etc. The `schedstat` collector reads data from /proc/schedstat and calculates a load value, separated by hwthread. This might be useful to detect bad cpu pinning on shared nodes etc.
Metric: Metric:
* `cpu_load_core` * `cpu_load_core`

View File

@ -1,3 +1,14 @@
<!--
---
title: Self-monitoring metric collector
description: Collect metrics from the execution of cc-metric-collector itself
categories: [cc-metric-collector]
tags: ['Admin']
weight: 2
hugo_path: docs/reference/cc-metric-collector/collectors/self.md
---
-->
## `self` collector ## `self` collector
```json ```json

View File

@ -1,3 +1,14 @@
<!--
---
title: Temperature metric collector
description: Collect thermal metrics from `/sys/class/hwmon/*`
categories: [cc-metric-collector]
tags: ['Admin']
weight: 2
hugo_path: docs/reference/cc-metric-collector/collectors/temp.md
---
-->
## `tempstat` collector ## `tempstat` collector

View File

@ -1,3 +1,15 @@
<!--
---
title: TopProcs collector
description: Collect infos about most CPU-consuming processes
categories: [cc-metric-collector]
tags: ['Admin']
weight: 2
hugo_path: docs/reference/cc-metric-collector/collectors/topprocs.md
---
-->
## `topprocs` collector ## `topprocs` collector

View File

@ -1,3 +1,14 @@
<!--
---
title: Metric Aggregator
description: Subsystem for evaluating expressions on metrics (deprecated)
categories: [cc-metric-collector]
tags: ['Developer']
weight: 1
hugo_path: docs/reference/cc-metric-collector/internal/metricaggregator/_index.md
---
-->
# The MetricAggregator # The MetricAggregator
In some cases, further combination of metrics or raw values is required. For that strings like `foo + 1` with runtime dependent `foo` need to be evaluated. The MetricAggregator relies on the [`gval`](https://github.com/PaesslerAG/gval) Golang package to perform all expression evaluation. The `gval` package provides the basic arithmetic operations but the MetricAggregator defines additional ones. In some cases, further combination of metrics or raw values is required. For that strings like `foo + 1` with runtime dependent `foo` need to be evaluated. The MetricAggregator relies on the [`gval`](https://github.com/PaesslerAG/gval) Golang package to perform all expression evaluation. The `gval` package provides the basic arithmetic operations but the MetricAggregator defines additional ones.
@ -35,4 +46,4 @@ The MetricAggregator provides these functions additional to the `Full` language
## Limitations ## Limitations
- Since the metrics are written in JSON files which do not allow `""` without proper escaping inside of JSON strings, you have to use `''` for strings. - Since the metrics are written in JSON files which do not allow `""` without proper escaping inside of JSON strings, you have to use `''` for strings.
- Since `\` is interpreted by JSON as escape character, it cannot be used in metrics. But it is required to write regular expressions. So instead of `/`, use `%` and the MetricAggregator replaces them after reading the JSON file. - Since `\` is interpreted by JSON as escape character, it cannot be used in metrics. But it is required to write regular expressions. So instead of `/`, use `%` and the MetricAggregator replaces them after reading the JSON file.

View File

@ -1,11 +1,22 @@
<!--
---
title: Message Router
description: Routing component inside cc-metric-collector
categories: [cc-metric-collector]
tags: ['Developer']
weight: 1
hugo_path: docs/reference/cc-metric-collector/internal/metricrouter/_index.md
---
-->
# CC Metric Router # CC Metric Router
The CCMetric router sits in between the collectors and the sinks and can be used to add and remove tags to/from traversing [CCMessages](https://pkg.go.dev/github.com/ClusterCockpit/cc-energy-manager@v0.0.0-20240919152819-92a17f2da4f7/pkg/cc-message. The CCMetric router sits in between the collectors and the sinks and can be used to add and remove tags to/from traversing [CCMessages](https://pkg.go.dev/github.com/ClusterCockpit/cc-lib/ccMessage).
# Configuration # Configuration
**Note**: Use the [message processor configuration](../../pkg/messageProcessor/README.md) with option `process_messages`. **Note**: Use the [message processor configuration](https://github.com/ClusterCockpit/cc-lib/blob/main/messageProcessor/README.md) with option `process_messages`.
```json ```json
{ {
@ -69,7 +80,7 @@ The CCMetric router sits in between the collectors and the sinks and can be used
There are three main options `add_tags`, `delete_tags` and `interval_timestamp`. `add_tags` and `delete_tags` are lists consisting of dicts with `key`, `value` and `if`. The `value` can be omitted in the `delete_tags` part as it only uses the `key` for removal. The `interval_timestamp` setting means that a unique timestamp is applied to all metrics traversing the router during an interval. There are three main options `add_tags`, `delete_tags` and `interval_timestamp`. `add_tags` and `delete_tags` are lists consisting of dicts with `key`, `value` and `if`. The `value` can be omitted in the `delete_tags` part as it only uses the `key` for removal. The `interval_timestamp` setting means that a unique timestamp is applied to all metrics traversing the router during an interval.
**Note**: Use the [message processor configuration](../../pkg/messageProcessor/README.md) (option `process_messages`) instead of `add_tags`, `delete_tags`, `drop_metrics`, `drop_metrics_if`, `rename_metrics`, `normalize_units` and `change_unit_prefix`. These options are deprecated and will be removed in future versions. Until then, they are added to the message processor. **Note**: Use the [message processor configuration](https://github.com/ClusterCockpit/cc-lib/blob/main/messageProcessor/README.md) (option `process_messages`) instead of `add_tags`, `delete_tags`, `drop_metrics`, `drop_metrics_if`, `rename_metrics`, `normalize_units` and `change_unit_prefix`. These options are deprecated and will be removed in future versions. Until then, they are added to the message processor.
# Processing order in the router # Processing order in the router
@ -263,7 +274,7 @@ The above configuration, collects all metric values for metrics evaluating `if`
If you are not interested in the input metrics `sub_metric_%d+` at all, you can add the same condition used here to the `drop_metrics_if` section to drop them. If you are not interested in the input metrics `sub_metric_%d+` at all, you can add the same condition used here to the `drop_metrics_if` section to drop them.
Use cases for `interval_aggregates`: Use cases for `interval_aggregates`:
- Combine multiple metrics of the a collector to a new one like the [MemstatCollector](../../collectors/memstatMetric.md) does it for `mem_used`)): - Combine multiple metrics of the a collector to a new one like the [MemstatCollector](../../collectors/memstatMetric.md) does it for `mem_used`:
```json ```json
{ {
"name" : "mem_used", "name" : "mem_used",

View File

@ -1,3 +1,14 @@
<!--
---
title: Multi-channel Ticker
description: Timer ticker that sends out the tick to multiple channels
categories: [cc-metric-collector]
tags: ['Developer']
weight: 1
hugo_path: docs/reference/cc-metric-collector/pkg/multichanticker/_index.md
---
-->
# MultiChanTicker # MultiChanTicker
The idea of this ticker is to multiply the output channels. The original Golang `time.Ticker` provides only a single output channel, so the signal can only be received by a single other class. This ticker allows to add multiple channels which get all notified about the time tick. The idea of this ticker is to multiply the output channels. The original Golang `time.Ticker` provides only a single output channel, so the signal can only be received by a single other class. This ticker allows to add multiple channels which get all notified about the time tick.