mirror of
https://github.com/ClusterCockpit/cc-metric-collector.git
synced 2025-10-07 23:04:32 +02:00
add slurm_cgroup Collector
This commit is contained in:
committed by
Thomas Gruber
parent
a45366646e
commit
c5183feafc
48
collectors/slurmCgroupMetric.md
Normal file
48
collectors/slurmCgroupMetric.md
Normal file
@@ -0,0 +1,48 @@
|
||||
<!--
|
||||
---
|
||||
title: Slurm cgroup metric collector
|
||||
description: Collect per-core memory and CPU usage for SLURM jobs from cgroup v2
|
||||
categories: [cc-metric-collector]
|
||||
tags: ['Admin']
|
||||
weight: 3
|
||||
hugo_path: docs/reference/cc-metric-collector/collectors/slurm_cgroup.md
|
||||
---
|
||||
-->
|
||||
|
||||
## `slurm_cgroup` collector
|
||||
|
||||
The `slurm_cgroup` collector reads job-specific resource metrics from the cgroup v2 filesystem and provides **hwthread** metrics for memory and CPU usage of running SLURM jobs.
|
||||
|
||||
### Example configuration
|
||||
|
||||
```json
|
||||
"slurm_cgroup": {
|
||||
"cgroup_base": "/sys/fs/cgroup/system.slice/slurmstepd.scope",
|
||||
"exclude_metrics": [
|
||||
"job_sys_cpu",
|
||||
"job_mem_limit"
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
* The `cgroup_base` parameter (optional) can be set to specify the root path to SLURM job cgroups. The default is `/sys/fs/cgroup/system.slice/slurmstepd.scope`.
|
||||
* The `exclude_metrics` array can be used to suppress individual metrics from being sent to the sink.
|
||||
|
||||
### Reported metrics
|
||||
|
||||
All metrics are available **per hardware thread** :
|
||||
|
||||
* `job_mem_used` (`unit=Bytes`): Current memory usage of the job
|
||||
* `job_max_mem_used` (`unit=Bytes`): Peak memory usage
|
||||
* `job_mem_limit` (`unit=Bytes`): Cgroup memory limit
|
||||
* `job_user_cpu` (`unit=%`): User CPU utilization percentage
|
||||
* `job_sys_cpu` (`unit=%`): System CPU utilization percentage
|
||||
|
||||
Each metric has tags:
|
||||
|
||||
* `type=hwthread`
|
||||
* `type-id=<core_id>`
|
||||
|
||||
### Limitations
|
||||
|
||||
* **cgroups v2 required:** This collector only supports systems running with cgroups v2 (unified hierarchy).
|
Reference in New Issue
Block a user