mirror of
https://github.com/ClusterCockpit/cc-metric-collector.git
synced 2024-12-27 23:59:05 +01:00
36 lines
1.6 KiB
Markdown
36 lines
1.6 KiB
Markdown
|
|
||
|
## `lustre_jobstat` collector
|
||
|
|
||
|
**Note**: This collector is meant to run on the Lustre servers, **not** the clients
|
||
|
|
||
|
The Lustre filesystem provides a feature (`job_stats`) to group processes on client side with an identifier string (like a compute job with its jobid) and retrieve the file system operation counts on the server side. Check the section [How to configure `job_stats`]() for more information.
|
||
|
|
||
|
### Configuration
|
||
|
|
||
|
```json
|
||
|
"lustre_jobstat_": {
|
||
|
"lctl_command": "/path/to/lctl",
|
||
|
"use_sudo": false,
|
||
|
"exclude_metrics": [
|
||
|
"setattr",
|
||
|
"getattr"
|
||
|
],
|
||
|
"send_abs_values" : true,
|
||
|
|
||
|
"jobid_regex" : "^(?P<jobid>[%d%w%.]+)$"
|
||
|
}
|
||
|
```
|
||
|
|
||
|
The `lustre_jobstat` collector uses the `lctl` application with the `get_param` option to get all `mdt.*.job_stats` and `obdfilter.*.job_stats` metrics. These metrics are only available for root users. If password-less sudo is configured, you can enable `sudo` in the configuration. In the `exclude_metrics` list, some metrics can be excluded to reduce network traffic and storage. With the `send_abs_values` flag, the collector sends absolute values for the configured metrics. The `jobid_regex` can be used to split the Lustre `job_stats` identifier into multiple parts. Since JSON cannot handle strings like `\d`, use `%` instead of `\`.
|
||
|
|
||
|
Metrics:
|
||
|
- `lustre_job_read_samples` (unit: `requests`)
|
||
|
- `lustre_job_read_min_bytes` (unit: `bytes`)
|
||
|
- `lustre_job_read_max_bytes` (unit: `bytes`)
|
||
|
|
||
|
The collector adds the tags: `type=jobid,typeid=<jobid_from_regex>,stype=device,stype=<device_name_from_output>`.
|
||
|
|
||
|
The collector adds the mega information: `unit=<unit>,scope=job`
|
||
|
|
||
|
### How to configure `job_stats`
|