mirror of
https://github.com/ClusterCockpit/cc-specifications.git
synced 2025-03-15 03:15:55 +01:00
Continue on lineprotocol ccMessage specs
This commit is contained in:
parent
6205680fb8
commit
917e17d9bf
@ -16,6 +16,9 @@ Where `<tag set>` and `<field set>` are comma-separated lists of `key=value`
|
|||||||
entries. In a mind-model, think about tags as `indices` in the database for
|
entries. In a mind-model, think about tags as `indices` in the database for
|
||||||
faster lookup and the `<field set>` as values.
|
faster lookup and the `<field set>` as values.
|
||||||
|
|
||||||
|
We are using the tag set to add metadata information and the field for the
|
||||||
|
payload.
|
||||||
|
|
||||||
**Remark**: In the first iteration, we only sent metrics (number values) but we
|
**Remark**: In the first iteration, we only sent metrics (number values) but we
|
||||||
extended the specification to messages with different purposes. The below
|
extended the specification to messages with different purposes. The below
|
||||||
text was changed accordingly. The update is backward-compatible, for metrics
|
text was changed accordingly. The update is backward-compatible, for metrics
|
||||||
@ -25,10 +28,11 @@ text was changed accordingly. The update is backward-compatible, for metrics
|
|||||||
|
|
||||||
There exist the following line line-protocol message flavors:
|
There exist the following line line-protocol message flavors:
|
||||||
|
|
||||||
* Metric: The field key is `value`
|
- Metric: The field key is `value`, measurement = metric name
|
||||||
* Event: The field key is `event`
|
- Event: The field key is `event`, Events are actionable informations, measurement = event subtype (job, phases, ?? ), Additional tag function=<string>
|
||||||
* Log message: The field key is `log`
|
- Log message: The field key is `log`. Log messages are purely informational,
|
||||||
* Control message: The field key is `control`
|
measurement = [ccb, ccms, ccmc, ccem, ccnc], Additional tag loglevel
|
||||||
|
- Control message: The field key is `control`, measurement = knob name (rapl, freq, prefetcher, topology, config), Additional tags: method=[get,put]
|
||||||
|
|
||||||
## Messaging
|
## Messaging
|
||||||
|
|
||||||
@ -45,10 +49,10 @@ subject hierarchy tree is used:
|
|||||||
|
|
|
|
||||||
--- log.[ccb, ccms, ccmc, ccem, ccnc]
|
--- log.[ccb, ccms, ccmc, ccem, ccnc]
|
||||||
|
|
|
|
||||||
--- control
|
--- control.[get,put]
|
||||||
```
|
```
|
||||||
|
|
||||||
## Metric messages
|
## Points generic for all message categories
|
||||||
|
|
||||||
In ClusterCockpit we limit the flexibility of the InfluxData line-protocol
|
In ClusterCockpit we limit the flexibility of the InfluxData line-protocol
|
||||||
slightly. The idea is to keep the format usable by different components.
|
slightly. The idea is to keep the format usable by different components.
|
||||||
@ -58,17 +62,17 @@ Each message is identifiable by the `measurement` (= metric name), the
|
|||||||
|
|
||||||
### Mandatory tags per message
|
### Mandatory tags per message
|
||||||
|
|
||||||
* `hostname`
|
- `hostname`
|
||||||
* `type`
|
- `type`
|
||||||
* `node`
|
- `node`
|
||||||
* `socket`
|
- `socket`
|
||||||
* `die`
|
- `die`
|
||||||
* `memoryDomain`
|
- `memoryDomain`
|
||||||
* `llc`
|
- `llc`
|
||||||
* `core`
|
- `core`
|
||||||
* `hwthread`
|
- `hwthread`
|
||||||
* `accelerator`
|
- `accelerator`
|
||||||
* `type-id` for further specifying the type like CPU socket or HW Thread identifier
|
- `type-id` for further specifying the type like CPU socket or HW Thread identifier
|
||||||
|
|
||||||
Although no `type-id` is required if `type=node`, it is recommended to send `type=node,type-id=0`.
|
Although no `type-id` is required if `type=node`, it is recommended to send `type=node,type-id=0`.
|
||||||
|
|
||||||
@ -93,17 +97,17 @@ While the measurements (metric names) can be chosen freely, there is a basic set
|
|||||||
of measurements which should be present as long as you navigate in the
|
of measurements which should be present as long as you navigate in the
|
||||||
ClusterCockpit ecosystem
|
ClusterCockpit ecosystem
|
||||||
|
|
||||||
* `flops_sp`: Single-precision floating point rate in `Flops/s`
|
- `flops_sp`: Single-precision floating point rate in `Flops/s`
|
||||||
* `flops_dp`: Double-precision floating point rate in `Flops/s`
|
- `flops_dp`: Double-precision floating point rate in `Flops/s`
|
||||||
* `flops_any`: Combined floating point rate in `Flops/s` (often `(flops_dp * 2) + flops_sp`)
|
- `flops_any`: Combined floating point rate in `Flops/s` (often `(flops_dp * 2) + flops_sp`)
|
||||||
* `cpu_load`: The 1m load of the system (see `/proc/loadavg`)
|
- `cpu_load`: The 1m load of the system (see `/proc/loadavg`)
|
||||||
* `mem_used`: The amount of memory used by applications (see `/proc/meminfo`)
|
- `mem_used`: The amount of memory used by applications (see `/proc/meminfo`)
|
||||||
* `ipc`: instructions-per-cycle metric
|
- `ipc`: instructions-per-cycle metric
|
||||||
* `mem_bw`: Main memory bandwidth (read and write) in `MByte/s`
|
- `mem_bw`: Main memory bandwidth (read and write) in `MByte/s`
|
||||||
* `cpu_power`: Power consumption of the whole CPU package
|
- `cpu_power`: Power consumption of the whole CPU package
|
||||||
* `mem_power`: Power consumption of the memory subsystem
|
- `mem_power`: Power consumption of the memory subsystem
|
||||||
* `clock`: CPU clock in `MHz`
|
- `clock`: CPU clock in `MHz`
|
||||||
* ...
|
- ...
|
||||||
|
|
||||||
For the whole list, see [job-data schema](../../datastructures/job-data.schema.json)
|
For the whole list, see [job-data schema](../../datastructures/job-data.schema.json)
|
||||||
|
|
||||||
@ -121,8 +125,8 @@ TBD
|
|||||||
|
|
||||||
**Identification:**
|
**Identification:**
|
||||||
|
|
||||||
* `control="X"` field with `"X"` being a string
|
- `control="X"` field with `"X"` being a string
|
||||||
* `method` tag is either `GET` or `PUT`
|
- `method` tag is either `GET` or `PUT`
|
||||||
|
|
||||||
#### Logs
|
#### Logs
|
||||||
|
|
||||||
|
Loading…
x
Reference in New Issue
Block a user