mirror of
https://github.com/ClusterCockpit/cc-metric-store.git
synced 2026-03-13 03:57:30 +01:00
Update README
Entire-Checkpoint: dd6b5959d7c6
This commit is contained in:
224
README.md
224
README.md
@@ -5,18 +5,20 @@
|
|||||||
The cc-metric-store provides a simple in-memory time series database for storing
|
The cc-metric-store provides a simple in-memory time series database for storing
|
||||||
metrics of cluster nodes at preconfigured intervals. It is meant to be used as
|
metrics of cluster nodes at preconfigured intervals. It is meant to be used as
|
||||||
part of the [ClusterCockpit suite](https://github.com/ClusterCockpit). As all
|
part of the [ClusterCockpit suite](https://github.com/ClusterCockpit). As all
|
||||||
data is kept in-memory (but written to disk as compressed JSON for long term
|
data is kept in-memory, accessing it is very fast. It also provides topology aware
|
||||||
storage), accessing it is very fast. It also provides topology aware
|
|
||||||
aggregations over time _and_ nodes/sockets/cpus.
|
aggregations over time _and_ nodes/sockets/cpus.
|
||||||
|
|
||||||
There are major limitations: Data only gets written to disk at periodic
|
There are major limitations: Data only gets written to disk at periodic
|
||||||
checkpoints, not as soon as it is received. Also only the fixed configured
|
checkpoints (or via WAL on every write), not immediately as it is received.
|
||||||
duration is stored and available.
|
Only the configured retention window is kept in memory.
|
||||||
|
Still metric data is kept as long as running jobs is using it.
|
||||||
|
|
||||||
Go look at the [GitHub
|
The storage engine is provided by the
|
||||||
Issues](https://github.com/ClusterCockpit/cc-metric-store/issues) for a progress
|
[cc-backend](https://github.com/ClusterCockpit/cc-backend) package
|
||||||
overview. The [NATS.io](https://nats.io/) based writing endpoint consumes messages in [this
|
(`cc-backend/pkg/metricstore`). This repository provides the HTTP API wrapper.
|
||||||
format of the InfluxDB line
|
|
||||||
|
The [NATS.io](https://nats.io/) based writing endpoint and the HTTP write
|
||||||
|
endpoint both consume messages in [this format of the InfluxDB line
|
||||||
protocol](https://github.com/ClusterCockpit/cc-specifications/blob/master/metrics/lineprotocol_alternative.md).
|
protocol](https://github.com/ClusterCockpit/cc-specifications/blob/master/metrics/lineprotocol_alternative.md).
|
||||||
|
|
||||||
## Building
|
## Building
|
||||||
@@ -24,22 +26,47 @@ protocol](https://github.com/ClusterCockpit/cc-specifications/blob/master/metric
|
|||||||
`cc-metric-store` can be built using the provided `Makefile`.
|
`cc-metric-store` can be built using the provided `Makefile`.
|
||||||
It supports the following targets:
|
It supports the following targets:
|
||||||
|
|
||||||
- `make`: Build the application, copy a example configuration file and generate
|
- `make`: Build the application, copy an example configuration file and generate
|
||||||
checkpoint folders if required.
|
checkpoint folders if required.
|
||||||
- `make clean`: Clean the golang build cache and application binary
|
- `make clean`: Clean the golang build cache and application binary
|
||||||
- `make distclean`: In addition to the clean target also remove the `./var`
|
- `make distclean`: In addition to the clean target also remove the `./var`
|
||||||
folder
|
folder and `config.json`
|
||||||
- `make swagger`: Regenerate the Swagger files from the source comments.
|
- `make swagger`: Regenerate the Swagger files from the source comments.
|
||||||
- `make test`: Run test and basic checks.
|
- `make test`: Run tests and basic checks (`go build`, `go vet`, `go test`).
|
||||||
|
|
||||||
|
## Running
|
||||||
|
|
||||||
|
```sh
|
||||||
|
./cc-metric-store # Uses ./config.json
|
||||||
|
./cc-metric-store -config /path/to/config.json
|
||||||
|
./cc-metric-store -dev # Enable Swagger UI at /swagger/
|
||||||
|
./cc-metric-store -loglevel debug # debug|info|warn (default)|err|crit
|
||||||
|
./cc-metric-store -logdate # Add date and time to log messages
|
||||||
|
./cc-metric-store -version # Show version information and exit
|
||||||
|
./cc-metric-store -gops # Enable gops agent for debugging
|
||||||
|
```
|
||||||
|
|
||||||
## REST API Endpoints
|
## REST API Endpoints
|
||||||
|
|
||||||
The REST API is documented in [swagger.json](./api/swagger.json). You can
|
The REST API is documented in [swagger.json](./api/swagger.json). You can
|
||||||
explore and try the REST API using the integrated [SwaggerUI web
|
explore and try the REST API using the integrated [SwaggerUI web
|
||||||
interface](http://localhost:8082/swagger).
|
interface](http://localhost:8082/swagger/) (requires the `-dev` flag).
|
||||||
|
|
||||||
For more information on the `cc-metric-store` REST API have a look at the
|
For more information on the `cc-metric-store` REST API have a look at the
|
||||||
ClusterCockpit documentation [website](https://clustercockpit.org/docs/reference/cc-metric-store/ccms-rest-api/)
|
ClusterCockpit documentation [website](https://clustercockpit.org/docs/reference/cc-metric-store/ccms-rest-api/).
|
||||||
|
|
||||||
|
All endpoints support both trailing-slash and non-trailing-slash variants:
|
||||||
|
|
||||||
|
| Method | Path | Description |
|
||||||
|
| ------ | ------------------- | -------------------------------------- |
|
||||||
|
| `GET` | `/api/query/` | Query metrics with selectors |
|
||||||
|
| `POST` | `/api/write/` | Write metrics (InfluxDB line protocol) |
|
||||||
|
| `POST` | `/api/free/` | Free buffers up to a timestamp |
|
||||||
|
| `GET` | `/api/debug/` | Dump internal state |
|
||||||
|
| `GET` | `/api/healthcheck/` | Check node health status |
|
||||||
|
|
||||||
|
If `jwt-public-key` is set in `config.json`, all endpoints require JWT
|
||||||
|
authentication using an Ed25519 key (`Authorization: Bearer <token>`).
|
||||||
|
|
||||||
## Run tests
|
## Run tests
|
||||||
|
|
||||||
@@ -60,11 +87,11 @@ go test -bench=. -race -v ./...
|
|||||||
|
|
||||||
The cc-metric-store works as a time-series database and uses the InfluxDB line
|
The cc-metric-store works as a time-series database and uses the InfluxDB line
|
||||||
protocol as input format. Unlike InfluxDB, the data is indexed by one single
|
protocol as input format. Unlike InfluxDB, the data is indexed by one single
|
||||||
strictly hierarchical tree structure. A selector is build out of the tags in the
|
strictly hierarchical tree structure. A selector is built out of the tags in the
|
||||||
InfluxDB line protocol, and can be used to select a node (not in the sense of a
|
InfluxDB line protocol, and can be used to select a node (not in the sense of a
|
||||||
compute node, can also be a socket, cpu, ...) in that tree. The implementation
|
compute node, can also be a socket, cpu, ...) in that tree. The implementation
|
||||||
calls those nodes `level` to avoid confusion. It is impossible to access data
|
calls those nodes `level` to avoid confusion. It is impossible to access data
|
||||||
only by knowing the _socket_ or _cpu_ tag, all higher up levels have to be
|
only by knowing the _socket_ or _cpu_ tag — all higher up levels have to be
|
||||||
specified as well.
|
specified as well.
|
||||||
|
|
||||||
This is what the hierarchy currently looks like:
|
This is what the hierarchy currently looks like:
|
||||||
@@ -90,18 +117,154 @@ Example selectors:
|
|||||||
|
|
||||||
1. `["cluster1", "host1", "cpu0"]`: Select only the cpu0 of host1 in cluster1
|
1. `["cluster1", "host1", "cpu0"]`: Select only the cpu0 of host1 in cluster1
|
||||||
2. `["cluster1", "host1", ["cpu4", "cpu5", "cpu6", "cpu7"]]`: Select only CPUs 4-7 of host1 in cluster1
|
2. `["cluster1", "host1", ["cpu4", "cpu5", "cpu6", "cpu7"]]`: Select only CPUs 4-7 of host1 in cluster1
|
||||||
3. `["cluster1", "host1"]`: Select the complete node. If querying for a CPU-specific metric such as floats, all CPUs are implied
|
3. `["cluster1", "host1"]`: Select the complete node. If querying for a CPU-specific metric such as flops, all CPUs are implied
|
||||||
|
|
||||||
## Config file
|
## Config file
|
||||||
|
|
||||||
You find the configuration options on the ClusterCockpit [website](https://clustercockpit.org/docs/reference/cc-metric-store/ccms-configuration/).
|
The config file is a JSON document with four top-level sections.
|
||||||
|
|
||||||
|
### `main`
|
||||||
|
|
||||||
|
```json
|
||||||
|
"main": {
|
||||||
|
"addr": "0.0.0.0:8082",
|
||||||
|
"https-cert-file": "",
|
||||||
|
"https-key-file": "",
|
||||||
|
"jwt-public-key": "<base64-encoded Ed25519 public key>",
|
||||||
|
"user": "",
|
||||||
|
"group": "",
|
||||||
|
"backend-url": ""
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- `addr`: Address and port to listen on (default: `0.0.0.0:8082`)
|
||||||
|
- `https-cert-file` / `https-key-file`: Paths to TLS certificate/key for HTTPS
|
||||||
|
- `jwt-public-key`: Base64-encoded Ed25519 public key for JWT authentication. If empty, no auth is required.
|
||||||
|
- `user` / `group`: Drop privileges to this user/group after startup
|
||||||
|
- `backend-url`: Optional URL of a cc-backend instance used as node provider
|
||||||
|
|
||||||
|
### `metrics`
|
||||||
|
|
||||||
|
Per-metric configuration. Each key is the metric name:
|
||||||
|
|
||||||
|
```json
|
||||||
|
"metrics": {
|
||||||
|
"cpu_load": { "frequency": 60, "aggregation": null },
|
||||||
|
"flops_any": { "frequency": 60, "aggregation": "sum" },
|
||||||
|
"cpu_user": { "frequency": 60, "aggregation": "avg" }
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- `frequency`: Sampling interval in seconds
|
||||||
|
- `aggregation`: How to aggregate sub-level data: `"sum"`, `"avg"`, or `null` (no aggregation)
|
||||||
|
|
||||||
|
### `metric-store`
|
||||||
|
|
||||||
|
```json
|
||||||
|
"metric-store": {
|
||||||
|
"checkpoints": {
|
||||||
|
"file-format": "wal",
|
||||||
|
"directory": "./var/checkpoints"
|
||||||
|
},
|
||||||
|
"memory-cap": 100,
|
||||||
|
"retention-in-memory": "24h",
|
||||||
|
"num-workers": 0,
|
||||||
|
"cleanup": {
|
||||||
|
"mode": "archive",
|
||||||
|
"directory": "./var/archive"
|
||||||
|
},
|
||||||
|
"nats-subscriptions": [
|
||||||
|
{ "subscribe-to": "hpc-nats", "cluster-tag": "fritz" }
|
||||||
|
]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- `checkpoints.file-format`: Checkpoint format: `"json"` (default, human-readable) or `"wal"` (binary WAL, crash-safe). See [Checkpoint formats](#checkpoint-formats) below.
|
||||||
|
- `checkpoints.directory`: Root directory for checkpoint files (organized as `<dir>/<cluster>/<host>/`)
|
||||||
|
- `memory-cap`: Approximate memory cap in MB for metric buffers
|
||||||
|
- `retention-in-memory`: How long to keep data in memory (e.g. `"48h"`)
|
||||||
|
- `num-workers`: Number of parallel workers for checkpoint/archive I/O (0 = auto, capped at 10)
|
||||||
|
- `cleanup.mode`: What to do with data older than `retention-in-memory`: `"archive"` (write Parquet) or `"delete"`
|
||||||
|
- `cleanup.directory`: Root directory for Parquet archive files (required when `mode` is `"archive"`)
|
||||||
|
- `nats-subscriptions`: List of NATS subjects to subscribe to, with associated cluster tag
|
||||||
|
|
||||||
|
### Checkpoint formats
|
||||||
|
|
||||||
|
The `checkpoints.file-format` field controls how in-memory data is persisted to disk.
|
||||||
|
|
||||||
|
**`"json"` (default)** — human-readable JSON snapshots written periodically. Each
|
||||||
|
snapshot is stored as `<dir>/<cluster>/<host>/<timestamp>.json` and contains the
|
||||||
|
full metric hierarchy. Easy to inspect and recover manually, but larger on disk
|
||||||
|
and slower to write.
|
||||||
|
|
||||||
|
**`"wal"`** — binary Write-Ahead Log format designed for crash safety. Two file
|
||||||
|
types are used per host:
|
||||||
|
|
||||||
|
- `current.wal` — append-only binary log. Every incoming data point is appended
|
||||||
|
immediately (magic `0xCC1DA7A1`, 4-byte CRC32 per record). Truncated trailing
|
||||||
|
records from unclean shutdowns are silently skipped on restart.
|
||||||
|
- `<timestamp>.bin` — binary snapshot written at each checkpoint interval
|
||||||
|
(magic `0xCC5B0001`). Contains the complete hierarchical metric state
|
||||||
|
column-by-column. Written atomically via a `.tmp` rename.
|
||||||
|
|
||||||
|
On startup the most recent `.bin` snapshot is loaded, then any remaining WAL
|
||||||
|
entries are replayed on top. The WAL is rotated (old file deleted, new one
|
||||||
|
started) after each successful snapshot.
|
||||||
|
|
||||||
|
The `"wal"` option is the default and will be the only supported option in the
|
||||||
|
future. The `"json"` checkpoint format is still provided to migrate from
|
||||||
|
previous cc-metric-store version.
|
||||||
|
|
||||||
|
### Parquet archive
|
||||||
|
|
||||||
|
When `cleanup.mode` is `"archive"`, data that ages out of the in-memory
|
||||||
|
retention window is written to [Apache Parquet](https://parquet.apache.org/)
|
||||||
|
files before being freed. Files are organized as:
|
||||||
|
|
||||||
|
```
|
||||||
|
<cleanup.directory>/
|
||||||
|
<cluster>/
|
||||||
|
<timestamp>.parquet
|
||||||
|
```
|
||||||
|
|
||||||
|
One Parquet file is produced per cluster per cleanup run, consolidating all
|
||||||
|
hosts. Rows use a long (tidy) schema:
|
||||||
|
|
||||||
|
| Column | Type | Description |
|
||||||
|
| ----------- | ------- | ----------------------------------------------------------------------- |
|
||||||
|
| `cluster` | string | Cluster name |
|
||||||
|
| `hostname` | string | Host name |
|
||||||
|
| `metric` | string | Metric name |
|
||||||
|
| `scope` | string | Hardware scope (`node`, `socket`, `core`, `hwthread`, `accelerator`, …) |
|
||||||
|
| `scope_id` | string | Numeric ID within the scope (e.g. `"0"`) |
|
||||||
|
| `timestamp` | int64 | Unix timestamp (seconds) |
|
||||||
|
| `frequency` | int64 | Sampling interval in seconds |
|
||||||
|
| `value` | float32 | Metric value |
|
||||||
|
|
||||||
|
Files are compressed with Zstandard and sorted by `(cluster, hostname, metric,
|
||||||
|
timestamp)` for efficient columnar reads. The `cpu` prefix in the tree is
|
||||||
|
treated as an alias for `hwthread` scope.
|
||||||
|
|
||||||
|
### `nats`
|
||||||
|
|
||||||
|
```json
|
||||||
|
"nats": {
|
||||||
|
"address": "nats://0.0.0.0:4222",
|
||||||
|
"username": "root",
|
||||||
|
"password": "root"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
NATS connection is optional. If not configured, only the HTTP write endpoint is available.
|
||||||
|
|
||||||
|
For more information see the ClusterCockpit documentation [website](https://clustercockpit.org/docs/reference/cc-metric-store/ccms-configuration/).
|
||||||
|
|
||||||
## Test the complete setup (excluding cc-backend itself)
|
## Test the complete setup (excluding cc-backend itself)
|
||||||
|
|
||||||
There are two ways for sending data to the cc-metric-store, both of which are
|
There are two ways for sending data to the cc-metric-store, both of which are
|
||||||
supported by the
|
supported by the
|
||||||
[cc-metric-collector](https://github.com/ClusterCockpit/cc-metric-collector).
|
[cc-metric-collector](https://github.com/ClusterCockpit/cc-metric-collector).
|
||||||
This example uses NATS, the alternative is to use HTTP.
|
This example uses NATS; the alternative is to use HTTP.
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
# Only needed once, downloads the docker image
|
# Only needed once, downloads the docker image
|
||||||
@@ -142,22 +305,25 @@ for testing:
|
|||||||
```sh
|
```sh
|
||||||
JWT="eyJ0eXAiOiJKV1QiLCJhbGciOiJFZERTQSJ9.eyJ1c2VyIjoiYWRtaW4iLCJyb2xlcyI6WyJST0xFX0FETUlOIiwiUk9MRV9BTkFMWVNUIiwiUk9MRV9VU0VSIl19.d-3_3FZTsadPjDEdsWrrQ7nS0edMAR4zjl-eK7rJU3HziNBfI9PDHDIpJVHTNN5E5SlLGLFXctWyKAkwhXL-Dw"
|
JWT="eyJ0eXAiOiJKV1QiLCJhbGciOiJFZERTQSJ9.eyJ1c2VyIjoiYWRtaW4iLCJyb2xlcyI6WyJST0xFX0FETUlOIiwiUk9MRV9BTkFMWVNUIiwiUk9MRV9VU0VSIl19.d-3_3FZTsadPjDEdsWrrQ7nS0edMAR4zjl-eK7rJU3HziNBfI9PDHDIpJVHTNN5E5SlLGLFXctWyKAkwhXL-Dw"
|
||||||
|
|
||||||
# If the collector and store and nats-server have been running for at least 60 seconds on the same host, you may run:
|
# If the collector and store and nats-server have been running for at least 60 seconds on the same host:
|
||||||
curl -H "Authorization: Bearer $JWT" -D - "http://localhost:8080/api/query" -d "{ \"cluster\": \"testcluster\", \"from\": $(expr $(date +%s) - 60), \"to\": $(date +%s), \"queries\": [{
|
curl -H "Authorization: Bearer $JWT" \
|
||||||
\"metric\": \"load_one\",
|
"http://localhost:8082/api/query/" \
|
||||||
\"host\": \"$(hostname)\"
|
-d '{
|
||||||
}] }"
|
"cluster": "testcluster",
|
||||||
|
"from": '"$(expr $(date +%s) - 60)"',
|
||||||
# ...
|
"to": '"$(date +%s)"',
|
||||||
|
"queries": [{ "metric": "cpu_load", "host": "'"$(hostname)"'" }]
|
||||||
|
}'
|
||||||
```
|
```
|
||||||
|
|
||||||
For debugging there is a debug endpoint to dump the current content to stdout:
|
For debugging, the debug endpoint dumps the current content to stdout:
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
JWT="eyJ0eXAiOiJKV1QiLCJhbGciOiJFZERTQSJ9.eyJ1c2VyIjoiYWRtaW4iLCJyb2xlcyI6WyJST0xFX0FETUlOIiwiUk9MRV9BTkFMWVNUIiwiUk9MRV9VU0VSIl19.d-3_3FZTsadPjDEdsWrrQ7nS0edMAR4zjl-eK7rJU3HziNBfI9PDHDIpJVHTNN5E5SlLGLFXctWyKAkwhXL-Dw"
|
JWT="eyJ0eXAiOiJKV1QiLCJhbGciOiJFZERTQSJ9.eyJ1c2VyIjoiYWRtaW4iLCJyb2xlcyI6WyJST0xFX0FETUlOIiwiUk9MRV9BTkFMWVNUIiwiUk9MRV9VU0VSIl19.d-3_3FZTsadPjDEdsWrrQ7nS0edMAR4zjl-eK7rJU3HziNBfI9PDHDIpJVHTNN5E5SlLGLFXctWyKAkwhXL-Dw"
|
||||||
|
|
||||||
# If the collector and store and nats-server have been running for at least 60 seconds on the same host, you may run:
|
# Dump everything
|
||||||
curl -H "Authorization: Bearer $JWT" -D - "http://localhost:8080/api/debug"
|
curl -H "Authorization: Bearer $JWT" "http://localhost:8082/api/debug/"
|
||||||
|
|
||||||
# ...
|
# Dump a specific selector (colon-separated path)
|
||||||
|
curl -H "Authorization: Bearer $JWT" "http://localhost:8082/api/debug/?selector=testcluster:host1"
|
||||||
```
|
```
|
||||||
|
|||||||
Reference in New Issue
Block a user