Update config for v1.5.0

2026-03-17 22:17:30 +01:00 · 2026-03-04 14:57:11 +01:00
parent 8c8c40b547
commit bd22bfe5e6
13 changed files with 4091 additions and 6825 deletions
--- a/nhr@fau/README.md
+++ b/nhr@fau/README.md
@@ -9,25 +9,15 @@ You can find an overview about all clusters

 Some systems run with job exclusive nodes, others have node sharing enabled.
 There are CPU systems (Fritz, Meggie, Woody, TinyFat) as well as GPU accelerated
-clusters (Alex, TinyGPU).
+clusters (Alex, Helma, TinyGPU).

 NHR@FAU uses the following stack:

-* `cc-metric-collector` as node agent
-* `cc-metric-store` as temporal metric time series cache. We use one instance
-for all clusters.
+* `cc-metric-collector`
 * `cc-backend`
-* A homegrown python script running on the management nodes for providing job
-meta data from Slurm
-* Builtin sqlite database for job meta and user data (currently 50GB large)
-* Job Archive without retention using compressed data.json files (around 700GB)
+* `cc-slurm-adapter`

-Currently all API use regular HTTP protocol, but we plan to switch to NATS for
-all communication.
-We also push the metric data to an InfluxDB instance for debugging purposes.
-
-The backend and metric store run on the same dedicated Dell server running
-Ubuntu Linux:
+We use the following server with Ubuntu Linux:

 * Two Intel Xeon(R) Platinum 8352Y with 32 cores each
 * 512 GB Main memory capacity