Commit Graph

56 Commits

Author SHA1 Message Date
moebiusband 26982088c3 Consolidate code for external and internal ccms buildQueries function
Entire-Checkpoint: fc3be444ef4c
2026-03-04 16:43:05 +01:00
moebiusband 67a17b5306 Reduce noise in info log 2026-03-04 15:14:35 +01:00
moebiusband 39635ea123 Cleanup metricstore options
Entire-Checkpoint: 2f9a4e1c2e87
2026-03-04 10:37:43 +01:00
Aditya Ujeniya 74ab51f409 Patch bufferPool with no limits to pool size 2026-03-03 09:51:04 +01:00
moebiusband 688ad507a2 Merge branch 'optimize-checkpoint-wal' into dev 2026-03-03 06:58:28 +01:00
Christoph Kluge 718ff60221 clarify ccms logs 2026-03-02 16:24:38 +01:00
Aditya Ujeniya a243e17499 Update to shutdown worker for WAL checkpointing mode 2026-03-02 15:27:06 +01:00
moebiusband 1ec41d8389 Review and improve buffer pool implmentation. Add unit tests. 2026-02-28 19:34:33 +01:00
moebiusband 888d7fb235 Merge branch 'optimize-checkpoint-wal' of github.com:ClusterCockpit/cc-backend into optimize-checkpoint-wal 2026-02-27 17:40:34 +01:00
moebiusband adebffd251 Replace the old zip archive options for the metricstore node data by parquet files 2026-02-27 17:40:32 +01:00
Aditya Ujeniya 2e5d85c223 Udpate testcase 2026-02-27 15:09:06 +01:00
Aditya Ujeniya 07b989cb81 Add new bufferPool implementation 2026-02-27 14:44:32 +01:00
moebiusband a418abc7d5 Run go fix 2026-02-27 14:40:26 +01:00
moebiusband a1db8263d7 Document line protocol. Optimize REST writeMetric path 2026-02-27 12:30:27 +01:00
moebiusband 4c3cd8e66a Merge branch 'dev' into optimize-checkpoint-wal 2026-02-27 09:30:32 +01:00
moebiusband 6ecb934967 Switch to CC line-protocol package. Update cc-lib. 2026-02-27 08:55:33 +01:00
moebiusband ca0f9a42c7 Introduce metric store binary checkpoints with write ahead log 2026-02-26 10:08:40 +01:00
moebiusband cc21e0e62c Make json the default checkpoint format 2026-02-25 07:38:19 +01:00
Christoph Kluge bae7ec11b4 migrate changes from cc-backend PR#364 2026-02-20 15:10:02 +01:00
moebiusband 6035b62734 Run go fix 2026-02-17 21:04:17 +01:00
Aditya Ujeniya 1cf2c41bd7 Resize the buffers and put them into the pool 2026-02-16 18:21:45 +01:00
Aditya Ujeniya 2eeefc2720 Add healthCheck support for external CCMS 2026-02-16 16:57:17 +01:00
moebiusband 865cd3db54 Prersist faulty nodestate metric lists to db 2026-02-12 08:48:15 +01:00
moebiusband 8d6c6b819b Update and port to cc-lib 2026-02-11 07:06:06 +01:00
moebiusband a8194de492 Add diagnostic output for healthcheck 2026-02-07 06:17:34 +01:00
moebiusband a8d385a1ee Update HealthCheck again Still WIP 2026-02-06 16:35:02 +01:00
moebiusband 5579b6f40c Adopt unit test to new API 2026-02-06 16:11:10 +01:00
moebiusband 7123a8c1cc Updated HealthCheck implementation WIP 2026-02-06 16:04:01 +01:00
moebiusband f671d8df90 Add counts in healthcheck for logging output 2026-02-06 09:25:09 +01:00
Aditya Ujeniya fcb37b0367 Update to count healthy metrics 2026-02-06 08:45:36 +01:00
moebiusband 0984c1d431 Add debug log with degrade and missing metrics for healthcheck 2026-02-06 07:21:04 +01:00
moebiusband 5d7dd62b72 Update unit test for new HealthCheck update 2026-02-04 12:53:24 +01:00
moebiusband 46fb52d67e Adopt documentation 2026-02-04 12:30:33 +01:00
Aditya Ujeniya 39b8356683 Optimized CCMS healthcheck 2026-02-04 10:24:45 +01:00
moebiusband 42ce598865 Merge branch 'dev' of github.com:ClusterCockpit/cc-backend into dev 2026-02-03 18:35:35 +01:00
moebiusband 0d62a300e7 Intermediate state of node Healthcheck
TODOS:
* Remove error handling from routine and simplify API call
* Use map for hardware level metrics
2026-02-03 18:35:17 +01:00
Aditya Ujeniya 3cf88f757c Update to checkpoint loader in CCMS 2026-02-03 16:25:48 +01:00
moebiusband 248f11f4f8 Change API of Node HealthState 2026-02-03 14:55:12 +01:00
moebiusband 00a41373e8 Add monitoring healthstate support in nodestate API. 2026-02-03 12:23:24 +01:00
Aditya Ujeniya a71341064e Update to MetricStore HealthCheck API 2026-01-30 23:24:16 +01:00
Christoph Kluge 32f0664012 add indicator to nodeView state, cap bubble size in roofline 2026-01-30 14:32:41 +01:00
Christoph Kluge 4deec9a170 no append if ErrNoHostOrMetric fired 2026-01-29 15:18:50 +01:00
Aditya Ujeniya 7101d2bb3b Handle the metric/host not found case differently 2026-01-28 17:47:38 +01:00
moebiusband 0d857b49a2 Disable explicit GC calls 2026-01-28 11:21:27 +01:00
moebiusband eb5aa9ad02 Disable explicit GC calls 2026-01-28 11:21:02 +01:00
moebiusband 9d15a87c88 Take into account the real allocated heap memory in MemoryUsageTracker 2026-01-27 18:23:09 +01:00
moebiusband bbde91a1f9 Move wg increment inside goroutines. Make GC calls less aggressive 2026-01-27 17:25:29 +01:00
moebiusband 55cb2cb6d6 Prevent file not closed on error in avro checkpoint 2026-01-27 17:10:26 +01:00
moebiusband 752e19c276 Pull out metric List build from metricstore Init 2026-01-27 17:06:52 +01:00
moebiusband b307e885ce feat: Add support for multiple external metric stores 2026-01-27 10:02:07 +01:00