From defa8fa994869592467fdf9e6e8b62b12220ea7c Mon Sep 17 00:00:00 2001
From: Jan Eitzinger <jan@moebiusband.org>
Date: Mon, 23 Feb 2026 08:45:49 +0100
Subject: [PATCH] Update ReleaseNotes for v1.5.0

---
 ReleaseNotes.md | 105 ++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 97 insertions(+), 8 deletions(-)

diff --git a/ReleaseNotes.md b/ReleaseNotes.md
index 7ea43620..1178ca8e 100644
--- a/ReleaseNotes.md
+++ b/ReleaseNotes.md
@@ -16,19 +16,25 @@ For release specific notes visit the [ClusterCockpit Documentation](https://clus
 - **Removed `disable-archive` option**: This obsolete configuration option has been removed.
 - **Removed `clusters` config section**: The separate clusters configuration section
   has been removed. Cluster information is now derived from the job archive.
-- **`apiAllowedIPs` is now optional**: If not specified, defaults to secure settings.
+- **`apiAllowedIPs` is now optional**: If not specified, defaults to not
+  restricted.
 
 ### Architecture changes
 
+- **Web framework replaced**: Migrated from `gorilla/mux` to `chi` as the HTTP
+  router. This should be transparent to users but affects how middleware and
+  routes are composed. A proper 404 handler is now in place.
 - **MetricStore moved**: The `metricstore` package has been moved from `internal/`
   to `pkg/` as it is now part of the public API.
 - **MySQL/MariaDB support removed**: Only SQLite is now supported as the database backend.
 - **Archive to Cleanup renaming**: Archive-related functions have been refactored
   and renamed to "Cleanup" for clarity.
+- **`minRunningFor` filter removed**: This undocumented filter has been removed
+  from the API and frontend.
 
 ### Dependency changes
 
-- **cc-lib v2**: Switched to cc-lib version 2 with updated APIs
+- **cc-lib v2.5.1**: Switched to cc-lib version 2 with updated APIs (currently at v2.5.1)
 - **cclib NATS client**: Now using the cclib NATS client implementation
 - Removed obsolete `util.Float` usage from cclib
 
@@ -51,13 +57,30 @@ For release specific notes visit the [ClusterCockpit Documentation](https://clus
 
 - **Node state tracking**: New node table in database with timestamp tracking
 - **Node state filtering**: Filter jobs by node state in systems view
-- **Node metrics improvements**: Better handling of node-level metrics and data
 - **Node list enhancements**: Improved paging, filtering, and continuous scroll support
+- **Nodestate retention and archiving**: Node state data is now subject to configurable
+  retention policies and can be archived to Parquet format for long-term storage
+- **Faulty node metric tracking**: Faulty node state metric lists are persisted to the database
+
+### Health Monitoring
+
+- **Health status dashboard**: New dedicated "Health" tab in the status details view
+  showing per-node metric health across the cluster
+- **CCMS health check**: Support for querying health status of external
+  cc-metric-store (CCMS) instances via the API
+- **GraphQL health endpoints**: New GraphQL queries and resolvers for health data
+- **Cluster/subcluster filter**: Filter health status view by cluster or subcluster
+
+### Log Viewer
+
+- **Web-based log viewer**: New log viewer page in the admin interface for inspecting
+  backend log output directly from the browser without shell access
+- **Accessible from header**: Quick access link from the navigation header
 
 ### MetricStore Improvements
 
 - **Memory tracking worker**: New worker for CCMS memory usage tracking
-- **Dynamic retention**: Support for cluster/subcluster-specific retention times
+- **Dynamic retention**: Support for job specific dynamic retention times
 - **Improved compression**: Transparent compression for job archive imports
 - **Parallel processing**: Parallelized Iter function in all archive backends
 
@@ -65,15 +88,32 @@ For release specific notes visit the [ClusterCockpit Documentation](https://clus
 
 - **Job tagger option**: Enable automatic job tagging via configuration flag
 - **Application detection**: Automatic detection of applications (MATLAB, GROMACS, etc.)
-- **Job classifaction**: Automatic detection of pathological jobs
+- **Job classification**: Automatic detection of pathological jobs
 - **omitTagged flag**: Option to exclude tagged jobs from retention/cleanup operations
+- **Admin UI trigger**: Taggers can be run on-demand from the admin web interface
+  without restarting the backend
 
 ### Archive Backends
 
+- **Parquet archive format**: New Parquet file format for job archiving, providing
+  columnar storage with efficient compression for analytical workloads
 - **S3 backend**: Full support for S3-compatible object storage
 - **SQLite backend**: Full support for SQLite backend using blobs
 - **Performance improvements**: Fixed performance bugs in archive backends
 - **Better error handling**: Improved error messages and fallback handling
+- **Zstd compression**: Parquet writers use zstd compression for better
+  compression ratios compared to the previous snappy default
+- **Optimized sort order**: Job and nodestate Parquet files are sorted by
+  cluster, subcluster, and start time for efficient range queries
+
+### Unified Archive Retention and Format Conversion
+
+- **Uniform retention policy**: Job archive retention now supports both JSON and
+  Parquet as target formats under a single, consistent policy configuration
+- **Archive manager tool**: The `tools/archive-manager` utility now supports
+  format conversion between JSON and Parquet job archives
+- **Parquet reader**: Full Parquet archive reader implementation for reading back
+  archived job data
 
 ## New features and improvements
 
@@ -85,6 +125,14 @@ For release specific notes visit the [ClusterCockpit Documentation](https://clus
 - **Filter presets**: Move list filter preset to URL for easy sharing
 - **Job comparison**: Improved job comparison views and plots
 - **Subcluster reactivity**: Job list now reacts to subcluster filter changes
+- **Short jobs quick selection**: New "Short jobs" quick-filter button in job lists
+  replaces the removed undocumented `minRunningFor` filter
+- **Row plot cursor sync**: Cursor position is now synchronized across all metric
+  plots in a job list row for easier cross-metric comparison
+- **Disabled metrics handling**: Improved handling and display of disabled metrics
+  across job view, node view, and list rows
+- **"Not configured" info cards**: Informational cards shown when optional features
+  are not yet configured
 - **Frontend dependencies**: Bumped frontend dependencies to latest versions
 - **Svelte 5 compatibility**: Fixed Svelte state warnings and compatibility issues
 
@@ -95,6 +143,15 @@ For release specific notes visit the [ClusterCockpit Documentation](https://clus
 - **Graceful shutdown**: Fixed shutdown timeout bugs and hanging issues
 - **Configuration defaults**: Sensible defaults for most configuration options
 - **Documentation**: Extensive documentation improvements across packages
+- **Server flag in systemd unit**: Example systemd unit now includes the `-server` flag
+
+### Security
+
+- **LDAP security hardening**: Improved input validation, connection handling, and
+  error reporting in the LDAP authenticator
+- **OIDC security hardening**: Stricter token validation and improved error handling
+  in the OIDC authenticator
+- **Auth schema extensions**: Additional schema fields for improved auth configuration
 
 ### API improvements
 
@@ -102,6 +159,8 @@ For release specific notes visit the [ClusterCockpit Documentation](https://clus
 - **Job exclusivity filter**: New filter for exclusive vs. shared jobs
 - **Improved error messages**: Better error messages and documentation in REST API
 - **GraphQL enhancements**: Improved GraphQL queries and resolvers
+- **Stop job lookup order**: Reversed lookup order in stop job requests for
+  more reliable job matching (cluster+jobId first, then jobId alone)
 
 ### Performance
 
@@ -109,13 +168,17 @@ For release specific notes visit the [ClusterCockpit Documentation](https://clus
 - **Job cache**: Introduced caching table for faster job inserts
 - **Parallel imports**: Archive imports now run in parallel where possible
 - **External tool integration**: Optimized use of external tools (fd) for better performance
+- **Node repository queries**: Reviewed and optimized node repository SQL queries
+- **Buffer pool**: Resized and pooled internal buffers for better memory reuse
 
 ### Developer experience
 
 - **AI agent guidelines**: Added documentation for AI coding agents (AGENTS.md, CLAUDE.md)
 - **Example API payloads**: Added example JSON API payloads for testing
-- **Unit tests**: Added more unit tests for NATS API and other components
-- **Test improvements**: Better test coverage and test data
+- **Unit tests**: Added more unit tests for NATS API, node repository, and other components
+- **Test improvements**: Better test coverage; test DB is now copied before unit tests
+  to avoid state pollution between test runs
+- **Parquet writer tests**: Comprehensive tests for Parquet archive writing and conversion
 
 ## Bug fixes
 
@@ -132,6 +195,16 @@ For release specific notes visit the [ClusterCockpit Documentation](https://clus
 - Fixed polar plot data query decoupling
 - Fixed missing resolution parameter handling
 - Fixed node table initialization fallback
+- Fixed reactivity key placement in nodeList
+- Fixed nodeList resolver data handling and increased nodestate filter cutoff
+- Fixed job always being transferred to main job table before archiving
+- Fixed AppTagger error handling and logging
+- Fixed log endpoint formatting and correctness
+- Fixed automatic refresh in metric status tab
+- Fixed NULL value handling in `health_state` and `health_metrics` columns
+- Fixed bugs related to `job_cache` IDs being used in the main job table
+- Fixed SyncJobs bug causing start job hooks to be called with wrong (cache) IDs
+- Fixed 404 handler route for sub-routers
 
 ## Configuration changes
 
@@ -167,6 +240,20 @@ For release specific notes visit the [ClusterCockpit Documentation](https://clus
       "interval": "48h",
       "directory": "./var/archive"
     }
+  },
+  "archive": {
+    "retention": {
+      "policy": "delete",
+      "age": "6months",
+      "target-format": "parquet"
+    }
+  },
+  "nodestate": {
+    "retention": {
+      "policy": "archive",
+      "age": "30d",
+      "archive-path": "./var/nodestate-archive"
+    }
   }
 }
 ```
@@ -178,11 +265,13 @@ For release specific notes visit the [ClusterCockpit Documentation](https://clus
 - If using S3 archive backend, configure the new `archive` section options
 - Test the new public dashboard at `/public` route
 - Review cron worker configuration if you need different frequencies
+- If using the archive retention feature, configure the `target-format` option
+  to choose between `json` (default) and `parquet` output formats
+- Consider enabling nodestate retention if you track node states over time
 
 ## Known issues
 
 - Currently energy footprint metrics of type energy are ignored for calculating
   total energy.
-- Resampling for running jobs only works with cc-metric-store
 - With energy footprint metrics of type power the unit is ignored and it is
   assumed the metric has the unit Watt.