Merge pull request #532 from ClusterCockpit/hotfix

Remove static linkage for helper tools
2026-03-21 07:17:30 +01:00 · 2026-03-20 09:37:34 +01:00 · 2026-03-20 09:34:49 +01:00 · 2026-03-20 09:25:04 +01:00 · 2026-03-20 09:19:13 +01:00 · 2026-03-20 08:41:27 +01:00
5 changed files with 74 additions and 6 deletions
--- a/.goreleaser.yaml
+++ b/.goreleaser.yaml
@@ -5,6 +5,7 @@ before:
 builds:
  - env:
      - CGO_ENABLED=1
+      - CC=x86_64-linux-musl-gcc
    goos:
      - linux
    goarch:
--- a/ReleaseNotes.md
+++ b/ReleaseNotes.md
@@ -10,7 +10,10 @@ If you are upgrading from v1.5.0 you need to do another DB migration. This
 should not take long. For optimal database performance after the migration it is
 recommended to apply the new `optimize-db` flag, which runs the sqlite `ANALYZE`
 and `VACUUM` commands. Depending on your database size (more then 40GB) the
-`VACUUM` may take up to 2h.
+`VACUUM` may take up to 2h. You can also run the `ANALYZE` command manually.
+While we are confident that the memory issue with the metricstore cleanup move
+policy is fixed, it is still recommended to use delete policy for cleanup.
+This is also the default.

 ## Changes in 1.5.2

@@ -19,6 +22,14 @@ and `VACUUM` commands. Depending on your database size (more then 40GB) the
 - **Memory spike in parquet writer**: Fixed memory spikes when using the
  metricstore move (archive) policy with the parquet writer. The writer now
  processes data in a streaming fashion to avoid accumulating large allocations.
+- **Top list query fixes**: Fixed top list queries in analysis and dashboard
+  views.
+- **Exclude down nodes from HealthCheck**: Down nodes are now excluded from
+  health checks in both the REST and NATS handlers.
+- **Node state priority order**: Node state determination now enforces a
+  priority order. Exception: idle+down results in idle.
+- **Blocking ReceiveNats call**: Fixed a blocking NATS receive call in the
+  metricstore.

 ### Database performance

@@ -33,6 +44,16 @@ and `VACUUM` commands. Depending on your database size (more then 40GB) the
  write load.
 - **Increased default SQLite timeout**: The default SQLite connection timeout
  has been raised to reduce spurious timeout errors under load.
+- **Optimized stats queries**: Improved sortby handling in stats queries, fixed
+  cache key passing, and simplified a stats query condition that caused an
+  expensive unnecessary subquery.
+
+### MetricStore performance
+
+- **Sharded WAL consumer**: The WAL consumer is now sharded for significantly
+  higher write throughput.
+- **NATS contention fix**: Fixed contention in the metricstore NATS ingestion
+  path.

 ### NATS API

@@ -52,6 +73,24 @@ and `VACUUM` commands. Depending on your database size (more then 40GB) the
  operation.
 - **Checkpoint archiving log**: Added an informational log message when the
  metricstore checkpoint archiving process runs.
+- **Auth failure context**: Auth failure log messages now include more context
+  information.
+
+### Behavior changes
+
+- **DB-based metricHealth**: Replaced heuristic-based metric health with
+  DB-based metric health for the node view, providing more accurate health
+  status information.
+- **Removed minRunningFor filter remnants**: Cleaned up remaining `minRunningFor`
+  references from the GraphQL schema and query builder.
+
+### Frontend
+
+- **Streamlined statsSeries**: Unified stats series calculation and rendering
+  across plot components.
+- **Clarified plot titles**: Improved titles in dashboard and health views.
+- **Bumped frontend dependencies**: Updated frontend dependencies to latest
+  versions.

 ### Dependencies

@@ -67,7 +106,7 @@ and `VACUUM` commands. Depending on your database size (more then 40GB) the
  running has to be allowed to execute the journalctl command.
 - The user configuration keys for the ui have changed. Therefore old user
  configuration persisted in the database is not used anymore. It is recommended
-  to configure the metrics shown in the ui-config sestion and remove all records
+  to configure the metrics shown in the ui-config section and remove all records
  in the table after the update.
 - Currently energy footprint metrics of type energy are ignored for calculating
  total energy.
--- a/go.sum
+++ b/go.sum
@@ -4,10 +4,6 @@ github.com/99designs/gqlgen v0.17.88 h1:neMQDgehMwT1vYIOx/w5ZYPUU/iMNAJzRO44I5In
 github.com/99designs/gqlgen v0.17.88/go.mod h1:qeqYFEgOeSKqWedOjogPizimp2iu4E23bdPvl4jTYic=
 github.com/Azure/go-ntlmssp v0.1.0 h1:DjFo6YtWzNqNvQdrwEyr/e4nhU3vRiwenz5QX7sFz+A=
 github.com/Azure/go-ntlmssp v0.1.0/go.mod h1:NYqdhxd/8aAct/s4qSYZEerdPuH1liG2/X9DiVTbhpk=
-github.com/ClusterCockpit/cc-lib/v2 v2.8.2 h1:rCLZk8wz8yq8xBnBEdVKigvA2ngR8dPmHbEFwxxb3jw=
-github.com/ClusterCockpit/cc-lib/v2 v2.8.2/go.mod h1:FwD8vnTIbBM3ngeLNKmCvp9FoSjQZm7xnuaVxEKR23o=
-github.com/ClusterCockpit/cc-lib/v2 v2.9.0 h1:mzUYakcjwb+UP5II4jOvr36rSYct90gXBbtUg+nvm9c=
-github.com/ClusterCockpit/cc-lib/v2 v2.9.0/go.mod h1:FwD8vnTIbBM3ngeLNKmCvp9FoSjQZm7xnuaVxEKR23o=
 github.com/ClusterCockpit/cc-lib/v2 v2.9.1 h1:eplKhXQyGAElBGCEGdmxwj7fLv26Op16uK0KxUePDak=
 github.com/ClusterCockpit/cc-lib/v2 v2.9.1/go.mod h1:FwD8vnTIbBM3ngeLNKmCvp9FoSjQZm7xnuaVxEKR23o=
 github.com/ClusterCockpit/cc-line-protocol/v2 v2.4.0 h1:hIzxgTBWcmCIHtoDKDkSCsKCOCOwUC34sFsbD2wcW0Q=
--- a/internal/graph/schema.resolvers.go
+++ b/internal/graph/schema.resolvers.go
@@ -676,6 +676,11 @@ func (r *queryResolver) JobsStatistics(ctx context.Context, filter []*model.JobF
 			// Use request-scoped cache: multiple aliases with same (filter, groupBy)
 			// but different sortBy/page hit the DB only once.
 			if cache := getStatsGroupCache(ctx); cache != nil {
+				// Ensure the sort field is computed even if not in the GraphQL selection,
+				// because sortAndPageStats will sort by it in memory.
+				if sortBy != nil {
+					reqFields[sortByFieldName(*sortBy)] = true
+				}
 				key := statsCacheKey(filter, groupBy, reqFields)
 				var allStats []*model.JobsStatistics
 				allStats, err = cache.getOrCompute(key, func() ([]*model.JobsStatistics, error) {
--- a/internal/graph/stats_cache.go
+++ b/internal/graph/stats_cache.go
@@ -107,6 +107,33 @@ func sortAndPageStats(allStats []*model.JobsStatistics, sortBy *model.SortByAggr
 	return sorted
 }

+// sortByFieldName maps a SortByAggregate enum to the corresponding reqFields key.
+// This ensures the DB computes the column that sortAndPageStats will sort by.
+func sortByFieldName(sortBy model.SortByAggregate) string {
+	switch sortBy {
+	case model.SortByAggregateTotaljobs:
+		return "totalJobs"
+	case model.SortByAggregateTotalusers:
+		return "totalUsers"
+	case model.SortByAggregateTotalwalltime:
+		return "totalWalltime"
+	case model.SortByAggregateTotalnodes:
+		return "totalNodes"
+	case model.SortByAggregateTotalnodehours:
+		return "totalNodeHours"
+	case model.SortByAggregateTotalcores:
+		return "totalCores"
+	case model.SortByAggregateTotalcorehours:
+		return "totalCoreHours"
+	case model.SortByAggregateTotalaccs:
+		return "totalAccs"
+	case model.SortByAggregateTotalacchours:
+		return "totalAccHours"
+	default:
+		return "totalJobs"
+	}
+}
+
 // statsFieldGetter returns a function that extracts the sortable int field
 // from a JobsStatistics struct for the given sort key.
 func statsFieldGetter(sortBy model.SortByAggregate) func(*model.JobsStatistics) int {
Author	SHA1	Message	Date
Jan Eitzinger	97330ce598	Merge pull request #532 from ClusterCockpit/hotfix Remove static linkage for helper tools	2026-03-20 09:37:34 +01:00
Jan Eitzinger	fb176c5afb	Remove static linkage for helper tools	2026-03-20 09:34:49 +01:00
Jan Eitzinger	d4ee937115	Merge pull request #531 from ClusterCockpit/hotfix Fix goreleaser config. Cleanup.	2026-03-20 09:25:04 +01:00
Jan Eitzinger	999d93efc3	Fix goreleaser config. Cleanup.	2026-03-20 09:19:13 +01:00
Jan Eitzinger	4ce0cfb686	Merge pull request #530 from ClusterCockpit/hotfix Hotfix	2026-03-20 08:41:27 +01:00
Jan Eitzinger	359962d166	Fix typo	2026-03-20 08:23:46 +01:00
Jan Eitzinger	60554896d5	Update ReleaseNote for upcoming release Entire-Checkpoint: 30099a746fc7	2026-03-20 08:21:16 +01:00
Jan Eitzinger	a9f335d910	Merge pull request #529 from ClusterCockpit/hotfix Hotfix	2026-03-20 05:50:18 +01:00
Jan Eitzinger	bf48389aeb	Optimize sortby in stats queries Entire-Checkpoint: 9b5b833472e1	2026-03-20 05:39:22 +01:00
Jan Eitzinger	676025adfe	Merge pull request #528 from ClusterCockpit/hotfix further clarify plot titles	2026-03-19 11:44:12 +01:00
Jan Eitzinger	d4a0ae173f	Merge pull request #525 from ClusterCockpit/hotfix Hotfix	2026-03-18 19:31:05 +01:00
Jan Eitzinger	a7e5ecaf6c	Merge pull request #524 from ClusterCockpit/hotfix Remove tracked .entire/metadata/ files from git	2026-03-18 07:15:36 +01:00
Jan Eitzinger	965e2007fb	Merge pull request #523 from ClusterCockpit/hotfix Hotfix	2026-03-18 07:04:12 +01:00
Jan Eitzinger	6a29faf460	Merge pull request #521 from ClusterCockpit/hotfix Hotfix	2026-03-17 09:23:59 +01:00
Jan Eitzinger	8751ae023d	Merge pull request #520 from ClusterCockpit/hotfix Extend known issues in ReleaseNotes	2026-03-15 07:22:20 +01:00
Jan Eitzinger	128c098865	Merge pull request #519 from ClusterCockpit/hotfix Hotfix	2026-03-13 17:39:04 +01:00