Commit Graph

2681 Commits

Author SHA1 Message Date
Christoph Kluge
ba366d0d72 use inline literals in simple queries, add downgrade optimize 2026-03-13 15:16:19 +01:00
f15f1452cc Inline jobstate literal in query
Entire-Checkpoint: 35f06df74b51
2026-03-13 15:16:07 +01:00
df2a13def2 Merge branch 'hotfix' of github.com:ClusterCockpit/cc-backend into hotfix 2026-03-13 14:34:11 +01:00
d586fe4b43 Optimize usage dashboard: partial indexes, request cache, parallel histograms
- Add migration 14: partial covering indexes WHERE job_state='running'
  for user/project/subcluster groupings (tiny B-tree vs full table)
- Inline literal state value in BuildWhereClause so SQLite matches
  partial indexes instead of parameterized placeholders
- Add per-request statsGroupCache (sync.Once per filter+groupBy key)
  so identical grouped stats queries execute only once per GQL operation
- Parallelize 4 histogram queries in AddHistograms using errgroup
- Consolidate frontend from 6 GQL aliases to 2, sort+slice top-10
  client-side via $derived

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 5b26a6e5ff10
2026-03-13 14:31:37 +01:00
Christoph Kluge
bc214f6cea add nullsafes to frontend 2026-03-13 14:20:45 +01:00
cbe46c3524 Merge branch 'hotfix' of github.com:ClusterCockpit/cc-backend into hotfix 2026-03-13 13:17:34 +01:00
0037d969b2 Consolidate UsageDash into single GraphQL query
Merge three separate queries (topJobsQuery, topNodesQuery, topAccsQuery)
into one topStatsQuery with 6 aliased jobsStatistics fields, reducing
3 HTTP round trips to 1 on the status dashboard.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 40d806a3240c
2026-03-13 13:14:29 +01:00
dd3e5427f4 Add covering indexes for status/dashboard queries (migration 13)
Adds composite covering indexes on (cluster, job_state, <group_col>, ...)
for user, project, and subcluster groupings to enable index-only scans
for status views. Drops subsumed 3-column indexes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 3d8def28e96e
2026-03-13 13:12:54 +01:00
Christoph Kluge
e666980184 fix typo 2026-03-13 12:07:43 +01:00
Christoph Kluge
c238f68af6 reduce unnecessary complexity 2026-03-13 12:05:16 +01:00
Christoph Kluge
58c0c79f72 handle single job state queries as simple stringquery
- this will improve index usage for single state queries
2026-03-13 12:03:06 +01:00
Christoph Kluge
c23d7bd5e5 remove non-required sorting params
- caused expensive DB scans without use or need
2026-03-13 11:27:45 +01:00
Christoph Kluge
41114f7eda reorder frontend coded filters to match db indices 2026-03-13 10:48:38 +01:00
Christoph Kluge
a877937a25 add missing downgarde index drop, add optimize after downgrades 2026-03-13 10:11:11 +01:00
39ab12784c Make checkpointInterval an option config option again.
Also applies small fixes

Entire-Checkpoint: c11d1a65fae4
2026-03-13 09:07:38 +01:00
b214e1755a Add buffered I/O to WAL writes and fix MemoryCap comment
WAL writes now go through bufio.Writer instead of raw syscalls per record,
reducing I/O overhead. Buffers are flushed on rotate, drain, and shutdown.
Fixed misleading MemoryCap comment ("Max bytes" → "Max memory in GB").

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: b38dc35e5334
2026-03-13 09:05:24 +01:00
a4f9ba6975 Apply correct log level
Entire-Checkpoint: 8288af281b94
2026-03-13 07:58:57 +01:00
8234ad3126 fix: Fix metricstore memory explosion from broken emergency free and batch aborts
- Fix MemoryUsageTracker: remove premature bufferPool.Clear() that prevented
  mem.Alloc from decreasing, replace broken ForceFree loop (100 iterations
  with no GC) with progressive time-based Free at 75%/50%/25% retention,
  add bufferPool.Clear()+GC between steps so memory stats update correctly
- Enable debug.FreeOSMemory() after emergency freeing to return memory to OS
- Add adaptive ticker: 30s checks when memory >80% of cap, normal otherwise
- Reduce default memory check interval from 1h to 5min
- Don't abort entire NATS batch on single write error (out-of-order timestamp),
  log warning and continue processing remaining lines
- Prune empty levels from tree after free() to reduce overhead
- Include buffer struct overhead in sizeInBytes() for more accurate reporting

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 7ce28627fc1d
2026-03-13 07:57:35 +01:00
126f65879a Update Release informations
Entire-Checkpoint: 9f282c3d9570
2026-03-13 06:23:33 +01:00
3aacc669b6 Remove debug timer 2026-03-13 06:18:21 +01:00
96fc44a649 fix: Optimize project stat query 2026-03-13 06:06:38 +01:00
8e86e8720d Make stats query selective. Add stats index. Add paging to user list.
Entire-Checkpoint: d42431eee30d
2026-03-12 20:16:55 +01:00
4555fb8a86 Merge branch 'hotfix' of github.com:ClusterCockpit/cc-backend into hotfix 2026-03-12 20:15:54 +01:00
0e27624d73 Add flag to optimize db. Remove ANALYZE on startup.
Entire-Checkpoint: d49917ff4b10
2026-03-12 20:12:49 +01:00
Christoph Kluge
8563ed5e08 fix: remove indices from migration 9
- optimization migration 11 drops these indices, so rather not create them in the first place
2026-03-12 14:45:45 +01:00
Christoph Kluge
2d07bdf6b5 fix: add missing nullsafe in publicDash 2026-03-12 14:13:45 +01:00
7f069f1ec1 Prepare bugfix release 1.5.1
Entire-Checkpoint: 15cc90a0347a
2026-03-12 06:40:36 +01:00
2506a92cdf Remove entire log 2026-03-12 06:14:11 +01:00
Christoph Kluge
972b14033a add db migration 11, optimizing index count 2026-03-11 16:07:29 +01:00
af78f06ced fix: Reduce complexity for groupBy stats queries
Entire-Checkpoint: fc899a70a751
2026-03-11 15:14:59 +01:00
6e0fe62566 Add new db config options to README 2026-03-11 14:30:41 +01:00
e70310dcbc fix: Segvault when taggers are enabled but rule directories missing 2026-03-11 11:15:08 +01:00
00d2f97c4c fix: Large heap allocations in sqlite driver. Sanitize sqlite config and make it configurablex. Allow to cancel queries. 2026-03-11 11:14:37 +01:00
c8d8f7084a Merge branch 'hotfix' of github.com:ClusterCockpit/cc-backend into hotfix 2026-03-11 07:50:55 +01:00
dc7407d0f0 fix: prevent segvault if enable-job-taggers option is tru but tagger config directories are missing
Entire-Checkpoint: 9ec86e3669e1
2026-03-11 07:50:53 +01:00
eba3995610 Add Analyse on db startup
Entire-Checkpoint: ea70a955214d
2026-03-11 05:28:52 +01:00
f8831e7040 Fixed merge errors
Entire-Checkpoint: ddd4fa4a7bbb
2026-03-11 05:09:38 +01:00
1cf99206a9 Merge branch 'hotfix' of github.com:ClusterCockpit/cc-backend into hotfix 2026-03-11 05:06:26 +01:00
5d3d77620e fix: Add defer.close for all queries 2026-03-11 05:04:20 +01:00
Christoph Kluge
5c72664162 bump frontend patch versions 2026-03-10 18:15:24 +01:00
Christoph Kluge
f3e796f3f5 add nullsafes to node view 2026-03-10 17:05:50 +01:00
Christoph Kluge
cc38b17472 fix wrong field checked vor json validity 2026-03-10 17:02:09 +01:00
282197ebef fix: Round floats in tagger message
Entire-Checkpoint: b68850c6fcff
2026-03-10 06:01:31 +01:00
Christoph Kluge
d2bc046fc6 fix ranged filter GT and LT conditions, reduce energy filter preset 2026-03-09 11:28:30 +01:00
Jan Eitzinger
d0ebba5b4a Merge pull request #513 from ClusterCockpit/dev
Dev
v1.5.0
2026-03-06 10:58:57 +01:00
70fea39d03 Add note on dynamic memory management for restarts 2026-03-06 10:56:23 +01:00
Christoph Kluge
88bd83b07e add nullsafe fallbacks 2026-03-06 10:19:46 +01:00
Christoph Kluge
d74465215d simplify and fix adaptive threshold logic 2026-03-06 10:09:44 +01:00
Jan Eitzinger
0fb9dc0373 Merge pull request #512 from ClusterCockpit/dev
bump frontend dependencies
2026-03-05 12:28:50 +01:00
Christoph Kluge
2c519ab2dc bump frontend dependencies
- fixes CVE-2020-7660 in @rollup/plugin-terser
2026-03-05 12:23:00 +01:00