Commit Graph

11 Commits

Author SHA1 Message Date
6d86690c76 Fix critical issues from follow-up security audit
A second-pass audit surfaced three severe issues missed by the previous
review, each a sibling code path of a bug class that was only partially
fixed before:

- auth: JWT session login (jwtSession.go) registered its authenticator
  even when CROSS_LOGIN_JWT_HS512_KEY was unset, leaving an empty HMAC
  key. golang-jwt verifies any HS256/HS512 signature against an empty
  key, allowing unauthenticated admin token forgery. Init() now refuses
  to register without a key, with a defense-in-depth empty-key guard in
  the keyfunc.

- repository: metric names from GraphQL ([String!]) were interpolated
  raw into json_extract(footprint, "$.<name>") SQL. SQLite parses
  double-quoted strings as literals, enabling SQL injection by any
  authenticated user. Validate metric names against ^[a-zA-Z0-9_]+$ in
  jobsMetricStatisticsHistogram and buildFloatJSONCondition.

- metricstore: cluster/host line-protocol tags flowed unvalidated into
  path.Join(RootDir, cluster, host) for checkpoint/WAL files, allowing
  arbitrary file write outside the checkpoint root via NATS
  (unauthenticated) or POST /api/write. Reject path-traversal sequences
  in DecodeLine before the tags become path components.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Entire-Checkpoint: b57246993ec1
2026-06-04 19:07:20 +02:00
Thomas Roehl
c76219651e Fix parsing of metric subtypes (key stype) 2026-05-04 18:10:01 +02:00
bf1a8a174e fix: Shard WAL consumer for higher throughput
Entire-Checkpoint: e583b7b11439
2026-03-18 06:32:14 +01:00
50aed595cf fix: metricstore NATS contention
Entire-Checkpoint: 7e68050cab59
2026-03-18 06:14:15 +01:00
8234ad3126 fix: Fix metricstore memory explosion from broken emergency free and batch aborts
- Fix MemoryUsageTracker: remove premature bufferPool.Clear() that prevented
  mem.Alloc from decreasing, replace broken ForceFree loop (100 iterations
  with no GC) with progressive time-based Free at 75%/50%/25% retention,
  add bufferPool.Clear()+GC between steps so memory stats update correctly
- Enable debug.FreeOSMemory() after emergency freeing to return memory to OS
- Add adaptive ticker: 30s checks when memory >80% of cap, normal otherwise
- Reduce default memory check interval from 1h to 5min
- Don't abort entire NATS batch on single write error (out-of-order timestamp),
  log warning and continue processing remaining lines
- Prune empty levels from tree after free() to reduce overhead
- Include buffer struct overhead in sizeInBytes() for more accurate reporting

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Entire-Checkpoint: 7ce28627fc1d
2026-03-13 07:57:35 +01:00
a1db8263d7 Document line protocol. Optimize REST writeMetric path 2026-02-27 12:30:27 +01:00
4c3cd8e66a Merge branch 'dev' into optimize-checkpoint-wal 2026-02-27 09:30:32 +01:00
6ecb934967 Switch to CC line-protocol package. Update cc-lib. 2026-02-27 08:55:33 +01:00
ca0f9a42c7 Introduce metric store binary checkpoints with write ahead log 2026-02-26 10:08:40 +01:00
499b4287f9 Switch to cclib nats client 2026-01-23 10:04:41 +01:00
f41301036b Move metricstore from internal to pkg 2026-01-23 07:49:47 +01:00