Add migration with alternative reduces index set

Entire-Checkpoint: af7afc9a29ff
This commit is contained in:
2026-03-11 05:46:03 +01:00
parent eba3995610
commit 301e590580
7 changed files with 551 additions and 26 deletions

View File

@@ -22,3 +22,26 @@
{"time":"2026-03-11T05:08:52.825155+01:00","level":"INFO","msg":"committed changes to shadow branch","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"checkpoint","agent":"claude-code","shadow_branch":"entire/1cf9920-e3b0c4"}
{"time":"2026-03-11T05:08:52.825161+01:00","level":"INFO","msg":"checkpoint saved","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"checkpoint","agent":"claude-code","strategy":"manual-commit","checkpoint_type":"session","checkpoint_count":2,"modified_files":1,"new_files":0,"deleted_files":0,"shadow_branch":"entire/1cf9920-e3b0c4","branch_created":false}
{"time":"2026-03-11T05:08:52.82522+01:00","level":"INFO","msg":"phase transition","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"session","agent":"claude-code","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","event":"TurnEnd","from":"active","to":"idle"}
{"time":"2026-03-11T05:31:33.173849+01:00","level":"INFO","msg":"turn-start","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"lifecycle","agent":"claude-code","event":"TurnStart","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","session_ref":"/Users/jan/.claude/projects/-Users-jan-prg-CC-cc-backend/42401d2e-7d1c-4c0e-abe6-356cb2d48747.jsonl"}
{"time":"2026-03-11T05:31:33.40187+01:00","level":"INFO","msg":"phase transition","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"session","agent":"claude-code","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","event":"TurnStart","from":"idle","to":"active"}
{"time":"2026-03-11T05:31:49.964342+01:00","level":"INFO","msg":"subagent started","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"lifecycle","agent":"claude-code","event":"SubagentStart","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","tool_use_id":"toolu_01MJqptZJ5SuaZVBxAKZ1svX","transcript":"/Users/jan/.claude/projects/-Users-jan-prg-CC-cc-backend/42401d2e-7d1c-4c0e-abe6-356cb2d48747.jsonl"}
{"time":"2026-03-11T05:31:55.403816+01:00","level":"INFO","msg":"subagent started","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"lifecycle","agent":"claude-code","event":"SubagentStart","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","tool_use_id":"toolu_01FTC59ufvgXt3VSQCNwf6FA","transcript":"/Users/jan/.claude/projects/-Users-jan-prg-CC-cc-backend/42401d2e-7d1c-4c0e-abe6-356cb2d48747.jsonl"}
{"time":"2026-03-11T05:32:17.136258+01:00","level":"INFO","msg":"subagent completed","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"lifecycle","agent":"claude-code","event":"SubagentEnd","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","tool_use_id":"toolu_01FTC59ufvgXt3VSQCNwf6FA","agent_id":"a1ede6e4db46fa8db"}
{"time":"2026-03-11T05:32:17.410446+01:00","level":"INFO","msg":"committed task checkpoint to shadow branch","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"checkpoint","agent":"claude-code","shadow_branch":"entire/eba3995-e3b0c4"}
{"time":"2026-03-11T05:32:17.410453+01:00","level":"INFO","msg":"task checkpoint saved","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"checkpoint","agent":"claude-code","strategy":"manual-commit","checkpoint_type":"task","checkpoint_uuid":"","tool_use_id":"toolu_01FTC59ufvgXt3VSQCNwf6FA","subagent_type":"Explore","modified_files":3,"new_files":0,"deleted_files":0,"shadow_branch":"entire/eba3995-e3b0c4","branch_created":false}
{"time":"2026-03-11T05:32:51.785996+01:00","level":"INFO","msg":"subagent completed","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"lifecycle","agent":"claude-code","event":"SubagentEnd","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","tool_use_id":"toolu_01MJqptZJ5SuaZVBxAKZ1svX","agent_id":"a97d0003ebdc3eccd"}
{"time":"2026-03-11T05:32:52.089909+01:00","level":"INFO","msg":"committed task checkpoint to shadow branch","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"checkpoint","agent":"claude-code","shadow_branch":"entire/eba3995-e3b0c4"}
{"time":"2026-03-11T05:32:52.089916+01:00","level":"INFO","msg":"task checkpoint saved","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"checkpoint","agent":"claude-code","strategy":"manual-commit","checkpoint_type":"task","checkpoint_uuid":"","tool_use_id":"toolu_01MJqptZJ5SuaZVBxAKZ1svX","subagent_type":"Explore","modified_files":3,"new_files":0,"deleted_files":0,"shadow_branch":"entire/eba3995-e3b0c4","branch_created":false}
{"time":"2026-03-11T05:34:41.346156+01:00","level":"INFO","msg":"subagent started","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"lifecycle","agent":"claude-code","event":"SubagentStart","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","tool_use_id":"toolu_01MERmbUXd1oS8d2m7sVmGXr","transcript":"/Users/jan/.claude/projects/-Users-jan-prg-CC-cc-backend/42401d2e-7d1c-4c0e-abe6-356cb2d48747.jsonl"}
{"time":"2026-03-11T05:36:59.651293+01:00","level":"INFO","msg":"subagent completed","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"lifecycle","agent":"claude-code","event":"SubagentEnd","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","tool_use_id":"toolu_01MERmbUXd1oS8d2m7sVmGXr","agent_id":"a4e4f76de8a1612f2"}
{"time":"2026-03-11T05:37:00.005222+01:00","level":"INFO","msg":"committed task checkpoint to shadow branch","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"checkpoint","agent":"claude-code","shadow_branch":"entire/eba3995-e3b0c4"}
{"time":"2026-03-11T05:37:00.005229+01:00","level":"INFO","msg":"task checkpoint saved","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"checkpoint","agent":"claude-code","strategy":"manual-commit","checkpoint_type":"task","checkpoint_uuid":"","tool_use_id":"toolu_01MERmbUXd1oS8d2m7sVmGXr","subagent_type":"Plan","modified_files":3,"new_files":0,"deleted_files":0,"shadow_branch":"entire/eba3995-e3b0c4","branch_created":false}
{"time":"2026-03-11T05:41:52.270916+01:00","level":"INFO","msg":"turn-end","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"lifecycle","agent":"claude-code","event":"TurnEnd","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","session_ref":"/Users/jan/.claude/projects/-Users-jan-prg-CC-cc-backend/42401d2e-7d1c-4c0e-abe6-356cb2d48747.jsonl"}
{"time":"2026-03-11T05:41:52.650259+01:00","level":"INFO","msg":"no files modified during session, skipping checkpoint","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"lifecycle","agent":"claude-code"}
{"time":"2026-03-11T05:41:52.650334+01:00","level":"INFO","msg":"phase transition","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"session","agent":"claude-code","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","event":"TurnEnd","from":"active","to":"idle"}
{"time":"2026-03-11T05:41:52.728326+01:00","level":"INFO","msg":"session-end","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"lifecycle","agent":"claude-code","event":"SessionEnd","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747"}
{"time":"2026-03-11T05:41:52.72837+01:00","level":"INFO","msg":"phase transition","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"session","agent":"claude-code","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","event":"SessionStop","from":"idle","to":"ended"}
{"time":"2026-03-11T05:41:52.79542+01:00","level":"INFO","msg":"session-start","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"lifecycle","agent":"claude-code","event":"SessionStart","session_id":"c31c699a-f492-48f7-bcf0-35d3ceeac243","session_ref":"/Users/jan/.claude/projects/-Users-jan-prg-CC-cc-backend/c31c699a-f492-48f7-bcf0-35d3ceeac243.jsonl"}
{"time":"2026-03-11T05:43:59.204149+01:00","level":"INFO","msg":"turn-end","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"lifecycle","agent":"claude-code","event":"TurnEnd","session_id":"c31c699a-f492-48f7-bcf0-35d3ceeac243","session_ref":"/Users/jan/.claude/projects/-Users-jan-prg-CC-cc-backend/c31c699a-f492-48f7-bcf0-35d3ceeac243.jsonl"}
{"time":"2026-03-11T05:43:59.565762+01:00","level":"INFO","msg":"committed changes to shadow branch","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"checkpoint","agent":"claude-code","shadow_branch":"entire/eba3995-e3b0c4"}
{"time":"2026-03-11T05:43:59.565768+01:00","level":"INFO","msg":"checkpoint saved","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"checkpoint","agent":"claude-code","strategy":"manual-commit","checkpoint_type":"session","checkpoint_count":1,"modified_files":3,"new_files":3,"deleted_files":0,"shadow_branch":"entire/eba3995-e3b0c4","branch_created":false}

File diff suppressed because one or more lines are too long

View File

@@ -1 +1 @@
There also have to be bugs in jobQuery.go . Especially the following query triggers the memory leak: SELECT * FROM job WHERE job.job_state IN ("completed", "running", "failed") ORDER BY job.start_time DESC LIMIT 1 OFFSET 10; Dig deeper to find the cause. Also investigate why no existing index is used for this query.
What optimal index selection do you suggest? The job table contains almost 20 million jobs in production causing timouts on any filter query that does not use and index.

View File

@@ -1,27 +1,9 @@
All passes. Here's a summary of findings and fixes:
---
## Findings
The plan reduces 79 indexes to 20 by:
### Bug 1: Missing `defer rows.Close()` in `user.go:GetLdapUsernames()` (line 100)
**Fixed.** Every call leaked a database connection and its buffers.
- **Eliminating entire categories**: all `shared`-prefixed (16), all user/project sort variants (11), all standalone single-filter sorts (11)
- **Consolidating**: cluster+jobstate sort variants replaced by `(cluster, sort_col)` which works for any state combo with LIMIT
- **Keeping what matters**: the indexes SQLite actually picks with ANALYZE — `jobs_starttime`, `jobs_cluster_starttime_duration`, `jobs_user_starttime_duration`, `jobs_cluster_jobstate_duration_starttime`, and a few secondary sort indexes
### Bug 2: SQLite never runs `ANALYZE` — wrong index selection
**Fixed.** Added `ANALYZE` to `setupSqlite()` in `dbConnection.go`.
Without statistics in `sqlite_stat1`, SQLite's query planner **guessed** and chose `jobs_jobstate_energy` (a terrible index for this query) instead of scanning `jobs_starttime` in reverse. The result:
| | Before ANALYZE | After ANALYZE |
|---|---|---|
| Index used | `jobs_jobstate_energy` | `jobs_starttime` |
| Sort | **TEMP B-TREE** (materializes ALL rows) | None (index order) |
| Memory | Proportional to total matching rows | Constant |
| I/O | Full scan of all matching rows | Stops at OFFSET+LIMIT |
### Bug 3: `IN` clause + `ORDER BY` is fundamentally incompatible with composite indexes
Even with the "correct" index `(job_state, start_time)`, SQLite **cannot** merge-sort across 3 separate index range scans for `IN ('completed','running','failed')`. It always falls back to a temp B-tree sort. The only efficient plan is to use the standalone `jobs_starttime` index — which SQLite does automatically **after ANALYZE** because it realizes the 3 states cover virtually all rows, making the WHERE clause nearly a no-op.
### Observation: 79 indexes on the `job` table
This is excessive and actively harmful — it confuses the query planner (especially without ANALYZE) and slows writes. The `jobs_jobstate_starttime` index from migration 08 is also missing from the actual DB (only the 3-column `jobs_jobstate_starttime_duration` exists). This is worth investigating separately but is a schema/migration concern, not a code bug.
Key trade-off: ~20% of queries that sort by rare columns (num_hwthreads, num_acc, energy) with a state filter will now do a cheap per-row state check instead of using a 3-column composite. With LIMIT this is negligible.