mirror of
https://github.com/ClusterCockpit/cc-backend
synced 2026-03-15 12:27:30 +01:00
Add migration with alternative reduces index set
Entire-Checkpoint: af7afc9a29ff
This commit is contained in:
@@ -22,3 +22,26 @@
|
||||
{"time":"2026-03-11T05:08:52.825155+01:00","level":"INFO","msg":"committed changes to shadow branch","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"checkpoint","agent":"claude-code","shadow_branch":"entire/1cf9920-e3b0c4"}
|
||||
{"time":"2026-03-11T05:08:52.825161+01:00","level":"INFO","msg":"checkpoint saved","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"checkpoint","agent":"claude-code","strategy":"manual-commit","checkpoint_type":"session","checkpoint_count":2,"modified_files":1,"new_files":0,"deleted_files":0,"shadow_branch":"entire/1cf9920-e3b0c4","branch_created":false}
|
||||
{"time":"2026-03-11T05:08:52.82522+01:00","level":"INFO","msg":"phase transition","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"session","agent":"claude-code","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","event":"TurnEnd","from":"active","to":"idle"}
|
||||
{"time":"2026-03-11T05:31:33.173849+01:00","level":"INFO","msg":"turn-start","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"lifecycle","agent":"claude-code","event":"TurnStart","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","session_ref":"/Users/jan/.claude/projects/-Users-jan-prg-CC-cc-backend/42401d2e-7d1c-4c0e-abe6-356cb2d48747.jsonl"}
|
||||
{"time":"2026-03-11T05:31:33.40187+01:00","level":"INFO","msg":"phase transition","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"session","agent":"claude-code","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","event":"TurnStart","from":"idle","to":"active"}
|
||||
{"time":"2026-03-11T05:31:49.964342+01:00","level":"INFO","msg":"subagent started","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"lifecycle","agent":"claude-code","event":"SubagentStart","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","tool_use_id":"toolu_01MJqptZJ5SuaZVBxAKZ1svX","transcript":"/Users/jan/.claude/projects/-Users-jan-prg-CC-cc-backend/42401d2e-7d1c-4c0e-abe6-356cb2d48747.jsonl"}
|
||||
{"time":"2026-03-11T05:31:55.403816+01:00","level":"INFO","msg":"subagent started","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"lifecycle","agent":"claude-code","event":"SubagentStart","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","tool_use_id":"toolu_01FTC59ufvgXt3VSQCNwf6FA","transcript":"/Users/jan/.claude/projects/-Users-jan-prg-CC-cc-backend/42401d2e-7d1c-4c0e-abe6-356cb2d48747.jsonl"}
|
||||
{"time":"2026-03-11T05:32:17.136258+01:00","level":"INFO","msg":"subagent completed","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"lifecycle","agent":"claude-code","event":"SubagentEnd","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","tool_use_id":"toolu_01FTC59ufvgXt3VSQCNwf6FA","agent_id":"a1ede6e4db46fa8db"}
|
||||
{"time":"2026-03-11T05:32:17.410446+01:00","level":"INFO","msg":"committed task checkpoint to shadow branch","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"checkpoint","agent":"claude-code","shadow_branch":"entire/eba3995-e3b0c4"}
|
||||
{"time":"2026-03-11T05:32:17.410453+01:00","level":"INFO","msg":"task checkpoint saved","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"checkpoint","agent":"claude-code","strategy":"manual-commit","checkpoint_type":"task","checkpoint_uuid":"","tool_use_id":"toolu_01FTC59ufvgXt3VSQCNwf6FA","subagent_type":"Explore","modified_files":3,"new_files":0,"deleted_files":0,"shadow_branch":"entire/eba3995-e3b0c4","branch_created":false}
|
||||
{"time":"2026-03-11T05:32:51.785996+01:00","level":"INFO","msg":"subagent completed","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"lifecycle","agent":"claude-code","event":"SubagentEnd","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","tool_use_id":"toolu_01MJqptZJ5SuaZVBxAKZ1svX","agent_id":"a97d0003ebdc3eccd"}
|
||||
{"time":"2026-03-11T05:32:52.089909+01:00","level":"INFO","msg":"committed task checkpoint to shadow branch","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"checkpoint","agent":"claude-code","shadow_branch":"entire/eba3995-e3b0c4"}
|
||||
{"time":"2026-03-11T05:32:52.089916+01:00","level":"INFO","msg":"task checkpoint saved","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"checkpoint","agent":"claude-code","strategy":"manual-commit","checkpoint_type":"task","checkpoint_uuid":"","tool_use_id":"toolu_01MJqptZJ5SuaZVBxAKZ1svX","subagent_type":"Explore","modified_files":3,"new_files":0,"deleted_files":0,"shadow_branch":"entire/eba3995-e3b0c4","branch_created":false}
|
||||
{"time":"2026-03-11T05:34:41.346156+01:00","level":"INFO","msg":"subagent started","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"lifecycle","agent":"claude-code","event":"SubagentStart","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","tool_use_id":"toolu_01MERmbUXd1oS8d2m7sVmGXr","transcript":"/Users/jan/.claude/projects/-Users-jan-prg-CC-cc-backend/42401d2e-7d1c-4c0e-abe6-356cb2d48747.jsonl"}
|
||||
{"time":"2026-03-11T05:36:59.651293+01:00","level":"INFO","msg":"subagent completed","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"lifecycle","agent":"claude-code","event":"SubagentEnd","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","tool_use_id":"toolu_01MERmbUXd1oS8d2m7sVmGXr","agent_id":"a4e4f76de8a1612f2"}
|
||||
{"time":"2026-03-11T05:37:00.005222+01:00","level":"INFO","msg":"committed task checkpoint to shadow branch","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"checkpoint","agent":"claude-code","shadow_branch":"entire/eba3995-e3b0c4"}
|
||||
{"time":"2026-03-11T05:37:00.005229+01:00","level":"INFO","msg":"task checkpoint saved","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"checkpoint","agent":"claude-code","strategy":"manual-commit","checkpoint_type":"task","checkpoint_uuid":"","tool_use_id":"toolu_01MERmbUXd1oS8d2m7sVmGXr","subagent_type":"Plan","modified_files":3,"new_files":0,"deleted_files":0,"shadow_branch":"entire/eba3995-e3b0c4","branch_created":false}
|
||||
{"time":"2026-03-11T05:41:52.270916+01:00","level":"INFO","msg":"turn-end","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"lifecycle","agent":"claude-code","event":"TurnEnd","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","session_ref":"/Users/jan/.claude/projects/-Users-jan-prg-CC-cc-backend/42401d2e-7d1c-4c0e-abe6-356cb2d48747.jsonl"}
|
||||
{"time":"2026-03-11T05:41:52.650259+01:00","level":"INFO","msg":"no files modified during session, skipping checkpoint","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"lifecycle","agent":"claude-code"}
|
||||
{"time":"2026-03-11T05:41:52.650334+01:00","level":"INFO","msg":"phase transition","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"session","agent":"claude-code","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","event":"TurnEnd","from":"active","to":"idle"}
|
||||
{"time":"2026-03-11T05:41:52.728326+01:00","level":"INFO","msg":"session-end","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"lifecycle","agent":"claude-code","event":"SessionEnd","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747"}
|
||||
{"time":"2026-03-11T05:41:52.72837+01:00","level":"INFO","msg":"phase transition","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"session","agent":"claude-code","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","event":"SessionStop","from":"idle","to":"ended"}
|
||||
{"time":"2026-03-11T05:41:52.79542+01:00","level":"INFO","msg":"session-start","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"lifecycle","agent":"claude-code","event":"SessionStart","session_id":"c31c699a-f492-48f7-bcf0-35d3ceeac243","session_ref":"/Users/jan/.claude/projects/-Users-jan-prg-CC-cc-backend/c31c699a-f492-48f7-bcf0-35d3ceeac243.jsonl"}
|
||||
{"time":"2026-03-11T05:43:59.204149+01:00","level":"INFO","msg":"turn-end","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"lifecycle","agent":"claude-code","event":"TurnEnd","session_id":"c31c699a-f492-48f7-bcf0-35d3ceeac243","session_ref":"/Users/jan/.claude/projects/-Users-jan-prg-CC-cc-backend/c31c699a-f492-48f7-bcf0-35d3ceeac243.jsonl"}
|
||||
{"time":"2026-03-11T05:43:59.565762+01:00","level":"INFO","msg":"committed changes to shadow branch","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"checkpoint","agent":"claude-code","shadow_branch":"entire/eba3995-e3b0c4"}
|
||||
{"time":"2026-03-11T05:43:59.565768+01:00","level":"INFO","msg":"checkpoint saved","session_id":"42401d2e-7d1c-4c0e-abe6-356cb2d48747","component":"checkpoint","agent":"claude-code","strategy":"manual-commit","checkpoint_type":"session","checkpoint_count":1,"modified_files":3,"new_files":3,"deleted_files":0,"shadow_branch":"entire/eba3995-e3b0c4","branch_created":false}
|
||||
|
||||
File diff suppressed because one or more lines are too long
@@ -1 +1 @@
|
||||
There also have to be bugs in jobQuery.go . Especially the following query triggers the memory leak: SELECT * FROM job WHERE job.job_state IN ("completed", "running", "failed") ORDER BY job.start_time DESC LIMIT 1 OFFSET 10; Dig deeper to find the cause. Also investigate why no existing index is used for this query.
|
||||
What optimal index selection do you suggest? The job table contains almost 20 million jobs in production causing timouts on any filter query that does not use and index.
|
||||
@@ -1,27 +1,9 @@
|
||||
All passes. Here's a summary of findings and fixes:
|
||||
|
||||
---
|
||||
|
||||
## Findings
|
||||
The plan reduces 79 indexes to 20 by:
|
||||
|
||||
### Bug 1: Missing `defer rows.Close()` in `user.go:GetLdapUsernames()` (line 100)
|
||||
**Fixed.** Every call leaked a database connection and its buffers.
|
||||
- **Eliminating entire categories**: all `shared`-prefixed (16), all user/project sort variants (11), all standalone single-filter sorts (11)
|
||||
- **Consolidating**: cluster+jobstate sort variants replaced by `(cluster, sort_col)` which works for any state combo with LIMIT
|
||||
- **Keeping what matters**: the indexes SQLite actually picks with ANALYZE — `jobs_starttime`, `jobs_cluster_starttime_duration`, `jobs_user_starttime_duration`, `jobs_cluster_jobstate_duration_starttime`, and a few secondary sort indexes
|
||||
|
||||
### Bug 2: SQLite never runs `ANALYZE` — wrong index selection
|
||||
**Fixed.** Added `ANALYZE` to `setupSqlite()` in `dbConnection.go`.
|
||||
|
||||
Without statistics in `sqlite_stat1`, SQLite's query planner **guessed** and chose `jobs_jobstate_energy` (a terrible index for this query) instead of scanning `jobs_starttime` in reverse. The result:
|
||||
|
||||
| | Before ANALYZE | After ANALYZE |
|
||||
|---|---|---|
|
||||
| Index used | `jobs_jobstate_energy` | `jobs_starttime` |
|
||||
| Sort | **TEMP B-TREE** (materializes ALL rows) | None (index order) |
|
||||
| Memory | Proportional to total matching rows | Constant |
|
||||
| I/O | Full scan of all matching rows | Stops at OFFSET+LIMIT |
|
||||
|
||||
### Bug 3: `IN` clause + `ORDER BY` is fundamentally incompatible with composite indexes
|
||||
|
||||
Even with the "correct" index `(job_state, start_time)`, SQLite **cannot** merge-sort across 3 separate index range scans for `IN ('completed','running','failed')`. It always falls back to a temp B-tree sort. The only efficient plan is to use the standalone `jobs_starttime` index — which SQLite does automatically **after ANALYZE** because it realizes the 3 states cover virtually all rows, making the WHERE clause nearly a no-op.
|
||||
|
||||
### Observation: 79 indexes on the `job` table
|
||||
This is excessive and actively harmful — it confuses the query planner (especially without ANALYZE) and slows writes. The `jobs_jobstate_starttime` index from migration 08 is also missing from the actual DB (only the 3-column `jobs_jobstate_starttime_duration` exists). This is worth investigating separately but is a schema/migration concern, not a code bug.
|
||||
Key trade-off: ~20% of queries that sort by rare columns (num_hwthreads, num_acc, energy) with a state filter will now do a cheap per-row state check instead of using a 3-column composite. With LIMIT this is negligible.
|
||||
Reference in New Issue
Block a user