Commit Graph

19 Commits

Author SHA1 Message Date
Joachim Meyer
80619b6154 Limit verbosity a bit (SCHEDD_DEBUG=D_FULLDEBUG). 2022-12-16 09:35:10 +01:00
Joachim Meyer
b530d9034e Restructure. 2022-12-15 16:20:26 +01:00
Joachim Meyer
a3ca962d84 Add Schedd plugin to synch with CC.
This should be much more reliable, albeit being more prone to crash a HTCondor component (the schedd) if there's a bug...
2022-12-15 16:13:45 +01:00
Joachim Meyer
d83f263dba Also stop jobs that ended with shadow exception. 2022-11-29 15:48:05 +01:00
Joachim Meyer
6175affa55 ULOG_JOB_RECONNECT_FAILED is a stop reason. 2022-11-16 13:07:01 +01:00
Joachim Meyer
09334ab4f1 Offset ArrayJobId with submitnode id.. 2022-11-09 14:27:53 +01:00
Joachim Meyer
9571f3cda6 Don't check outdated cc job data. 2022-11-09 11:47:59 +01:00
Joachim Meyer
9e96f65977 Handle held / requeued jobs.
Requires cc-backend patch proposed in:
https://github.com/ClusterCockpit/cc-backend/issues/30
(Upstream assumes startTime to be non-unique if they happened in the same 24hrs).
2022-11-09 10:30:04 +01:00
Joachim Meyer
21cdece420 Use value from actually used schema. 2022-11-08 17:45:47 +01:00
Joachim Meyer
253784d94d If no ToE, use eventtime. 2022-11-08 17:40:49 +01:00
Joachim Meyer
35c6ee3b47 Disable event 4. 2022-11-08 17:27:51 +01:00
Joachim Meyer
fe641ca357 Fix file name 2022-11-08 16:47:59 +01:00
Joachim Meyer
308df9907e Start revamping to use htcondor EventLog not slurm 2022-11-04 16:25:48 +01:00
Michael Schwarz
c9aa4095fe Add systemd service and timer to start this script every minute 2022-09-06 15:02:28 +02:00
Michael Schwarz
57593358a2 Ignore tagged jobs 2022-08-30 11:00:25 +02:00
Michael Schwarz
631ed6c8b6 Little bugfix, there might be failed jobs without a step 2022-08-30 11:00:04 +02:00
Michael Schwarz
483bc0da1d Fix some layout issues in Readme.md 2022-08-25 16:09:22 +02:00
Michael Schwarz
54fbc4fa93 Initial commit 2022-08-25 15:38:06 +02:00
oscarminus
84d49f2807
Initial commit 2022-08-25 15:30:56 +02:00