Commit Graph

  • e7f54aa1b7 Don't monitor jobs with +NoMonitoring=true. main Joachim Meyer 2024-07-03 14:06:58 -0700
  • 0a70035977 Add MIT license file. Joachim Meyer 2023-07-20 11:16:38 +0200
  • 6cc171d7aa Cleanup unused systemd files. Joachim Meyer 2023-07-20 11:12:24 +0200
  • 8c28b2f287 Add timeout state for 12hr preemted jobs. Joachim Meyer 2023-07-06 09:33:24 +0200
  • e70d377047 Make gpu map script return the normalized node names already. Joachim Meyer 2023-06-29 10:01:19 +0200
  • 7a0c41e378 Make stop script more robust. Joachim Meyer 2023-05-08 09:40:50 +0200
  • 89f6440e0f Add new script to stop jobs that were (for soome reason) not stopped in CC. Joachim Meyer 2023-03-22 10:51:00 +0100
  • 4259611495 GPU map needs all 8 leading 0s.. Joachim Meyer 2023-02-08 09:10:25 +0100
  • 063713d18c We have the gpu map for that... :/ Joachim Meyer 2023-02-08 08:21:01 +0100
  • 4c22c4c6eb Need all 8 zeros for cc-metric-collector. Joachim Meyer 2023-02-07 17:01:34 +0100
  • 019bfa5ee9 JobCurrentStartDate & EnteredCurrentStatus differ. Joachim Meyer 2022-12-20 17:15:01 +0100
  • e38e6fa5b9 Don't start on JobStatus chng & stop on R -> IDLE. Joachim Meyer 2022-12-19 11:24:24 +0100
  • 6128b58cbd Fix AssignedGPUs parsing. Joachim Meyer 2022-12-16 13:42:04 +0100
  • 80619b6154 Limit verbosity a bit (SCHEDD_DEBUG=D_FULLDEBUG). Joachim Meyer 2022-12-16 09:35:10 +0100
  • b530d9034e Restructure. Joachim Meyer 2022-12-15 16:20:26 +0100
  • a3ca962d84 Add Schedd plugin to synch with CC. Joachim Meyer 2022-12-15 16:11:42 +0100
  • d83f263dba Also stop jobs that ended with shadow exception. Joachim Meyer 2022-11-29 15:48:05 +0100
  • 6175affa55 ULOG_JOB_RECONNECT_FAILED is a stop reason. Joachim Meyer 2022-11-16 13:07:01 +0100
  • 09334ab4f1 Offset ArrayJobId with submitnode id.. Joachim Meyer 2022-11-09 14:27:53 +0100
  • 9571f3cda6 Don't check outdated cc job data. Joachim Meyer 2022-11-09 11:47:59 +0100
  • 9e96f65977 Handle held / requeued jobs. Joachim Meyer 2022-11-09 10:30:04 +0100
  • 21cdece420 Use value from actually used schema. Joachim Meyer 2022-11-08 17:45:47 +0100
  • 253784d94d If no ToE, use eventtime. Joachim Meyer 2022-11-08 17:40:49 +0100
  • 35c6ee3b47 Disable event 4. Joachim Meyer 2022-11-08 17:27:51 +0100
  • fe641ca357 Fix file name Joachim Meyer 2022-11-08 16:47:59 +0100
  • 308df9907e Start revamping to use htcondor EventLog not slurm Joachim Meyer 2022-11-04 16:25:48 +0100
  • c9aa4095fe Add systemd service and timer to start this script every minute Michael Schwarz 2022-09-06 15:02:28 +0200
  • 57593358a2 Ignore tagged jobs Michael Schwarz 2022-08-30 11:00:25 +0200
  • 631ed6c8b6 Little bugfix, there might be failed jobs without a step Michael Schwarz 2022-08-30 11:00:04 +0200
  • 483bc0da1d Fix some layout issues in Readme.md Michael Schwarz 2022-08-25 16:09:22 +0200
  • 54fbc4fa93 Initial commit Michael Schwarz 2022-08-25 15:38:06 +0200
  • 84d49f2807
    Initial commit oscarminus 2022-08-25 15:30:56 +0200