mirror of
https://github.com/ClusterCockpit/cc-specifications.git
synced 2025-07-23 05:11:41 +02:00
Update specs
This commit is contained in:
49
job-archive/README.md
Normal file
49
job-archive/README.md
Normal file
@@ -0,0 +1,49 @@
|
||||
# File based archive specification for HPC jobs
|
||||
|
||||
This is a json files based exchange format for HPC job meta and performance metric data.
|
||||
|
||||
It consists of two parts:
|
||||
* a sqlite database schema for job meta data and performance statistics
|
||||
* a json file format together with a directory hierarchy specification
|
||||
|
||||
By using an open, portable and simple specification based on files it is
|
||||
possible to exchange job performance data for research and analysis purposes as
|
||||
well as a robust way for archiving job performance data on disk.
|
||||
|
||||
## Directory hierarchy and file specification
|
||||
|
||||
The job archive has top-level directories named after the clusters. In every
|
||||
cluster directory there must be one file named `cluster.json` describing the
|
||||
cluster. The json schema for this file is described here. Within this directory
|
||||
a three-level directory tree is used to organize job files.
|
||||
|
||||
To manage the number of directories within a single directory a tree approach
|
||||
is used splitting the integer job ID. The job id is split in junks of 1000
|
||||
each.
|
||||
|
||||
For a 2 layer schema this can be achieved with (code example in Perl):
|
||||
|
||||
```perl
|
||||
$level1 = $jobID/1000;
|
||||
$level2 = $jobID%1000;
|
||||
$dstPath = sprintf("%s/%s/%d/%03d", $trunk, $destdir, $level1, $level2);
|
||||
|
||||
```
|
||||
|
||||
The last directory level is the unix epoch timestamp in seconds to allow for
|
||||
overflowing job ids.
|
||||
|
||||
Example:
|
||||
|
||||
For the job ID 1034871 the directory path is ./1034/871/<timestamp>/.
|
||||
|
||||
The job data consists of two files:
|
||||
|
||||
* meta.json: Contains job meta information and job statistics.
|
||||
* data.json: Contains complete job data with time series
|
||||
|
||||
The description of the json format specification is available as json schema.
|
||||
|
||||
Metric time series data is stored with fixed time step. The time step can be
|
||||
set per metric. If no value is available for a metric time series data
|
||||
timestamp null must be entered.
|
Reference in New Issue
Block a user