mirror of
				https://github.com/ClusterCockpit/cc-backend
				synced 2025-10-31 07:55:06 +01:00 
			
		
		
		
	Merge branch 'hotfix' of https://github.com/ClusterCockpit/cc-backend into hotfix
This commit is contained in:
		| @@ -41,8 +41,9 @@ versions of third party packages. | ||||
|  | ||||
| ## Demo Setup | ||||
|  | ||||
| We provide a shell skript that downloads demo data and automatically builds and starts cc-backend. | ||||
| You need `wget`, `go`, `node`, `rollup` and `yarn` in your path to start the demo. The demo will download 32MB of data (223MB on disk). | ||||
| We provide a shell skript that downloads demo data and automatically builds and | ||||
| starts cc-backend. You need `wget`, `go`, `node`, `npm` in your path to start | ||||
| the demo. The demo will download 32MB of data (223MB on disk). | ||||
|  | ||||
| ```sh | ||||
| git clone https://github.com/ClusterCockpit/cc-backend.git | ||||
|   | ||||
| @@ -0,0 +1,78 @@ | ||||
| The job archive specifies an exchange format for job meta and performance metric | ||||
| data. It consists of two parts: | ||||
| * a [SQLite database schema](https://github.com/ClusterCockpit/cc-backend/wiki/Job-Archive#sqlite-database-schema)  for job meta data and performance statistics | ||||
| * a [Json file format](https://github.com/ClusterCockpit/cc-backend/wiki/Job-Archive#json-file-format) together with a [Directory hierarchy specification](https://github.com/ClusterCockpit/cc-backend/wiki/Job-Archive#directory-hierarchy-specification) | ||||
|  | ||||
| By using an open, portable and simple specification based on files it is | ||||
| possible to exchange job performance data for research and analysis purposes as | ||||
| well as use it as a robust way for archiving job performance data to disk. | ||||
|  | ||||
| # SQLite database schema | ||||
| ## Introduction | ||||
|  | ||||
| A SQLite 3 database schema is provided to standardize the job meta data | ||||
| information in a portable way. The schema also includes optional columns for job | ||||
| performance statistics (called a job performance footprint). The database acts | ||||
| as a front end to filter and select subsets of job IDs, that are the keys to get | ||||
| the full job performance data in the job performance tree hierarchy. | ||||
|  | ||||
| ## Database schema | ||||
|  | ||||
| The schema includes 3 tables: the job table, a tag table and a jobtag table | ||||
| representing the MANY-TO-MANY relation between jobs and tags. The SQL schema is | ||||
| specified | ||||
| [here](https://github.com/ClusterCockpit/cc-specifications/blob/master/schemas/jobs-sqlite.sql). | ||||
| Explanation of the various columns including the JSON datatypes is documented | ||||
| [here](https://github.com/ClusterCockpit/cc-specifications/blob/master/datastructures/job-meta.schema.json). | ||||
|  | ||||
| # Directory hierarchy specification | ||||
|  | ||||
| ## Specification | ||||
|  | ||||
| To manage the number of directories within a single directory a tree approach is | ||||
| used splitting the integer job ID. The job id is split in junks of 1000 each. | ||||
| Usually 2 layers of directories is sufficient but the concept can be used for an | ||||
| arbitrary number of layers. | ||||
|  | ||||
| For a 2 layer schema this can be achieved with (code example in Perl): | ||||
| ``` perl | ||||
| $level1 = $jobID/1000; | ||||
| $level2 = $jobID%1000; | ||||
| $dstPath = sprintf("%s/%s/%d/%03d", $trunk, $destdir, $level1, $level2); | ||||
| ``` | ||||
|  | ||||
| ## Example | ||||
|  | ||||
| For the job ID 1034871 the directory path is `./1034/871/`. | ||||
|  | ||||
| # Json file format | ||||
| ## Overview | ||||
|  | ||||
| Every cluster must be configured in a `cluster.json` file. | ||||
|  | ||||
| The job data consists of two files: | ||||
| * `meta.json`: Contains job meta information and job statistics. | ||||
| * `data.json`: Contains complete job data with time series | ||||
|  | ||||
| The description of the json format specification is available as [[json | ||||
| schema|https://json-schema.org/]] format file. The latest version of the json | ||||
| schema is part of the `cc-backend` source tree. For external reference it is | ||||
| also available in a separate repository. | ||||
|  | ||||
| ## Specification `cluster.json` | ||||
|  | ||||
| The json schema specification is available | ||||
| [here](https://github.com/ClusterCockpit/cc-specifications/blob/master/datastructures/cluster.schema.json). | ||||
|  | ||||
| ## Specification `meta.json` | ||||
|  | ||||
| The json schema specification is available | ||||
| [here](https://github.com/RRZE-HPC/HPCJobDatabase/blob/master/json-schema/job-meta.schema.json). | ||||
|  | ||||
| ## Specification `data.json` | ||||
|  | ||||
| The json schema specification is available | ||||
| [here](https://github.com/RRZE-HPC/HPCJobDatabase/blob/master/json-schema/job-data.schema.json). | ||||
| Metric time series data is stored for a fixed time step. The time step is set | ||||
| per metric. If no value is available for a metric time series data timestamp | ||||
| `null` is entered.  | ||||
|   | ||||
| @@ -363,6 +363,7 @@ func (fsa *FsArchive) CompressLast(starttime int64) int64 { | ||||
| 	b, err := os.ReadFile(filename) | ||||
| 	if err != nil { | ||||
| 		log.Errorf("fsBackend Compress - %v", err) | ||||
| 		os.WriteFile(filename, []byte(fmt.Sprintf("%d", starttime)), 0644) | ||||
| 		return starttime | ||||
| 	} | ||||
| 	last, err := strconv.ParseInt(strings.TrimSuffix(string(b), "\n"), 10, 64) | ||||
|   | ||||
		Reference in New Issue
	
	Block a user