mirror of
https://github.com/ClusterCockpit/cc-backend
synced 2025-12-31 10:56:15 +01:00
Merge branch 'dev' of github.com:ClusterCockpit/cc-backend into dev
This commit is contained in:
215
CLAUDE.md
Normal file
215
CLAUDE.md
Normal file
@@ -0,0 +1,215 @@
|
||||
# CLAUDE.md
|
||||
|
||||
This file provides guidance to Claude Code (claude.ai/code) when working with
|
||||
code in this repository.
|
||||
|
||||
## Project Overview
|
||||
|
||||
ClusterCockpit is a job-specific performance monitoring framework for HPC
|
||||
clusters. This is a Golang backend that provides REST and GraphQL APIs, serves a
|
||||
Svelte-based frontend, and manages job archives and metric data from various
|
||||
time-series databases.
|
||||
|
||||
## Build and Development Commands
|
||||
|
||||
### Building
|
||||
|
||||
```bash
|
||||
# Build everything (frontend + backend)
|
||||
make
|
||||
|
||||
# Build only the frontend
|
||||
make frontend
|
||||
|
||||
# Build only the backend (requires frontend to be built first)
|
||||
go build -ldflags='-s -X main.date=$(date +"%Y-%m-%d:T%H:%M:%S") -X main.version=1.4.4 -X main.commit=$(git rev-parse --short HEAD)' ./cmd/cc-backend
|
||||
```
|
||||
|
||||
### Testing
|
||||
|
||||
```bash
|
||||
# Run all tests
|
||||
make test
|
||||
|
||||
# Run tests with verbose output
|
||||
go test -v ./...
|
||||
|
||||
# Run tests for a specific package
|
||||
go test ./internal/repository
|
||||
```
|
||||
|
||||
### Code Generation
|
||||
|
||||
```bash
|
||||
# Regenerate GraphQL schema and resolvers (after modifying api/*.graphqls)
|
||||
make graphql
|
||||
|
||||
# Regenerate Swagger/OpenAPI docs (after modifying API comments)
|
||||
make swagger
|
||||
```
|
||||
|
||||
### Frontend Development
|
||||
|
||||
```bash
|
||||
cd web/frontend
|
||||
|
||||
# Install dependencies
|
||||
npm install
|
||||
|
||||
# Build for production
|
||||
npm run build
|
||||
|
||||
# Development mode with watch
|
||||
npm run dev
|
||||
```
|
||||
|
||||
### Running
|
||||
|
||||
```bash
|
||||
# Initialize database and create admin user
|
||||
./cc-backend -init-db -add-user demo:admin:demo
|
||||
|
||||
# Start server in development mode (enables GraphQL Playground and Swagger UI)
|
||||
./cc-backend -server -dev -loglevel info
|
||||
|
||||
# Start demo with sample data
|
||||
./startDemo.sh
|
||||
```
|
||||
|
||||
## Architecture
|
||||
|
||||
### Backend Structure
|
||||
|
||||
The backend follows a layered architecture with clear separation of concerns:
|
||||
|
||||
- **cmd/cc-backend**: Entry point, orchestrates initialization of all subsystems
|
||||
- **internal/repository**: Data access layer using repository pattern
|
||||
- Abstracts database operations (SQLite3 only)
|
||||
- Implements LRU caching for performance
|
||||
- Provides repositories for Job, User, Node, and Tag entities
|
||||
- Transaction support for batch operations
|
||||
- **internal/api**: REST API endpoints (Swagger/OpenAPI documented)
|
||||
- **internal/graph**: GraphQL API (uses gqlgen)
|
||||
- Schema in `api/*.graphqls`
|
||||
- Generated code in `internal/graph/generated/`
|
||||
- Resolvers in `internal/graph/schema.resolvers.go`
|
||||
- **internal/auth**: Authentication layer
|
||||
- Supports local accounts, LDAP, OIDC, and JWT tokens
|
||||
- Implements rate limiting for login attempts
|
||||
- **internal/metricdata**: Metric data repository abstraction
|
||||
- Pluggable backends: cc-metric-store, Prometheus, InfluxDB
|
||||
- Each cluster can have a different metric data backend
|
||||
- **internal/archiver**: Job archiving to file-based archive
|
||||
- **pkg/archive**: Job archive backend implementations
|
||||
- File system backend (default)
|
||||
- S3 backend
|
||||
- SQLite backend (experimental)
|
||||
- **pkg/nats**: NATS integration for metric ingestion
|
||||
|
||||
### Frontend Structure
|
||||
|
||||
- **web/frontend**: Svelte 5 application
|
||||
- Uses Rollup for building
|
||||
- Components organized by feature (analysis, job, user, etc.)
|
||||
- GraphQL client using @urql/svelte
|
||||
- Bootstrap 5 + SvelteStrap for UI
|
||||
- uPlot for time-series visualization
|
||||
- **web/templates**: Server-side Go templates
|
||||
|
||||
### Key Concepts
|
||||
|
||||
**Job Archive**: Completed jobs are stored in a file-based archive following the
|
||||
[ClusterCockpit job-archive
|
||||
specification](https://github.com/ClusterCockpit/cc-specifications/tree/master/job-archive).
|
||||
Each job has a `meta.json` file with metadata and metric data files.
|
||||
|
||||
**Metric Data Repositories**: Time-series metric data is stored separately from
|
||||
job metadata. The system supports multiple backends (cc-metric-store is
|
||||
recommended). Configuration is per-cluster in `config.json`.
|
||||
|
||||
**Authentication Flow**:
|
||||
|
||||
1. Multiple authenticators can be configured (local, LDAP, OIDC, JWT)
|
||||
2. Each authenticator's `CanLogin` method is called to determine if it should handle the request
|
||||
3. The first authenticator that returns true performs the actual `Login`
|
||||
4. JWT tokens are used for API authentication
|
||||
|
||||
**Database Migrations**: SQL migrations in `internal/repository/migrations/` are
|
||||
applied automatically on startup. Version tracking in `version` table.
|
||||
|
||||
**Scopes**: Metrics can be collected at different scopes:
|
||||
|
||||
- Node scope (always available)
|
||||
- Core scope (for jobs with ≤8 nodes)
|
||||
- Accelerator scope (for GPU/accelerator metrics)
|
||||
|
||||
## Configuration
|
||||
|
||||
- **config.json**: Main configuration (clusters, metric repositories, archive settings)
|
||||
- **.env**: Environment variables (secrets like JWT keys)
|
||||
- Copy from `configs/env-template.txt`
|
||||
- NEVER commit this file
|
||||
- **cluster.json**: Cluster topology and metric definitions (loaded from archive or config)
|
||||
|
||||
## Database
|
||||
|
||||
- Default: SQLite 3 (`./var/job.db`)
|
||||
- Connection managed by `internal/repository`
|
||||
- Schema version in `internal/repository/migration.go`
|
||||
|
||||
## Code Generation
|
||||
|
||||
**GraphQL** (gqlgen):
|
||||
|
||||
- Schema: `api/*.graphqls`
|
||||
- Config: `gqlgen.yml`
|
||||
- Generated code: `internal/graph/generated/`
|
||||
- Custom resolvers: `internal/graph/schema.resolvers.go`
|
||||
- Run `make graphql` after schema changes
|
||||
|
||||
**Swagger/OpenAPI**:
|
||||
|
||||
- Annotations in `internal/api/*.go`
|
||||
- Generated docs: `api/docs.go`, `api/swagger.yaml`
|
||||
- Run `make swagger` after API changes
|
||||
|
||||
## Testing Conventions
|
||||
|
||||
- Test files use `_test.go` suffix
|
||||
- Test data in `testdata/` subdirectories
|
||||
- Repository tests use in-memory SQLite
|
||||
- API tests use httptest
|
||||
|
||||
## Common Workflows
|
||||
|
||||
### Adding a new GraphQL field
|
||||
|
||||
1. Edit schema in `api/*.graphqls`
|
||||
2. Run `make graphql`
|
||||
3. Implement resolver in `internal/graph/schema.resolvers.go`
|
||||
|
||||
### Adding a new REST endpoint
|
||||
|
||||
1. Add handler in `internal/api/*.go`
|
||||
2. Add route in `internal/api/rest.go`
|
||||
3. Add Swagger annotations
|
||||
4. Run `make swagger`
|
||||
|
||||
### Adding a new metric data backend
|
||||
|
||||
1. Implement `MetricDataRepository` interface in `internal/metricdata/`
|
||||
2. Register in `metricdata.Init()` switch statement
|
||||
3. Update config.json schema documentation
|
||||
|
||||
### Modifying database schema
|
||||
|
||||
1. Create new migration in `internal/repository/migrations/`
|
||||
2. Increment `repository.Version`
|
||||
3. Test with fresh database and existing database
|
||||
|
||||
## Dependencies
|
||||
|
||||
- Go 1.24.0+ (check go.mod for exact version)
|
||||
- Node.js (for frontend builds)
|
||||
- SQLite 3 (only supported database)
|
||||
- Optional: NATS server for metric ingestion
|
||||
11
README.md
11
README.md
@@ -29,12 +29,11 @@ is also served by the backend using [Svelte](https://svelte.dev/) components.
|
||||
Layout and styling are based on [Bootstrap 5](https://getbootstrap.com/) using
|
||||
[Bootstrap Icons](https://icons.getbootstrap.com/).
|
||||
|
||||
The backend uses [SQLite 3](https://sqlite.org/) as a relational SQL database by
|
||||
default. Optionally it can use a MySQL/MariaDB database server. While there are
|
||||
metric data backends for the InfluxDB and Prometheus time series databases, the
|
||||
only tested and supported setup is to use cc-metric-store as the metric data
|
||||
backend. Documentation on how to integrate ClusterCockpit with other time series
|
||||
databases will be added in the future.
|
||||
The backend uses [SQLite 3](https://sqlite.org/) as the relational SQL database.
|
||||
While there are metric data backends for the InfluxDB and Prometheus time series
|
||||
databases, the only tested and supported setup is to use cc-metric-store as the
|
||||
metric data backend. Documentation on how to integrate ClusterCockpit with other
|
||||
time series databases will be added in the future.
|
||||
|
||||
Completed batch jobs are stored in a file-based job archive according to
|
||||
[this specification](https://github.com/ClusterCockpit/cc-specifications/tree/master/job-archive).
|
||||
|
||||
@@ -105,9 +105,9 @@ func initEnv() {
|
||||
cclog.Abortf("Could not create default ./var folder with permissions '0o777'. Application initialization failed, exited.\nError: %s\n", err.Error())
|
||||
}
|
||||
|
||||
err := repository.MigrateDB("sqlite3", "./var/job.db")
|
||||
err := repository.MigrateDB("./var/job.db")
|
||||
if err != nil {
|
||||
cclog.Abortf("Could not initialize default sqlite3 database as './var/job.db'. Application initialization failed, exited.\nError: %s\n", err.Error())
|
||||
cclog.Abortf("Could not initialize default SQLite database as './var/job.db'. Application initialization failed, exited.\nError: %s\n", err.Error())
|
||||
}
|
||||
if err := os.Mkdir("var/job-archive", 0o777); err != nil {
|
||||
cclog.Abortf("Could not create default ./var/job-archive folder with permissions '0o777'. Application initialization failed, exited.\nError: %s\n", err.Error())
|
||||
|
||||
@@ -40,7 +40,6 @@ import (
|
||||
"github.com/google/gops/agent"
|
||||
"github.com/joho/godotenv"
|
||||
|
||||
_ "github.com/go-sql-driver/mysql"
|
||||
_ "github.com/mattn/go-sqlite3"
|
||||
)
|
||||
|
||||
@@ -120,30 +119,30 @@ func initDatabase() error {
|
||||
|
||||
func handleDatabaseCommands() error {
|
||||
if flagMigrateDB {
|
||||
err := repository.MigrateDB(config.Keys.DBDriver, config.Keys.DB)
|
||||
err := repository.MigrateDB(config.Keys.DB)
|
||||
if err != nil {
|
||||
return fmt.Errorf("migrating database to version %d: %w", repository.Version, err)
|
||||
}
|
||||
cclog.Exitf("MigrateDB Success: Migrated '%s' database at location '%s' to version %d.\n",
|
||||
config.Keys.DBDriver, config.Keys.DB, repository.Version)
|
||||
cclog.Exitf("MigrateDB Success: Migrated SQLite database at '%s' to version %d.\n",
|
||||
config.Keys.DB, repository.Version)
|
||||
}
|
||||
|
||||
if flagRevertDB {
|
||||
err := repository.RevertDB(config.Keys.DBDriver, config.Keys.DB)
|
||||
err := repository.RevertDB(config.Keys.DB)
|
||||
if err != nil {
|
||||
return fmt.Errorf("reverting database to version %d: %w", repository.Version-1, err)
|
||||
}
|
||||
cclog.Exitf("RevertDB Success: Reverted '%s' database at location '%s' to version %d.\n",
|
||||
config.Keys.DBDriver, config.Keys.DB, repository.Version-1)
|
||||
cclog.Exitf("RevertDB Success: Reverted SQLite database at '%s' to version %d.\n",
|
||||
config.Keys.DB, repository.Version-1)
|
||||
}
|
||||
|
||||
if flagForceDB {
|
||||
err := repository.ForceDB(config.Keys.DBDriver, config.Keys.DB)
|
||||
err := repository.ForceDB(config.Keys.DB)
|
||||
if err != nil {
|
||||
return fmt.Errorf("forcing database to version %d: %w", repository.Version, err)
|
||||
}
|
||||
cclog.Exitf("ForceDB Success: Forced '%s' database at location '%s' to version %d.\n",
|
||||
config.Keys.DBDriver, config.Keys.DB, repository.Version)
|
||||
cclog.Exitf("ForceDB Success: Forced SQLite database at '%s' to version %d.\n",
|
||||
config.Keys.DB, repository.Version)
|
||||
}
|
||||
|
||||
return nil
|
||||
|
||||
@@ -51,7 +51,8 @@ const (
|
||||
type Server struct {
|
||||
router *mux.Router
|
||||
server *http.Server
|
||||
apiHandle *api.RestAPI
|
||||
restAPIHandle *api.RestAPI
|
||||
natsAPIHandle *api.NatsAPI
|
||||
}
|
||||
|
||||
func onFailureResponse(rw http.ResponseWriter, r *http.Request, err error) {
|
||||
@@ -104,7 +105,7 @@ func (s *Server) init() error {
|
||||
|
||||
authHandle := auth.GetAuthInstance()
|
||||
|
||||
s.apiHandle = api.New()
|
||||
s.restAPIHandle = api.New()
|
||||
|
||||
info := map[string]any{}
|
||||
info["hasOpenIDConnect"] = false
|
||||
@@ -240,13 +241,20 @@ func (s *Server) init() error {
|
||||
|
||||
// Mount all /monitoring/... and /api/... routes.
|
||||
routerConfig.SetupRoutes(secured, buildInfo)
|
||||
s.apiHandle.MountAPIRoutes(securedapi)
|
||||
s.apiHandle.MountUserAPIRoutes(userapi)
|
||||
s.apiHandle.MountConfigAPIRoutes(configapi)
|
||||
s.apiHandle.MountFrontendAPIRoutes(frontendapi)
|
||||
s.restAPIHandle.MountAPIRoutes(securedapi)
|
||||
s.restAPIHandle.MountUserAPIRoutes(userapi)
|
||||
s.restAPIHandle.MountConfigAPIRoutes(configapi)
|
||||
s.restAPIHandle.MountFrontendAPIRoutes(frontendapi)
|
||||
|
||||
if config.Keys.APISubjects != nil {
|
||||
s.natsAPIHandle = api.NewNatsAPI()
|
||||
if err := s.natsAPIHandle.StartSubscriptions(); err != nil {
|
||||
return fmt.Errorf("starting NATS subscriptions: %w", err)
|
||||
}
|
||||
}
|
||||
|
||||
if memorystore.InternalCCMSFlag {
|
||||
s.apiHandle.MountMetricStoreAPIRoutes(metricstoreapi)
|
||||
s.restAPIHandle.MountMetricStoreAPIRoutes(metricstoreapi)
|
||||
}
|
||||
|
||||
if config.Keys.EmbedStaticFiles {
|
||||
|
||||
@@ -1,64 +0,0 @@
|
||||
{
|
||||
"addr": "127.0.0.1:8080",
|
||||
"short-running-jobs-duration": 300,
|
||||
"archive": {
|
||||
"kind": "file",
|
||||
"path": "./var/job-archive"
|
||||
},
|
||||
"jwts": {
|
||||
"max-age": "2000h"
|
||||
},
|
||||
"db-driver": "mysql",
|
||||
"db": "clustercockpit:demo@tcp(127.0.0.1:3306)/clustercockpit",
|
||||
"enable-resampling": {
|
||||
"trigger": 30,
|
||||
"resolutions": [600, 300, 120, 60]
|
||||
},
|
||||
"emission-constant": 317,
|
||||
"clusters": [
|
||||
{
|
||||
"name": "fritz",
|
||||
"metricDataRepository": {
|
||||
"kind": "cc-metric-store",
|
||||
"url": "http://localhost:8082",
|
||||
"token": ""
|
||||
},
|
||||
"filterRanges": {
|
||||
"numNodes": {
|
||||
"from": 1,
|
||||
"to": 64
|
||||
},
|
||||
"duration": {
|
||||
"from": 0,
|
||||
"to": 86400
|
||||
},
|
||||
"startTime": {
|
||||
"from": "2022-01-01T00:00:00Z",
|
||||
"to": null
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"name": "alex",
|
||||
"metricDataRepository": {
|
||||
"kind": "cc-metric-store",
|
||||
"url": "http://localhost:8082",
|
||||
"token": ""
|
||||
},
|
||||
"filterRanges": {
|
||||
"numNodes": {
|
||||
"from": 1,
|
||||
"to": 64
|
||||
},
|
||||
"duration": {
|
||||
"from": 0,
|
||||
"to": 86400
|
||||
},
|
||||
"startTime": {
|
||||
"from": "2022-01-01T00:00:00Z",
|
||||
"to": null
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
22
configs/startJobPayload.json
Normal file
22
configs/startJobPayload.json
Normal file
@@ -0,0 +1,22 @@
|
||||
{
|
||||
"cluster": "fritz",
|
||||
"jobId": 123000,
|
||||
"jobState": "running",
|
||||
"numAcc": 0,
|
||||
"numHwthreads": 72,
|
||||
"numNodes": 1,
|
||||
"partition": "main",
|
||||
"requestedMemory": 128000,
|
||||
"resources": [{ "hostname": "f0726" }],
|
||||
"startTime": 1649723812,
|
||||
"subCluster": "main",
|
||||
"submitTime": 1649723812,
|
||||
"user": "k106eb10",
|
||||
"project": "k106eb",
|
||||
"walltime": 86400,
|
||||
"metaData": {
|
||||
"slurmInfo": "JobId=398759\nJobName=myJob\nUserId=dummyUser\nGroupId=dummyGroup\nAccount=dummyAccount\nQOS=normal Requeue=False Restarts=0 BatchFlag=True\nTimeLimit=1439'\nSubmitTime=2023-02-09T14:10:18\nPartition=singlenode\nNodeList=xx\nNumNodes=xx NumCPUs=72 NumTasks=72 CPUs/Task=1\nNTasksPerNode:Socket:Core=0:None:None\nTRES_req=cpu=72,mem=250000M,node=1,billing=72\nTRES_alloc=cpu=72,node=1,billing=72\nCommand=myCmd\nWorkDir=myDir\nStdErr=\nStdOut=\n",
|
||||
"jobScript": "#!/bin/bash -l\n#SBATCH --job-name=dummy_job\n#SBATCH --time=23:59:00\n#SBATCH --partition=singlenode\n#SBATCH --ntasks=72\n#SBATCH --hint=multithread\n#SBATCH --chdir=/home/atuin/k106eb/dummy/\n#SBATCH --export=NONE\nunset SLURM_EXPORT_ENV\n\n#This is a dummy job script\n./mybinary\n",
|
||||
"jobName": "ams_pipeline"
|
||||
}
|
||||
}
|
||||
7
configs/stopJobPayload.json
Normal file
7
configs/stopJobPayload.json
Normal file
@@ -0,0 +1,7 @@
|
||||
{
|
||||
"cluster": "fritz",
|
||||
"jobId": 123000,
|
||||
"jobState": "completed",
|
||||
"startTime": 1649723812,
|
||||
"stopTime": 1649763839
|
||||
}
|
||||
4
go.mod
4
go.mod
@@ -11,7 +11,7 @@ tool (
|
||||
|
||||
require (
|
||||
github.com/99designs/gqlgen v0.17.84
|
||||
github.com/ClusterCockpit/cc-lib v1.0.0
|
||||
github.com/ClusterCockpit/cc-lib v1.0.2
|
||||
github.com/Masterminds/squirrel v1.5.4
|
||||
github.com/aws/aws-sdk-go-v2 v1.41.0
|
||||
github.com/aws/aws-sdk-go-v2/config v1.31.20
|
||||
@@ -21,7 +21,6 @@ require (
|
||||
github.com/expr-lang/expr v1.17.6
|
||||
github.com/go-co-op/gocron/v2 v2.18.2
|
||||
github.com/go-ldap/ldap/v3 v3.4.12
|
||||
github.com/go-sql-driver/mysql v1.9.3
|
||||
github.com/golang-jwt/jwt/v5 v5.3.0
|
||||
github.com/golang-migrate/migrate/v4 v4.19.1
|
||||
github.com/google/gops v0.3.28
|
||||
@@ -48,7 +47,6 @@ require (
|
||||
)
|
||||
|
||||
require (
|
||||
filippo.io/edwards25519 v1.1.0 // indirect
|
||||
github.com/Azure/go-ntlmssp v0.0.0-20221128193559-754e69321358 // indirect
|
||||
github.com/KyleBanks/depth v1.2.1 // indirect
|
||||
github.com/agnivade/levenshtein v1.2.1 // indirect
|
||||
|
||||
53
go.sum
53
go.sum
@@ -2,18 +2,14 @@ filippo.io/edwards25519 v1.1.0 h1:FNf4tywRC1HmFuKW5xopWpigGjJKiJSV0Cqo0cJWDaA=
|
||||
filippo.io/edwards25519 v1.1.0/go.mod h1:BxyFTGdWcka3PhytdK4V28tE5sGfRvvvRV7EaN4VDT4=
|
||||
github.com/99designs/gqlgen v0.17.84 h1:iVMdiStgUVx/BFkMb0J5GAXlqfqtQ7bqMCYK6v52kQ0=
|
||||
github.com/99designs/gqlgen v0.17.84/go.mod h1:qjoUqzTeiejdo+bwUg8unqSpeYG42XrcrQboGIezmFA=
|
||||
github.com/Azure/go-ansiterm v0.0.0-20230124172434-306776ec8161 h1:L/gRVlceqvL25UVaW/CKtUDjefjrs0SPonmDGUVOYP0=
|
||||
github.com/Azure/go-ansiterm v0.0.0-20230124172434-306776ec8161/go.mod h1:xomTg63KZ2rFqZQzSB4Vz2SUXa1BpHTVz9L5PTmPC4E=
|
||||
github.com/Azure/go-ntlmssp v0.0.0-20221128193559-754e69321358 h1:mFRzDkZVAjdal+s7s0MwaRv9igoPqLRdzOLzw/8Xvq8=
|
||||
github.com/Azure/go-ntlmssp v0.0.0-20221128193559-754e69321358/go.mod h1:chxPXzSsl7ZWRAuOIE23GDNzjWuZquvFlgA8xmpunjU=
|
||||
github.com/ClusterCockpit/cc-lib v1.0.0 h1:/8DFRomt4BpVWKWrsEZ/ru4K8x76QTVnEgdwHc5eSps=
|
||||
github.com/ClusterCockpit/cc-lib v1.0.0/go.mod h1:UGdOvXEnjFqlnPSxtvtFwO6BtXYW6NnXFoud9FtN93k=
|
||||
github.com/ClusterCockpit/cc-lib v1.0.2 h1:ZWn3oZkXgxrr3zSigBdlOOfayZ4Om4xL20DhmritPPg=
|
||||
github.com/ClusterCockpit/cc-lib v1.0.2/go.mod h1:UGdOvXEnjFqlnPSxtvtFwO6BtXYW6NnXFoud9FtN93k=
|
||||
github.com/KyleBanks/depth v1.2.1 h1:5h8fQADFrWtarTdtDudMmGsC7GPbOAu6RVB3ffsVFHc=
|
||||
github.com/KyleBanks/depth v1.2.1/go.mod h1:jzSb9d0L43HxTQfT+oSA1EEp2q+ne2uh6XgeJcm8brE=
|
||||
github.com/Masterminds/squirrel v1.5.4 h1:uUcX/aBc8O7Fg9kaISIUsHXdKuqehiXAMQTYX8afzqM=
|
||||
github.com/Masterminds/squirrel v1.5.4/go.mod h1:NNaOrjSoIDfDA40n7sr2tPNZRfjzjA400rg+riTZj10=
|
||||
github.com/Microsoft/go-winio v0.6.2 h1:F2VQgta7ecxGYO8k3ZZz3RS8fVIXVxONVUPlNERoyfY=
|
||||
github.com/Microsoft/go-winio v0.6.2/go.mod h1:yd8OoFMLzJbo9gZq8j5qaps8bJ9aShtEA8Ipt1oGCvU=
|
||||
github.com/NVIDIA/go-nvml v0.13.0-1 h1:OLX8Jq3dONuPOQPC7rndB6+iDmDakw0XTYgzMxObkEw=
|
||||
github.com/NVIDIA/go-nvml v0.13.0-1/go.mod h1:+KNA7c7gIBH7SKSJ1ntlwkfN80zdx8ovl4hrK3LmPt4=
|
||||
github.com/PuerkitoBio/goquery v1.11.0 h1:jZ7pwMQXIITcUXNH83LLk+txlaEy6NVOfTuP43xxfqw=
|
||||
@@ -70,10 +66,6 @@ github.com/beorn7/perks v1.0.1 h1:VlbKKnNfV8bJzeqoa4cOKqO6bYr3WgKZxO8Z16+hsOM=
|
||||
github.com/beorn7/perks v1.0.1/go.mod h1:G2ZrVWU2WbWT9wwq4/hrbKbnv/1ERSJQ0ibhJ6rlkpw=
|
||||
github.com/cespare/xxhash/v2 v2.3.0 h1:UL815xU9SqsFlibzuggzjXhog7bL6oX9BbNZnL2UFvs=
|
||||
github.com/cespare/xxhash/v2 v2.3.0/go.mod h1:VGX0DQ3Q6kWi7AoAeZDth3/j3BFtOZR5XLFGgcrjCOs=
|
||||
github.com/containerd/errdefs v1.0.0 h1:tg5yIfIlQIrxYtu9ajqY42W3lpS19XqdxRQeEwYG8PI=
|
||||
github.com/containerd/errdefs v1.0.0/go.mod h1:+YBYIdtsnF4Iw6nWZhJcqGSg/dwvV7tyJ/kCkyJ2k+M=
|
||||
github.com/containerd/errdefs/pkg v0.3.0 h1:9IKJ06FvyNlexW690DXuQNx2KA2cUJXx151Xdx3ZPPE=
|
||||
github.com/containerd/errdefs/pkg v0.3.0/go.mod h1:NJw6s9HwNuRhnjJhM7pylWwMyAkmCQvQ4GpJHEqRLVk=
|
||||
github.com/coreos/go-oidc/v3 v3.16.0 h1:qRQUCFstKpXwmEjDQTIbyY/5jF00+asXzSkmkoa/mow=
|
||||
github.com/coreos/go-oidc/v3 v3.16.0/go.mod h1:wqPbKFrVnE90vty060SB40FCJ8fTHTxSwyXJqZH+sI8=
|
||||
github.com/cpuguy83/go-md2man/v2 v2.0.7 h1:zbFlGlXEAKlwXpmvle3d8Oe3YnkKIK4xSRTd3sHPnBo=
|
||||
@@ -85,16 +77,6 @@ github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc h1:U9qPSI2PIWSS1
|
||||
github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
|
||||
github.com/dgryski/trifles v0.0.0-20230903005119-f50d829f2e54 h1:SG7nF6SRlWhcT7cNTs5R6Hk4V2lcmLz2NsG2VnInyNo=
|
||||
github.com/dgryski/trifles v0.0.0-20230903005119-f50d829f2e54/go.mod h1:if7Fbed8SFyPtHLHbg49SI7NAdJiC5WIA09pe59rfAA=
|
||||
github.com/dhui/dktest v0.4.6 h1:+DPKyScKSEp3VLtbMDHcUq6V5Lm5zfZZVb0Sk7Ahom4=
|
||||
github.com/dhui/dktest v0.4.6/go.mod h1:JHTSYDtKkvFNFHJKqCzVzqXecyv+tKt8EzceOmQOgbU=
|
||||
github.com/distribution/reference v0.6.0 h1:0IXCQ5g4/QMHHkarYzh5l+u8T3t73zM5QvfrDyIgxBk=
|
||||
github.com/distribution/reference v0.6.0/go.mod h1:BbU0aIcezP1/5jX/8MP0YiH4SdvB5Y4f/wlDRiLyi3E=
|
||||
github.com/docker/docker v28.3.3+incompatible h1:Dypm25kh4rmk49v1eiVbsAtpAsYURjYkaKubwuBdxEI=
|
||||
github.com/docker/docker v28.3.3+incompatible/go.mod h1:eEKB0N0r5NX/I1kEveEz05bcu8tLC/8azJZsviup8Sk=
|
||||
github.com/docker/go-connections v0.5.0 h1:USnMq7hx7gwdVZq1L49hLXaFtUdTADjXGp+uj1Br63c=
|
||||
github.com/docker/go-connections v0.5.0/go.mod h1:ov60Kzw0kKElRwhNs9UlUHAE/F9Fe6GLaXnqyDdmEXc=
|
||||
github.com/docker/go-units v0.5.0 h1:69rxXcBk27SvSaaxTtLh/8llcHD8vYHT7WSdRZ/jvr4=
|
||||
github.com/docker/go-units v0.5.0/go.mod h1:fgPhTUdO+D/Jk86RDLlptpiXQzgHJF7gydDDbaIK4Dk=
|
||||
github.com/expr-lang/expr v1.17.6 h1:1h6i8ONk9cexhDmowO/A64VPxHScu7qfSl2k8OlINec=
|
||||
github.com/expr-lang/expr v1.17.6/go.mod h1:8/vRC7+7HBzESEqt5kKpYXxrxkr31SaO8r40VO/1IT4=
|
||||
github.com/felixge/httpsnoop v1.0.4 h1:NFTV2Zj1bL4mc9sqWACXbQFVBBg2W3GPvqp8/ESS2Wg=
|
||||
@@ -113,10 +95,6 @@ github.com/go-jose/go-jose/v4 v4.1.3 h1:CVLmWDhDVRa6Mi/IgCgaopNosCaHz7zrMeF9MlZR
|
||||
github.com/go-jose/go-jose/v4 v4.1.3/go.mod h1:x4oUasVrzR7071A4TnHLGSPpNOm2a21K9Kf04k1rs08=
|
||||
github.com/go-ldap/ldap/v3 v3.4.12 h1:1b81mv7MagXZ7+1r7cLTWmyuTqVqdwbtJSjC0DAp9s4=
|
||||
github.com/go-ldap/ldap/v3 v3.4.12/go.mod h1:+SPAGcTtOfmGsCb3h1RFiq4xpp4N636G75OEace8lNo=
|
||||
github.com/go-logr/logr v1.4.3 h1:CjnDlHq8ikf6E492q6eKboGOC0T8CDaOvkHCIg8idEI=
|
||||
github.com/go-logr/logr v1.4.3/go.mod h1:9T104GzyrTigFIr8wt5mBrctHMim0Nb2HLGrmQ40KvY=
|
||||
github.com/go-logr/stdr v1.2.2 h1:hSWxHoqTgW2S2qGc0LTAI563KZ5YKYRhT3MFKZMbjag=
|
||||
github.com/go-logr/stdr v1.2.2/go.mod h1:mMo/vtBO5dYbehREoey6XUKy/eSumjCCveDpRre4VKE=
|
||||
github.com/go-openapi/jsonpointer v0.22.3 h1:dKMwfV4fmt6Ah90zloTbUKWMD+0he+12XYAsPotrkn8=
|
||||
github.com/go-openapi/jsonpointer v0.22.3/go.mod h1:0lBbqeRsQ5lIanv3LHZBrmRGHLHcQoOXQnf88fHlGWo=
|
||||
github.com/go-openapi/jsonreference v0.21.3 h1:96Dn+MRPa0nYAR8DR1E03SblB5FJvh7W6krPI0Z7qMc=
|
||||
@@ -145,15 +123,12 @@ github.com/go-openapi/testify/enable/yaml/v2 v2.0.2/go.mod h1:kme83333GCtJQHXQ8U
|
||||
github.com/go-openapi/testify/v2 v2.0.2 h1:X999g3jeLcoY8qctY/c/Z8iBHTbwLz7R2WXd6Ub6wls=
|
||||
github.com/go-openapi/testify/v2 v2.0.2/go.mod h1:HCPmvFFnheKK2BuwSA0TbbdxJ3I16pjwMkYkP4Ywn54=
|
||||
github.com/go-sql-driver/mysql v1.4.1/go.mod h1:zAC/RDZ24gD3HViQzih4MyKcchzm+sOG5ZlKdlhCg5w=
|
||||
github.com/go-sql-driver/mysql v1.8.1 h1:LedoTUt/eveggdHS9qUFC1EFSa8bU2+1pZjSRpvNJ1Y=
|
||||
github.com/go-sql-driver/mysql v1.8.1/go.mod h1:wEBSXgmK//2ZFJyE+qWnIsVGmvmEKlqwuVSjsCm7DZg=
|
||||
github.com/go-sql-driver/mysql v1.9.3 h1:U/N249h2WzJ3Ukj8SowVFjdtZKfu9vlLZxjPXV1aweo=
|
||||
github.com/go-sql-driver/mysql v1.9.3/go.mod h1:qn46aNg1333BRMNU69Lq93t8du/dwxI64Gl8i5p1WMU=
|
||||
github.com/go-viper/mapstructure/v2 v2.4.0 h1:EBsztssimR/CONLSZZ04E8qAkxNYq4Qp9LvH92wZUgs=
|
||||
github.com/go-viper/mapstructure/v2 v2.4.0/go.mod h1:oJDH3BJKyqBA2TXFhDsKDGDTlndYOZ6rGS0BRZIxGhM=
|
||||
github.com/goccy/go-yaml v1.19.0 h1:EmkZ9RIsX+Uq4DYFowegAuJo8+xdX3T/2dwNPXbxEYE=
|
||||
github.com/goccy/go-yaml v1.19.0/go.mod h1:XBurs7gK8ATbW4ZPGKgcbrY1Br56PdM69F7LkFRi1kA=
|
||||
github.com/gogo/protobuf v1.3.2 h1:Ov1cvc58UF3b5XjBnZv7+opcTcQFZebYjWzi34vdm4Q=
|
||||
github.com/gogo/protobuf v1.3.2/go.mod h1:P1XiOD3dCwIKUDQYPy72D8LYyHL2YPYrpS2s69NZV8Q=
|
||||
github.com/golang-jwt/jwt/v5 v5.3.0 h1:pv4AsKCKKZuqlgs5sUmn4x8UlGa0kEVt/puTpKx9vvo=
|
||||
github.com/golang-jwt/jwt/v5 v5.3.0/go.mod h1:fxCRLWMO43lRc8nhHWY6LGqRcf+1gQWArsqaEUEa5bE=
|
||||
github.com/golang-migrate/migrate/v4 v4.19.1 h1:OCyb44lFuQfYXYLx1SCxPZQGU7mcaZ7gH9yH4jSFbBA=
|
||||
@@ -241,17 +216,11 @@ github.com/mattn/go-sqlite3 v1.10.0/go.mod h1:FPy6KqzDD04eiIsT53CuJW3U88zkxoIYsO
|
||||
github.com/mattn/go-sqlite3 v1.14.22/go.mod h1:Uh1q+B4BYcTPb+yiD3kU8Ct7aC0hY9fxUwlHK0RXw+Y=
|
||||
github.com/mattn/go-sqlite3 v1.14.32 h1:JD12Ag3oLy1zQA+BNn74xRgaBbdhbNIDYvQUEuuErjs=
|
||||
github.com/mattn/go-sqlite3 v1.14.32/go.mod h1:Uh1q+B4BYcTPb+yiD3kU8Ct7aC0hY9fxUwlHK0RXw+Y=
|
||||
github.com/moby/docker-image-spec v1.3.1 h1:jMKff3w6PgbfSa69GfNg+zN/XLhfXJGnEx3Nl2EsFP0=
|
||||
github.com/moby/docker-image-spec v1.3.1/go.mod h1:eKmb5VW8vQEh/BAr2yvVNvuiJuY6UIocYsFu/DxxRpo=
|
||||
github.com/moby/term v0.5.0 h1:xt8Q1nalod/v7BqbG21f8mQPqH+xAaC9C3N3wfWbVP0=
|
||||
github.com/moby/term v0.5.0/go.mod h1:8FzsFHVUBGZdbDsJw/ot+X+d5HLUbvklYLJ9uGfcI3Y=
|
||||
github.com/modern-go/concurrent v0.0.0-20180228061459-e0a39a4cb421/go.mod h1:6dJC0mAP4ikYIbvyc7fijjWJddQyLn8Ig3JB5CqoB9Q=
|
||||
github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd h1:TRLaZ9cD/w8PVh93nsPXa1VrQ6jlwL5oN8l14QlcNfg=
|
||||
github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd/go.mod h1:6dJC0mAP4ikYIbvyc7fijjWJddQyLn8Ig3JB5CqoB9Q=
|
||||
github.com/modern-go/reflect2 v1.0.2 h1:xBagoLtFs94CBntxluKeaWgTMpvLxC4ur3nMaC9Gz0M=
|
||||
github.com/modern-go/reflect2 v1.0.2/go.mod h1:yWuevngMOJpCy52FWWMvUC8ws7m/LJsjYzDa0/r8luk=
|
||||
github.com/morikuni/aec v1.0.0 h1:nP9CBfwrvYnBRgY6qfDQkygYDmYwOilePFkwzv4dU8A=
|
||||
github.com/morikuni/aec v1.0.0/go.mod h1:BbKIizmSmc5MMPqRYbxO4ZU0S0+P200+tUnFx7PXmsc=
|
||||
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 h1:C3w9PqII01/Oq1c1nUAm88MOHcQC9l5mIlSMApZMrHA=
|
||||
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822/go.mod h1:+n7T8mK8HuQTcFwEeznm/DIxMOiR9yIdICNftLE1DvQ=
|
||||
github.com/mwitkow/go-conntrack v0.0.0-20190716064945-2f068394615f h1:KUppIJq7/+SVif2QVs3tOP0zanoHgBEVAwHxUSIzRqU=
|
||||
@@ -265,13 +234,7 @@ github.com/nats-io/nuid v1.0.1/go.mod h1:19wcPz3Ph3q0Jbyiqsd0kePYG7A95tJPxeL+1OS
|
||||
github.com/niemeyer/pretty v0.0.0-20200227124842-a10e7caefd8e/go.mod h1:zD1mROLANZcx1PVRCS0qkT7pwLkGfwJo4zjcN/Tysno=
|
||||
github.com/oapi-codegen/runtime v1.1.1 h1:EXLHh0DXIJnWhdRPN2w4MXAzFyE4CskzhNLUmtpMYro=
|
||||
github.com/oapi-codegen/runtime v1.1.1/go.mod h1:SK9X900oXmPWilYR5/WKPzt3Kqxn/uS/+lbpREv+eCg=
|
||||
github.com/opencontainers/go-digest v1.0.0 h1:apOUWs51W5PlhuyGyz9FCeeBIOUDA/6nW8Oi/yOhh5U=
|
||||
github.com/opencontainers/go-digest v1.0.0/go.mod h1:0JzlMkj0TRzQZfJkVvzbP0HBR3IKzErnv2BNG4W4MAM=
|
||||
github.com/opencontainers/image-spec v1.1.0 h1:8SG7/vwALn54lVB/0yZ/MMwhFrPYtpEHQb2IpWsCzug=
|
||||
github.com/opencontainers/image-spec v1.1.0/go.mod h1:W4s4sFTMaBeK1BQLXbG4AdM2szdn85PY75RI83NrTrM=
|
||||
github.com/opentracing/opentracing-go v1.1.0/go.mod h1:UkNAQd3GIcIGf0SeVgPpRdFStlNbqXla1AfSYxPUl2o=
|
||||
github.com/pkg/errors v0.9.1 h1:FEBLx1zS214owpjy7qsBeixbURkuhQAwrK5UwLGTwt4=
|
||||
github.com/pkg/errors v0.9.1/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0=
|
||||
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
|
||||
github.com/pmezard/go-difflib v1.0.1-0.20181226105442-5d4384ee4fb2 h1:Jamvg5psRIccs7FGNTlIRMkT8wgtp5eCXdBlqhYGL6U=
|
||||
github.com/pmezard/go-difflib v1.0.1-0.20181226105442-5d4384ee4fb2/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
|
||||
@@ -323,16 +286,6 @@ github.com/vektah/gqlparser/v2 v2.5.31/go.mod h1:c1I28gSOVNzlfc4WuDlqU7voQnsqI6O
|
||||
github.com/xrash/smetrics v0.0.0-20250705151800-55b8f293f342 h1:FnBeRrxr7OU4VvAzt5X7s6266i6cSVkkFPS0TuXWbIg=
|
||||
github.com/xrash/smetrics v0.0.0-20250705151800-55b8f293f342/go.mod h1:Ohn+xnUBiLI6FVj/9LpzZWtj1/D6lUovWYBkxHVV3aM=
|
||||
github.com/yuin/goldmark v1.4.13/go.mod h1:6yULJ656Px+3vBD8DxQVa3kxgyrAnzto9xy5taEt/CY=
|
||||
go.opentelemetry.io/auto/sdk v1.1.0 h1:cH53jehLUN6UFLY71z+NDOiNJqDdPRaXzTel0sJySYA=
|
||||
go.opentelemetry.io/auto/sdk v1.1.0/go.mod h1:3wSPjt5PWp2RhlCcmmOial7AvC4DQqZb7a7wCow3W8A=
|
||||
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.61.0 h1:F7Jx+6hwnZ41NSFTO5q4LYDtJRXBf2PD0rNBkeB/lus=
|
||||
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.61.0/go.mod h1:UHB22Z8QsdRDrnAtX4PntOl36ajSxcdUMt1sF7Y6E7Q=
|
||||
go.opentelemetry.io/otel v1.37.0 h1:9zhNfelUvx0KBfu/gb+ZgeAfAgtWrfHJZcAqFC228wQ=
|
||||
go.opentelemetry.io/otel v1.37.0/go.mod h1:ehE/umFRLnuLa/vSccNq9oS1ErUlkkK71gMcN34UG8I=
|
||||
go.opentelemetry.io/otel/metric v1.37.0 h1:mvwbQS5m0tbmqML4NqK+e3aDiO02vsf/WgbsdpcPoZE=
|
||||
go.opentelemetry.io/otel/metric v1.37.0/go.mod h1:04wGrZurHYKOc+RKeye86GwKiTb9FKm1WHtO+4EVr2E=
|
||||
go.opentelemetry.io/otel/trace v1.37.0 h1:HLdcFNbRQBE2imdSEgm/kwqmQj1Or1l/7bW6mxVK7z4=
|
||||
go.opentelemetry.io/otel/trace v1.37.0/go.mod h1:TlgrlQ+PtQO5XFerSPUYG0JSgGyryXewPGyayAWSBS0=
|
||||
go.uber.org/goleak v1.3.0 h1:2K3zAYmnTNqV73imy9J1T3WC+gmCePx2hEGkimedGto=
|
||||
go.uber.org/goleak v1.3.0/go.mod h1:CoHD4mav9JJNrW/WLlf7HGZPjdw8EucARQHekz1X6bE=
|
||||
go.yaml.in/yaml/v2 v2.4.3 h1:6gvOSjQoTB3vt1l+CU+tSyi/HOjfOjRLJ4YwYZGwRO0=
|
||||
|
||||
@@ -3,7 +3,7 @@ Description=ClusterCockpit Web Server
|
||||
Documentation=https://github.com/ClusterCockpit/cc-backend
|
||||
Wants=network-online.target
|
||||
After=network-online.target
|
||||
After=mariadb.service mysql.service
|
||||
# Database is file-based SQLite - no service dependency required
|
||||
|
||||
[Service]
|
||||
WorkingDirectory=/opt/monitoring/cc-backend
|
||||
|
||||
@@ -141,7 +141,7 @@ func setup(t *testing.T) *api.RestAPI {
|
||||
}
|
||||
|
||||
dbfilepath := filepath.Join(tmpdir, "test.db")
|
||||
err := repository.MigrateDB("sqlite3", dbfilepath)
|
||||
err := repository.MigrateDB(dbfilepath)
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
|
||||
@@ -79,8 +79,11 @@ func (api *RestAPI) MountAPIRoutes(r *mux.Router) {
|
||||
// Slurm node state
|
||||
r.HandleFunc("/nodestate/", api.updateNodeStates).Methods(http.MethodPost, http.MethodPut)
|
||||
// Job Handler
|
||||
if config.Keys.APISubjects == nil {
|
||||
cclog.Info("Enabling REST start/stop job API")
|
||||
r.HandleFunc("/jobs/start_job/", api.startJob).Methods(http.MethodPost, http.MethodPut)
|
||||
r.HandleFunc("/jobs/stop_job/", api.stopJobByRequest).Methods(http.MethodPost, http.MethodPut)
|
||||
}
|
||||
r.HandleFunc("/jobs/", api.getJobs).Methods(http.MethodGet)
|
||||
r.HandleFunc("/jobs/{id}", api.getJobByID).Methods(http.MethodPost)
|
||||
r.HandleFunc("/jobs/{id}", api.getCompleteJobByID).Methods(http.MethodGet)
|
||||
|
||||
@@ -37,10 +37,10 @@ type ProgramConfig struct {
|
||||
EmbedStaticFiles bool `json:"embed-static-files"`
|
||||
StaticFiles string `json:"static-files"`
|
||||
|
||||
// 'sqlite3' or 'mysql' (mysql will work for mariadb as well)
|
||||
// Database driver - only 'sqlite3' is supported
|
||||
DBDriver string `json:"db-driver"`
|
||||
|
||||
// For sqlite3 a filename, for mysql a DSN in this format: https://github.com/go-sql-driver/mysql#dsn-data-source-name (Without query parameters!).
|
||||
// Path to SQLite database file
|
||||
DB string `json:"db"`
|
||||
|
||||
// Keep all metric data in the metric data repositories,
|
||||
|
||||
@@ -41,7 +41,7 @@ var configSchema = `
|
||||
"type": "string"
|
||||
},
|
||||
"db": {
|
||||
"description": "For sqlite3 a filename, for mysql a DSN in this format: https://github.com/go-sql-driver/mysql#dsn-data-source-name (Without query parameters!).",
|
||||
"description": "Path to SQLite database file (e.g., './var/job.db')",
|
||||
"type": "string"
|
||||
},
|
||||
"disable-archive": {
|
||||
|
||||
@@ -88,14 +88,14 @@ func (r *jobResolver) EnergyFootprint(ctx context.Context, obj *schema.Job) ([]*
|
||||
res := []*model.EnergyFootprintValue{}
|
||||
for name, value := range rawEnergyFootprint {
|
||||
// Suboptimal: Nearly hardcoded metric name expectations
|
||||
matchCpu := regexp.MustCompile(`cpu|Cpu|CPU`)
|
||||
matchCPU := regexp.MustCompile(`cpu|Cpu|CPU`)
|
||||
matchAcc := regexp.MustCompile(`acc|Acc|ACC`)
|
||||
matchMem := regexp.MustCompile(`mem|Mem|MEM`)
|
||||
matchCore := regexp.MustCompile(`core|Core|CORE`)
|
||||
|
||||
hwType := ""
|
||||
switch test := name; { // NOtice ';' for var declaration
|
||||
case matchCpu.MatchString(test):
|
||||
case matchCPU.MatchString(test):
|
||||
hwType = "CPU"
|
||||
case matchAcc.MatchString(test):
|
||||
hwType = "Accelerator"
|
||||
@@ -175,9 +175,9 @@ func (r *mutationResolver) AddTagsToJob(ctx context.Context, job string, tagIds
|
||||
}
|
||||
|
||||
tags := []*schema.Tag{}
|
||||
for _, tagId := range tagIds {
|
||||
for _, tagID := range tagIds {
|
||||
// Get ID
|
||||
tid, err := strconv.ParseInt(tagId, 10, 64)
|
||||
tid, err := strconv.ParseInt(tagID, 10, 64)
|
||||
if err != nil {
|
||||
cclog.Warn("Error while parsing tag id")
|
||||
return nil, err
|
||||
@@ -222,9 +222,9 @@ func (r *mutationResolver) RemoveTagsFromJob(ctx context.Context, job string, ta
|
||||
}
|
||||
|
||||
tags := []*schema.Tag{}
|
||||
for _, tagId := range tagIds {
|
||||
for _, tagID := range tagIds {
|
||||
// Get ID
|
||||
tid, err := strconv.ParseInt(tagId, 10, 64)
|
||||
tid, err := strconv.ParseInt(tagID, 10, 64)
|
||||
if err != nil {
|
||||
cclog.Warn("Error while parsing tag id")
|
||||
return nil, err
|
||||
@@ -265,9 +265,9 @@ func (r *mutationResolver) RemoveTagFromList(ctx context.Context, tagIds []strin
|
||||
}
|
||||
|
||||
tags := []int{}
|
||||
for _, tagId := range tagIds {
|
||||
for _, tagID := range tagIds {
|
||||
// Get ID
|
||||
tid, err := strconv.ParseInt(tagId, 10, 64)
|
||||
tid, err := strconv.ParseInt(tagID, 10, 64)
|
||||
if err != nil {
|
||||
cclog.Warn("Error while parsing tag id for removal")
|
||||
return nil, err
|
||||
@@ -317,7 +317,7 @@ func (r *nodeResolver) SchedulerState(ctx context.Context, obj *schema.Node) (sc
|
||||
if obj.NodeState != "" {
|
||||
return obj.NodeState, nil
|
||||
} else {
|
||||
return "", fmt.Errorf("No SchedulerState (NodeState) on Object")
|
||||
return "", fmt.Errorf("no SchedulerState (NodeState) on Object")
|
||||
}
|
||||
}
|
||||
|
||||
@@ -343,6 +343,14 @@ func (r *queryResolver) Tags(ctx context.Context) ([]*schema.Tag, error) {
|
||||
|
||||
// GlobalMetrics is the resolver for the globalMetrics field.
|
||||
func (r *queryResolver) GlobalMetrics(ctx context.Context) ([]*schema.GlobalMetricListItem, error) {
|
||||
user := repository.GetUserFromContext(ctx)
|
||||
|
||||
if user != nil {
|
||||
if user.HasRole(schema.RoleUser) || user.HasRole(schema.RoleManager) {
|
||||
return archive.GlobalUserMetricList, nil
|
||||
}
|
||||
}
|
||||
|
||||
return archive.GlobalMetricList, nil
|
||||
}
|
||||
|
||||
@@ -373,12 +381,12 @@ func (r *queryResolver) AllocatedNodes(ctx context.Context, cluster string) ([]*
|
||||
// Node is the resolver for the node field.
|
||||
func (r *queryResolver) Node(ctx context.Context, id string) (*schema.Node, error) {
|
||||
repo := repository.GetNodeRepository()
|
||||
numericId, err := strconv.ParseInt(id, 10, 64)
|
||||
numericID, err := strconv.ParseInt(id, 10, 64)
|
||||
if err != nil {
|
||||
cclog.Warn("Error while parsing job id")
|
||||
return nil, err
|
||||
}
|
||||
return repo.GetNodeByID(numericId, false)
|
||||
return repo.GetNodeByID(numericID, false)
|
||||
}
|
||||
|
||||
// Nodes is the resolver for the nodes field.
|
||||
@@ -405,8 +413,7 @@ func (r *queryResolver) NodeStates(ctx context.Context, filter []*model.NodeFilt
|
||||
return nil, herr
|
||||
}
|
||||
|
||||
allCounts := make([]*model.NodeStates, 0)
|
||||
allCounts = append(stateCounts, healthCounts...)
|
||||
allCounts := append(stateCounts, healthCounts...)
|
||||
|
||||
return allCounts, nil
|
||||
}
|
||||
@@ -433,18 +440,18 @@ func (r *queryResolver) NodeStatesTimed(ctx context.Context, filter []*model.Nod
|
||||
return healthCounts, nil
|
||||
}
|
||||
|
||||
return nil, errors.New("Unknown Node State Query Type")
|
||||
return nil, errors.New("unknown Node State Query Type")
|
||||
}
|
||||
|
||||
// Job is the resolver for the job field.
|
||||
func (r *queryResolver) Job(ctx context.Context, id string) (*schema.Job, error) {
|
||||
numericId, err := strconv.ParseInt(id, 10, 64)
|
||||
numericID, err := strconv.ParseInt(id, 10, 64)
|
||||
if err != nil {
|
||||
cclog.Warn("Error while parsing job id")
|
||||
return nil, err
|
||||
}
|
||||
|
||||
job, err := r.Repo.FindByID(ctx, numericId)
|
||||
job, err := r.Repo.FindByID(ctx, numericID)
|
||||
if err != nil {
|
||||
cclog.Warn("Error while finding job by id")
|
||||
return nil, err
|
||||
@@ -809,7 +816,7 @@ func (r *queryResolver) NodeMetricsList(ctx context.Context, cluster string, sub
|
||||
nodeRepo := repository.GetNodeRepository()
|
||||
nodes, stateMap, countNodes, hasNextPage, nerr := nodeRepo.GetNodesForList(ctx, cluster, subCluster, stateFilter, nodeFilter, page)
|
||||
if nerr != nil {
|
||||
return nil, errors.New("Could not retrieve node list required for resolving NodeMetricsList")
|
||||
return nil, errors.New("could not retrieve node list required for resolving NodeMetricsList")
|
||||
}
|
||||
|
||||
if metrics == nil {
|
||||
@@ -898,9 +905,7 @@ func (r *queryResolver) ClusterMetrics(ctx context.Context, cluster string, metr
|
||||
collectorUnit[metric] = scopedMetric.Unit
|
||||
// Collect Initial Data
|
||||
for _, ser := range scopedMetric.Series {
|
||||
for _, val := range ser.Data {
|
||||
collectorData[metric] = append(collectorData[metric], val)
|
||||
}
|
||||
collectorData[metric] = append(collectorData[metric], ser.Data...)
|
||||
}
|
||||
}
|
||||
} else {
|
||||
|
||||
@@ -107,7 +107,7 @@ func setup(t *testing.T) *repository.JobRepository {
|
||||
}
|
||||
|
||||
dbfilepath := filepath.Join(tmpdir, "test.db")
|
||||
err := repository.MigrateDB("sqlite3", dbfilepath)
|
||||
err := repository.MigrateDB(dbfilepath)
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
|
||||
@@ -770,6 +770,7 @@ func (ccms *CCMetricStore) LoadNodeData(
|
||||
}
|
||||
|
||||
mc := archive.GetMetricConfig(cluster, metric)
|
||||
if mc != nil {
|
||||
hostdata[metric] = append(hostdata[metric], &schema.JobMetric{
|
||||
Unit: mc.Unit,
|
||||
Timestep: mc.Timestep,
|
||||
@@ -785,6 +786,9 @@ func (ccms *CCMetricStore) LoadNodeData(
|
||||
},
|
||||
},
|
||||
})
|
||||
} else {
|
||||
cclog.Warnf("Metric '%s' not configured for cluster '%s': Skipped in LoadNodeData() Return!", metric, cluster)
|
||||
}
|
||||
}
|
||||
|
||||
if len(errors) != 0 {
|
||||
|
||||
@@ -55,6 +55,10 @@ func Connect(driver string, db string) {
|
||||
var err error
|
||||
var dbHandle *sqlx.DB
|
||||
|
||||
if driver != "sqlite3" {
|
||||
cclog.Abortf("Unsupported database driver '%s'. Only 'sqlite3' is supported.\n", driver)
|
||||
}
|
||||
|
||||
dbConnOnce.Do(func() {
|
||||
opts := DatabaseOptions{
|
||||
URL: db,
|
||||
@@ -64,8 +68,6 @@ func Connect(driver string, db string) {
|
||||
ConnectionMaxIdleTime: repoConfig.ConnectionMaxIdleTime,
|
||||
}
|
||||
|
||||
switch driver {
|
||||
case "sqlite3":
|
||||
// TODO: Have separate DB handles for Writes and Reads
|
||||
// Optimize SQLite connection: https://kerkour.com/sqlite-for-servers
|
||||
connectionURLParams := make(url.Values)
|
||||
@@ -84,20 +86,14 @@ func Connect(driver string, db string) {
|
||||
dbHandle, err = sqlx.Open("sqlite3", opts.URL)
|
||||
}
|
||||
|
||||
if err != nil {
|
||||
cclog.Abortf("DB Connection: Could not connect to SQLite database with sqlx.Open().\nError: %s\n", err.Error())
|
||||
}
|
||||
|
||||
err = setupSqlite(dbHandle.DB)
|
||||
if err != nil {
|
||||
cclog.Abortf("Failed sqlite db setup.\nError: %s\n", err.Error())
|
||||
}
|
||||
case "mysql":
|
||||
opts.URL += "?multiStatements=true"
|
||||
dbHandle, err = sqlx.Open("mysql", opts.URL)
|
||||
default:
|
||||
cclog.Abortf("DB Connection: Unsupported database driver '%s'.\n", driver)
|
||||
}
|
||||
|
||||
if err != nil {
|
||||
cclog.Abortf("DB Connection: Could not connect to '%s' database with sqlx.Open().\nError: %s\n", driver, err.Error())
|
||||
}
|
||||
|
||||
dbHandle.SetMaxOpenConns(opts.MaxOpenConnections)
|
||||
dbHandle.SetMaxIdleConns(opts.MaxIdleConnections)
|
||||
@@ -105,7 +101,7 @@ func Connect(driver string, db string) {
|
||||
dbHandle.SetConnMaxIdleTime(opts.ConnectionMaxIdleTime)
|
||||
|
||||
dbConnInstance = &DBConnection{DB: dbHandle, Driver: driver}
|
||||
err = checkDBVersion(driver, dbHandle.DB)
|
||||
err = checkDBVersion(dbHandle.DB)
|
||||
if err != nil {
|
||||
cclog.Abortf("DB Connection: Failed DB version check.\nError: %s\n", err.Error())
|
||||
}
|
||||
|
||||
@@ -14,8 +14,6 @@
|
||||
// Initialize the database connection before using any repository:
|
||||
//
|
||||
// repository.Connect("sqlite3", "./var/job.db")
|
||||
// // or for MySQL:
|
||||
// repository.Connect("mysql", "user:password@tcp(localhost:3306)/dbname")
|
||||
//
|
||||
// # Configuration
|
||||
//
|
||||
@@ -158,52 +156,22 @@ func scanJob(row interface{ Scan(...any) error }) (*schema.Job, error) {
|
||||
}
|
||||
|
||||
func (r *JobRepository) Optimize() error {
|
||||
var err error
|
||||
|
||||
switch r.driver {
|
||||
case "sqlite3":
|
||||
if _, err = r.DB.Exec(`VACUUM`); err != nil {
|
||||
if _, err := r.DB.Exec(`VACUUM`); err != nil {
|
||||
return err
|
||||
}
|
||||
case "mysql":
|
||||
cclog.Info("Optimize currently not supported for mysql driver")
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
func (r *JobRepository) Flush() error {
|
||||
var err error
|
||||
|
||||
switch r.driver {
|
||||
case "sqlite3":
|
||||
if _, err = r.DB.Exec(`DELETE FROM jobtag`); err != nil {
|
||||
if _, err := r.DB.Exec(`DELETE FROM jobtag`); err != nil {
|
||||
return err
|
||||
}
|
||||
if _, err = r.DB.Exec(`DELETE FROM tag`); err != nil {
|
||||
if _, err := r.DB.Exec(`DELETE FROM tag`); err != nil {
|
||||
return err
|
||||
}
|
||||
if _, err = r.DB.Exec(`DELETE FROM job`); err != nil {
|
||||
if _, err := r.DB.Exec(`DELETE FROM job`); err != nil {
|
||||
return err
|
||||
}
|
||||
case "mysql":
|
||||
if _, err = r.DB.Exec(`SET FOREIGN_KEY_CHECKS = 0`); err != nil {
|
||||
return err
|
||||
}
|
||||
if _, err = r.DB.Exec(`TRUNCATE TABLE jobtag`); err != nil {
|
||||
return err
|
||||
}
|
||||
if _, err = r.DB.Exec(`TRUNCATE TABLE tag`); err != nil {
|
||||
return err
|
||||
}
|
||||
if _, err = r.DB.Exec(`TRUNCATE TABLE job`); err != nil {
|
||||
return err
|
||||
}
|
||||
if _, err = r.DB.Exec(`SET FOREIGN_KEY_CHECKS = 1`); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
|
||||
@@ -12,7 +12,6 @@ import (
|
||||
|
||||
cclog "github.com/ClusterCockpit/cc-lib/ccLogger"
|
||||
"github.com/golang-migrate/migrate/v4"
|
||||
"github.com/golang-migrate/migrate/v4/database/mysql"
|
||||
"github.com/golang-migrate/migrate/v4/database/sqlite3"
|
||||
"github.com/golang-migrate/migrate/v4/source/iofs"
|
||||
)
|
||||
@@ -22,11 +21,7 @@ const Version uint = 10
|
||||
//go:embed migrations/*
|
||||
var migrationFiles embed.FS
|
||||
|
||||
func checkDBVersion(backend string, db *sql.DB) error {
|
||||
var m *migrate.Migrate
|
||||
|
||||
switch backend {
|
||||
case "sqlite3":
|
||||
func checkDBVersion(db *sql.DB) error {
|
||||
driver, err := sqlite3.WithInstance(db, &sqlite3.Config{})
|
||||
if err != nil {
|
||||
return err
|
||||
@@ -36,27 +31,10 @@ func checkDBVersion(backend string, db *sql.DB) error {
|
||||
return err
|
||||
}
|
||||
|
||||
m, err = migrate.NewWithInstance("iofs", d, "sqlite3", driver)
|
||||
m, err := migrate.NewWithInstance("iofs", d, "sqlite3", driver)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
case "mysql":
|
||||
driver, err := mysql.WithInstance(db, &mysql.Config{})
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
d, err := iofs.New(migrationFiles, "migrations/mysql")
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
m, err = migrate.NewWithInstance("iofs", d, "mysql", driver)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
default:
|
||||
cclog.Abortf("Migration: Unsupported database backend '%s'.\n", backend)
|
||||
}
|
||||
|
||||
v, dirty, err := m.Version()
|
||||
if err != nil {
|
||||
@@ -80,37 +58,22 @@ func checkDBVersion(backend string, db *sql.DB) error {
|
||||
return nil
|
||||
}
|
||||
|
||||
func getMigrateInstance(backend string, db string) (m *migrate.Migrate, err error) {
|
||||
switch backend {
|
||||
case "sqlite3":
|
||||
func getMigrateInstance(db string) (m *migrate.Migrate, err error) {
|
||||
d, err := iofs.New(migrationFiles, "migrations/sqlite3")
|
||||
if err != nil {
|
||||
cclog.Fatal(err)
|
||||
return nil, err
|
||||
}
|
||||
|
||||
m, err = migrate.NewWithSourceInstance("iofs", d, fmt.Sprintf("sqlite3://%s?_foreign_keys=on", db))
|
||||
if err != nil {
|
||||
return m, err
|
||||
}
|
||||
case "mysql":
|
||||
d, err := iofs.New(migrationFiles, "migrations/mysql")
|
||||
if err != nil {
|
||||
return m, err
|
||||
}
|
||||
|
||||
m, err = migrate.NewWithSourceInstance("iofs", d, fmt.Sprintf("mysql://%s?multiStatements=true", db))
|
||||
if err != nil {
|
||||
return m, err
|
||||
}
|
||||
default:
|
||||
cclog.Abortf("Migration: Unsupported database backend '%s'.\n", backend)
|
||||
return nil, err
|
||||
}
|
||||
|
||||
return m, nil
|
||||
}
|
||||
|
||||
func MigrateDB(backend string, db string) error {
|
||||
m, err := getMigrateInstance(backend, db)
|
||||
func MigrateDB(db string) error {
|
||||
m, err := getMigrateInstance(db)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
@@ -144,8 +107,8 @@ func MigrateDB(backend string, db string) error {
|
||||
return nil
|
||||
}
|
||||
|
||||
func RevertDB(backend string, db string) error {
|
||||
m, err := getMigrateInstance(backend, db)
|
||||
func RevertDB(db string) error {
|
||||
m, err := getMigrateInstance(db)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
@@ -162,8 +125,8 @@ func RevertDB(backend string, db string) error {
|
||||
return nil
|
||||
}
|
||||
|
||||
func ForceDB(backend string, db string) error {
|
||||
m, err := getMigrateInstance(backend, db)
|
||||
func ForceDB(db string) error {
|
||||
m, err := getMigrateInstance(db)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
@@ -1,5 +0,0 @@
|
||||
DROP TABLE IF EXISTS job;
|
||||
DROP TABLE IF EXISTS tags;
|
||||
DROP TABLE IF EXISTS jobtag;
|
||||
DROP TABLE IF EXISTS configuration;
|
||||
DROP TABLE IF EXISTS user;
|
||||
@@ -1,66 +0,0 @@
|
||||
CREATE TABLE IF NOT EXISTS job (
|
||||
id INTEGER AUTO_INCREMENT PRIMARY KEY ,
|
||||
job_id BIGINT NOT NULL,
|
||||
cluster VARCHAR(255) NOT NULL,
|
||||
subcluster VARCHAR(255) NOT NULL,
|
||||
start_time BIGINT NOT NULL, -- Unix timestamp
|
||||
|
||||
user VARCHAR(255) NOT NULL,
|
||||
project VARCHAR(255) NOT NULL,
|
||||
`partition` VARCHAR(255) NOT NULL,
|
||||
array_job_id BIGINT NOT NULL,
|
||||
duration INT NOT NULL DEFAULT 0,
|
||||
walltime INT NOT NULL DEFAULT 0,
|
||||
job_state VARCHAR(255) NOT NULL
|
||||
CHECK(job_state IN ('running', 'completed', 'failed', 'cancelled',
|
||||
'stopped', 'timeout', 'preempted', 'out_of_memory')),
|
||||
meta_data TEXT, -- JSON
|
||||
resources TEXT NOT NULL, -- JSON
|
||||
|
||||
num_nodes INT NOT NULL,
|
||||
num_hwthreads INT NOT NULL,
|
||||
num_acc INT NOT NULL,
|
||||
smt TINYINT NOT NULL DEFAULT 1 CHECK(smt IN (0, 1 )),
|
||||
exclusive TINYINT NOT NULL DEFAULT 1 CHECK(exclusive IN (0, 1, 2)),
|
||||
monitoring_status TINYINT NOT NULL DEFAULT 1 CHECK(monitoring_status IN (0, 1, 2, 3)),
|
||||
|
||||
mem_used_max REAL NOT NULL DEFAULT 0.0,
|
||||
flops_any_avg REAL NOT NULL DEFAULT 0.0,
|
||||
mem_bw_avg REAL NOT NULL DEFAULT 0.0,
|
||||
load_avg REAL NOT NULL DEFAULT 0.0,
|
||||
net_bw_avg REAL NOT NULL DEFAULT 0.0,
|
||||
net_data_vol_total REAL NOT NULL DEFAULT 0.0,
|
||||
file_bw_avg REAL NOT NULL DEFAULT 0.0,
|
||||
file_data_vol_total REAL NOT NULL DEFAULT 0.0,
|
||||
UNIQUE (job_id, cluster, start_time)
|
||||
);
|
||||
|
||||
CREATE TABLE IF NOT EXISTS tag (
|
||||
id INTEGER PRIMARY KEY,
|
||||
tag_type VARCHAR(255) NOT NULL,
|
||||
tag_name VARCHAR(255) NOT NULL,
|
||||
UNIQUE (tag_type, tag_name));
|
||||
|
||||
CREATE TABLE IF NOT EXISTS jobtag (
|
||||
job_id INTEGER,
|
||||
tag_id INTEGER,
|
||||
PRIMARY KEY (job_id, tag_id),
|
||||
FOREIGN KEY (job_id) REFERENCES job (id) ON DELETE CASCADE,
|
||||
FOREIGN KEY (tag_id) REFERENCES tag (id) ON DELETE CASCADE);
|
||||
|
||||
CREATE TABLE IF NOT EXISTS user (
|
||||
username varchar(255) PRIMARY KEY NOT NULL,
|
||||
password varchar(255) DEFAULT NULL,
|
||||
ldap tinyint NOT NULL DEFAULT 0, /* col called "ldap" for historic reasons, fills the "AuthSource" */
|
||||
name varchar(255) DEFAULT NULL,
|
||||
roles varchar(255) NOT NULL DEFAULT "[]",
|
||||
email varchar(255) DEFAULT NULL);
|
||||
|
||||
CREATE TABLE IF NOT EXISTS configuration (
|
||||
username varchar(255),
|
||||
confkey varchar(255),
|
||||
value varchar(255),
|
||||
PRIMARY KEY (username, confkey),
|
||||
FOREIGN KEY (username) REFERENCES user (username) ON DELETE CASCADE ON UPDATE NO ACTION);
|
||||
|
||||
|
||||
@@ -1,8 +0,0 @@
|
||||
DROP INDEX IF EXISTS job_stats;
|
||||
DROP INDEX IF EXISTS job_by_user;
|
||||
DROP INDEX IF EXISTS job_by_starttime;
|
||||
DROP INDEX IF EXISTS job_by_job_id;
|
||||
DROP INDEX IF EXISTS job_list;
|
||||
DROP INDEX IF EXISTS job_list_user;
|
||||
DROP INDEX IF EXISTS job_list_users;
|
||||
DROP INDEX IF EXISTS job_list_users_start;
|
||||
@@ -1,8 +0,0 @@
|
||||
CREATE INDEX IF NOT EXISTS job_stats ON job (cluster,subcluster,user);
|
||||
CREATE INDEX IF NOT EXISTS job_by_user ON job (user);
|
||||
CREATE INDEX IF NOT EXISTS job_by_starttime ON job (start_time);
|
||||
CREATE INDEX IF NOT EXISTS job_by_job_id ON job (job_id);
|
||||
CREATE INDEX IF NOT EXISTS job_list ON job (cluster, job_state);
|
||||
CREATE INDEX IF NOT EXISTS job_list_user ON job (user, cluster, job_state);
|
||||
CREATE INDEX IF NOT EXISTS job_list_users ON job (user, job_state);
|
||||
CREATE INDEX IF NOT EXISTS job_list_users_start ON job (start_time, user, job_state);
|
||||
@@ -1 +0,0 @@
|
||||
ALTER TABLE user DROP COLUMN projects;
|
||||
@@ -1 +0,0 @@
|
||||
ALTER TABLE user ADD COLUMN projects varchar(255) NOT NULL DEFAULT "[]";
|
||||
@@ -1,5 +0,0 @@
|
||||
ALTER TABLE job
|
||||
MODIFY `partition` VARCHAR(255) NOT NULL,
|
||||
MODIFY array_job_id BIGINT NOT NULL,
|
||||
MODIFY num_hwthreads INT NOT NULL,
|
||||
MODIFY num_acc INT NOT NULL;
|
||||
@@ -1,5 +0,0 @@
|
||||
ALTER TABLE job
|
||||
MODIFY `partition` VARCHAR(255),
|
||||
MODIFY array_job_id BIGINT,
|
||||
MODIFY num_hwthreads INT,
|
||||
MODIFY num_acc INT;
|
||||
@@ -1,2 +0,0 @@
|
||||
ALTER TABLE tag DROP COLUMN insert_time;
|
||||
ALTER TABLE jobtag DROP COLUMN insert_time;
|
||||
@@ -1,2 +0,0 @@
|
||||
ALTER TABLE tag ADD COLUMN insert_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP;
|
||||
ALTER TABLE jobtag ADD COLUMN insert_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP;
|
||||
@@ -1 +0,0 @@
|
||||
ALTER TABLE configuration MODIFY value VARCHAR(255);
|
||||
@@ -1 +0,0 @@
|
||||
ALTER TABLE configuration MODIFY value TEXT;
|
||||
@@ -1,3 +0,0 @@
|
||||
SET FOREIGN_KEY_CHECKS = 0;
|
||||
ALTER TABLE tag MODIFY id INTEGER;
|
||||
SET FOREIGN_KEY_CHECKS = 1;
|
||||
@@ -1,3 +0,0 @@
|
||||
SET FOREIGN_KEY_CHECKS = 0;
|
||||
ALTER TABLE tag MODIFY id INTEGER AUTO_INCREMENT;
|
||||
SET FOREIGN_KEY_CHECKS = 1;
|
||||
@@ -1,83 +0,0 @@
|
||||
ALTER TABLE job DROP energy;
|
||||
ALTER TABLE job DROP energy_footprint;
|
||||
ALTER TABLE job ADD COLUMN flops_any_avg;
|
||||
ALTER TABLE job ADD COLUMN mem_bw_avg;
|
||||
ALTER TABLE job ADD COLUMN mem_used_max;
|
||||
ALTER TABLE job ADD COLUMN load_avg;
|
||||
ALTER TABLE job ADD COLUMN net_bw_avg;
|
||||
ALTER TABLE job ADD COLUMN net_data_vol_total;
|
||||
ALTER TABLE job ADD COLUMN file_bw_avg;
|
||||
ALTER TABLE job ADD COLUMN file_data_vol_total;
|
||||
|
||||
UPDATE job SET flops_any_avg = json_extract(footprint, '$.flops_any_avg');
|
||||
UPDATE job SET mem_bw_avg = json_extract(footprint, '$.mem_bw_avg');
|
||||
UPDATE job SET mem_used_max = json_extract(footprint, '$.mem_used_max');
|
||||
UPDATE job SET load_avg = json_extract(footprint, '$.cpu_load_avg');
|
||||
UPDATE job SET net_bw_avg = json_extract(footprint, '$.net_bw_avg');
|
||||
UPDATE job SET net_data_vol_total = json_extract(footprint, '$.net_data_vol_total');
|
||||
UPDATE job SET file_bw_avg = json_extract(footprint, '$.file_bw_avg');
|
||||
UPDATE job SET file_data_vol_total = json_extract(footprint, '$.file_data_vol_total');
|
||||
|
||||
ALTER TABLE job DROP footprint;
|
||||
-- Do not use reserved keywords anymore
|
||||
RENAME TABLE hpc_user TO `user`;
|
||||
ALTER TABLE job RENAME COLUMN hpc_user TO `user`;
|
||||
ALTER TABLE job RENAME COLUMN cluster_partition TO `partition`;
|
||||
|
||||
DROP INDEX IF EXISTS jobs_cluster;
|
||||
DROP INDEX IF EXISTS jobs_cluster_user;
|
||||
DROP INDEX IF EXISTS jobs_cluster_project;
|
||||
DROP INDEX IF EXISTS jobs_cluster_subcluster;
|
||||
DROP INDEX IF EXISTS jobs_cluster_starttime;
|
||||
DROP INDEX IF EXISTS jobs_cluster_duration;
|
||||
DROP INDEX IF EXISTS jobs_cluster_numnodes;
|
||||
|
||||
DROP INDEX IF EXISTS jobs_cluster_partition;
|
||||
DROP INDEX IF EXISTS jobs_cluster_partition_starttime;
|
||||
DROP INDEX IF EXISTS jobs_cluster_partition_duration;
|
||||
DROP INDEX IF EXISTS jobs_cluster_partition_numnodes;
|
||||
|
||||
DROP INDEX IF EXISTS jobs_cluster_partition_jobstate;
|
||||
DROP INDEX IF EXISTS jobs_cluster_partition_jobstate_user;
|
||||
DROP INDEX IF EXISTS jobs_cluster_partition_jobstate_project;
|
||||
DROP INDEX IF EXISTS jobs_cluster_partition_jobstate_starttime;
|
||||
DROP INDEX IF EXISTS jobs_cluster_partition_jobstate_duration;
|
||||
DROP INDEX IF EXISTS jobs_cluster_partition_jobstate_numnodes;
|
||||
|
||||
DROP INDEX IF EXISTS jobs_cluster_jobstate;
|
||||
DROP INDEX IF EXISTS jobs_cluster_jobstate_user;
|
||||
DROP INDEX IF EXISTS jobs_cluster_jobstate_project;
|
||||
|
||||
DROP INDEX IF EXISTS jobs_cluster_jobstate_starttime;
|
||||
DROP INDEX IF EXISTS jobs_cluster_jobstate_duration;
|
||||
DROP INDEX IF EXISTS jobs_cluster_jobstate_numnodes;
|
||||
|
||||
DROP INDEX IF EXISTS jobs_user;
|
||||
DROP INDEX IF EXISTS jobs_user_starttime;
|
||||
DROP INDEX IF EXISTS jobs_user_duration;
|
||||
DROP INDEX IF EXISTS jobs_user_numnodes;
|
||||
|
||||
DROP INDEX IF EXISTS jobs_project;
|
||||
DROP INDEX IF EXISTS jobs_project_user;
|
||||
DROP INDEX IF EXISTS jobs_project_starttime;
|
||||
DROP INDEX IF EXISTS jobs_project_duration;
|
||||
DROP INDEX IF EXISTS jobs_project_numnodes;
|
||||
|
||||
DROP INDEX IF EXISTS jobs_jobstate;
|
||||
DROP INDEX IF EXISTS jobs_jobstate_user;
|
||||
DROP INDEX IF EXISTS jobs_jobstate_project;
|
||||
DROP INDEX IF EXISTS jobs_jobstate_starttime;
|
||||
DROP INDEX IF EXISTS jobs_jobstate_duration;
|
||||
DROP INDEX IF EXISTS jobs_jobstate_numnodes;
|
||||
|
||||
DROP INDEX IF EXISTS jobs_arrayjobid_starttime;
|
||||
DROP INDEX IF EXISTS jobs_cluster_arrayjobid_starttime;
|
||||
|
||||
DROP INDEX IF EXISTS jobs_starttime;
|
||||
DROP INDEX IF EXISTS jobs_duration;
|
||||
DROP INDEX IF EXISTS jobs_numnodes;
|
||||
|
||||
DROP INDEX IF EXISTS jobs_duration_starttime;
|
||||
DROP INDEX IF EXISTS jobs_numnodes_starttime;
|
||||
DROP INDEX IF EXISTS jobs_numacc_starttime;
|
||||
DROP INDEX IF EXISTS jobs_energy_starttime;
|
||||
@@ -1,123 +0,0 @@
|
||||
DROP INDEX IF EXISTS job_stats ON job;
|
||||
DROP INDEX IF EXISTS job_by_user ON job;
|
||||
DROP INDEX IF EXISTS job_by_starttime ON job;
|
||||
DROP INDEX IF EXISTS job_by_job_id ON job;
|
||||
DROP INDEX IF EXISTS job_list ON job;
|
||||
DROP INDEX IF EXISTS job_list_user ON job;
|
||||
DROP INDEX IF EXISTS job_list_users ON job;
|
||||
DROP INDEX IF EXISTS job_list_users_start ON job;
|
||||
|
||||
ALTER TABLE job ADD COLUMN energy REAL NOT NULL DEFAULT 0.0;
|
||||
ALTER TABLE job ADD COLUMN energy_footprint JSON;
|
||||
|
||||
ALTER TABLE job ADD COLUMN footprint JSON;
|
||||
ALTER TABLE tag ADD COLUMN tag_scope TEXT NOT NULL DEFAULT 'global';
|
||||
|
||||
-- Do not use reserved keywords anymore
|
||||
RENAME TABLE `user` TO hpc_user;
|
||||
ALTER TABLE job RENAME COLUMN `user` TO hpc_user;
|
||||
ALTER TABLE job RENAME COLUMN `partition` TO cluster_partition;
|
||||
|
||||
ALTER TABLE job MODIFY COLUMN cluster VARCHAR(50);
|
||||
ALTER TABLE job MODIFY COLUMN hpc_user VARCHAR(50);
|
||||
ALTER TABLE job MODIFY COLUMN subcluster VARCHAR(50);
|
||||
ALTER TABLE job MODIFY COLUMN project VARCHAR(50);
|
||||
ALTER TABLE job MODIFY COLUMN cluster_partition VARCHAR(50);
|
||||
ALTER TABLE job MODIFY COLUMN job_state VARCHAR(25);
|
||||
|
||||
UPDATE job SET footprint = '{"flops_any_avg": 0.0}';
|
||||
UPDATE job SET footprint = json_replace(footprint, '$.flops_any_avg', job.flops_any_avg);
|
||||
UPDATE job SET footprint = json_insert(footprint, '$.mem_bw_avg', job.mem_bw_avg);
|
||||
UPDATE job SET footprint = json_insert(footprint, '$.mem_used_max', job.mem_used_max);
|
||||
UPDATE job SET footprint = json_insert(footprint, '$.cpu_load_avg', job.load_avg);
|
||||
UPDATE job SET footprint = json_insert(footprint, '$.net_bw_avg', job.net_bw_avg) WHERE job.net_bw_avg != 0;
|
||||
UPDATE job SET footprint = json_insert(footprint, '$.net_data_vol_total', job.net_data_vol_total) WHERE job.net_data_vol_total != 0;
|
||||
UPDATE job SET footprint = json_insert(footprint, '$.file_bw_avg', job.file_bw_avg) WHERE job.file_bw_avg != 0;
|
||||
UPDATE job SET footprint = json_insert(footprint, '$.file_data_vol_total', job.file_data_vol_total) WHERE job.file_data_vol_total != 0;
|
||||
|
||||
ALTER TABLE job DROP flops_any_avg;
|
||||
ALTER TABLE job DROP mem_bw_avg;
|
||||
ALTER TABLE job DROP mem_used_max;
|
||||
ALTER TABLE job DROP load_avg;
|
||||
ALTER TABLE job DROP net_bw_avg;
|
||||
ALTER TABLE job DROP net_data_vol_total;
|
||||
ALTER TABLE job DROP file_bw_avg;
|
||||
ALTER TABLE job DROP file_data_vol_total;
|
||||
|
||||
-- Indices for: Single filters, combined filters, sorting, sorting with filters
|
||||
-- Cluster Filter
|
||||
CREATE INDEX IF NOT EXISTS jobs_cluster ON job (cluster);
|
||||
CREATE INDEX IF NOT EXISTS jobs_cluster_user ON job (cluster, hpc_user);
|
||||
CREATE INDEX IF NOT EXISTS jobs_cluster_project ON job (cluster, project);
|
||||
CREATE INDEX IF NOT EXISTS jobs_cluster_subcluster ON job (cluster, subcluster);
|
||||
-- Cluster Filter Sorting
|
||||
CREATE INDEX IF NOT EXISTS jobs_cluster_starttime ON job (cluster, start_time);
|
||||
CREATE INDEX IF NOT EXISTS jobs_cluster_duration ON job (cluster, duration);
|
||||
CREATE INDEX IF NOT EXISTS jobs_cluster_numnodes ON job (cluster, num_nodes);
|
||||
|
||||
-- Cluster+Partition Filter
|
||||
CREATE INDEX IF NOT EXISTS jobs_cluster_partition ON job (cluster, cluster_partition);
|
||||
-- Cluster+Partition Filter Sorting
|
||||
CREATE INDEX IF NOT EXISTS jobs_cluster_partition_starttime ON job (cluster, cluster_partition, start_time);
|
||||
CREATE INDEX IF NOT EXISTS jobs_cluster_partition_duration ON job (cluster, cluster_partition, duration);
|
||||
CREATE INDEX IF NOT EXISTS jobs_cluster_partition_numnodes ON job (cluster, cluster_partition, num_nodes);
|
||||
|
||||
-- Cluster+Partition+Jobstate Filter
|
||||
CREATE INDEX IF NOT EXISTS jobs_cluster_partition_jobstate ON job (cluster, cluster_partition, job_state);
|
||||
CREATE INDEX IF NOT EXISTS jobs_cluster_partition_jobstate_user ON job (cluster, cluster_partition, job_state, hpc_user);
|
||||
CREATE INDEX IF NOT EXISTS jobs_cluster_partition_jobstate_project ON job (cluster, cluster_partition, job_state, project);
|
||||
-- Cluster+Partition+Jobstate Filter Sorting
|
||||
CREATE INDEX IF NOT EXISTS jobs_cluster_partition_jobstate_starttime ON job (cluster, cluster_partition, job_state, start_time);
|
||||
CREATE INDEX IF NOT EXISTS jobs_cluster_partition_jobstate_duration ON job (cluster, cluster_partition, job_state, duration);
|
||||
CREATE INDEX IF NOT EXISTS jobs_cluster_partition_jobstate_numnodes ON job (cluster, cluster_partition, job_state, num_nodes);
|
||||
|
||||
-- Cluster+JobState Filter
|
||||
CREATE INDEX IF NOT EXISTS jobs_cluster_jobstate ON job (cluster, job_state);
|
||||
CREATE INDEX IF NOT EXISTS jobs_cluster_jobstate_user ON job (cluster, job_state, hpc_user);
|
||||
CREATE INDEX IF NOT EXISTS jobs_cluster_jobstate_project ON job (cluster, job_state, project);
|
||||
-- Cluster+JobState Filter Sorting
|
||||
CREATE INDEX IF NOT EXISTS jobs_cluster_jobstate_starttime ON job (cluster, job_state, start_time);
|
||||
CREATE INDEX IF NOT EXISTS jobs_cluster_jobstate_duration ON job (cluster, job_state, duration);
|
||||
CREATE INDEX IF NOT EXISTS jobs_cluster_jobstate_numnodes ON job (cluster, job_state, num_nodes);
|
||||
|
||||
-- User Filter
|
||||
CREATE INDEX IF NOT EXISTS jobs_user ON job (hpc_user);
|
||||
-- User Filter Sorting
|
||||
CREATE INDEX IF NOT EXISTS jobs_user_starttime ON job (hpc_user, start_time);
|
||||
CREATE INDEX IF NOT EXISTS jobs_user_duration ON job (hpc_user, duration);
|
||||
CREATE INDEX IF NOT EXISTS jobs_user_numnodes ON job (hpc_user, num_nodes);
|
||||
|
||||
-- Project Filter
|
||||
CREATE INDEX IF NOT EXISTS jobs_project ON job (project);
|
||||
CREATE INDEX IF NOT EXISTS jobs_project_user ON job (project, hpc_user);
|
||||
-- Project Filter Sorting
|
||||
CREATE INDEX IF NOT EXISTS jobs_project_starttime ON job (project, start_time);
|
||||
CREATE INDEX IF NOT EXISTS jobs_project_duration ON job (project, duration);
|
||||
CREATE INDEX IF NOT EXISTS jobs_project_numnodes ON job (project, num_nodes);
|
||||
|
||||
-- JobState Filter
|
||||
CREATE INDEX IF NOT EXISTS jobs_jobstate ON job (job_state);
|
||||
CREATE INDEX IF NOT EXISTS jobs_jobstate_user ON job (job_state, hpc_user);
|
||||
CREATE INDEX IF NOT EXISTS jobs_jobstate_project ON job (job_state, project);
|
||||
CREATE INDEX IF NOT EXISTS jobs_jobstate_cluster ON job (job_state, cluster);
|
||||
-- JobState Filter Sorting
|
||||
CREATE INDEX IF NOT EXISTS jobs_jobstate_starttime ON job (job_state, start_time);
|
||||
CREATE INDEX IF NOT EXISTS jobs_jobstate_duration ON job (job_state, duration);
|
||||
CREATE INDEX IF NOT EXISTS jobs_jobstate_numnodes ON job (job_state, num_nodes);
|
||||
|
||||
-- ArrayJob Filter
|
||||
CREATE INDEX IF NOT EXISTS jobs_arrayjobid_starttime ON job (array_job_id, start_time);
|
||||
CREATE INDEX IF NOT EXISTS jobs_cluster_arrayjobid_starttime ON job (cluster, array_job_id, start_time);
|
||||
|
||||
-- Sorting without active filters
|
||||
CREATE INDEX IF NOT EXISTS jobs_starttime ON job (start_time);
|
||||
CREATE INDEX IF NOT EXISTS jobs_duration ON job (duration);
|
||||
CREATE INDEX IF NOT EXISTS jobs_numnodes ON job (num_nodes);
|
||||
|
||||
-- Single filters with default starttime sorting
|
||||
CREATE INDEX IF NOT EXISTS jobs_duration_starttime ON job (duration, start_time);
|
||||
CREATE INDEX IF NOT EXISTS jobs_numnodes_starttime ON job (num_nodes, start_time);
|
||||
CREATE INDEX IF NOT EXISTS jobs_numacc_starttime ON job (num_acc, start_time);
|
||||
CREATE INDEX IF NOT EXISTS jobs_energy_starttime ON job (energy, start_time);
|
||||
|
||||
-- Optimize DB index usage
|
||||
@@ -130,7 +130,7 @@ func nodeTestSetup(t *testing.T) {
|
||||
}
|
||||
|
||||
dbfilepath := filepath.Join(tmpdir, "test.db")
|
||||
err := MigrateDB("sqlite3", dbfilepath)
|
||||
err := MigrateDB(dbfilepath)
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
|
||||
@@ -149,7 +149,7 @@ func setup(tb testing.TB) *JobRepository {
|
||||
tb.Helper()
|
||||
cclog.Init("warn", true)
|
||||
dbfile := "testdata/job.db"
|
||||
err := MigrateDB("sqlite3", dbfile)
|
||||
err := MigrateDB(dbfile)
|
||||
noErr(tb, err)
|
||||
Connect("sqlite3", dbfile)
|
||||
return GetJobRepository()
|
||||
|
||||
@@ -73,9 +73,6 @@ func (r *JobRepository) buildStatsQuery(
|
||||
col string,
|
||||
) sq.SelectBuilder {
|
||||
var query sq.SelectBuilder
|
||||
castType := r.getCastType()
|
||||
|
||||
// fmt.Sprintf(`CAST(ROUND((CASE WHEN job.job_state = "running" THEN %d - job.start_time ELSE job.duration END) / 3600) as %s) as value`, time.Now().Unix(), castType)
|
||||
|
||||
if col != "" {
|
||||
// Scan columns: id, name, totalJobs, totalUsers, totalWalltime, totalNodes, totalNodeHours, totalCores, totalCoreHours, totalAccs, totalAccHours
|
||||
@@ -84,26 +81,26 @@ func (r *JobRepository) buildStatsQuery(
|
||||
"name",
|
||||
"COUNT(job.id) as totalJobs",
|
||||
"COUNT(DISTINCT job.hpc_user) AS totalUsers",
|
||||
fmt.Sprintf(`CAST(ROUND(SUM((CASE WHEN job.job_state = "running" THEN %d - job.start_time ELSE job.duration END)) / 3600) as %s) as totalWalltime`, time.Now().Unix(), castType),
|
||||
fmt.Sprintf(`CAST(SUM(job.num_nodes) as %s) as totalNodes`, castType),
|
||||
fmt.Sprintf(`CAST(ROUND(SUM((CASE WHEN job.job_state = "running" THEN %d - job.start_time ELSE job.duration END) * job.num_nodes) / 3600) as %s) as totalNodeHours`, time.Now().Unix(), castType),
|
||||
fmt.Sprintf(`CAST(SUM(job.num_hwthreads) as %s) as totalCores`, castType),
|
||||
fmt.Sprintf(`CAST(ROUND(SUM((CASE WHEN job.job_state = "running" THEN %d - job.start_time ELSE job.duration END) * job.num_hwthreads) / 3600) as %s) as totalCoreHours`, time.Now().Unix(), castType),
|
||||
fmt.Sprintf(`CAST(SUM(job.num_acc) as %s) as totalAccs`, castType),
|
||||
fmt.Sprintf(`CAST(ROUND(SUM((CASE WHEN job.job_state = "running" THEN %d - job.start_time ELSE job.duration END) * job.num_acc) / 3600) as %s) as totalAccHours`, time.Now().Unix(), castType),
|
||||
fmt.Sprintf(`CAST(ROUND(SUM((CASE WHEN job.job_state = "running" THEN %d - job.start_time ELSE job.duration END)) / 3600) as int) as totalWalltime`, time.Now().Unix()),
|
||||
fmt.Sprintf(`CAST(SUM(job.num_nodes) as int) as totalNodes`),
|
||||
fmt.Sprintf(`CAST(ROUND(SUM((CASE WHEN job.job_state = "running" THEN %d - job.start_time ELSE job.duration END) * job.num_nodes) / 3600) as int) as totalNodeHours`, time.Now().Unix()),
|
||||
fmt.Sprintf(`CAST(SUM(job.num_hwthreads) as int) as totalCores`),
|
||||
fmt.Sprintf(`CAST(ROUND(SUM((CASE WHEN job.job_state = "running" THEN %d - job.start_time ELSE job.duration END) * job.num_hwthreads) / 3600) as int) as totalCoreHours`, time.Now().Unix()),
|
||||
fmt.Sprintf(`CAST(SUM(job.num_acc) as int) as totalAccs`),
|
||||
fmt.Sprintf(`CAST(ROUND(SUM((CASE WHEN job.job_state = "running" THEN %d - job.start_time ELSE job.duration END) * job.num_acc) / 3600) as int) as totalAccHours`, time.Now().Unix()),
|
||||
).From("job").LeftJoin("hpc_user ON hpc_user.username = job.hpc_user").GroupBy(col)
|
||||
} else {
|
||||
// Scan columns: totalJobs, totalUsers, totalWalltime, totalNodes, totalNodeHours, totalCores, totalCoreHours, totalAccs, totalAccHours
|
||||
query = sq.Select(
|
||||
"COUNT(job.id) as totalJobs",
|
||||
"COUNT(DISTINCT job.hpc_user) AS totalUsers",
|
||||
fmt.Sprintf(`CAST(ROUND(SUM((CASE WHEN job.job_state = "running" THEN %d - job.start_time ELSE job.duration END)) / 3600) as %s)`, time.Now().Unix(), castType),
|
||||
fmt.Sprintf(`CAST(SUM(job.num_nodes) as %s)`, castType),
|
||||
fmt.Sprintf(`CAST(ROUND(SUM((CASE WHEN job.job_state = "running" THEN %d - job.start_time ELSE job.duration END) * job.num_nodes) / 3600) as %s)`, time.Now().Unix(), castType),
|
||||
fmt.Sprintf(`CAST(SUM(job.num_hwthreads) as %s)`, castType),
|
||||
fmt.Sprintf(`CAST(ROUND(SUM((CASE WHEN job.job_state = "running" THEN %d - job.start_time ELSE job.duration END) * job.num_hwthreads) / 3600) as %s)`, time.Now().Unix(), castType),
|
||||
fmt.Sprintf(`CAST(SUM(job.num_acc) as %s)`, castType),
|
||||
fmt.Sprintf(`CAST(ROUND(SUM((CASE WHEN job.job_state = "running" THEN %d - job.start_time ELSE job.duration END) * job.num_acc) / 3600) as %s)`, time.Now().Unix(), castType),
|
||||
fmt.Sprintf(`CAST(ROUND(SUM((CASE WHEN job.job_state = "running" THEN %d - job.start_time ELSE job.duration END)) / 3600) as int)`, time.Now().Unix()),
|
||||
fmt.Sprintf(`CAST(SUM(job.num_nodes) as int)`),
|
||||
fmt.Sprintf(`CAST(ROUND(SUM((CASE WHEN job.job_state = "running" THEN %d - job.start_time ELSE job.duration END) * job.num_nodes) / 3600) as int)`, time.Now().Unix()),
|
||||
fmt.Sprintf(`CAST(SUM(job.num_hwthreads) as int)`),
|
||||
fmt.Sprintf(`CAST(ROUND(SUM((CASE WHEN job.job_state = "running" THEN %d - job.start_time ELSE job.duration END) * job.num_hwthreads) / 3600) as int)`, time.Now().Unix()),
|
||||
fmt.Sprintf(`CAST(SUM(job.num_acc) as int)`),
|
||||
fmt.Sprintf(`CAST(ROUND(SUM((CASE WHEN job.job_state = "running" THEN %d - job.start_time ELSE job.duration END) * job.num_acc) / 3600) as int)`, time.Now().Unix()),
|
||||
).From("job")
|
||||
}
|
||||
|
||||
@@ -114,21 +111,6 @@ func (r *JobRepository) buildStatsQuery(
|
||||
return query
|
||||
}
|
||||
|
||||
func (r *JobRepository) getCastType() string {
|
||||
var castType string
|
||||
|
||||
switch r.driver {
|
||||
case "sqlite3":
|
||||
castType = "int"
|
||||
case "mysql":
|
||||
castType = "unsigned"
|
||||
default:
|
||||
castType = ""
|
||||
}
|
||||
|
||||
return castType
|
||||
}
|
||||
|
||||
func (r *JobRepository) JobsStatsGrouped(
|
||||
ctx context.Context,
|
||||
filter []*model.JobFilter,
|
||||
@@ -477,10 +459,9 @@ func (r *JobRepository) AddHistograms(
|
||||
targetBinSize = 3600
|
||||
}
|
||||
|
||||
castType := r.getCastType()
|
||||
var err error
|
||||
// Return X-Values always as seconds, will be formatted into minutes and hours in frontend
|
||||
value := fmt.Sprintf(`CAST(ROUND(((CASE WHEN job.job_state = "running" THEN %d - job.start_time ELSE job.duration END) / %d) + 1) as %s) as value`, time.Now().Unix(), targetBinSize, castType)
|
||||
value := fmt.Sprintf(`CAST(ROUND(((CASE WHEN job.job_state = "running" THEN %d - job.start_time ELSE job.duration END) / %d) + 1) as int) as value`, time.Now().Unix(), targetBinSize)
|
||||
stat.HistDuration, err = r.jobsDurationStatisticsHistogram(ctx, value, filter, targetBinSize, &targetBinCount)
|
||||
if err != nil {
|
||||
cclog.Warn("Error while loading job statistics histogram: job duration")
|
||||
|
||||
@@ -224,10 +224,10 @@ func (r *JobRepository) CountTags(user *schema.User) (tags []schema.Tag, counts
|
||||
}
|
||||
|
||||
// Query and Count Jobs with attached Tags
|
||||
q := sq.Select("t.tag_name, t.id, count(jt.tag_id)").
|
||||
q := sq.Select("t.tag_type, t.tag_name, t.id, count(jt.tag_id)").
|
||||
From("tag t").
|
||||
LeftJoin("jobtag jt ON t.id = jt.tag_id").
|
||||
GroupBy("t.tag_name")
|
||||
GroupBy("t.tag_type, t.tag_name")
|
||||
|
||||
// Build scope list for filtering
|
||||
var scopeBuilder strings.Builder
|
||||
@@ -260,14 +260,15 @@ func (r *JobRepository) CountTags(user *schema.User) (tags []schema.Tag, counts
|
||||
|
||||
counts = make(map[string]int)
|
||||
for rows.Next() {
|
||||
var tagType string
|
||||
var tagName string
|
||||
var tagId int
|
||||
var count int
|
||||
if err = rows.Scan(&tagName, &tagId, &count); err != nil {
|
||||
if err = rows.Scan(&tagType, &tagName, &tagId, &count); err != nil {
|
||||
return nil, nil, err
|
||||
}
|
||||
// Use tagId as second Map-Key component to differentiate tags with identical names
|
||||
counts[fmt.Sprint(tagName, tagId)] = count
|
||||
counts[fmt.Sprint(tagType, tagName, tagId)] = count
|
||||
}
|
||||
err = rows.Err()
|
||||
|
||||
|
||||
@@ -42,7 +42,7 @@ func setupUserTest(t *testing.T) *UserCfgRepo {
|
||||
|
||||
cclog.Init("info", true)
|
||||
dbfilepath := "testdata/job.db"
|
||||
err := MigrateDB("sqlite3", dbfilepath)
|
||||
err := MigrateDB(dbfilepath)
|
||||
if err != nil {
|
||||
t.Fatal(err)
|
||||
}
|
||||
|
||||
@@ -205,13 +205,13 @@ func setupTaglistRoute(i InfoType, r *http.Request) InfoType {
|
||||
"id": tag.ID,
|
||||
"name": tag.Name,
|
||||
"scope": tag.Scope,
|
||||
"count": counts[fmt.Sprint(tag.Name, tag.ID)],
|
||||
"count": counts[fmt.Sprint(tag.Type, tag.Name, tag.ID)],
|
||||
}
|
||||
tagMap[tag.Type] = append(tagMap[tag.Type], tagItem)
|
||||
}
|
||||
} else if userAuthlevel < 4 && userAuthlevel >= 2 { // User+ : Show global and admin scope only if at least 1 tag used, private scope regardless of count
|
||||
for _, tag := range tags {
|
||||
tagCount := counts[fmt.Sprint(tag.Name, tag.ID)]
|
||||
tagCount := counts[fmt.Sprint(tag.Type, tag.Name, tag.ID)]
|
||||
if ((tag.Scope == "global" || tag.Scope == "admin") && tagCount >= 1) || (tag.Scope != "global" && tag.Scope != "admin") {
|
||||
tagItem := map[string]interface{}{
|
||||
"id": tag.ID,
|
||||
|
||||
@@ -15,7 +15,7 @@ func setup(tb testing.TB) *repository.JobRepository {
|
||||
tb.Helper()
|
||||
cclog.Init("warn", true)
|
||||
dbfile := "../repository/testdata/job.db"
|
||||
err := repository.MigrateDB("sqlite3", dbfile)
|
||||
err := repository.MigrateDB(dbfile)
|
||||
noErr(tb, err)
|
||||
repository.Connect("sqlite3", dbfile)
|
||||
return repository.GetJobRepository()
|
||||
|
||||
@@ -6,7 +6,6 @@
|
||||
package archive
|
||||
|
||||
import (
|
||||
"errors"
|
||||
"fmt"
|
||||
|
||||
cclog "github.com/ClusterCockpit/cc-lib/ccLogger"
|
||||
@@ -16,11 +15,14 @@ import (
|
||||
var (
|
||||
Clusters []*schema.Cluster
|
||||
GlobalMetricList []*schema.GlobalMetricListItem
|
||||
GlobalUserMetricList []*schema.GlobalMetricListItem
|
||||
NodeLists map[string]map[string]NodeList
|
||||
)
|
||||
|
||||
func initClusterConfig() error {
|
||||
Clusters = []*schema.Cluster{}
|
||||
GlobalMetricList = []*schema.GlobalMetricListItem{}
|
||||
GlobalUserMetricList = []*schema.GlobalMetricListItem{}
|
||||
NodeLists = map[string]map[string]NodeList{}
|
||||
metricLookup := make(map[string]schema.GlobalMetricListItem)
|
||||
|
||||
@@ -29,38 +31,41 @@ func initClusterConfig() error {
|
||||
cluster, err := ar.LoadClusterCfg(c)
|
||||
if err != nil {
|
||||
cclog.Warnf("Error while loading cluster config for cluster '%v'", c)
|
||||
return err
|
||||
return fmt.Errorf("failed to load cluster config for '%s': %w", c, err)
|
||||
}
|
||||
|
||||
if len(cluster.Name) == 0 ||
|
||||
len(cluster.MetricConfig) == 0 ||
|
||||
len(cluster.SubClusters) == 0 {
|
||||
return errors.New("cluster.name, cluster.metricConfig and cluster.SubClusters should not be empty")
|
||||
if len(cluster.Name) == 0 {
|
||||
return fmt.Errorf("cluster name is empty in config for '%s'", c)
|
||||
}
|
||||
if len(cluster.MetricConfig) == 0 {
|
||||
return fmt.Errorf("cluster '%s' has no metric configurations", cluster.Name)
|
||||
}
|
||||
if len(cluster.SubClusters) == 0 {
|
||||
return fmt.Errorf("cluster '%s' has no subclusters defined", cluster.Name)
|
||||
}
|
||||
|
||||
for _, mc := range cluster.MetricConfig {
|
||||
if len(mc.Name) == 0 {
|
||||
return errors.New("cluster.metricConfig.name should not be empty")
|
||||
return fmt.Errorf("cluster '%s' has a metric config with empty name", cluster.Name)
|
||||
}
|
||||
if mc.Timestep < 1 {
|
||||
return errors.New("cluster.metricConfig.timestep should not be smaller than one")
|
||||
return fmt.Errorf("metric '%s' in cluster '%s' has invalid timestep %d (must be >= 1)", mc.Name, cluster.Name, mc.Timestep)
|
||||
}
|
||||
|
||||
// For backwards compability...
|
||||
// For backwards compatibility...
|
||||
if mc.Scope == "" {
|
||||
mc.Scope = schema.MetricScopeNode
|
||||
}
|
||||
if !mc.Scope.Valid() {
|
||||
return errors.New("cluster.metricConfig.scope must be a valid scope ('node', 'scocket', ...)")
|
||||
return fmt.Errorf("metric '%s' in cluster '%s' has invalid scope '%s' (must be 'node', 'socket', 'core', etc.)", mc.Name, cluster.Name, mc.Scope)
|
||||
}
|
||||
|
||||
ml, ok := metricLookup[mc.Name]
|
||||
if !ok {
|
||||
if _, ok := metricLookup[mc.Name]; !ok {
|
||||
metricLookup[mc.Name] = schema.GlobalMetricListItem{
|
||||
Name: mc.Name, Scope: mc.Scope, Unit: mc.Unit, Footprint: mc.Footprint,
|
||||
Name: mc.Name, Scope: mc.Scope, Restrict: mc.Restrict, Unit: mc.Unit, Footprint: mc.Footprint,
|
||||
}
|
||||
ml = metricLookup[mc.Name]
|
||||
}
|
||||
|
||||
availability := schema.ClusterSupport{Cluster: cluster.Name}
|
||||
scLookup := make(map[string]*schema.SubClusterConfig)
|
||||
|
||||
@@ -90,8 +95,9 @@ func initClusterConfig() error {
|
||||
}
|
||||
|
||||
if cfg, ok := scLookup[sc.Name]; ok {
|
||||
if !cfg.Remove {
|
||||
availability.SubClusters = append(availability.SubClusters, sc.Name)
|
||||
if cfg.Remove {
|
||||
continue
|
||||
}
|
||||
newMetric.Peak = cfg.Peak
|
||||
newMetric.Normal = cfg.Normal
|
||||
newMetric.Caution = cfg.Caution
|
||||
@@ -99,30 +105,25 @@ func initClusterConfig() error {
|
||||
newMetric.Footprint = cfg.Footprint
|
||||
newMetric.Energy = cfg.Energy
|
||||
newMetric.LowerIsBetter = cfg.LowerIsBetter
|
||||
sc.MetricConfig = append(sc.MetricConfig, *newMetric)
|
||||
}
|
||||
|
||||
if newMetric.Footprint != "" {
|
||||
sc.Footprint = append(sc.Footprint, newMetric.Name)
|
||||
ml.Footprint = newMetric.Footprint
|
||||
}
|
||||
if newMetric.Energy != "" {
|
||||
sc.EnergyFootprint = append(sc.EnergyFootprint, newMetric.Name)
|
||||
}
|
||||
}
|
||||
} else {
|
||||
availability.SubClusters = append(availability.SubClusters, sc.Name)
|
||||
sc.MetricConfig = append(sc.MetricConfig, *newMetric)
|
||||
|
||||
if newMetric.Footprint != "" {
|
||||
sc.Footprint = append(sc.Footprint, newMetric.Name)
|
||||
item := metricLookup[mc.Name]
|
||||
item.Footprint = newMetric.Footprint
|
||||
metricLookup[mc.Name] = item
|
||||
}
|
||||
if newMetric.Energy != "" {
|
||||
sc.EnergyFootprint = append(sc.EnergyFootprint, newMetric.Name)
|
||||
}
|
||||
}
|
||||
}
|
||||
ml.Availability = append(metricLookup[mc.Name].Availability, availability)
|
||||
metricLookup[mc.Name] = ml
|
||||
|
||||
item := metricLookup[mc.Name]
|
||||
item.Availability = append(item.Availability, availability)
|
||||
metricLookup[mc.Name] = item
|
||||
}
|
||||
|
||||
Clusters = append(Clusters, cluster)
|
||||
@@ -141,8 +142,11 @@ func initClusterConfig() error {
|
||||
}
|
||||
}
|
||||
|
||||
for _, ml := range metricLookup {
|
||||
GlobalMetricList = append(GlobalMetricList, &ml)
|
||||
for _, metric := range metricLookup {
|
||||
GlobalMetricList = append(GlobalMetricList, &metric)
|
||||
if !metric.Restrict {
|
||||
GlobalUserMetricList = append(GlobalUserMetricList, &metric)
|
||||
}
|
||||
}
|
||||
|
||||
return nil
|
||||
|
||||
@@ -83,7 +83,7 @@ func Connect() {
|
||||
|
||||
client, err := NewClient(nil)
|
||||
if err != nil {
|
||||
cclog.Errorf("NATS connection failed: %v", err)
|
||||
cclog.Warnf("NATS connection failed: %v", err)
|
||||
return
|
||||
}
|
||||
|
||||
|
||||
20
startDemo.sh
20
startDemo.sh
@@ -1,22 +1,18 @@
|
||||
#!/bin/sh
|
||||
|
||||
# rm -rf var
|
||||
|
||||
if [ -d './var' ]; then
|
||||
echo 'Directory ./var already exists! Skipping initialization.'
|
||||
./cc-backend -server -dev
|
||||
./cc-backend -server -dev -loglevel info
|
||||
else
|
||||
make
|
||||
wget https://hpc-mover.rrze.uni-erlangen.de/HPC-Data/0x7b58aefb/eig7ahyo6fo2bais0ephuf2aitohv1ai/job-archive-dev.tar
|
||||
tar xf job-archive-dev.tar
|
||||
rm ./job-archive-dev.tar
|
||||
|
||||
cp ./configs/env-template.txt .env
|
||||
./cc-backend --init
|
||||
cp ./configs/config-demo.json config.json
|
||||
|
||||
./cc-backend -migrate-db
|
||||
wget https://hpc-mover.rrze.uni-erlangen.de/HPC-Data/0x7b58aefb/eig7ahyo6fo2bais0ephuf2aitohv1ai/job-archive-demo.tar
|
||||
tar xf job-archive-demo.tar
|
||||
rm ./job-archive-demo.tar
|
||||
|
||||
./cc-backend -dev -init-db -add-user demo:admin,api:demo
|
||||
|
||||
./cc-backend -server -dev
|
||||
|
||||
./cc-backend -server -dev -loglevel info
|
||||
fi
|
||||
|
||||
|
||||
@@ -9,9 +9,11 @@ import (
|
||||
"encoding/json"
|
||||
"flag"
|
||||
"fmt"
|
||||
"io/fs"
|
||||
"os"
|
||||
"os/exec"
|
||||
"os/signal"
|
||||
"path/filepath"
|
||||
"strconv"
|
||||
"strings"
|
||||
"sync"
|
||||
@@ -39,28 +41,47 @@ func parseDate(in string) int64 {
|
||||
return 0
|
||||
}
|
||||
|
||||
// countJobs counts the total number of jobs in the source archive using external fd command.
|
||||
// It requires the fd binary to be available in PATH.
|
||||
// The srcConfig parameter should be the JSON configuration string containing the archive path.
|
||||
func countJobs(srcConfig string) (int, error) {
|
||||
fdPath, err := exec.LookPath("fd")
|
||||
if err != nil {
|
||||
return 0, fmt.Errorf("fd binary not found in PATH: %w", err)
|
||||
}
|
||||
|
||||
// parseArchivePath extracts the path from the source config JSON.
|
||||
func parseArchivePath(srcConfig string) (string, error) {
|
||||
var config struct {
|
||||
Kind string `json:"kind"`
|
||||
Path string `json:"path"`
|
||||
}
|
||||
if err := json.Unmarshal([]byte(srcConfig), &config); err != nil {
|
||||
return 0, fmt.Errorf("failed to parse source config: %w", err)
|
||||
return "", fmt.Errorf("failed to parse source config: %w", err)
|
||||
}
|
||||
|
||||
if config.Path == "" {
|
||||
return 0, fmt.Errorf("no path found in source config")
|
||||
return "", fmt.Errorf("no path found in source config")
|
||||
}
|
||||
|
||||
fdCmd := exec.Command(fdPath, "meta.json", config.Path)
|
||||
return config.Path, nil
|
||||
}
|
||||
|
||||
// countJobsNative counts jobs using native Go filepath.WalkDir.
|
||||
// This is used as a fallback when fd/fdfind is not available.
|
||||
func countJobsNative(archivePath string) (int, error) {
|
||||
count := 0
|
||||
err := filepath.WalkDir(archivePath, func(path string, d fs.DirEntry, err error) error {
|
||||
if err != nil {
|
||||
return nil // Skip directories we can't access
|
||||
}
|
||||
if !d.IsDir() && d.Name() == "meta.json" {
|
||||
count++
|
||||
}
|
||||
return nil
|
||||
})
|
||||
|
||||
if err != nil {
|
||||
return 0, fmt.Errorf("failed to walk directory: %w", err)
|
||||
}
|
||||
|
||||
return count, nil
|
||||
}
|
||||
|
||||
// countJobsWithFd counts jobs using the external fd command.
|
||||
func countJobsWithFd(fdPath, archivePath string) (int, error) {
|
||||
fdCmd := exec.Command(fdPath, "meta.json", archivePath)
|
||||
wcCmd := exec.Command("wc", "-l")
|
||||
|
||||
pipe, err := fdCmd.StdoutPipe()
|
||||
@@ -91,6 +112,31 @@ func countJobs(srcConfig string) (int, error) {
|
||||
return count, nil
|
||||
}
|
||||
|
||||
// countJobs counts the total number of jobs in the source archive.
|
||||
// It tries to use external fd/fdfind command for speed, falling back to
|
||||
// native Go filepath.WalkDir if neither is available.
|
||||
// The srcConfig parameter should be the JSON configuration string containing the archive path.
|
||||
func countJobs(srcConfig string) (int, error) {
|
||||
archivePath, err := parseArchivePath(srcConfig)
|
||||
if err != nil {
|
||||
return 0, err
|
||||
}
|
||||
|
||||
// Try fd first (common name)
|
||||
if fdPath, err := exec.LookPath("fd"); err == nil {
|
||||
return countJobsWithFd(fdPath, archivePath)
|
||||
}
|
||||
|
||||
// Try fdfind (Debian/Ubuntu package name)
|
||||
if fdPath, err := exec.LookPath("fdfind"); err == nil {
|
||||
return countJobsWithFd(fdPath, archivePath)
|
||||
}
|
||||
|
||||
// Fall back to native Go implementation
|
||||
cclog.Debug("fd/fdfind not found, using native Go file walker")
|
||||
return countJobsNative(archivePath)
|
||||
}
|
||||
|
||||
// formatDuration formats a duration as a human-readable string.
|
||||
func formatDuration(d time.Duration) string {
|
||||
if d < time.Minute {
|
||||
|
||||
@@ -30,7 +30,8 @@
|
||||
Table,
|
||||
Progress,
|
||||
Icon,
|
||||
Button
|
||||
Button,
|
||||
Badge
|
||||
} from "@sveltestrap/sveltestrap";
|
||||
import Roofline from "./generic/plots/Roofline.svelte";
|
||||
import Pie, { colors } from "./generic/plots/Pie.svelte";
|
||||
@@ -85,7 +86,8 @@
|
||||
query: gql`
|
||||
query (
|
||||
$cluster: String!
|
||||
$metrics: [String!]
|
||||
$nmetrics: [String!]
|
||||
$cmetrics: [String!]
|
||||
$from: Time!
|
||||
$to: Time!
|
||||
$clusterFrom: Time!
|
||||
@@ -97,7 +99,7 @@
|
||||
# Node 5 Minute Averages for Roofline
|
||||
nodeMetrics(
|
||||
cluster: $cluster
|
||||
metrics: $metrics
|
||||
metrics: $nmetrics
|
||||
from: $from
|
||||
to: $to
|
||||
) {
|
||||
@@ -106,6 +108,10 @@
|
||||
metrics {
|
||||
name
|
||||
metric {
|
||||
unit {
|
||||
base
|
||||
prefix
|
||||
}
|
||||
series {
|
||||
statistics {
|
||||
avg
|
||||
@@ -114,21 +120,6 @@
|
||||
}
|
||||
}
|
||||
}
|
||||
# Running Job Metric Average for Rooflines
|
||||
jobsMetricStats(filter: $jobFilter, metrics: $metrics) {
|
||||
id
|
||||
jobId
|
||||
duration
|
||||
numNodes
|
||||
numAccelerators
|
||||
subCluster
|
||||
stats {
|
||||
name
|
||||
data {
|
||||
avg
|
||||
}
|
||||
}
|
||||
}
|
||||
# Get Jobs for Per-Node Counts
|
||||
jobs(filter: $jobFilter, order: $sorting, page: $paging) {
|
||||
items {
|
||||
@@ -175,7 +166,7 @@
|
||||
# ClusterMetrics for doubleMetricPlot
|
||||
clusterMetrics(
|
||||
cluster: $cluster
|
||||
metrics: $metrics
|
||||
metrics: $cmetrics
|
||||
from: $clusterFrom
|
||||
to: $to
|
||||
) {
|
||||
@@ -194,7 +185,8 @@
|
||||
`,
|
||||
variables: {
|
||||
cluster: presetCluster,
|
||||
metrics: ["flops_any", "mem_bw"], // Metrics For Cluster Plot and Roofline
|
||||
nmetrics: ["flops_any", "mem_bw", "cpu_power", "acc_power"], // Metrics For Roofline and Stats
|
||||
cmetrics: ["flops_any", "mem_bw"], // Metrics For Cluster Plot
|
||||
from: from.toISOString(),
|
||||
clusterFrom: clusterFrom.toISOString(),
|
||||
to: to.toISOString(),
|
||||
@@ -258,6 +250,11 @@
|
||||
}
|
||||
}
|
||||
|
||||
// Get Idle Infos after Sums
|
||||
if (!rawInfos['idleNodes']) rawInfos['idleNodes'] = rawInfos['totalNodes'] - rawInfos['allocatedNodes'];
|
||||
if (!rawInfos['idleCores']) rawInfos['idleCores'] = rawInfos['totalCores'] - rawInfos['allocatedCores'];
|
||||
if (!rawInfos['idleAccs']) rawInfos['idleAccs'] = rawInfos['totalAccs'] - rawInfos['allocatedAccs'];
|
||||
|
||||
// Keymetrics (Data on Cluster-Scope)
|
||||
let rawFlops = $statusQuery?.data?.nodeMetrics?.reduce((sum, node) =>
|
||||
sum + (node.metrics.find((m) => m.name == 'flops_any')?.metric?.series[0]?.statistics?.avg || 0),
|
||||
@@ -271,6 +268,26 @@
|
||||
) || 0;
|
||||
rawInfos['memBwRate'] = Math.floor((rawMemBw * 100) / 100)
|
||||
|
||||
let rawCpuPwr = $statusQuery?.data?.nodeMetrics?.reduce((sum, node) =>
|
||||
sum + (node.metrics.find((m) => m.name == 'cpu_power')?.metric?.series[0]?.statistics?.avg || 0),
|
||||
0, // Initial Value
|
||||
) || 0;
|
||||
rawInfos['cpuPwr'] = Math.floor((rawCpuPwr * 100) / 100)
|
||||
if (!rawInfos['cpuPwrUnit']) {
|
||||
let rawCpuUnit = $statusQuery?.data?.nodeMetrics[0]?.metrics.find((m) => m.name == 'cpu_power')?.metric?.unit || null
|
||||
rawInfos['cpuPwrUnit'] = rawCpuUnit ? rawCpuUnit.prefix + rawCpuUnit.base : ''
|
||||
}
|
||||
|
||||
let rawGpuPwr = $statusQuery?.data?.nodeMetrics?.reduce((sum, node) =>
|
||||
sum + (node.metrics.find((m) => m.name == 'acc_power')?.metric?.series[0]?.statistics?.avg || 0),
|
||||
0, // Initial Value
|
||||
) || 0;
|
||||
rawInfos['gpuPwr'] = Math.floor((rawGpuPwr * 100) / 100)
|
||||
if (!rawInfos['gpuPwrUnit']) {
|
||||
let rawGpuUnit = $statusQuery?.data?.nodeMetrics[0]?.metrics.find((m) => m.name == 'acc_power')?.metric?.unit || null
|
||||
rawInfos['gpuPwrUnit'] = rawGpuUnit ? rawGpuUnit.prefix + rawGpuUnit.base : ''
|
||||
}
|
||||
|
||||
return rawInfos
|
||||
} else {
|
||||
return {};
|
||||
@@ -338,7 +355,7 @@
|
||||
</script>
|
||||
|
||||
<Card style="height: 98vh;">
|
||||
<CardBody class="align-content-center">
|
||||
<CardBody class="align-content-center p-2">
|
||||
<Row>
|
||||
<Col>
|
||||
<Refresher
|
||||
@@ -354,11 +371,6 @@
|
||||
}}
|
||||
/>
|
||||
</Col>
|
||||
<Col class="d-flex justify-content-end">
|
||||
<Button outline class="mb-1" size="sm" color="light" href="/">
|
||||
<Icon name="x"/>
|
||||
</Button>
|
||||
</Col>
|
||||
</Row>
|
||||
{#if $statusQuery.fetching || $statesTimed.fetching}
|
||||
<Row class="justify-content-center">
|
||||
@@ -368,6 +380,13 @@
|
||||
</Row>
|
||||
|
||||
{:else if $statusQuery.error || $statesTimed.error}
|
||||
<Row class="mb-2">
|
||||
<Col class="d-flex justify-content-end">
|
||||
<Button color="secondary" href="/">
|
||||
<Icon name="x"/>
|
||||
</Button>
|
||||
</Col>
|
||||
</Row>
|
||||
<Row cols={{xs:1, md:2}}>
|
||||
{#if $statusQuery.error}
|
||||
<Col>
|
||||
@@ -385,8 +404,17 @@
|
||||
<Row cols={{xs:1, md:2}}>
|
||||
<Col> <!-- General Cluster Info Card -->
|
||||
<Card class="h-100">
|
||||
<CardHeader class="text-center">
|
||||
<CardHeader>
|
||||
<Row>
|
||||
<Col xs="11" class="text-center">
|
||||
<h2 class="mb-0">Cluster {presetCluster.charAt(0).toUpperCase() + presetCluster.slice(1)}</h2>
|
||||
</Col>
|
||||
<Col xs="1" class="d-flex justify-content-end">
|
||||
<Button color="light" href="/">
|
||||
<Icon name="x"/>
|
||||
</Button>
|
||||
</Col>
|
||||
</Row>
|
||||
</CardHeader>
|
||||
<CardBody>
|
||||
<h4>CPU(s)</h4><p><strong>{[...clusterInfo?.processorTypes].join(', ')}</strong></p>
|
||||
@@ -397,79 +425,99 @@
|
||||
<Col> <!-- Utilization Info Card -->
|
||||
<Card class="h-100">
|
||||
<CardBody>
|
||||
<Table borderless>
|
||||
<tr class="py-2">
|
||||
<td style="font-size:x-large;">{clusterInfo?.runningJobs} Running Jobs</td>
|
||||
<td colspan="2" style="font-size:x-large;">{clusterInfo?.activeUsers} Active Users</td>
|
||||
</tr>
|
||||
<hr class="my-1"/>
|
||||
<tr class="pt-2">
|
||||
<td style="font-size: large;">
|
||||
Flop Rate (<span style="cursor: help;" title="Flops[Any] = (Flops[Double] x 2) + Flops[Single]">Any</span>)
|
||||
</td>
|
||||
<td colspan="2" style="font-size: large;">
|
||||
Memory BW Rate
|
||||
</td>
|
||||
</tr>
|
||||
<tr class="pb-2">
|
||||
<td style="font-size:x-large;">
|
||||
{clusterInfo?.flopRate}
|
||||
{clusterInfo?.flopRateUnit}
|
||||
</td>
|
||||
<td colspan="2" style="font-size:x-large;">
|
||||
{clusterInfo?.memBwRate}
|
||||
{clusterInfo?.memBwRateUnit}
|
||||
</td>
|
||||
</tr>
|
||||
<hr class="my-1"/>
|
||||
<tr class="py-2">
|
||||
<th scope="col">Allocated Nodes</th>
|
||||
<td style="min-width: 100px;"
|
||||
><div class="col">
|
||||
<Progress
|
||||
value={clusterInfo?.allocatedNodes}
|
||||
max={clusterInfo?.totalNodes}
|
||||
/>
|
||||
</div></td
|
||||
>
|
||||
<td
|
||||
>{clusterInfo?.allocatedNodes} / {clusterInfo?.totalNodes}
|
||||
Nodes</td
|
||||
>
|
||||
</tr>
|
||||
<tr class="py-2">
|
||||
<th scope="col">Allocated Cores</th>
|
||||
<td style="min-width: 100px;"
|
||||
><div class="col">
|
||||
<Progress
|
||||
value={clusterInfo?.allocatedCores}
|
||||
max={clusterInfo?.totalCores}
|
||||
/>
|
||||
</div></td
|
||||
>
|
||||
<td
|
||||
>{formatNumber(clusterInfo?.allocatedCores)} / {formatNumber(clusterInfo?.totalCores)}
|
||||
Cores</td
|
||||
>
|
||||
</tr>
|
||||
<Row class="mb-1">
|
||||
<Col xs={4} class="d-inline-flex align-items-center justify-content-center">
|
||||
<Badge color="primary" style="font-size:x-large;margin-right:0.25rem;">
|
||||
{clusterInfo?.runningJobs}
|
||||
</Badge>
|
||||
<div style="font-size:large;">
|
||||
Running Jobs
|
||||
</div>
|
||||
</Col>
|
||||
<Col xs={4} class="d-inline-flex align-items-center justify-content-center">
|
||||
<Badge color="primary" style="font-size:x-large;margin-right:0.25rem;">
|
||||
{clusterInfo?.activeUsers}
|
||||
</Badge>
|
||||
<div style="font-size:large;">
|
||||
Active Users
|
||||
</div>
|
||||
</Col>
|
||||
<Col xs={4} class="d-inline-flex align-items-center justify-content-center">
|
||||
<Badge color="primary" style="font-size:x-large;margin-right:0.25rem;">
|
||||
{clusterInfo?.allocatedNodes}
|
||||
</Badge>
|
||||
<div style="font-size:large;">
|
||||
Active Nodes
|
||||
</div>
|
||||
</Col>
|
||||
</Row>
|
||||
<Row class="mt-1 mb-2">
|
||||
<Col xs={4} class="d-inline-flex align-items-center justify-content-center">
|
||||
<Badge color="secondary" style="font-size:x-large;margin-right:0.25rem;">
|
||||
{clusterInfo?.flopRate} {clusterInfo?.flopRateUnit}
|
||||
</Badge>
|
||||
<div style="font-size:large;">
|
||||
Total Flop Rate
|
||||
</div>
|
||||
</Col>
|
||||
<Col xs={4} class="d-inline-flex align-items-center justify-content-center">
|
||||
<Badge color="secondary" style="font-size:x-large;margin-right:0.25rem;">
|
||||
{clusterInfo?.memBwRate} {clusterInfo?.memBwRateUnit}
|
||||
</Badge>
|
||||
<div style="font-size:large;">
|
||||
Total Memory Bandwidth
|
||||
</div>
|
||||
</Col>
|
||||
{#if clusterInfo?.totalAccs !== 0}
|
||||
<tr class="py-2">
|
||||
<th scope="col">Allocated Accelerators</th>
|
||||
<td style="min-width: 100px;"
|
||||
><div class="col">
|
||||
<Progress
|
||||
value={clusterInfo?.allocatedAccs}
|
||||
max={clusterInfo?.totalAccs}
|
||||
/>
|
||||
</div></td
|
||||
>
|
||||
<td
|
||||
>{clusterInfo?.allocatedAccs} / {clusterInfo?.totalAccs}
|
||||
Accelerators</td
|
||||
>
|
||||
</tr>
|
||||
<Col xs={4} class="d-inline-flex align-items-center justify-content-center">
|
||||
<Badge color="secondary" style="font-size:x-large;margin-right:0.25rem;">
|
||||
{clusterInfo?.gpuPwr} {clusterInfo?.gpuPwrUnit}
|
||||
</Badge>
|
||||
<div style="font-size:large;">
|
||||
Total GPU Power
|
||||
</div>
|
||||
</Col>
|
||||
{:else}
|
||||
<Col xs={4} class="d-inline-flex align-items-center justify-content-center">
|
||||
<Badge color="secondary" style="font-size:x-large;margin-right:0.25rem;">
|
||||
{clusterInfo?.cpuPwr} {clusterInfo?.cpuPwrUnit}
|
||||
</Badge>
|
||||
<div style="font-size:large;">
|
||||
Total CPU Power
|
||||
</div>
|
||||
</Col>
|
||||
{/if}
|
||||
</Row>
|
||||
<Row class="my-1 align-items-baseline">
|
||||
<Col xs={2} style="font-size:large;">
|
||||
Active Cores
|
||||
</Col>
|
||||
<Col xs={8}>
|
||||
<Progress multi style="height:2.5rem;font-size:x-large;">
|
||||
<Progress bar color="success" value={clusterInfo?.allocatedCores}>{formatNumber(clusterInfo?.allocatedCores)}</Progress>
|
||||
<Progress bar color="light" value={clusterInfo?.idleCores}>{formatNumber(clusterInfo?.idleCores)}</Progress>
|
||||
</Progress>
|
||||
</Col>
|
||||
<Col xs={2} style="font-size:large;">
|
||||
Idle Cores
|
||||
</Col>
|
||||
</Row>
|
||||
{#if clusterInfo?.totalAccs !== 0}
|
||||
<Row class="my-1 align-items-baseline">
|
||||
<Col xs={2} style="font-size:large;">
|
||||
Active GPU
|
||||
</Col>
|
||||
<Col xs={8}>
|
||||
<Progress multi style="height:2.5rem;font-size:x-large;">
|
||||
<Progress bar color="success" value={clusterInfo?.allocatedAccs}>{formatNumber(clusterInfo?.allocatedAccs)}</Progress>
|
||||
<Progress bar color="light" value={clusterInfo?.idleAccs}>{formatNumber(clusterInfo?.idleAccs)}</Progress>
|
||||
</Progress>
|
||||
</Col>
|
||||
<Col xs={2} style="font-size:large;">
|
||||
Idle GPU
|
||||
</Col>
|
||||
</Row>
|
||||
{/if}
|
||||
</Table>
|
||||
</CardBody>
|
||||
</Card>
|
||||
</Col>
|
||||
@@ -495,7 +543,7 @@
|
||||
useColors={false}
|
||||
useLegend={false}
|
||||
allowSizeChange
|
||||
width={colWidthRoof - 10}
|
||||
width={colWidthRoof}
|
||||
height={300}
|
||||
cluster={presetCluster}
|
||||
subCluster={clusterInfo?.roofData ? clusterInfo.roofData : null}
|
||||
@@ -563,8 +611,8 @@
|
||||
{#key $statesTimed?.data?.nodeStatesTimed}
|
||||
<Stacked
|
||||
data={$statesTimed?.data?.nodeStatesTimed}
|
||||
width={colWidthStacked * 0.95}
|
||||
xlabel="Time"
|
||||
width={colWidthStacked}
|
||||
height={260}
|
||||
ylabel="Nodes"
|
||||
yunit = "#Count"
|
||||
title = "Cluster Status"
|
||||
|
||||
@@ -269,7 +269,7 @@
|
||||
<NodeOverview {cluster} {ccconfig} {selectedMetric} {from} {to} {hostnameFilter} {hoststateFilter}/>
|
||||
{:else}
|
||||
<!-- ROW2-2: Node List (Grid Included)-->
|
||||
<NodeList {cluster} {subCluster} {ccconfig} {selectedMetrics} {selectedResolution} {hostnameFilter} {hoststateFilter} {from} {to} {presetSystemUnits}/>
|
||||
<NodeList {cluster} {subCluster} {ccconfig} pendingSelectedMetrics={selectedMetrics} {selectedResolution} {hostnameFilter} {hoststateFilter} {from} {to} {presetSystemUnits}/>
|
||||
{/if}
|
||||
{/if}
|
||||
|
||||
|
||||
@@ -23,7 +23,7 @@
|
||||
width = 0,
|
||||
height = 300,
|
||||
data = null,
|
||||
xlabel = "",
|
||||
xlabel = null,
|
||||
ylabel = "",
|
||||
yunit = "",
|
||||
title = "",
|
||||
|
||||
@@ -107,13 +107,18 @@
|
||||
}
|
||||
}
|
||||
|
||||
function columnsDragOver(event) {
|
||||
event.preventDefault();
|
||||
event.dataTransfer.dropEffect = 'move';
|
||||
}
|
||||
|
||||
function columnsDragStart(event, i) {
|
||||
event.dataTransfer.effectAllowed = "move";
|
||||
event.dataTransfer.dropEffect = "move";
|
||||
event.dataTransfer.setData("text/plain", i);
|
||||
}
|
||||
|
||||
function columnsDrag(event, target) {
|
||||
function columnsDrop(event, target) {
|
||||
event.dataTransfer.dropEffect = "move";
|
||||
const start = Number.parseInt(event.dataTransfer.getData("text/plain"));
|
||||
|
||||
@@ -182,19 +187,18 @@
|
||||
{/if}
|
||||
{#each listedMetrics as metric, index (metric)}
|
||||
<li
|
||||
draggable
|
||||
draggable={true}
|
||||
class="cc-config-column list-group-item"
|
||||
class:is-active={columnHovering === index}
|
||||
ondragover={(event) => {
|
||||
event.preventDefault()
|
||||
return false
|
||||
columnsDragOver(event)
|
||||
}}
|
||||
ondragstart={(event) => {
|
||||
columnsDragStart(event, index)
|
||||
}}
|
||||
ondrop={(event) => {
|
||||
event.preventDefault()
|
||||
columnsDrag(event, index)
|
||||
columnsDrop(event, index)
|
||||
}}
|
||||
ondragenter={() => (columnHovering = index)}
|
||||
>
|
||||
|
||||
@@ -3,7 +3,7 @@
|
||||
*/
|
||||
|
||||
const power = [1, 1e3, 1e6, 1e9, 1e12, 1e15, 1e18, 1e21]
|
||||
const prefix = ['', 'K', 'M', 'G', 'T', 'P', 'E']
|
||||
const prefix = ['', 'k', 'M', 'G', 'T', 'P', 'E']
|
||||
|
||||
export function formatNumber(x) {
|
||||
if ( isNaN(x) || x == null) {
|
||||
|
||||
@@ -355,7 +355,7 @@
|
||||
|
||||
</script>
|
||||
|
||||
<Card style="height: 88vh;">
|
||||
<Card>
|
||||
<CardBody class="align-content-center">
|
||||
<Row>
|
||||
<Col>
|
||||
@@ -540,7 +540,7 @@
|
||||
<Roofline
|
||||
useColors={true}
|
||||
allowSizeChange
|
||||
width={colWidthRoof - 10}
|
||||
width={colWidthRoof}
|
||||
height={300}
|
||||
subCluster={clusterInfo?.roofData ? clusterInfo.roofData : null}
|
||||
roofData={transformJobsStatsToData($statusQuery?.data?.jobsMetricStats)}
|
||||
@@ -568,7 +568,8 @@
|
||||
{#key $statesTimed?.data?.nodeStates}
|
||||
<Stacked
|
||||
data={$statesTimed?.data?.nodeStates}
|
||||
width={colWidthStacked1 * 0.95}
|
||||
width={colWidthStacked1}
|
||||
height={330}
|
||||
xlabel="Time"
|
||||
ylabel="Nodes"
|
||||
yunit = "#Count"
|
||||
@@ -584,7 +585,8 @@
|
||||
{#key $statesTimed?.data?.healthStates}
|
||||
<Stacked
|
||||
data={$statesTimed?.data?.healthStates}
|
||||
width={colWidthStacked2 * 0.95}
|
||||
width={colWidthStacked2}
|
||||
height={330}
|
||||
xlabel="Time"
|
||||
ylabel="Nodes"
|
||||
yunit = "#Count"
|
||||
|
||||
@@ -5,7 +5,7 @@
|
||||
- `cluster String`: The nodes' cluster
|
||||
- `subCluster String`: The nodes' subCluster [Default: ""]
|
||||
- `ccconfig Object?`: The ClusterCockpit Config Context [Default: null]
|
||||
- `selectedMetrics [String]`: The array of selected metrics [Default []]
|
||||
- `pendingSelectedMetrics [String]`: The array of selected metrics [Default []]
|
||||
- `selectedResolution Number?`: The selected data resolution [Default: 0]
|
||||
- `hostnameFilter String?`: The active hostnamefilter [Default: ""]
|
||||
- `hoststateFilter String?`: The active hoststatefilter [Default: ""]
|
||||
@@ -27,7 +27,7 @@
|
||||
cluster,
|
||||
subCluster = "",
|
||||
ccconfig = null,
|
||||
selectedMetrics = [],
|
||||
pendingSelectedMetrics = [],
|
||||
selectedResolution = 0,
|
||||
hostnameFilter = "",
|
||||
hoststateFilter = "",
|
||||
@@ -94,6 +94,7 @@
|
||||
|
||||
/* State Init */
|
||||
let nodes = $state([]);
|
||||
let selectedMetrics = $state(pendingSelectedMetrics);
|
||||
let page = $state(1);
|
||||
let itemsPerPage = $state(usePaging ? (ccconfig?.nodeList_nodesPerPage || 10) : 10);
|
||||
let headerPaddingTop = $state(0);
|
||||
@@ -110,7 +111,7 @@
|
||||
stateFilter: hoststateFilter,
|
||||
nodeFilter: hostnameFilter,
|
||||
scopes: ["core", "socket", "accelerator"],
|
||||
metrics: selectedMetrics,
|
||||
metrics: pendingSelectedMetrics,
|
||||
from: from.toISOString(),
|
||||
to: to.toISOString(),
|
||||
paging: paging,
|
||||
@@ -140,15 +141,17 @@
|
||||
$effect(() => {
|
||||
if ($nodesQuery?.data) {
|
||||
untrack(() => {
|
||||
handleNodes($nodesQuery?.data?.nodeMetricsList);
|
||||
nodes = handleNodes($nodesQuery?.data?.nodeMetricsList);
|
||||
matchedNodes = $nodesQuery?.data?.totalNodes || 0;
|
||||
});
|
||||
selectedMetrics = [...pendingSelectedMetrics]; // Trigger Rerender in NodeListRow Only After Data is Fetched
|
||||
};
|
||||
});
|
||||
|
||||
$effect(() => {
|
||||
// Triggers (Except Paging)
|
||||
from, to
|
||||
selectedMetrics, selectedResolution
|
||||
pendingSelectedMetrics, selectedResolution
|
||||
hostnameFilter, hoststateFilter
|
||||
// Continous Scroll: Paging if parameters change: Existing entries will not match new selections
|
||||
// Nodes Array Reset in HandleNodes func
|
||||
@@ -162,17 +165,16 @@
|
||||
if (data) {
|
||||
if (usePaging) {
|
||||
// console.log('New Paging', $state.snapshot(paging))
|
||||
nodes = [...data.items].sort((a, b) => a.host.localeCompare(b.host));
|
||||
return [...data.items].sort((a, b) => a.host.localeCompare(b.host));
|
||||
} else {
|
||||
if ($state.snapshot(page) == 1) {
|
||||
// console.log('Page 1 Reset', [...data.items])
|
||||
nodes = [...data.items].sort((a, b) => a.host.localeCompare(b.host));
|
||||
return [...data.items].sort((a, b) => a.host.localeCompare(b.host));
|
||||
} else {
|
||||
// console.log('Add Nodes', $state.snapshot(nodes), [...data.items])
|
||||
nodes = nodes.concat([...data.items])
|
||||
return nodes.concat([...data.items])
|
||||
}
|
||||
}
|
||||
matchedNodes = data.totalNodes;
|
||||
};
|
||||
};
|
||||
|
||||
@@ -228,7 +230,7 @@
|
||||
{/if}
|
||||
</th>
|
||||
|
||||
{#each selectedMetrics as metric (metric)}
|
||||
{#each pendingSelectedMetrics as metric (metric)}
|
||||
<th
|
||||
class="position-sticky top-0 text-center"
|
||||
scope="col"
|
||||
@@ -246,18 +248,9 @@
|
||||
<Card body color="danger">{$nodesQuery.error.message}</Card>
|
||||
</Col>
|
||||
</Row>
|
||||
{:else}
|
||||
{#each nodes as nodeData (nodeData.host)}
|
||||
<NodeListRow {nodeData} {cluster} {selectedMetrics}/>
|
||||
{:else}
|
||||
{:else if $nodesQuery.fetching || !$nodesQuery.data}
|
||||
<tr>
|
||||
<td colspan={selectedMetrics.length + 1}> No nodes found </td>
|
||||
</tr>
|
||||
{/each}
|
||||
{/if}
|
||||
{#if $nodesQuery.fetching || !$nodesQuery.data}
|
||||
<tr>
|
||||
<td colspan={selectedMetrics.length + 1}>
|
||||
<td colspan={pendingSelectedMetrics.length + 1}>
|
||||
<div style="text-align:center;">
|
||||
{#if !usePaging}
|
||||
<p><b>
|
||||
@@ -272,6 +265,14 @@
|
||||
</div>
|
||||
</td>
|
||||
</tr>
|
||||
{:else}
|
||||
{#each nodes as nodeData (nodeData.host)}
|
||||
<NodeListRow {nodeData} {cluster} {selectedMetrics}/>
|
||||
{:else}
|
||||
<tr>
|
||||
<td colspan={selectedMetrics.length + 1}> No nodes found </td>
|
||||
</tr>
|
||||
{/each}
|
||||
{/if}
|
||||
</tbody>
|
||||
</Table>
|
||||
|
||||
@@ -128,6 +128,24 @@
|
||||
}
|
||||
return pendingExtendedLegendData;
|
||||
}
|
||||
|
||||
/* Inspect */
|
||||
// $inspect(selectedMetrics).with((type, selectedMetrics) => {
|
||||
// console.log(type, 'selectedMetrics', selectedMetrics)
|
||||
// });
|
||||
|
||||
// $inspect(nodeData).with((type, nodeData) => {
|
||||
// console.log(type, 'nodeData', nodeData)
|
||||
// });
|
||||
|
||||
// $inspect(refinedData).with((type, refinedData) => {
|
||||
// console.log(type, 'refinedData', refinedData)
|
||||
// });
|
||||
|
||||
// $inspect(dataHealth).with((type, dataHealth) => {
|
||||
// console.log(type, 'dataHealth', dataHealth)
|
||||
// });
|
||||
|
||||
</script>
|
||||
|
||||
<tr>
|
||||
@@ -148,13 +166,19 @@
|
||||
hoststate={nodeData?.state? nodeData.state: 'notindb'}/>
|
||||
{/if}
|
||||
</td>
|
||||
{#each refinedData as metricData (metricData.data.name)}
|
||||
{#each refinedData as metricData, i (metricData?.data?.name || i)}
|
||||
{#key metricData}
|
||||
<td>
|
||||
{#if metricData?.disabled}
|
||||
<Card body class="mx-3" color="info"
|
||||
>Metric disabled for subcluster <code
|
||||
>{metricData.data.name}:{nodeData.subCluster}</code
|
||||
>{metricData?.data?.name ? metricData.data.name : `Metric Index ${i}`}:{nodeData.subCluster}</code
|
||||
></Card
|
||||
>
|
||||
{:else if !metricData?.data?.name}
|
||||
<Card body class="mx-3" color="warning"
|
||||
>Metric without name for subcluster <code
|
||||
>{`Metric Index ${i}`}:{nodeData.subCluster}</code
|
||||
></Card
|
||||
>
|
||||
{:else if !!metricData.data?.metric.statisticsSeries}
|
||||
|
||||
Reference in New Issue
Block a user