2 Commits

Author SHA1 Message Date
Christoph Kluge
d35d00a18e Merge branch 'dev' into add_editmeta_byrequest 2025-07-03 14:02:54 +02:00
Christoph Kluge
8dfaa78a83 feat: add editMetaByRequest, specifies job by parameters instead of dbid 2025-05-13 17:48:02 +02:00
240 changed files with 14902 additions and 29248 deletions

View File

@@ -1,15 +0,0 @@
# To get started with Dependabot version updates, you'll need to specify which
# package ecosystems to update and where the package manifests are located.
# Please see the documentation for all configuration options:
# https://docs.github.com/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file
version: 2
updates:
- package-ecosystem: "gomod"
directory: "/"
schedule:
interval: "weekly"
- package-ecosystem: "npm"
directory: "/web/frontend"
schedule:
interval: "weekly"

View File

@@ -7,7 +7,7 @@ jobs:
- name: Install Go - name: Install Go
uses: actions/setup-go@v4 uses: actions/setup-go@v4
with: with:
go-version: 1.25.x go-version: 1.24.x
- name: Checkout code - name: Checkout code
uses: actions/checkout@v3 uses: actions/checkout@v3
- name: Build, Vet & Test - name: Build, Vet & Test

13
.gitignore vendored
View File

@@ -1,20 +1,14 @@
/cc-backend /cc-backend
/.env /.env
/config.json /config.json
/uiConfig.json
/var/job-archive /var/job-archive
/var/machine-state /var/machine-state
/var/*.db-shm /var/job.db-shm
/var/*.db-wal /var/job.db-wal
/var/*.db /var/*.db
/var/*.txt /var/*.txt
/var/checkpoints*
migrateTimestamps.pl
test_ccms_write_api*
/web/frontend/public/build /web/frontend/public/build
/web/frontend/node_modules /web/frontend/node_modules
@@ -27,6 +21,3 @@ test_ccms_write_api*
/.vscode/* /.vscode/*
dist/ dist/
*.db *.db
.idea
tools/archive-migration/archive-migration
tools/archive-manager/archive-manager

View File

@@ -1,26 +0,0 @@
# ClusterCockpit Backend - Agent Guidelines
## Build/Test Commands
- Build: `make` or `go build ./cmd/cc-backend`
- Run all tests: `make test` (runs: `go clean -testcache && go build ./... && go vet ./... && go test ./...`)
- Run single test: `go test -run TestName ./path/to/package`
- Run single test file: `go test ./path/to/package -run TestName`
- Frontend build: `cd web/frontend && npm install && npm run build`
- Generate GraphQL: `make graphql` (uses gqlgen)
- Generate Swagger: `make swagger` (uses swaggo/swag)
## Code Style
- **Formatting**: Use `gofumpt` for all Go files (strict requirement)
- **Copyright header**: All files must include copyright header (see existing files)
- **Package docs**: Document packages with comprehensive package-level comments explaining purpose, usage, configuration
- **Imports**: Standard library first, then external packages, then internal packages (grouped with blank lines)
- **Naming**: Use camelCase for private, PascalCase for exported; descriptive names (e.g., `JobRepository`, `handleError`)
- **Error handling**: Return errors, don't panic; use custom error types where appropriate; log with cclog package
- **Logging**: Use `cclog` package (e.g., `cclog.Errorf()`, `cclog.Warnf()`, `cclog.Debugf()`)
- **Testing**: Use standard `testing` package; use `testify/assert` for assertions; name tests `TestFunctionName`
- **Comments**: Document all exported functions/types with godoc-style comments
- **Structs**: Document fields with inline comments, especially for complex configurations
- **HTTP handlers**: Return proper status codes; use `handleError()` helper for consistent error responses
- **JSON**: Use struct tags for JSON marshaling; `DisallowUnknownFields()` for strict decoding

215
CLAUDE.md
View File

@@ -1,215 +0,0 @@
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with
code in this repository.
## Project Overview
ClusterCockpit is a job-specific performance monitoring framework for HPC
clusters. This is a Golang backend that provides REST and GraphQL APIs, serves a
Svelte-based frontend, and manages job archives and metric data from various
time-series databases.
## Build and Development Commands
### Building
```bash
# Build everything (frontend + backend)
make
# Build only the frontend
make frontend
# Build only the backend (requires frontend to be built first)
go build -ldflags='-s -X main.date=$(date +"%Y-%m-%d:T%H:%M:%S") -X main.version=1.4.4 -X main.commit=$(git rev-parse --short HEAD)' ./cmd/cc-backend
```
### Testing
```bash
# Run all tests
make test
# Run tests with verbose output
go test -v ./...
# Run tests for a specific package
go test ./internal/repository
```
### Code Generation
```bash
# Regenerate GraphQL schema and resolvers (after modifying api/*.graphqls)
make graphql
# Regenerate Swagger/OpenAPI docs (after modifying API comments)
make swagger
```
### Frontend Development
```bash
cd web/frontend
# Install dependencies
npm install
# Build for production
npm run build
# Development mode with watch
npm run dev
```
### Running
```bash
# Initialize database and create admin user
./cc-backend -init-db -add-user demo:admin:demo
# Start server in development mode (enables GraphQL Playground and Swagger UI)
./cc-backend -server -dev -loglevel info
# Start demo with sample data
./startDemo.sh
```
## Architecture
### Backend Structure
The backend follows a layered architecture with clear separation of concerns:
- **cmd/cc-backend**: Entry point, orchestrates initialization of all subsystems
- **internal/repository**: Data access layer using repository pattern
- Abstracts database operations (SQLite3 only)
- Implements LRU caching for performance
- Provides repositories for Job, User, Node, and Tag entities
- Transaction support for batch operations
- **internal/api**: REST API endpoints (Swagger/OpenAPI documented)
- **internal/graph**: GraphQL API (uses gqlgen)
- Schema in `api/*.graphqls`
- Generated code in `internal/graph/generated/`
- Resolvers in `internal/graph/schema.resolvers.go`
- **internal/auth**: Authentication layer
- Supports local accounts, LDAP, OIDC, and JWT tokens
- Implements rate limiting for login attempts
- **internal/metricdata**: Metric data repository abstraction
- Pluggable backends: cc-metric-store, Prometheus, InfluxDB
- Each cluster can have a different metric data backend
- **internal/archiver**: Job archiving to file-based archive
- **pkg/archive**: Job archive backend implementations
- File system backend (default)
- S3 backend
- SQLite backend (experimental)
- **pkg/nats**: NATS integration for metric ingestion
### Frontend Structure
- **web/frontend**: Svelte 5 application
- Uses Rollup for building
- Components organized by feature (analysis, job, user, etc.)
- GraphQL client using @urql/svelte
- Bootstrap 5 + SvelteStrap for UI
- uPlot for time-series visualization
- **web/templates**: Server-side Go templates
### Key Concepts
**Job Archive**: Completed jobs are stored in a file-based archive following the
[ClusterCockpit job-archive
specification](https://github.com/ClusterCockpit/cc-specifications/tree/master/job-archive).
Each job has a `meta.json` file with metadata and metric data files.
**Metric Data Repositories**: Time-series metric data is stored separately from
job metadata. The system supports multiple backends (cc-metric-store is
recommended). Configuration is per-cluster in `config.json`.
**Authentication Flow**:
1. Multiple authenticators can be configured (local, LDAP, OIDC, JWT)
2. Each authenticator's `CanLogin` method is called to determine if it should handle the request
3. The first authenticator that returns true performs the actual `Login`
4. JWT tokens are used for API authentication
**Database Migrations**: SQL migrations in `internal/repository/migrations/` are
applied automatically on startup. Version tracking in `version` table.
**Scopes**: Metrics can be collected at different scopes:
- Node scope (always available)
- Core scope (for jobs with ≤8 nodes)
- Accelerator scope (for GPU/accelerator metrics)
## Configuration
- **config.json**: Main configuration (clusters, metric repositories, archive settings)
- **.env**: Environment variables (secrets like JWT keys)
- Copy from `configs/env-template.txt`
- NEVER commit this file
- **cluster.json**: Cluster topology and metric definitions (loaded from archive or config)
## Database
- Default: SQLite 3 (`./var/job.db`)
- Connection managed by `internal/repository`
- Schema version in `internal/repository/migration.go`
## Code Generation
**GraphQL** (gqlgen):
- Schema: `api/*.graphqls`
- Config: `gqlgen.yml`
- Generated code: `internal/graph/generated/`
- Custom resolvers: `internal/graph/schema.resolvers.go`
- Run `make graphql` after schema changes
**Swagger/OpenAPI**:
- Annotations in `internal/api/*.go`
- Generated docs: `api/docs.go`, `api/swagger.yaml`
- Run `make swagger` after API changes
## Testing Conventions
- Test files use `_test.go` suffix
- Test data in `testdata/` subdirectories
- Repository tests use in-memory SQLite
- API tests use httptest
## Common Workflows
### Adding a new GraphQL field
1. Edit schema in `api/*.graphqls`
2. Run `make graphql`
3. Implement resolver in `internal/graph/schema.resolvers.go`
### Adding a new REST endpoint
1. Add handler in `internal/api/*.go`
2. Add route in `internal/api/rest.go`
3. Add Swagger annotations
4. Run `make swagger`
### Adding a new metric data backend
1. Implement `MetricDataRepository` interface in `internal/metricdata/`
2. Register in `metricdata.Init()` switch statement
3. Update config.json schema documentation
### Modifying database schema
1. Create new migration in `internal/repository/migrations/`
2. Increment `repository.Version`
3. Test with fresh database and existing database
## Dependencies
- Go 1.24.0+ (check go.mod for exact version)
- Node.js (for frontend builds)
- SQLite 3 (only supported database)
- Optional: NATS server for metric ingestion

View File

@@ -1,4 +1,6 @@
TARGET = ./cc-backend TARGET = ./cc-backend
VAR = ./var
CFG = config.json .env
FRONTEND = ./web/frontend FRONTEND = ./web/frontend
VERSION = 1.4.4 VERSION = 1.4.4
GIT_HASH := $(shell git rev-parse --short HEAD || echo 'development') GIT_HASH := $(shell git rev-parse --short HEAD || echo 'development')
@@ -40,7 +42,7 @@ SVELTE_SRC = $(wildcard $(FRONTEND)/src/*.svelte) \
.NOTPARALLEL: .NOTPARALLEL:
$(TARGET): $(SVELTE_TARGETS) $(TARGET): $(VAR) $(CFG) $(SVELTE_TARGETS)
$(info ===> BUILD cc-backend) $(info ===> BUILD cc-backend)
@go build -ldflags=${LD_FLAGS} ./cmd/cc-backend @go build -ldflags=${LD_FLAGS} ./cmd/cc-backend
@@ -50,12 +52,12 @@ frontend:
swagger: swagger:
$(info ===> GENERATE swagger) $(info ===> GENERATE swagger)
@go tool github.com/swaggo/swag/cmd/swag init --parseDependency -d ./internal/api -g rest.go -o ./api @go run github.com/swaggo/swag/cmd/swag init -d ./internal/api,./pkg/schema -g rest.go -o ./api
@mv ./api/docs.go ./internal/api/docs.go @mv ./api/docs.go ./internal/api/docs.go
graphql: graphql:
$(info ===> GENERATE graphql) $(info ===> GENERATE graphql)
@go tool github.com/99designs/gqlgen @go run github.com/99designs/gqlgen
clean: clean:
$(info ===> CLEAN) $(info ===> CLEAN)
@@ -66,7 +68,7 @@ distclean:
@$(MAKE) clean @$(MAKE) clean
$(info ===> DISTCLEAN) $(info ===> DISTCLEAN)
@rm -rf $(FRONTEND)/node_modules @rm -rf $(FRONTEND)/node_modules
@rm -rf ./var @rm -rf $(VAR)
test: test:
$(info ===> TESTING) $(info ===> TESTING)
@@ -82,6 +84,14 @@ tags:
$(VAR): $(VAR):
@mkdir -p $(VAR) @mkdir -p $(VAR)
config.json:
$(info ===> Initialize config.json file)
@cp configs/config.json config.json
.env:
$(info ===> Initialize .env file)
@cp configs/env-template.txt .env
$(SVELTE_TARGETS): $(SVELTE_SRC) $(SVELTE_TARGETS): $(SVELTE_SRC)
$(info ===> BUILD frontend) $(info ===> BUILD frontend)
cd web/frontend && npm install && npm run build cd web/frontend && npm install && npm run build

102
README.md
View File

@@ -1,8 +1,5 @@
# NOTE # NOTE
While we do our best to keep the master branch in a usable state, there is no guarantee the master branch works.
Please do not use it for production!
Please have a look at the [Release Please have a look at the [Release
Notes](https://github.com/ClusterCockpit/cc-backend/blob/master/ReleaseNotes.md) Notes](https://github.com/ClusterCockpit/cc-backend/blob/master/ReleaseNotes.md)
for breaking changes! for breaking changes!
@@ -29,11 +26,12 @@ is also served by the backend using [Svelte](https://svelte.dev/) components.
Layout and styling are based on [Bootstrap 5](https://getbootstrap.com/) using Layout and styling are based on [Bootstrap 5](https://getbootstrap.com/) using
[Bootstrap Icons](https://icons.getbootstrap.com/). [Bootstrap Icons](https://icons.getbootstrap.com/).
The backend uses [SQLite 3](https://sqlite.org/) as the relational SQL database. The backend uses [SQLite 3](https://sqlite.org/) as a relational SQL database by
While there are metric data backends for the InfluxDB and Prometheus time series default. Optionally it can use a MySQL/MariaDB database server. While there are
databases, the only tested and supported setup is to use cc-metric-store as the metric data backends for the InfluxDB and Prometheus time series databases, the
metric data backend. Documentation on how to integrate ClusterCockpit with other only tested and supported setup is to use cc-metric-store as the metric data
time series databases will be added in the future. backend. Documentation on how to integrate ClusterCockpit with other time series
databases will be added in the future.
Completed batch jobs are stored in a file-based job archive according to Completed batch jobs are stored in a file-based job archive according to
[this specification](https://github.com/ClusterCockpit/cc-specifications/tree/master/job-archive). [this specification](https://github.com/ClusterCockpit/cc-specifications/tree/master/job-archive).
@@ -71,7 +69,7 @@ You can also try the demo using the latest release binary.
Create a folder and put the release binary `cc-backend` into this folder. Create a folder and put the release binary `cc-backend` into this folder.
Execute the following steps: Execute the following steps:
```shell ``` shell
./cc-backend -init ./cc-backend -init
vim config.json (Add a second cluster entry and name the clusters alex and fritz) vim config.json (Add a second cluster entry and name the clusters alex and fritz)
wget https://hpc-mover.rrze.uni-erlangen.de/HPC-Data/0x7b58aefb/eig7ahyo6fo2bais0ephuf2aitohv1ai/job-archive-demo.tar wget https://hpc-mover.rrze.uni-erlangen.de/HPC-Data/0x7b58aefb/eig7ahyo6fo2bais0ephuf2aitohv1ai/job-archive-demo.tar
@@ -90,11 +88,11 @@ Analysis, Systems and Status views).
There is a Makefile to automate the build of cc-backend. The Makefile supports There is a Makefile to automate the build of cc-backend. The Makefile supports
the following targets: the following targets:
- `make`: Initialize `var` directory and build svelte frontend and backend * `make`: Initialize `var` directory and build svelte frontend and backend
binary. Note that there is no proper prerequisite handling. Any change of binary. Note that there is no proper prerequisite handling. Any change of
frontend source files will result in a complete rebuild. frontend source files will result in a complete rebuild.
- `make clean`: Clean go build cache and remove binary. * `make clean`: Clean go build cache and remove binary.
- `make test`: Run the tests that are also run in the GitHub workflow setup. * `make test`: Run the tests that are also run in the GitHub workflow setup.
A common workflow for setting up cc-backend from scratch is: A common workflow for setting up cc-backend from scratch is:
@@ -130,41 +128,41 @@ ln -s <your-existing-job-archive> ./var/job-archive
## Project file structure ## Project file structure
- [`api/`](https://github.com/ClusterCockpit/cc-backend/tree/master/api) * [`api/`](https://github.com/ClusterCockpit/cc-backend/tree/master/api)
contains the API schema files for the REST and GraphQL APIs. The REST API is contains the API schema files for the REST and GraphQL APIs. The REST API is
documented in the OpenAPI 3.0 format in documented in the OpenAPI 3.0 format in
[./api/openapi.yaml](./api/openapi.yaml). [./api/openapi.yaml](./api/openapi.yaml).
- [`cmd/cc-backend`](https://github.com/ClusterCockpit/cc-backend/tree/master/cmd/cc-backend) * [`cmd/cc-backend`](https://github.com/ClusterCockpit/cc-backend/tree/master/cmd/cc-backend)
contains `main.go` for the main application. contains `main.go` for the main application.
- [`configs/`](https://github.com/ClusterCockpit/cc-backend/tree/master/configs) * [`configs/`](https://github.com/ClusterCockpit/cc-backend/tree/master/configs)
contains documentation about configuration and command line options and required contains documentation about configuration and command line options and required
environment variables. A sample configuration file is provided. environment variables. A sample configuration file is provided.
- [`docs/`](https://github.com/ClusterCockpit/cc-backend/tree/master/docs) * [`docs/`](https://github.com/ClusterCockpit/cc-backend/tree/master/docs)
contains more in-depth documentation. contains more in-depth documentation.
- [`init/`](https://github.com/ClusterCockpit/cc-backend/tree/master/init) * [`init/`](https://github.com/ClusterCockpit/cc-backend/tree/master/init)
contains an example of setting up systemd for production use. contains an example of setting up systemd for production use.
- [`internal/`](https://github.com/ClusterCockpit/cc-backend/tree/master/internal) * [`internal/`](https://github.com/ClusterCockpit/cc-backend/tree/master/internal)
contains library source code that is not intended for use by others. contains library source code that is not intended for use by others.
- [`pkg/`](https://github.com/ClusterCockpit/cc-backend/tree/master/pkg) * [`pkg/`](https://github.com/ClusterCockpit/cc-backend/tree/master/pkg)
contains Go packages that can be used by other projects. contains Go packages that can be used by other projects.
- [`tools/`](https://github.com/ClusterCockpit/cc-backend/tree/master/tools) * [`tools/`](https://github.com/ClusterCockpit/cc-backend/tree/master/tools)
Additional command line helper tools. Additional command line helper tools.
- [`archive-manager`](https://github.com/ClusterCockpit/cc-backend/tree/master/tools/archive-manager) * [`archive-manager`](https://github.com/ClusterCockpit/cc-backend/tree/master/tools/archive-manager)
Commands for getting infos about and existing job archive. Commands for getting infos about and existing job archive.
- [`convert-pem-pubkey`](https://github.com/ClusterCockpit/cc-backend/tree/master/tools/convert-pem-pubkey) * [`convert-pem-pubkey`](https://github.com/ClusterCockpit/cc-backend/tree/master/tools/convert-pem-pubkey)
Tool to convert external pubkey for use in `cc-backend`. Tool to convert external pubkey for use in `cc-backend`.
- [`gen-keypair`](https://github.com/ClusterCockpit/cc-backend/tree/master/tools/gen-keypair) * [`gen-keypair`](https://github.com/ClusterCockpit/cc-backend/tree/master/tools/gen-keypair)
contains a small application to generate a compatible JWT keypair. You find contains a small application to generate a compatible JWT keypair. You find
documentation on how to use it documentation on how to use it
[here](https://github.com/ClusterCockpit/cc-backend/blob/master/docs/JWT-Handling.md). [here](https://github.com/ClusterCockpit/cc-backend/blob/master/docs/JWT-Handling.md).
- [`web/`](https://github.com/ClusterCockpit/cc-backend/tree/master/web) * [`web/`](https://github.com/ClusterCockpit/cc-backend/tree/master/web)
Server-side templates and frontend-related files: Server-side templates and frontend-related files:
- [`frontend`](https://github.com/ClusterCockpit/cc-backend/tree/master/web/frontend) * [`frontend`](https://github.com/ClusterCockpit/cc-backend/tree/master/web/frontend)
Svelte components and static assets for the frontend UI Svelte components and static assets for the frontend UI
- [`templates`](https://github.com/ClusterCockpit/cc-backend/tree/master/web/templates) * [`templates`](https://github.com/ClusterCockpit/cc-backend/tree/master/web/templates)
Server-side Go templates Server-side Go templates
- [`gqlgen.yml`](https://github.com/ClusterCockpit/cc-backend/blob/master/gqlgen.yml) * [`gqlgen.yml`](https://github.com/ClusterCockpit/cc-backend/blob/master/gqlgen.yml)
Configures the behaviour and generation of Configures the behaviour and generation of
[gqlgen](https://github.com/99designs/gqlgen). [gqlgen](https://github.com/99designs/gqlgen).
- [`startDemo.sh`](https://github.com/ClusterCockpit/cc-backend/blob/master/startDemo.sh) * [`startDemo.sh`](https://github.com/ClusterCockpit/cc-backend/blob/master/startDemo.sh)
is a shell script that sets up demo data, and builds and starts `cc-backend`. is a shell script that sets up demo data, and builds and starts `cc-backend`.

View File

@@ -4,7 +4,7 @@ scalar Any
scalar NullableFloat scalar NullableFloat
scalar MetricScope scalar MetricScope
scalar JobState scalar JobState
scalar SchedulerState scalar NodeState
scalar MonitoringState scalar MonitoringState
type Node { type Node {
@@ -12,26 +12,16 @@ type Node {
hostname: String! hostname: String!
cluster: String! cluster: String!
subCluster: String! subCluster: String!
jobsRunning: Int! nodeState: NodeState!
cpusAllocated: Int HealthState: MonitoringState!
memoryAllocated: Int
gpusAllocated: Int
schedulerState: SchedulerState!
healthState: MonitoringState!
metaData: Any metaData: Any
} }
type NodeStates { type NodeStats {
state: String! state: String!
count: Int! count: Int!
} }
type NodeStatesTimed {
state: String!
counts: [Int!]!
times: [Int!]!
}
type Job { type Job {
id: ID! id: ID!
jobId: Int! jobId: Int!
@@ -47,7 +37,7 @@ type Job {
numAcc: Int! numAcc: Int!
energy: Float! energy: Float!
SMT: Int! SMT: Int!
shared: String! exclusive: Int!
partition: String! partition: String!
arrayJobId: Int! arrayJobId: Int!
monitoringStatus: Int! monitoringStatus: Int!
@@ -164,13 +154,6 @@ type JobMetricWithName {
metric: JobMetric! metric: JobMetric!
} }
type ClusterMetricWithName {
name: String!
unit: Unit
timestep: Int!
data: [NullableFloat!]!
}
type JobMetric { type JobMetric {
unit: Unit unit: Unit
timestep: Int! timestep: Int!
@@ -253,12 +236,10 @@ enum Aggregate {
USER USER
PROJECT PROJECT
CLUSTER CLUSTER
SUBCLUSTER
} }
enum SortByAggregate { enum SortByAggregate {
TOTALWALLTIME TOTALWALLTIME
TOTALJOBS TOTALJOBS
TOTALUSERS
TOTALNODES TOTALNODES
TOTALNODEHOURS TOTALNODEHOURS
TOTALCORES TOTALCORES
@@ -269,16 +250,10 @@ enum SortByAggregate {
type NodeMetrics { type NodeMetrics {
host: String! host: String!
state: String!
subCluster: String! subCluster: String!
metrics: [JobMetricWithName!]! metrics: [JobMetricWithName!]!
} }
type ClusterMetrics {
nodeCount: Int!
metrics: [ClusterMetricWithName!]!
}
type NodesResultList { type NodesResultList {
items: [NodeMetrics!]! items: [NodeMetrics!]!
offset: Int offset: Int
@@ -325,11 +300,9 @@ type Query {
user(username: String!): User user(username: String!): User
allocatedNodes(cluster: String!): [Count!]! allocatedNodes(cluster: String!): [Count!]!
## Node Queries New
node(id: ID!): Node node(id: ID!): Node
nodes(filter: [NodeFilter!], order: OrderByInput): NodeStateResultList! nodes(filter: [NodeFilter!], order: OrderByInput): NodeStateResultList!
nodeStates(filter: [NodeFilter!]): [NodeStates!]! nodeStats(filter: [NodeFilter!]): [NodeStats!]!
nodeStatesTimed(filter: [NodeFilter!], type: String!): [NodeStatesTimed!]!
job(id: ID!): Job job(id: ID!): Job
jobMetrics( jobMetrics(
@@ -384,11 +357,9 @@ type Query {
from: Time! from: Time!
to: Time! to: Time!
): [NodeMetrics!]! ): [NodeMetrics!]!
nodeMetricsList( nodeMetricsList(
cluster: String! cluster: String!
subCluster: String! subCluster: String!
stateFilter: String!
nodeFilter: String! nodeFilter: String!
scopes: [MetricScope!] scopes: [MetricScope!]
metrics: [String!] metrics: [String!]
@@ -397,13 +368,6 @@ type Query {
page: PageRequest page: PageRequest
resolution: Int resolution: Int
): NodesResultList! ): NodesResultList!
clusterMetrics(
cluster: String!
metrics: [String!]
from: Time!
to: Time!
): ClusterMetrics!
} }
type Mutation { type Mutation {
@@ -429,10 +393,8 @@ type TimeRangeOutput {
input NodeFilter { input NodeFilter {
hostname: StringInput hostname: StringInput
cluster: StringInput cluster: StringInput
subcluster: StringInput nodeState: NodeState
schedulerState: SchedulerState
healthState: MonitoringState healthState: MonitoringState
timeStart: Int
} }
input JobFilter { input JobFilter {
@@ -457,7 +419,7 @@ input JobFilter {
startTime: TimeRange startTime: TimeRange
state: [JobState!] state: [JobState!]
metricStats: [MetricStatItem!] metricStats: [MetricStatItem!]
shared: String exclusive: Int
node: StringInput node: StringInput
} }
@@ -535,12 +497,11 @@ type MetricHistoPoint {
} }
type JobsStatistics { type JobsStatistics {
id: ID! # If `groupBy` was used, ID of the user/project/cluster/subcluster id: ID! # If `groupBy` was used, ID of the user/project/cluster
name: String! # if User-Statistics: Given Name of Account (ID) Owner name: String! # if User-Statistics: Given Name of Account (ID) Owner
totalUsers: Int! # if *not* User-Statistics: Number of active users (based on running jobs)
totalJobs: Int! # Number of jobs totalJobs: Int! # Number of jobs
runningJobs: Int! # Number of running jobs runningJobs: Int! # Number of running jobs
shortJobs: Int! # Number of jobs with a duration of less than config'd ShortRunningJobsDuration shortJobs: Int! # Number of jobs with a duration of less than duration
totalWalltime: Int! # Sum of the duration of all matched jobs in hours totalWalltime: Int! # Sum of the duration of all matched jobs in hours
totalNodes: Int! # Sum of the nodes of all matched jobs totalNodes: Int! # Sum of the nodes of all matched jobs
totalNodeHours: Int! # Sum of the node hours of all matched jobs totalNodeHours: Int! # Sum of the node hours of all matched jobs

View File

@@ -383,8 +383,71 @@
} }
} }
}, },
"/api/jobs/edit_meta/": {
"patch": {
"security": [
{
"ApiKeyAuth": []
}
],
"description": "Edit key value pairs in metadata json of job specified by jobID, StartTime and Cluster\nIf a key already exists its content will be overwritten",
"consumes": [
"application/json"
],
"produces": [
"application/json"
],
"tags": [
"Job add and modify"
],
"summary": "Edit meta-data json by request",
"parameters": [
{
"description": "Specifies job and payload to add or update",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/api.JobMetaRequest"
}
}
],
"responses": {
"200": {
"description": "Updated job resource",
"schema": {
"$ref": "#/definitions/schema.Job"
}
},
"400": {
"description": "Bad Request",
"schema": {
"$ref": "#/definitions/api.ErrorResponse"
}
},
"401": {
"description": "Unauthorized",
"schema": {
"$ref": "#/definitions/api.ErrorResponse"
}
},
"404": {
"description": "Job does not exist",
"schema": {
"$ref": "#/definitions/api.ErrorResponse"
}
},
"500": {
"description": "Internal Server Error",
"schema": {
"$ref": "#/definitions/api.ErrorResponse"
}
}
}
}
},
"/api/jobs/edit_meta/{id}": { "/api/jobs/edit_meta/{id}": {
"post": { "patch": {
"security": [ "security": [
{ {
"ApiKeyAuth": [] "ApiKeyAuth": []
@@ -1237,6 +1300,37 @@
} }
} }
}, },
"api.JobMetaRequest": {
"type": "object",
"required": [
"jobId"
],
"properties": {
"cluster": {
"description": "Cluster of job",
"type": "string",
"example": "fritz"
},
"jobId": {
"description": "Cluster Job ID of job",
"type": "integer",
"example": 123000
},
"payload": {
"description": "Content to Add to Job Meta_Data",
"allOf": [
{
"$ref": "#/definitions/api.EditMetaRequest"
}
]
},
"startTime": {
"description": "Start Time of job as epoch",
"type": "integer",
"example": 1649723812
}
}
},
"api.JobMetricWithName": { "api.JobMetricWithName": {
"type": "object", "type": "object",
"properties": { "properties": {
@@ -1254,27 +1348,9 @@
"api.Node": { "api.Node": {
"type": "object", "type": "object",
"properties": { "properties": {
"cpusAllocated": {
"type": "integer"
},
"cpusTotal": {
"type": "integer"
},
"gpusAllocated": {
"type": "integer"
},
"gpusTotal": {
"type": "integer"
},
"hostname": { "hostname": {
"type": "string" "type": "string"
}, },
"memoryAllocated": {
"type": "integer"
},
"memoryTotal": {
"type": "integer"
},
"states": { "states": {
"type": "array", "type": "array",
"items": { "items": {
@@ -1390,15 +1466,19 @@
"energyFootprint": { "energyFootprint": {
"type": "object", "type": "object",
"additionalProperties": { "additionalProperties": {
"type": "number", "type": "number"
"format": "float64"
} }
}, },
"exclusive": {
"type": "integer",
"maximum": 2,
"minimum": 0,
"example": 1
},
"footprint": { "footprint": {
"type": "object", "type": "object",
"additionalProperties": { "additionalProperties": {
"type": "number", "type": "number"
"format": "float64"
} }
}, },
"id": { "id": {
@@ -1410,18 +1490,12 @@
}, },
"jobState": { "jobState": {
"enum": [ "enum": [
"boot_fail",
"cancelled",
"completed", "completed",
"deadline",
"failed", "failed",
"node_fail", "cancelled",
"out_of_memory", "stopped",
"pending", "timeout",
"preempted", "out_of_memory"
"running",
"suspended",
"timeout"
], ],
"allOf": [ "allOf": [
{ {
@@ -1477,14 +1551,6 @@
"$ref": "#/definitions/schema.Resource" "$ref": "#/definitions/schema.Resource"
} }
}, },
"shared": {
"type": "string",
"enum": [
"none",
"single_user",
"multi_user"
]
},
"smt": { "smt": {
"type": "integer", "type": "integer",
"example": 4 "example": 4
@@ -1503,10 +1569,6 @@
"type": "string", "type": "string",
"example": "main" "example": "main"
}, },
"submitTime": {
"type": "integer",
"example": 1649723812
},
"tags": { "tags": {
"type": "array", "type": "array",
"items": { "items": {
@@ -1572,32 +1634,24 @@
"schema.JobState": { "schema.JobState": {
"type": "string", "type": "string",
"enum": [ "enum": [
"boot_fail",
"cancelled",
"completed",
"deadline",
"failed",
"node_fail",
"out_of_memory",
"pending",
"preempted",
"running", "running",
"suspended", "completed",
"timeout" "failed",
"cancelled",
"stopped",
"timeout",
"preempted",
"out_of_memory"
], ],
"x-enum-varnames": [ "x-enum-varnames": [
"JobStateBootFail",
"JobStateCancelled",
"JobStateCompleted",
"JobStateDeadline",
"JobStateFailed",
"JobStateNodeFail",
"JobStateOutOfMemory",
"JobStatePending",
"JobStatePreempted",
"JobStateRunning", "JobStateRunning",
"JobStateSuspended", "JobStateCompleted",
"JobStateTimeout" "JobStateFailed",
"JobStateCancelled",
"JobStateStopped",
"JobStateTimeout",
"JobStatePreempted",
"JobStateOutOfMemory"
] ]
}, },
"schema.JobStatistics": { "schema.JobStatistics": {
@@ -1796,8 +1850,7 @@
"additionalProperties": { "additionalProperties": {
"type": "array", "type": "array",
"items": { "items": {
"type": "number", "type": "number"
"format": "float64"
} }
} }
} }

View File

@@ -102,6 +102,27 @@ definitions:
description: Page id returned description: Page id returned
type: integer type: integer
type: object type: object
api.JobMetaRequest:
properties:
cluster:
description: Cluster of job
example: fritz
type: string
jobId:
description: Cluster Job ID of job
example: 123000
type: integer
payload:
allOf:
- $ref: '#/definitions/api.EditMetaRequest'
description: Content to Add to Job Meta_Data
startTime:
description: Start Time of job as epoch
example: 1649723812
type: integer
required:
- jobId
type: object
api.JobMetricWithName: api.JobMetricWithName:
properties: properties:
metric: metric:
@@ -113,20 +134,8 @@ definitions:
type: object type: object
api.Node: api.Node:
properties: properties:
cpusAllocated:
type: integer
cpusTotal:
type: integer
gpusAllocated:
type: integer
gpusTotal:
type: integer
hostname: hostname:
type: string type: string
memoryAllocated:
type: integer
memoryTotal:
type: integer
states: states:
items: items:
type: string type: string
@@ -204,12 +213,15 @@ definitions:
type: number type: number
energyFootprint: energyFootprint:
additionalProperties: additionalProperties:
format: float64
type: number type: number
type: object type: object
exclusive:
example: 1
maximum: 2
minimum: 0
type: integer
footprint: footprint:
additionalProperties: additionalProperties:
format: float64
type: number type: number
type: object type: object
id: id:
@@ -221,18 +233,12 @@ definitions:
allOf: allOf:
- $ref: '#/definitions/schema.JobState' - $ref: '#/definitions/schema.JobState'
enum: enum:
- boot_fail
- cancelled
- completed - completed
- deadline
- failed - failed
- node_fail - cancelled
- out_of_memory - stopped
- pending
- preempted
- running
- suspended
- timeout - timeout
- out_of_memory
example: completed example: completed
metaData: metaData:
additionalProperties: additionalProperties:
@@ -270,12 +276,6 @@ definitions:
items: items:
$ref: '#/definitions/schema.Resource' $ref: '#/definitions/schema.Resource'
type: array type: array
shared:
enum:
- none
- single_user
- multi_user
type: string
smt: smt:
example: 4 example: 4
type: integer type: integer
@@ -289,9 +289,6 @@ definitions:
subCluster: subCluster:
example: main example: main
type: string type: string
submitTime:
example: 1649723812
type: integer
tags: tags:
items: items:
$ref: '#/definitions/schema.Tag' $ref: '#/definitions/schema.Tag'
@@ -335,32 +332,24 @@ definitions:
type: object type: object
schema.JobState: schema.JobState:
enum: enum:
- boot_fail
- cancelled
- completed
- deadline
- failed
- node_fail
- out_of_memory
- pending
- preempted
- running - running
- suspended - completed
- failed
- cancelled
- stopped
- timeout - timeout
- preempted
- out_of_memory
type: string type: string
x-enum-varnames: x-enum-varnames:
- JobStateBootFail
- JobStateCancelled
- JobStateCompleted
- JobStateDeadline
- JobStateFailed
- JobStateNodeFail
- JobStateOutOfMemory
- JobStatePending
- JobStatePreempted
- JobStateRunning - JobStateRunning
- JobStateSuspended - JobStateCompleted
- JobStateFailed
- JobStateCancelled
- JobStateStopped
- JobStateTimeout - JobStateTimeout
- JobStatePreempted
- JobStateOutOfMemory
schema.JobStatistics: schema.JobStatistics:
description: Specification for job metric statistics. description: Specification for job metric statistics.
properties: properties:
@@ -497,7 +486,6 @@ definitions:
percentiles: percentiles:
additionalProperties: additionalProperties:
items: items:
format: float64
type: number type: number
type: array type: array
type: object type: object
@@ -985,8 +973,50 @@ paths:
summary: Remove a job from the sql database summary: Remove a job from the sql database
tags: tags:
- Job remove - Job remove
/api/jobs/edit_meta/:
patch:
consumes:
- application/json
description: |-
Edit key value pairs in metadata json of job specified by jobID, StartTime and Cluster
If a key already exists its content will be overwritten
parameters:
- description: Specifies job and payload to add or update
in: body
name: request
required: true
schema:
$ref: '#/definitions/api.JobMetaRequest'
produces:
- application/json
responses:
"200":
description: Updated job resource
schema:
$ref: '#/definitions/schema.Job'
"400":
description: Bad Request
schema:
$ref: '#/definitions/api.ErrorResponse'
"401":
description: Unauthorized
schema:
$ref: '#/definitions/api.ErrorResponse'
"404":
description: Job does not exist
schema:
$ref: '#/definitions/api.ErrorResponse'
"500":
description: Internal Server Error
schema:
$ref: '#/definitions/api.ErrorResponse'
security:
- ApiKeyAuth: []
summary: Edit meta-data json by request
tags:
- Job add and modify
/api/jobs/edit_meta/{id}: /api/jobs/edit_meta/{id}:
post: patch:
consumes: consumes:
- application/json - application/json
description: |- description: |-

View File

@@ -2,9 +2,6 @@
// All rights reserved. This file is part of cc-backend. // All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style // Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file. // license that can be found in the LICENSE file.
// Package main provides the entry point for the ClusterCockpit backend server.
// This file defines all command-line flags and their default values.
package main package main
import "flag" import "flag"
@@ -33,6 +30,6 @@ func cliInit() {
flag.StringVar(&flagDelUser, "del-user", "", "Remove a existing user. Argument format: <username>") flag.StringVar(&flagDelUser, "del-user", "", "Remove a existing user. Argument format: <username>")
flag.StringVar(&flagGenJWT, "jwt", "", "Generate and print a JWT for the user specified by its `username`") flag.StringVar(&flagGenJWT, "jwt", "", "Generate and print a JWT for the user specified by its `username`")
flag.StringVar(&flagImportJob, "import-job", "", "Import a job. Argument format: `<path-to-meta.json>:<path-to-data.json>,...`") flag.StringVar(&flagImportJob, "import-job", "", "Import a job. Argument format: `<path-to-meta.json>:<path-to-data.json>,...`")
flag.StringVar(&flagLogLevel, "loglevel", "warn", "Sets the logging level: `[debug, info , warn (default), err, crit]`") flag.StringVar(&flagLogLevel, "loglevel", "warn", "Sets the logging level: `[debug, info (default), warn, err, crit]`")
flag.Parse() flag.Parse()
} }

View File

@@ -2,19 +2,12 @@
// All rights reserved. This file is part of cc-backend. // All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style // Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file. // license that can be found in the LICENSE file.
// Package main provides the entry point for the ClusterCockpit backend server.
// This file contains bootstrap logic for initializing the environment,
// creating default configuration files, and setting up the database.
package main package main
import ( import (
"encoding/json"
"os" "os"
"github.com/ClusterCockpit/cc-backend/internal/config"
"github.com/ClusterCockpit/cc-backend/internal/repository" "github.com/ClusterCockpit/cc-backend/internal/repository"
"github.com/ClusterCockpit/cc-backend/pkg/archive"
cclog "github.com/ClusterCockpit/cc-lib/ccLogger" cclog "github.com/ClusterCockpit/cc-lib/ccLogger"
"github.com/ClusterCockpit/cc-lib/util" "github.com/ClusterCockpit/cc-lib/util"
) )
@@ -31,60 +24,50 @@ SESSION_KEY="67d829bf61dc5f87a73fd814e2c9f629"
const configString = ` const configString = `
{ {
"main": {
"addr": "127.0.0.1:8080", "addr": "127.0.0.1:8080",
"short-running-jobs-duration": 300, "archive": {
"resampling": { "kind": "file",
"minimumPoints": 600, "path": "./var/job-archive"
"trigger": 180, },
"resolutions": [ "jwts": {
240, "max-age": "2000h"
60
]
}, },
"apiAllowedIPs": [ "apiAllowedIPs": [
"*" "*"
], ],
"emission-constant": 317 "enable-resampling": {
}, "trigger": 30,
"cron": { "resolutions": [
"commit-job-worker": "2m", 600,
"duration-worker": "5m", 300,
"footprint-worker": "10m" 120,
}, 60
"archive": { ]
"kind": "file", },
"path": "./var/job-archive" "clusters": [
}, {
"auth": { "name": "name",
"jwts": { "metricDataRepository": {
"max-age": "2000h" "kind": "cc-metric-store",
} "url": "http://localhost:8082",
}, "token": ""
"clusters": [ },
{ "filterRanges": {
"name": "name", "numNodes": {
"metricDataRepository": { "from": 1,
"kind": "cc-metric-store", "to": 64
"url": "http://localhost:8082", },
"token": "" "duration": {
}, "from": 0,
"filterRanges": { "to": 86400
"numNodes": { },
"from": 1, "startTime": {
"to": 64 "from": "2023-01-01T00:00:00Z",
}, "to": null
"duration": { }
"from": 0, }
"to": 86400 }
}, ]
"startTime": {
"from": "2023-01-01T00:00:00Z",
"to": null
}
}
}
]
} }
` `
@@ -105,15 +88,8 @@ func initEnv() {
cclog.Abortf("Could not create default ./var folder with permissions '0o777'. Application initialization failed, exited.\nError: %s\n", err.Error()) cclog.Abortf("Could not create default ./var folder with permissions '0o777'. Application initialization failed, exited.\nError: %s\n", err.Error())
} }
err := repository.MigrateDB("./var/job.db") err := repository.MigrateDB("sqlite3", "./var/job.db")
if err != nil { if err != nil {
cclog.Abortf("Could not initialize default SQLite database as './var/job.db'. Application initialization failed, exited.\nError: %s\n", err.Error()) cclog.Abortf("Could not initialize default sqlite3 database as './var/job.db'. Application initialization failed, exited.\nError: %s\n", err.Error())
}
if err := os.Mkdir("var/job-archive", 0o777); err != nil {
cclog.Abortf("Could not create default ./var/job-archive folder with permissions '0o777'. Application initialization failed, exited.\nError: %s\n", err.Error())
}
archiveCfg := "{\"kind\": \"file\",\"path\": \"./var/job-archive\"}"
if err := archive.Init(json.RawMessage(archiveCfg), config.Keys.DisableArchive); err != nil {
cclog.Abortf("Could not initialize job-archive, exited.\nError: %s\n", err.Error())
} }
} }

View File

@@ -2,15 +2,9 @@
// All rights reserved. This file is part of cc-backend. // All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style // Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file. // license that can be found in the LICENSE file.
// Package main provides the entry point for the ClusterCockpit backend server.
// It orchestrates initialization of all subsystems including configuration,
// database, authentication, and the HTTP server.
package main package main
import ( import (
"context"
"encoding/json"
"fmt" "fmt"
"os" "os"
"os/signal" "os/signal"
@@ -18,28 +12,24 @@ import (
"strings" "strings"
"sync" "sync"
"syscall" "syscall"
"time"
"github.com/ClusterCockpit/cc-backend/internal/archiver" "github.com/ClusterCockpit/cc-backend/internal/archiver"
"github.com/ClusterCockpit/cc-backend/internal/auth" "github.com/ClusterCockpit/cc-backend/internal/auth"
"github.com/ClusterCockpit/cc-backend/internal/config" "github.com/ClusterCockpit/cc-backend/internal/config"
"github.com/ClusterCockpit/cc-backend/internal/importer" "github.com/ClusterCockpit/cc-backend/internal/importer"
"github.com/ClusterCockpit/cc-backend/internal/memorystore"
"github.com/ClusterCockpit/cc-backend/internal/metricdata" "github.com/ClusterCockpit/cc-backend/internal/metricdata"
"github.com/ClusterCockpit/cc-backend/internal/repository" "github.com/ClusterCockpit/cc-backend/internal/repository"
"github.com/ClusterCockpit/cc-backend/internal/tagger" "github.com/ClusterCockpit/cc-backend/internal/tagger"
"github.com/ClusterCockpit/cc-backend/internal/taskmanager" "github.com/ClusterCockpit/cc-backend/internal/taskManager"
"github.com/ClusterCockpit/cc-backend/pkg/archive" "github.com/ClusterCockpit/cc-backend/pkg/archive"
"github.com/ClusterCockpit/cc-backend/pkg/nats" "github.com/ClusterCockpit/cc-backend/pkg/runtimeEnv"
"github.com/ClusterCockpit/cc-backend/web"
ccconf "github.com/ClusterCockpit/cc-lib/ccConfig"
cclog "github.com/ClusterCockpit/cc-lib/ccLogger" cclog "github.com/ClusterCockpit/cc-lib/ccLogger"
"github.com/ClusterCockpit/cc-lib/runtimeEnv"
"github.com/ClusterCockpit/cc-lib/schema" "github.com/ClusterCockpit/cc-lib/schema"
"github.com/ClusterCockpit/cc-lib/util" "github.com/ClusterCockpit/cc-lib/util"
"github.com/google/gops/agent" "github.com/google/gops/agent"
"github.com/joho/godotenv" "github.com/joho/godotenv"
_ "github.com/go-sql-driver/mysql"
_ "github.com/mattn/go-sqlite3" _ "github.com/mattn/go-sqlite3"
) )
@@ -52,376 +42,28 @@ const logoString = `
|_| |_|
` `
// Environment variable names
const (
envGOGC = "GOGC"
)
// Default configurations
const (
defaultArchiveConfig = `{"kind":"file","path":"./var/job-archive"}`
)
var ( var (
date string date string
commit string commit string
version string version string
) )
func printVersion() { func main() {
fmt.Print(logoString)
fmt.Printf("Version:\t%s\n", version)
fmt.Printf("Git hash:\t%s\n", commit)
fmt.Printf("Build time:\t%s\n", date)
fmt.Printf("SQL db version:\t%d\n", repository.Version)
fmt.Printf("Job archive version:\t%d\n", archive.Version)
}
func initGops() error {
if !flagGops {
return nil
}
if err := agent.Listen(agent.Options{}); err != nil {
return fmt.Errorf("starting gops agent: %w", err)
}
return nil
}
func loadEnvironment() error {
if err := godotenv.Load(); err != nil {
return fmt.Errorf("loading .env file: %w", err)
}
return nil
}
func initConfiguration() error {
ccconf.Init(flagConfigFile)
cfg := ccconf.GetPackageConfig("main")
if cfg == nil {
return fmt.Errorf("main configuration must be present")
}
clustercfg := ccconf.GetPackageConfig("clusters")
if clustercfg == nil {
return fmt.Errorf("cluster configuration must be present")
}
config.Init(cfg, clustercfg)
return nil
}
func initDatabase() error {
repository.Connect(config.Keys.DBDriver, config.Keys.DB)
return nil
}
func handleDatabaseCommands() error {
if flagMigrateDB {
err := repository.MigrateDB(config.Keys.DB)
if err != nil {
return fmt.Errorf("migrating database to version %d: %w", repository.Version, err)
}
cclog.Exitf("MigrateDB Success: Migrated SQLite database at '%s' to version %d.\n",
config.Keys.DB, repository.Version)
}
if flagRevertDB {
err := repository.RevertDB(config.Keys.DB)
if err != nil {
return fmt.Errorf("reverting database to version %d: %w", repository.Version-1, err)
}
cclog.Exitf("RevertDB Success: Reverted SQLite database at '%s' to version %d.\n",
config.Keys.DB, repository.Version-1)
}
if flagForceDB {
err := repository.ForceDB(config.Keys.DB)
if err != nil {
return fmt.Errorf("forcing database to version %d: %w", repository.Version, err)
}
cclog.Exitf("ForceDB Success: Forced SQLite database at '%s' to version %d.\n",
config.Keys.DB, repository.Version)
}
return nil
}
func handleUserCommands() error {
if config.Keys.DisableAuthentication && (flagNewUser != "" || flagDelUser != "") {
return fmt.Errorf("--add-user and --del-user can only be used if authentication is enabled")
}
if !config.Keys.DisableAuthentication {
if cfg := ccconf.GetPackageConfig("auth"); cfg != nil {
auth.Init(&cfg)
} else {
cclog.Warn("Authentication disabled due to missing configuration")
auth.Init(nil)
}
// Check for default security keys
checkDefaultSecurityKeys()
if flagNewUser != "" {
if err := addUser(flagNewUser); err != nil {
return err
}
}
if flagDelUser != "" {
if err := delUser(flagDelUser); err != nil {
return err
}
}
authHandle := auth.GetAuthInstance()
if flagSyncLDAP {
if err := syncLDAP(authHandle); err != nil {
return err
}
}
if flagGenJWT != "" {
if err := generateJWT(authHandle, flagGenJWT); err != nil {
return err
}
}
}
return nil
}
// checkDefaultSecurityKeys warns if default JWT keys are detected
func checkDefaultSecurityKeys() {
// Default JWT public key from init.go
defaultJWTPublic := "kzfYrYy+TzpanWZHJ5qSdMj5uKUWgq74BWhQG6copP0="
if os.Getenv("JWT_PUBLIC_KEY") == defaultJWTPublic {
cclog.Warn("Using default JWT keys - not recommended for production environments")
}
}
func addUser(userSpec string) error {
parts := strings.SplitN(userSpec, ":", 3)
if len(parts) != 3 || len(parts[0]) == 0 {
return fmt.Errorf("invalid user format, want: <username>:[admin,support,manager,api,user]:<password>, have: %s", userSpec)
}
ur := repository.GetUserRepository()
if err := ur.AddUser(&schema.User{
Username: parts[0],
Projects: make([]string, 0),
Password: parts[2],
Roles: strings.Split(parts[1], ","),
}); err != nil {
return fmt.Errorf("adding user '%s' with roles '%s': %w", parts[0], parts[1], err)
}
cclog.Infof("Add User: Added new user '%s' with roles '%s'", parts[0], parts[1])
return nil
}
func delUser(username string) error {
ur := repository.GetUserRepository()
if err := ur.DelUser(username); err != nil {
return fmt.Errorf("deleting user '%s': %w", username, err)
}
cclog.Infof("Delete User: Deleted user '%s' from DB", username)
return nil
}
func syncLDAP(authHandle *auth.Authentication) error {
if authHandle.LdapAuth == nil {
return fmt.Errorf("LDAP authentication is not configured")
}
if err := authHandle.LdapAuth.Sync(); err != nil {
return fmt.Errorf("synchronizing LDAP: %w", err)
}
cclog.Print("Sync LDAP: LDAP synchronization successfull.")
return nil
}
func generateJWT(authHandle *auth.Authentication, username string) error {
ur := repository.GetUserRepository()
user, err := ur.GetUser(username)
if err != nil {
return fmt.Errorf("getting user '%s': %w", username, err)
}
if !user.HasRole(schema.RoleApi) {
cclog.Warnf("JWT: User '%s' does not have the role 'api'. REST API endpoints will return error!\n", user.Username)
}
jwt, err := authHandle.JwtAuth.ProvideJWT(user)
if err != nil {
return fmt.Errorf("generating JWT for user '%s': %w", user.Username, err)
}
cclog.Printf("JWT: Successfully generated JWT for user '%s': %s\n", user.Username, jwt)
return nil
}
func initSubsystems() error {
// Initialize nats client
natsConfig := ccconf.GetPackageConfig("nats")
if err := nats.Init(natsConfig); err != nil {
cclog.Warnf("initializing (optional) nats client: %s", err.Error())
}
nats.Connect()
// Initialize job archive
archiveCfg := ccconf.GetPackageConfig("archive")
if archiveCfg == nil {
archiveCfg = json.RawMessage(defaultArchiveConfig)
}
if err := archive.Init(archiveCfg, config.Keys.DisableArchive); err != nil {
return fmt.Errorf("initializing archive: %w", err)
}
// Initialize metricdata
// if err := metricdata.Init(); err != nil {
// return fmt.Errorf("initializing metricdata repository: %w", err)
// }
// Initialize upstream metricdata repositories for pull worker
if err := metricdata.InitUpstreamRepos(); err != nil {
return fmt.Errorf("initializing upstream metricdata repositories: %w", err)
}
// Handle database re-initialization
if flagReinitDB {
if err := importer.InitDB(); err != nil {
return fmt.Errorf("re-initializing repository DB: %w", err)
}
cclog.Print("Init DB: Successfully re-initialized repository DB.")
}
// Handle job import
if flagImportJob != "" {
if err := importer.HandleImportFlag(flagImportJob); err != nil {
return fmt.Errorf("importing job: %w", err)
}
cclog.Infof("Import Job: Imported Job '%s' into DB", flagImportJob)
}
// Initialize taggers
if config.Keys.EnableJobTaggers {
tagger.Init()
}
// Apply tags if requested
if flagApplyTags {
if err := tagger.RunTaggers(); err != nil {
return fmt.Errorf("running job taggers: %w", err)
}
}
return nil
}
func runServer(ctx context.Context) error {
var wg sync.WaitGroup
// Initialize metric store if configuration is provided
mscfg := ccconf.GetPackageConfig("metric-store")
if mscfg != nil {
memorystore.Init(mscfg, &wg)
} else {
cclog.Debug("Metric store configuration not found, skipping memorystore initialization")
}
// Start archiver and task manager
archiver.Start(repository.GetJobRepository(), ctx)
taskmanager.Start(ccconf.GetPackageConfig("cron"), ccconf.GetPackageConfig("archive"))
// Initialize web UI
cfg := ccconf.GetPackageConfig("ui")
if err := web.Init(cfg); err != nil {
return fmt.Errorf("initializing web UI: %w", err)
}
// Initialize HTTP server
srv, err := NewServer(version, commit, date)
if err != nil {
return fmt.Errorf("creating server: %w", err)
}
// Channel to collect errors from server
errChan := make(chan error, 1)
// Start HTTP server
wg.Add(1)
go func() {
defer wg.Done()
if err := srv.Start(ctx); err != nil {
errChan <- err
}
}()
// Handle shutdown signals
wg.Add(1)
sigs := make(chan os.Signal, 1)
signal.Notify(sigs, syscall.SIGINT, syscall.SIGTERM)
go func() {
defer wg.Done()
select {
case <-sigs:
cclog.Info("Shutdown signal received")
case <-ctx.Done():
}
runtimeEnv.SystemdNotifiy(false, "Shutting down ...")
srv.Shutdown(ctx)
util.FsWatcherShutdown()
taskmanager.Shutdown()
}()
// Set GC percent if not configured
if os.Getenv(envGOGC) == "" {
debug.SetGCPercent(25)
}
runtimeEnv.SystemdNotifiy(true, "running")
// Wait for completion or error
go func() {
wg.Wait()
close(errChan)
}()
// Check for server startup errors
select {
case err := <-errChan:
if err != nil {
return err
}
case <-time.After(100 * time.Millisecond):
// Server started successfully, wait for completion
if err := <-errChan; err != nil {
return err
}
}
cclog.Print("Graceful shutdown completed!")
return nil
}
func run() error {
cliInit() cliInit()
if flagVersion { if flagVersion {
printVersion() fmt.Print(logoString)
return nil fmt.Printf("Version:\t%s\n", version)
fmt.Printf("Git hash:\t%s\n", commit)
fmt.Printf("Build time:\t%s\n", date)
fmt.Printf("SQL db version:\t%d\n", repository.Version)
fmt.Printf("Job archive version:\t%d\n", archive.Version)
os.Exit(0)
} }
// Initialize logger
cclog.Init(flagLogLevel, flagLogDateTime) cclog.Init(flagLogLevel, flagLogDateTime)
// Handle init flag // If init flag set, run tasks here before any file dependencies cause errors
if flagInit { if flagInit {
initEnv() initEnv()
cclog.Exit("Successfully setup environment!\n" + cclog.Exit("Successfully setup environment!\n" +
@@ -429,63 +71,193 @@ func run() error {
"Add your job-archive at ./var/job-archive.") "Add your job-archive at ./var/job-archive.")
} }
// Initialize gops agent // See https://github.com/google/gops (Runtime overhead is almost zero)
if err := initGops(); err != nil { if flagGops {
return err if err := agent.Listen(agent.Options{}); err != nil {
cclog.Abortf("Could not start gops agent with 'gops/agent.Listen(agent.Options{})'. Application startup failed, exited.\nError: %s\n", err.Error())
}
} }
// Initialize subsystems in dependency order: err := godotenv.Load()
// 1. Load environment variables from .env file (contains sensitive configuration) if err != nil {
// 2. Load configuration from config.json (may reference environment variables) cclog.Abortf("Could not parse existing .env file at location './.env'. Application startup failed, exited.\nError: %s\n", err.Error())
// 3. Handle database migration commands if requested
// 4. Initialize database connection (requires config for connection string)
// 5. Handle user commands if requested (requires database and authentication config)
// 6. Initialize subsystems like archive and metrics (require config and database)
// Load environment and configuration
if err := loadEnvironment(); err != nil {
return err
} }
if err := initConfiguration(); err != nil { // Initialize sub-modules and handle command line flags.
return err // The order here is important!
config.Init(flagConfigFile)
// As a special case for `db`, allow using an environment variable instead of the value
// stored in the config. This can be done for people having security concerns about storing
// the password for their mysql database in config.json.
if strings.HasPrefix(config.Keys.DB, "env:") {
envvar := strings.TrimPrefix(config.Keys.DB, "env:")
config.Keys.DB = os.Getenv(envvar)
} }
// Handle database migration (migrate, revert, force) if flagMigrateDB {
if err := handleDatabaseCommands(); err != nil { err := repository.MigrateDB(config.Keys.DBDriver, config.Keys.DB)
return err if err != nil {
cclog.Abortf("MigrateDB Failed: Could not migrate '%s' database at location '%s' to version %d.\nError: %s\n", config.Keys.DBDriver, config.Keys.DB, repository.Version, err.Error())
}
cclog.Exitf("MigrateDB Success: Migrated '%s' database at location '%s' to version %d.\n", config.Keys.DBDriver, config.Keys.DB, repository.Version)
} }
// Initialize database if flagRevertDB {
if err := initDatabase(); err != nil { err := repository.RevertDB(config.Keys.DBDriver, config.Keys.DB)
return err if err != nil {
cclog.Abortf("RevertDB Failed: Could not revert '%s' database at location '%s' to version %d.\nError: %s\n", config.Keys.DBDriver, config.Keys.DB, (repository.Version - 1), err.Error())
}
cclog.Exitf("RevertDB Success: Reverted '%s' database at location '%s' to version %d.\n", config.Keys.DBDriver, config.Keys.DB, (repository.Version - 1))
} }
// Handle user commands (add, delete, sync, JWT) if flagForceDB {
if err := handleUserCommands(); err != nil { err := repository.ForceDB(config.Keys.DBDriver, config.Keys.DB)
return err if err != nil {
cclog.Abortf("ForceDB Failed: Could not force '%s' database at location '%s' to version %d.\nError: %s\n", config.Keys.DBDriver, config.Keys.DB, repository.Version, err.Error())
}
cclog.Exitf("ForceDB Success: Forced '%s' database at location '%s' to version %d.\n", config.Keys.DBDriver, config.Keys.DB, repository.Version)
} }
// Initialize subsystems (archive, metrics, taggers) repository.Connect(config.Keys.DBDriver, config.Keys.DB)
if err := initSubsystems(); err != nil {
return err if !config.Keys.DisableAuthentication {
auth.Init()
if flagNewUser != "" {
parts := strings.SplitN(flagNewUser, ":", 3)
if len(parts) != 3 || len(parts[0]) == 0 {
cclog.Abortf("Add User: Could not parse supplied argument format: No changes.\n"+
"Want: <username>:[admin,support,manager,api,user]:<password>\n"+
"Have: %s\n", flagNewUser)
}
ur := repository.GetUserRepository()
if err := ur.AddUser(&schema.User{
Username: parts[0], Projects: make([]string, 0), Password: parts[2], Roles: strings.Split(parts[1], ","),
}); err != nil {
cclog.Abortf("Add User: Could not add new user authentication for '%s' and roles '%s'.\nError: %s\n", parts[0], parts[1], err.Error())
} else {
cclog.Printf("Add User: Added new user '%s' with roles '%s'.\n", parts[0], parts[1])
}
}
if flagDelUser != "" {
ur := repository.GetUserRepository()
if err := ur.DelUser(flagDelUser); err != nil {
cclog.Abortf("Delete User: Could not delete user '%s' from DB.\nError: %s\n", flagDelUser, err.Error())
} else {
cclog.Printf("Delete User: Deleted user '%s' from DB.\n", flagDelUser)
}
}
authHandle := auth.GetAuthInstance()
if flagSyncLDAP {
if authHandle.LdapAuth == nil {
cclog.Abort("Sync LDAP: LDAP authentication is not configured, could not synchronize. No changes, exited.")
}
if err := authHandle.LdapAuth.Sync(); err != nil {
cclog.Abortf("Sync LDAP: Could not synchronize, failed with error.\nError: %s\n", err.Error())
}
cclog.Print("Sync LDAP: LDAP synchronization successfull.")
}
if flagGenJWT != "" {
ur := repository.GetUserRepository()
user, err := ur.GetUser(flagGenJWT)
if err != nil {
cclog.Abortf("JWT: Could not get supplied user '%s' from DB. No changes, exited.\nError: %s\n", flagGenJWT, err.Error())
}
if !user.HasRole(schema.RoleApi) {
cclog.Warnf("JWT: User '%s' does not have the role 'api'. REST API endpoints will return error!\n", user.Username)
}
jwt, err := authHandle.JwtAuth.ProvideJWT(user)
if err != nil {
cclog.Abortf("JWT: User '%s' found in DB, but failed to provide JWT.\nError: %s\n", user.Username, err.Error())
}
cclog.Printf("JWT: Successfully generated JWT for user '%s': %s\n", user.Username, jwt)
}
} else if flagNewUser != "" || flagDelUser != "" {
cclog.Abort("Error: Arguments '--add-user' and '--del-user' can only be used if authentication is enabled. No changes, exited.")
}
if err := archive.Init(config.Keys.Archive, config.Keys.DisableArchive); err != nil {
cclog.Abortf("Init: Failed to initialize archive.\nError: %s\n", err.Error())
}
if err := metricdata.Init(); err != nil {
cclog.Abortf("Init: Failed to initialize metricdata repository.\nError %s\n", err.Error())
}
if flagReinitDB {
if err := importer.InitDB(); err != nil {
cclog.Abortf("Init DB: Failed to re-initialize repository DB.\nError: %s\n", err.Error())
} else {
cclog.Print("Init DB: Sucessfully re-initialized repository DB.")
}
}
if flagImportJob != "" {
if err := importer.HandleImportFlag(flagImportJob); err != nil {
cclog.Abortf("Import Job: Job import failed.\nError: %s\n", err.Error())
} else {
cclog.Printf("Import Job: Imported Job '%s' into DB.\n", flagImportJob)
}
}
if config.Keys.EnableJobTaggers {
tagger.Init()
}
if flagApplyTags {
if err := tagger.RunTaggers(); err != nil {
cclog.Abortf("Running job taggers.\nError: %s\n", err.Error())
}
} }
// Exit if start server is not requested
if !flagServer { if !flagServer {
cclog.Exit("No errors, server flag not set. Exiting cc-backend.") cclog.Exit("No errors, server flag not set. Exiting cc-backend.")
} }
// Run server with context archiver.Start(repository.GetJobRepository())
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
return runServer(ctx) taskManager.Start()
} serverInit()
func main() { var wg sync.WaitGroup
if err := run(); err != nil {
cclog.Error(err.Error()) wg.Add(1)
os.Exit(1) go func() {
defer wg.Done()
serverStart()
}()
wg.Add(1)
sigs := make(chan os.Signal, 1)
signal.Notify(sigs, syscall.SIGINT, syscall.SIGTERM)
go func() {
defer wg.Done()
<-sigs
runtimeEnv.SystemdNotifiy(false, "Shutting down ...")
serverShutdown()
util.FsWatcherShutdown()
taskManager.Shutdown()
}()
if os.Getenv("GOGC") == "" {
debug.SetGCPercent(25)
} }
runtimeEnv.SystemdNotifiy(true, "running")
wg.Wait()
cclog.Print("Graceful shutdown completed!")
} }

View File

@@ -2,9 +2,6 @@
// All rights reserved. This file is part of cc-backend. // All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style // Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file. // license that can be found in the LICENSE file.
// Package main provides the entry point for the ClusterCockpit backend server.
// This file contains HTTP server setup, routing configuration, and
// authentication middleware integration.
package main package main
import ( import (
@@ -29,32 +26,21 @@ import (
"github.com/ClusterCockpit/cc-backend/internal/config" "github.com/ClusterCockpit/cc-backend/internal/config"
"github.com/ClusterCockpit/cc-backend/internal/graph" "github.com/ClusterCockpit/cc-backend/internal/graph"
"github.com/ClusterCockpit/cc-backend/internal/graph/generated" "github.com/ClusterCockpit/cc-backend/internal/graph/generated"
"github.com/ClusterCockpit/cc-backend/internal/memorystore"
"github.com/ClusterCockpit/cc-backend/internal/routerConfig" "github.com/ClusterCockpit/cc-backend/internal/routerConfig"
"github.com/ClusterCockpit/cc-backend/pkg/nats" "github.com/ClusterCockpit/cc-backend/pkg/runtimeEnv"
"github.com/ClusterCockpit/cc-backend/web" "github.com/ClusterCockpit/cc-backend/web"
cclog "github.com/ClusterCockpit/cc-lib/ccLogger" cclog "github.com/ClusterCockpit/cc-lib/ccLogger"
"github.com/ClusterCockpit/cc-lib/runtimeEnv"
"github.com/gorilla/handlers" "github.com/gorilla/handlers"
"github.com/gorilla/mux" "github.com/gorilla/mux"
httpSwagger "github.com/swaggo/http-swagger" httpSwagger "github.com/swaggo/http-swagger"
) )
var buildInfo web.Build var (
router *mux.Router
// Environment variable names server *http.Server
const ( apiHandle *api.RestApi
envDebug = "DEBUG"
) )
// Server encapsulates the HTTP server state and dependencies
type Server struct {
router *mux.Router
server *http.Server
restAPIHandle *api.RestAPI
natsAPIHandle *api.NatsAPI
}
func onFailureResponse(rw http.ResponseWriter, r *http.Request, err error) { func onFailureResponse(rw http.ResponseWriter, r *http.Request, err error) {
rw.Header().Add("Content-Type", "application/json") rw.Header().Add("Content-Type", "application/json")
rw.WriteHeader(http.StatusUnauthorized) rw.WriteHeader(http.StatusUnauthorized)
@@ -64,31 +50,25 @@ func onFailureResponse(rw http.ResponseWriter, r *http.Request, err error) {
}) })
} }
// NewServer creates and initializes a new Server instance func serverInit() {
func NewServer(version, commit, buildDate string) (*Server, error) {
buildInfo = web.Build{Version: version, Hash: commit, Buildtime: buildDate}
s := &Server{
router: mux.NewRouter(),
}
if err := s.init(); err != nil {
return nil, err
}
return s, nil
}
func (s *Server) init() error {
// Setup the http.Handler/Router used by the server // Setup the http.Handler/Router used by the server
graph.Init() graph.Init()
resolver := graph.GetResolverInstance() resolver := graph.GetResolverInstance()
graphQLServer := handler.New( graphQLServer := handler.New(
generated.NewExecutableSchema(generated.Config{Resolvers: resolver})) generated.NewExecutableSchema(generated.Config{Resolvers: resolver}))
// graphQLServer.AddTransport(transport.SSE{})
graphQLServer.AddTransport(transport.POST{}) graphQLServer.AddTransport(transport.POST{})
// graphQLServer.AddTransport(transport.Websocket{
// KeepAlivePingInterval: 10 * time.Second,
// Upgrader: websocket.Upgrader{
// CheckOrigin: func(r *http.Request) bool {
// return true
// },
// },
// })
if os.Getenv(envDebug) != "1" { if os.Getenv("DEBUG") != "1" {
// Having this handler means that a error message is returned via GraphQL instead of the connection simply beeing closed. // Having this handler means that a error message is returned via GraphQL instead of the connection simply beeing closed.
// The problem with this is that then, no more stacktrace is printed to stderr. // The problem with this is that then, no more stacktrace is printed to stderr.
graphQLServer.SetRecoverFunc(func(ctx context.Context, err any) error { graphQLServer.SetRecoverFunc(func(ctx context.Context, err any) error {
@@ -105,56 +85,72 @@ func (s *Server) init() error {
authHandle := auth.GetAuthInstance() authHandle := auth.GetAuthInstance()
s.restAPIHandle = api.New() apiHandle = api.New()
router = mux.NewRouter()
buildInfo := web.Build{Version: version, Hash: commit, Buildtime: date}
info := map[string]any{} info := map[string]any{}
info["hasOpenIDConnect"] = false info["hasOpenIDConnect"] = false
if auth.Keys.OpenIDConfig != nil { if config.Keys.OpenIDConfig != nil {
openIDConnect := auth.NewOIDC(authHandle) openIDConnect := auth.NewOIDC(authHandle)
openIDConnect.RegisterEndpoints(s.router) openIDConnect.RegisterEndpoints(router)
info["hasOpenIDConnect"] = true info["hasOpenIDConnect"] = true
} }
s.router.HandleFunc("/login", func(rw http.ResponseWriter, r *http.Request) { router.HandleFunc("/login", func(rw http.ResponseWriter, r *http.Request) {
rw.Header().Add("Content-Type", "text/html; charset=utf-8") rw.Header().Add("Content-Type", "text/html; charset=utf-8")
cclog.Debugf("##%v##", info) cclog.Debugf("##%v##", info)
web.RenderTemplate(rw, "login.tmpl", &web.Page{Title: "Login", Build: buildInfo, Infos: info}) web.RenderTemplate(rw, "login.tmpl", &web.Page{Title: "Login", Build: buildInfo, Infos: info})
}).Methods(http.MethodGet) }).Methods(http.MethodGet)
s.router.HandleFunc("/imprint", func(rw http.ResponseWriter, r *http.Request) { router.HandleFunc("/imprint", func(rw http.ResponseWriter, r *http.Request) {
rw.Header().Add("Content-Type", "text/html; charset=utf-8") rw.Header().Add("Content-Type", "text/html; charset=utf-8")
web.RenderTemplate(rw, "imprint.tmpl", &web.Page{Title: "Imprint", Build: buildInfo}) web.RenderTemplate(rw, "imprint.tmpl", &web.Page{Title: "Imprint", Build: buildInfo})
}) })
s.router.HandleFunc("/privacy", func(rw http.ResponseWriter, r *http.Request) { router.HandleFunc("/privacy", func(rw http.ResponseWriter, r *http.Request) {
rw.Header().Add("Content-Type", "text/html; charset=utf-8") rw.Header().Add("Content-Type", "text/html; charset=utf-8")
web.RenderTemplate(rw, "privacy.tmpl", &web.Page{Title: "Privacy", Build: buildInfo}) web.RenderTemplate(rw, "privacy.tmpl", &web.Page{Title: "Privacy", Build: buildInfo})
}) })
secured := s.router.PathPrefix("/").Subrouter() secured := router.PathPrefix("/").Subrouter()
securedapi := s.router.PathPrefix("/api").Subrouter() securedapi := router.PathPrefix("/api").Subrouter()
userapi := s.router.PathPrefix("/userapi").Subrouter() userapi := router.PathPrefix("/userapi").Subrouter()
configapi := s.router.PathPrefix("/config").Subrouter() configapi := router.PathPrefix("/config").Subrouter()
frontendapi := s.router.PathPrefix("/frontend").Subrouter() frontendapi := router.PathPrefix("/frontend").Subrouter()
metricstoreapi := s.router.PathPrefix("/metricstore").Subrouter()
if !config.Keys.DisableAuthentication { if !config.Keys.DisableAuthentication {
// Create login failure handler (used by both /login and /jwt-login) router.Handle("/login", authHandle.Login(
loginFailureHandler := func(rw http.ResponseWriter, r *http.Request, err error) { // On success: Handled within Login()
rw.Header().Add("Content-Type", "text/html; charset=utf-8") // On failure:
rw.WriteHeader(http.StatusUnauthorized) func(rw http.ResponseWriter, r *http.Request, err error) {
web.RenderTemplate(rw, "login.tmpl", &web.Page{ rw.Header().Add("Content-Type", "text/html; charset=utf-8")
Title: "Login failed - ClusterCockpit", rw.WriteHeader(http.StatusUnauthorized)
MsgType: "alert-warning", web.RenderTemplate(rw, "login.tmpl", &web.Page{
Message: err.Error(), Title: "Login failed - ClusterCockpit",
Build: buildInfo, MsgType: "alert-warning",
Infos: info, Message: err.Error(),
}) Build: buildInfo,
} Infos: info,
})
})).Methods(http.MethodPost)
s.router.Handle("/login", authHandle.Login(loginFailureHandler)).Methods(http.MethodPost) router.Handle("/jwt-login", authHandle.Login(
s.router.Handle("/jwt-login", authHandle.Login(loginFailureHandler)) // On success: Handled within Login()
// On failure:
func(rw http.ResponseWriter, r *http.Request, err error) {
rw.Header().Add("Content-Type", "text/html; charset=utf-8")
rw.WriteHeader(http.StatusUnauthorized)
web.RenderTemplate(rw, "login.tmpl", &web.Page{
Title: "Login failed - ClusterCockpit",
MsgType: "alert-warning",
Message: err.Error(),
Build: buildInfo,
Infos: info,
})
}))
s.router.Handle("/logout", authHandle.Logout( router.Handle("/logout", authHandle.Logout(
http.HandlerFunc(func(rw http.ResponseWriter, r *http.Request) { http.HandlerFunc(func(rw http.ResponseWriter, r *http.Request) {
rw.Header().Add("Content-Type", "text/html; charset=utf-8") rw.Header().Add("Content-Type", "text/html; charset=utf-8")
rw.WriteHeader(http.StatusOK) rw.WriteHeader(http.StatusOK)
@@ -187,7 +183,7 @@ func (s *Server) init() error {
}) })
securedapi.Use(func(next http.Handler) http.Handler { securedapi.Use(func(next http.Handler) http.Handler {
return authHandle.AuthAPI( return authHandle.AuthApi(
// On success; // On success;
next, next,
// On failure: JSON Response // On failure: JSON Response
@@ -195,15 +191,7 @@ func (s *Server) init() error {
}) })
userapi.Use(func(next http.Handler) http.Handler { userapi.Use(func(next http.Handler) http.Handler {
return authHandle.AuthUserAPI( return authHandle.AuthUserApi(
// On success;
next,
// On failure: JSON Response
onFailureResponse)
})
metricstoreapi.Use(func(next http.Handler) http.Handler {
return authHandle.AuthMetricStoreAPI(
// On success; // On success;
next, next,
// On failure: JSON Response // On failure: JSON Response
@@ -211,7 +199,7 @@ func (s *Server) init() error {
}) })
configapi.Use(func(next http.Handler) http.Handler { configapi.Use(func(next http.Handler) http.Handler {
return authHandle.AuthConfigAPI( return authHandle.AuthConfigApi(
// On success; // On success;
next, next,
// On failure: JSON Response // On failure: JSON Response
@@ -219,7 +207,7 @@ func (s *Server) init() error {
}) })
frontendapi.Use(func(next http.Handler) http.Handler { frontendapi.Use(func(next http.Handler) http.Handler {
return authHandle.AuthFrontendAPI( return authHandle.AuthFrontendApi(
// On success; // On success;
next, next,
// On failure: JSON Response // On failure: JSON Response
@@ -228,8 +216,8 @@ func (s *Server) init() error {
} }
if flagDev { if flagDev {
s.router.Handle("/playground", playground.Handler("GraphQL playground", "/query")) router.Handle("/playground", playground.Handler("GraphQL playground", "/query"))
s.router.PathPrefix("/swagger/").Handler(httpSwagger.Handler( router.PathPrefix("/swagger/").Handler(httpSwagger.Handler(
httpSwagger.URL("http://" + config.Keys.Addr + "/swagger/doc.json"))).Methods(http.MethodGet) httpSwagger.URL("http://" + config.Keys.Addr + "/swagger/doc.json"))).Methods(http.MethodGet)
} }
secured.Handle("/query", graphQLServer) secured.Handle("/query", graphQLServer)
@@ -241,51 +229,34 @@ func (s *Server) init() error {
// Mount all /monitoring/... and /api/... routes. // Mount all /monitoring/... and /api/... routes.
routerConfig.SetupRoutes(secured, buildInfo) routerConfig.SetupRoutes(secured, buildInfo)
s.restAPIHandle.MountAPIRoutes(securedapi) apiHandle.MountApiRoutes(securedapi)
s.restAPIHandle.MountUserAPIRoutes(userapi) apiHandle.MountUserApiRoutes(userapi)
s.restAPIHandle.MountConfigAPIRoutes(configapi) apiHandle.MountConfigApiRoutes(configapi)
s.restAPIHandle.MountFrontendAPIRoutes(frontendapi) apiHandle.MountFrontendApiRoutes(frontendapi)
if config.Keys.APISubjects != nil {
s.natsAPIHandle = api.NewNatsAPI()
if err := s.natsAPIHandle.StartSubscriptions(); err != nil {
return fmt.Errorf("starting NATS subscriptions: %w", err)
}
}
s.restAPIHandle.MountMetricStoreAPIRoutes(metricstoreapi)
if config.Keys.EmbedStaticFiles { if config.Keys.EmbedStaticFiles {
if i, err := os.Stat("./var/img"); err == nil { if i, err := os.Stat("./var/img"); err == nil {
if i.IsDir() { if i.IsDir() {
cclog.Info("Use local directory for static images") cclog.Info("Use local directory for static images")
s.router.PathPrefix("/img/").Handler(http.StripPrefix("/img/", http.FileServer(http.Dir("./var/img")))) router.PathPrefix("/img/").Handler(http.StripPrefix("/img/", http.FileServer(http.Dir("./var/img"))))
} }
} }
s.router.PathPrefix("/").Handler(http.StripPrefix("/", web.ServeFiles())) router.PathPrefix("/").Handler(web.ServeFiles())
} else { } else {
s.router.PathPrefix("/").Handler(http.FileServer(http.Dir(config.Keys.StaticFiles))) router.PathPrefix("/").Handler(http.FileServer(http.Dir(config.Keys.StaticFiles)))
} }
s.router.Use(handlers.CompressHandler) router.Use(handlers.CompressHandler)
s.router.Use(handlers.RecoveryHandler(handlers.PrintRecoveryStack(true))) router.Use(handlers.RecoveryHandler(handlers.PrintRecoveryStack(true)))
s.router.Use(handlers.CORS( router.Use(handlers.CORS(
handlers.AllowCredentials(), handlers.AllowCredentials(),
handlers.AllowedHeaders([]string{"X-Requested-With", "Content-Type", "Authorization", "Origin"}), handlers.AllowedHeaders([]string{"X-Requested-With", "Content-Type", "Authorization", "Origin"}),
handlers.AllowedMethods([]string{"GET", "POST", "HEAD", "OPTIONS"}), handlers.AllowedMethods([]string{"GET", "POST", "HEAD", "OPTIONS"}),
handlers.AllowedOrigins([]string{"*"}))) handlers.AllowedOrigins([]string{"*"})))
return nil
} }
// Server timeout defaults (in seconds) func serverStart() {
const ( handler := handlers.CustomLoggingHandler(io.Discard, router, func(_ io.Writer, params handlers.LogFormatterParams) {
defaultReadTimeout = 20
defaultWriteTimeout = 20
)
func (s *Server) Start(ctx context.Context) error {
handler := handlers.CustomLoggingHandler(io.Discard, s.router, func(_ io.Writer, params handlers.LogFormatterParams) {
if strings.HasPrefix(params.Request.RequestURI, "/api/") { if strings.HasPrefix(params.Request.RequestURI, "/api/") {
cclog.Debugf("%s %s (%d, %.02fkb, %dms)", cclog.Debugf("%s %s (%d, %.02fkb, %dms)",
params.Request.Method, params.URL.RequestURI(), params.Request.Method, params.URL.RequestURI(),
@@ -299,13 +270,9 @@ func (s *Server) Start(ctx context.Context) error {
} }
}) })
// Use configurable timeouts with defaults server = &http.Server{
readTimeout := time.Duration(defaultReadTimeout) * time.Second ReadTimeout: 20 * time.Second,
writeTimeout := time.Duration(defaultWriteTimeout) * time.Second WriteTimeout: 20 * time.Second,
s.server = &http.Server{
ReadTimeout: readTimeout,
WriteTimeout: writeTimeout,
Handler: handler, Handler: handler,
Addr: config.Keys.Addr, Addr: config.Keys.Addr,
} }
@@ -313,20 +280,20 @@ func (s *Server) Start(ctx context.Context) error {
// Start http or https server // Start http or https server
listener, err := net.Listen("tcp", config.Keys.Addr) listener, err := net.Listen("tcp", config.Keys.Addr)
if err != nil { if err != nil {
return fmt.Errorf("starting listener on '%s': %w", config.Keys.Addr, err) cclog.Abortf("Server Start: Starting http listener on '%s' failed.\nError: %s\n", config.Keys.Addr, err.Error())
} }
if !strings.HasSuffix(config.Keys.Addr, ":80") && config.Keys.RedirectHTTPTo != "" { if !strings.HasSuffix(config.Keys.Addr, ":80") && config.Keys.RedirectHttpTo != "" {
go func() { go func() {
http.ListenAndServe(":80", http.RedirectHandler(config.Keys.RedirectHTTPTo, http.StatusMovedPermanently)) http.ListenAndServe(":80", http.RedirectHandler(config.Keys.RedirectHttpTo, http.StatusMovedPermanently))
}() }()
} }
if config.Keys.HTTPSCertFile != "" && config.Keys.HTTPSKeyFile != "" { if config.Keys.HttpsCertFile != "" && config.Keys.HttpsKeyFile != "" {
cert, err := tls.LoadX509KeyPair( cert, err := tls.LoadX509KeyPair(
config.Keys.HTTPSCertFile, config.Keys.HTTPSKeyFile) config.Keys.HttpsCertFile, config.Keys.HttpsKeyFile)
if err != nil { if err != nil {
return fmt.Errorf("loading X509 keypair (check 'https-cert-file' and 'https-key-file' in config.json): %w", err) cclog.Abortf("Server Start: Loading X509 keypair failed. Check options 'https-cert-file' and 'https-key-file' in 'config.json'.\nError: %s\n", err.Error())
} }
listener = tls.NewListener(listener, &tls.Config{ listener = tls.NewListener(listener, &tls.Config{
Certificates: []tls.Certificate{cert}, Certificates: []tls.Certificate{cert},
@@ -337,54 +304,27 @@ func (s *Server) Start(ctx context.Context) error {
MinVersion: tls.VersionTLS12, MinVersion: tls.VersionTLS12,
PreferServerCipherSuites: true, PreferServerCipherSuites: true,
}) })
cclog.Infof("HTTPS server listening at %s...", config.Keys.Addr) cclog.Printf("HTTPS server listening at %s...\n", config.Keys.Addr)
} else { } else {
cclog.Infof("HTTP server listening at %s...", config.Keys.Addr) cclog.Printf("HTTP server listening at %s...\n", config.Keys.Addr)
} }
// //
// Because this program will want to bind to a privileged port (like 80), the listener must // Because this program will want to bind to a privileged port (like 80), the listener must
// be established first, then the user can be changed, and after that, // be established first, then the user can be changed, and after that,
// the actual http server can be started. // the actual http server can be started.
if err := runtimeEnv.DropPrivileges(config.Keys.Group, config.Keys.User); err != nil { if err := runtimeEnv.DropPrivileges(config.Keys.Group, config.Keys.User); err != nil {
return fmt.Errorf("dropping privileges: %w", err) cclog.Abortf("Server Start: Error while preparing server start.\nError: %s\n", err.Error())
} }
// Handle context cancellation for graceful shutdown if err = server.Serve(listener); err != nil && err != http.ErrServerClosed {
go func() { cclog.Abortf("Server Start: Starting server failed.\nError: %s\n", err.Error())
<-ctx.Done()
shutdownCtx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
if err := s.server.Shutdown(shutdownCtx); err != nil {
cclog.Errorf("Server shutdown error: %v", err)
}
}()
if err = s.server.Serve(listener); err != nil && err != http.ErrServerClosed {
return fmt.Errorf("server failed: %w", err)
} }
return nil
} }
func (s *Server) Shutdown(ctx context.Context) { func serverShutdown() {
// Create a shutdown context with timeout
shutdownCtx, cancel := context.WithTimeout(ctx, 30*time.Second)
defer cancel()
nc := nats.GetClient()
if nc != nil {
nc.Close()
}
// First shut down the server gracefully (waiting for all ongoing requests) // First shut down the server gracefully (waiting for all ongoing requests)
if err := s.server.Shutdown(shutdownCtx); err != nil { server.Shutdown(context.Background())
cclog.Errorf("Server shutdown error: %v", err)
}
// Archive all the metric store data // Then, wait for any async archivings still pending...
memorystore.Shutdown() archiver.WaitForArchiving()
// Shutdown archiver with 10 second timeout for fast shutdown
if err := archiver.Shutdown(10 * time.Second); err != nil {
cclog.Warnf("Archiver shutdown: %v", err)
}
} }

View File

@@ -1,42 +1,34 @@
{ {
"main": { "addr": "127.0.0.1:8080",
"addr": "127.0.0.1:8080", "short-running-jobs-duration": 300,
"short-running-jobs-duration": 300,
"resampling": {
"minimumPoints": 600,
"trigger": 180,
"resolutions": [
240,
60
]
},
"apiAllowedIPs": [
"*"
],
"emission-constant": 317
},
"cron": {
"commit-job-worker": "2m",
"duration-worker": "5m",
"footprint-worker": "10m"
},
"archive": { "archive": {
"kind": "file", "kind": "file",
"path": "./var/job-archive" "path": "./var/job-archive"
}, },
"auth": { "jwts": {
"jwts": { "max-age": "2000h"
"max-age": "2000h"
}
}, },
"nats": { "enable-resampling": {
"address": "nats://0.0.0.0:4222", "trigger": 30,
"username": "root", "resolutions": [
"password": "root" 600,
300,
120,
60
]
}, },
"apiAllowedIPs": [
"*"
],
"emission-constant": 317,
"clusters": [ "clusters": [
{ {
"name": "fritz", "name": "fritz",
"metricDataRepository": {
"kind": "cc-metric-store",
"url": "http://localhost:8082",
"token": ""
},
"filterRanges": { "filterRanges": {
"numNodes": { "numNodes": {
"from": 1, "from": 1,
@@ -54,6 +46,11 @@
}, },
{ {
"name": "alex", "name": "alex",
"metricDataRepository": {
"kind": "cc-metric-store",
"url": "http://localhost:8082",
"token": ""
},
"filterRanges": { "filterRanges": {
"numNodes": { "numNodes": {
"from": 1, "from": 1,
@@ -69,28 +66,5 @@
} }
} }
} }
], ]
"metric-store": { }
"checkpoints": {
"file-format": "avro",
"interval": "1h",
"directory": "./var/checkpoints",
"restore": "48h"
},
"archive": {
"interval": "1h",
"directory": "./var/archive"
},
"retention-in-memory": "48h",
"subscriptions": [
{
"subscribe-to": "hpc-nats",
"cluster-tag": "fritz"
},
{
"subscribe-to": "hpc-nats",
"cluster-tag": "alex"
}
]
}
}

View File

@@ -0,0 +1,69 @@
{
"addr": "127.0.0.1:8080",
"short-running-jobs-duration": 300,
"archive": {
"kind": "file",
"path": "./var/job-archive"
},
"jwts": {
"max-age": "2000h"
},
"db-driver": "mysql",
"db": "clustercockpit:demo@tcp(127.0.0.1:3306)/clustercockpit",
"enable-resampling": {
"trigger": 30,
"resolutions": [
600,
300,
120,
60
]
},
"emission-constant": 317,
"clusters": [
{
"name": "fritz",
"metricDataRepository": {
"kind": "cc-metric-store",
"url": "http://localhost:8082",
"token": ""
},
"filterRanges": {
"numNodes": {
"from": 1,
"to": 64
},
"duration": {
"from": 0,
"to": 86400
},
"startTime": {
"from": "2022-01-01T00:00:00Z",
"to": null
}
}
},
{
"name": "alex",
"metricDataRepository": {
"kind": "cc-metric-store",
"url": "http://localhost:8082",
"token": ""
},
"filterRanges": {
"numNodes": {
"from": 1,
"to": 64
},
"duration": {
"from": 0,
"to": 86400
},
"startTime": {
"from": "2022-01-01T00:00:00Z",
"to": null
}
}
}
]
}

View File

@@ -1,34 +1,32 @@
{ {
"main": { "addr": "0.0.0.0:443",
"addr": "0.0.0.0:443", "ldap": {
"https-cert-file": "/etc/letsencrypt/live/url/fullchain.pem", "url": "ldaps://test",
"https-key-file": "/etc/letsencrypt/live/url/privkey.pem", "user_base": "ou=people,ou=hpc,dc=test,dc=de",
"user": "clustercockpit", "search_dn": "cn=hpcmonitoring,ou=roadm,ou=profile,ou=hpc,dc=test,dc=de",
"group": "clustercockpit", "user_bind": "uid={username},ou=people,ou=hpc,dc=test,dc=de",
"validate": false, "user_filter": "(&(objectclass=posixAccount))"
"apiAllowedIPs": ["*"],
"short-running-jobs-duration": 300,
"resampling": {
"minimumPoints": 600,
"trigger": 180,
"resolutions": [
240,
60
]
}
},
"cron": {
"commit-job-worker": "2m",
"duration-worker": "5m",
"footprint-worker": "10m"
}, },
"https-cert-file": "/etc/letsencrypt/live/url/fullchain.pem",
"https-key-file": "/etc/letsencrypt/live/url/privkey.pem",
"user": "clustercockpit",
"group": "clustercockpit",
"archive": { "archive": {
"kind": "file", "kind": "file",
"path": "./var/job-archive" "path": "./var/job-archive"
}, },
"validate": false,
"apiAllowedIPs": [
"*"
],
"clusters": [ "clusters": [
{ {
"name": "test", "name": "test",
"metricDataRepository": {
"kind": "cc-metric-store",
"url": "http://localhost:8082",
"token": "eyJhbGciOiJF-E-pQBQ"
},
"filterRanges": { "filterRanges": {
"numNodes": { "numNodes": {
"from": 1, "from": 1,
@@ -44,6 +42,21 @@
} }
} }
} }
] ],
"jwts": {
"cookieName": "",
"validateUser": false,
"max-age": "2000h",
"trustedIssuer": ""
},
"enable-resampling": {
"trigger": 30,
"resolutions": [
600,
300,
120,
60
]
},
"short-running-jobs-duration": 300
} }

View File

@@ -0,0 +1,12 @@
{
"clusters": [
{
"name": "fritz",
"default_metrics": "cpu_load, flops_any, core_power, lustre_open, mem_used, mem_bw, net_bytes_in"
},
{
"name": "alex",
"default_metrics": "flops_any, mem_bw, mem_used, vectorization_ratio"
}
]
}

View File

@@ -1,22 +0,0 @@
{
"cluster": "fritz",
"jobId": 123000,
"jobState": "running",
"numAcc": 0,
"numHwthreads": 72,
"numNodes": 1,
"partition": "main",
"requestedMemory": 128000,
"resources": [{ "hostname": "f0726" }],
"startTime": 1649723812,
"subCluster": "main",
"submitTime": 1649723812,
"user": "k106eb10",
"project": "k106eb",
"walltime": 86400,
"metaData": {
"slurmInfo": "JobId=398759\nJobName=myJob\nUserId=dummyUser\nGroupId=dummyGroup\nAccount=dummyAccount\nQOS=normal Requeue=False Restarts=0 BatchFlag=True\nTimeLimit=1439'\nSubmitTime=2023-02-09T14:10:18\nPartition=singlenode\nNodeList=xx\nNumNodes=xx NumCPUs=72 NumTasks=72 CPUs/Task=1\nNTasksPerNode:Socket:Core=0:None:None\nTRES_req=cpu=72,mem=250000M,node=1,billing=72\nTRES_alloc=cpu=72,node=1,billing=72\nCommand=myCmd\nWorkDir=myDir\nStdErr=\nStdOut=\n",
"jobScript": "#!/bin/bash -l\n#SBATCH --job-name=dummy_job\n#SBATCH --time=23:59:00\n#SBATCH --partition=singlenode\n#SBATCH --ntasks=72\n#SBATCH --hint=multithread\n#SBATCH --chdir=/home/atuin/k106eb/dummy/\n#SBATCH --export=NONE\nunset SLURM_EXPORT_ENV\n\n#This is a dummy job script\n./mybinary\n",
"jobName": "ams_pipeline"
}
}

View File

@@ -1,7 +0,0 @@
{
"cluster": "fritz",
"jobId": 123000,
"jobState": "completed",
"startTime": 1649723812,
"stopTime": 1649763839
}

View File

@@ -1,45 +0,0 @@
{
"jobList": {
"usePaging": false,
"showFootprint":false
},
"jobView": {
"showPolarPlot": true,
"showFootprint": true,
"showRoofline": true,
"showStatTable": true
},
"metricConfig": {
"jobListMetrics": ["mem_bw", "flops_dp"],
"jobViewPlotMetrics": ["mem_bw", "flops_dp"],
"jobViewTableMetrics": ["mem_bw", "flops_dp"],
"clusters": [
{
"name": "test",
"subClusters": [
{
"name": "one",
"jobListMetrics": ["mem_used", "flops_sp"]
}
]
}
]
},
"nodeList": {
"usePaging": true
},
"plotConfiguration": {
"plotsPerRow": 3,
"colorBackground": true,
"lineWidth": 3,
"colorScheme": [
"#00bfff",
"#0000ff",
"#ff00ff",
"#ff0000",
"#ff8000",
"#ffff00",
"#80ff00"
]
}
}

123
go.mod
View File

@@ -1,126 +1,91 @@
module github.com/ClusterCockpit/cc-backend module github.com/ClusterCockpit/cc-backend
go 1.24.0 go 1.23.5
toolchain go1.24.1 toolchain go1.24.1
tool (
github.com/99designs/gqlgen
github.com/swaggo/swag/cmd/swag
)
require ( require (
github.com/99designs/gqlgen v0.17.84 github.com/99designs/gqlgen v0.17.66
github.com/ClusterCockpit/cc-lib v1.0.2 github.com/ClusterCockpit/cc-lib v0.3.0
github.com/Masterminds/squirrel v1.5.4 github.com/Masterminds/squirrel v1.5.4
github.com/aws/aws-sdk-go-v2 v1.41.0 github.com/coreos/go-oidc/v3 v3.12.0
github.com/aws/aws-sdk-go-v2/config v1.31.20 github.com/expr-lang/expr v1.17.3
github.com/aws/aws-sdk-go-v2/credentials v1.18.24 github.com/go-co-op/gocron/v2 v2.16.0
github.com/aws/aws-sdk-go-v2/service/s3 v1.90.2 github.com/go-ldap/ldap/v3 v3.4.10
github.com/coreos/go-oidc/v3 v3.16.0 github.com/go-sql-driver/mysql v1.9.0
github.com/expr-lang/expr v1.17.6 github.com/golang-jwt/jwt/v5 v5.2.2
github.com/go-co-op/gocron/v2 v2.18.2 github.com/golang-migrate/migrate/v4 v4.18.2
github.com/go-ldap/ldap/v3 v3.4.12
github.com/golang-jwt/jwt/v5 v5.3.0
github.com/golang-migrate/migrate/v4 v4.19.1
github.com/google/gops v0.3.28 github.com/google/gops v0.3.28
github.com/gorilla/handlers v1.5.2 github.com/gorilla/handlers v1.5.2
github.com/gorilla/mux v1.8.1 github.com/gorilla/mux v1.8.1
github.com/gorilla/sessions v1.4.0 github.com/gorilla/sessions v1.4.0
github.com/influxdata/line-protocol/v2 v2.2.1
github.com/jmoiron/sqlx v1.4.0 github.com/jmoiron/sqlx v1.4.0
github.com/joho/godotenv v1.5.1 github.com/joho/godotenv v1.5.1
github.com/linkedin/goavro/v2 v2.14.1 github.com/mattn/go-sqlite3 v1.14.24
github.com/mattn/go-sqlite3 v1.14.32 github.com/prometheus/client_golang v1.22.0
github.com/nats-io/nats.go v1.47.0 github.com/prometheus/common v0.63.0
github.com/prometheus/client_golang v1.23.2
github.com/prometheus/common v0.67.4
github.com/qustavo/sqlhooks/v2 v2.1.0 github.com/qustavo/sqlhooks/v2 v2.1.0
github.com/santhosh-tekuri/jsonschema/v5 v5.3.1 github.com/santhosh-tekuri/jsonschema/v5 v5.3.1
github.com/stretchr/testify v1.11.1
github.com/swaggo/http-swagger v1.3.4 github.com/swaggo/http-swagger v1.3.4
github.com/swaggo/swag v1.16.6 github.com/swaggo/swag v1.16.4
github.com/vektah/gqlparser/v2 v2.5.31 github.com/vektah/gqlparser/v2 v2.5.22
golang.org/x/crypto v0.45.0 golang.org/x/crypto v0.37.0
golang.org/x/oauth2 v0.32.0 golang.org/x/oauth2 v0.27.0
golang.org/x/time v0.14.0 golang.org/x/time v0.5.0
) )
require ( require (
filippo.io/edwards25519 v1.1.0 // indirect
github.com/Azure/go-ntlmssp v0.0.0-20221128193559-754e69321358 // indirect github.com/Azure/go-ntlmssp v0.0.0-20221128193559-754e69321358 // indirect
github.com/KyleBanks/depth v1.2.1 // indirect github.com/KyleBanks/depth v1.2.1 // indirect
github.com/agnivade/levenshtein v1.2.1 // indirect github.com/agnivade/levenshtein v1.2.1 // indirect
github.com/aws/aws-sdk-go-v2/aws/protocol/eventstream v1.7.3 // indirect
github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.18.13 // indirect
github.com/aws/aws-sdk-go-v2/internal/configsources v1.4.13 // indirect
github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.7.13 // indirect
github.com/aws/aws-sdk-go-v2/internal/ini v1.8.4 // indirect
github.com/aws/aws-sdk-go-v2/internal/v4a v1.4.13 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.13.3 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/checksum v1.9.4 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.13.13 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/s3shared v1.19.13 // indirect
github.com/aws/aws-sdk-go-v2/service/sso v1.30.3 // indirect
github.com/aws/aws-sdk-go-v2/service/ssooidc v1.35.7 // indirect
github.com/aws/aws-sdk-go-v2/service/sts v1.40.2 // indirect
github.com/aws/smithy-go v1.24.0 // indirect
github.com/beorn7/perks v1.0.1 // indirect github.com/beorn7/perks v1.0.1 // indirect
github.com/cespare/xxhash/v2 v2.3.0 // indirect github.com/cespare/xxhash/v2 v2.3.0 // indirect
github.com/cpuguy83/go-md2man/v2 v2.0.7 // indirect github.com/cpuguy83/go-md2man/v2 v2.0.6 // indirect
github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc // indirect
github.com/felixge/httpsnoop v1.0.4 // indirect github.com/felixge/httpsnoop v1.0.4 // indirect
github.com/fsnotify/fsnotify v1.9.0 // indirect github.com/fsnotify/fsnotify v1.9.0 // indirect
github.com/go-asn1-ber/asn1-ber v1.5.8-0.20250403174932-29230038a667 // indirect github.com/go-asn1-ber/asn1-ber v1.5.7 // indirect
github.com/go-jose/go-jose/v4 v4.1.3 // indirect github.com/go-jose/go-jose/v4 v4.0.5 // indirect
github.com/go-openapi/jsonpointer v0.22.3 // indirect github.com/go-openapi/jsonpointer v0.21.0 // indirect
github.com/go-openapi/jsonreference v0.21.3 // indirect github.com/go-openapi/jsonreference v0.21.0 // indirect
github.com/go-openapi/spec v0.22.1 // indirect github.com/go-openapi/spec v0.21.0 // indirect
github.com/go-openapi/swag/conv v0.25.4 // indirect github.com/go-openapi/swag v0.23.0 // indirect
github.com/go-openapi/swag/jsonname v0.25.4 // indirect github.com/go-viper/mapstructure/v2 v2.2.1 // indirect
github.com/go-openapi/swag/jsonutils v0.25.4 // indirect
github.com/go-openapi/swag/loading v0.25.4 // indirect
github.com/go-openapi/swag/stringutils v0.25.4 // indirect
github.com/go-openapi/swag/typeutils v0.25.4 // indirect
github.com/go-openapi/swag/yamlutils v0.25.4 // indirect
github.com/go-viper/mapstructure/v2 v2.4.0 // indirect
github.com/goccy/go-yaml v1.19.0 // indirect
github.com/golang/snappy v0.0.4 // indirect
github.com/google/uuid v1.6.0 // indirect github.com/google/uuid v1.6.0 // indirect
github.com/gorilla/securecookie v1.1.2 // indirect github.com/gorilla/securecookie v1.1.2 // indirect
github.com/gorilla/websocket v1.5.3 // indirect github.com/gorilla/websocket v1.5.3 // indirect
github.com/hashicorp/errwrap v1.1.0 // indirect
github.com/hashicorp/go-multierror v1.1.1 // indirect
github.com/hashicorp/golang-lru/v2 v2.0.7 // indirect github.com/hashicorp/golang-lru/v2 v2.0.7 // indirect
github.com/jonboulle/clockwork v0.5.0 // indirect github.com/jonboulle/clockwork v0.5.0 // indirect
github.com/josharian/intern v1.0.0 // indirect
github.com/jpillora/backoff v1.0.0 // indirect github.com/jpillora/backoff v1.0.0 // indirect
github.com/json-iterator/go v1.1.12 // indirect github.com/json-iterator/go v1.1.12 // indirect
github.com/klauspost/compress v1.18.1 // indirect
github.com/lann/builder v0.0.0-20180802200727-47ae307949d0 // indirect github.com/lann/builder v0.0.0-20180802200727-47ae307949d0 // indirect
github.com/lann/ps v0.0.0-20150810152359-62de8c46ede0 // indirect github.com/lann/ps v0.0.0-20150810152359-62de8c46ede0 // indirect
github.com/mailru/easyjson v0.9.0 // indirect
github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect
github.com/modern-go/reflect2 v1.0.2 // indirect github.com/modern-go/reflect2 v1.0.2 // indirect
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 // indirect github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 // indirect
github.com/mwitkow/go-conntrack v0.0.0-20190716064945-2f068394615f // indirect github.com/mwitkow/go-conntrack v0.0.0-20190716064945-2f068394615f // indirect
github.com/nats-io/nkeys v0.4.11 // indirect
github.com/nats-io/nuid v1.0.1 // indirect
github.com/pmezard/go-difflib v1.0.1-0.20181226105442-5d4384ee4fb2 // indirect
github.com/prometheus/client_model v0.6.2 // indirect github.com/prometheus/client_model v0.6.2 // indirect
github.com/prometheus/procfs v0.16.1 // indirect github.com/prometheus/procfs v0.16.1 // indirect
github.com/robfig/cron/v3 v3.0.1 // indirect github.com/robfig/cron/v3 v3.0.1 // indirect
github.com/russross/blackfriday/v2 v2.1.0 // indirect github.com/russross/blackfriday/v2 v2.1.0 // indirect
github.com/sosodev/duration v1.3.1 // indirect github.com/sosodev/duration v1.3.1 // indirect
github.com/stretchr/objx v0.5.2 // indirect
github.com/swaggo/files v1.0.1 // indirect github.com/swaggo/files v1.0.1 // indirect
github.com/urfave/cli/v2 v2.27.7 // indirect github.com/urfave/cli/v2 v2.27.5 // indirect
github.com/urfave/cli/v3 v3.6.1 // indirect github.com/xrash/smetrics v0.0.0-20240521201337-686a1a2994c1 // indirect
github.com/xrash/smetrics v0.0.0-20250705151800-55b8f293f342 // indirect go.uber.org/atomic v1.11.0 // indirect
go.yaml.in/yaml/v2 v2.4.3 // indirect golang.org/x/exp v0.0.0-20250408133849-7e4ce0ab07d0 // indirect
go.yaml.in/yaml/v3 v3.0.4 // indirect golang.org/x/mod v0.24.0 // indirect
golang.org/x/mod v0.30.0 // indirect golang.org/x/net v0.39.0 // indirect
golang.org/x/net v0.47.0 // indirect golang.org/x/sync v0.13.0 // indirect
golang.org/x/sync v0.18.0 // indirect golang.org/x/sys v0.32.0 // indirect
golang.org/x/sys v0.38.0 // indirect golang.org/x/text v0.24.0 // indirect
golang.org/x/text v0.31.0 // indirect golang.org/x/tools v0.32.0 // indirect
golang.org/x/tools v0.39.0 // indirect google.golang.org/protobuf v1.36.6 // indirect
google.golang.org/protobuf v1.36.10 // indirect gopkg.in/yaml.v2 v2.4.0 // indirect
gopkg.in/yaml.v3 v3.0.1 // indirect gopkg.in/yaml.v3 v3.0.1 // indirect
sigs.k8s.io/yaml v1.6.0 // indirect sigs.k8s.io/yaml v1.4.0 // indirect
) )

380
go.sum
View File

@@ -1,143 +1,94 @@
filippo.io/edwards25519 v1.1.0 h1:FNf4tywRC1HmFuKW5xopWpigGjJKiJSV0Cqo0cJWDaA= filippo.io/edwards25519 v1.1.0 h1:FNf4tywRC1HmFuKW5xopWpigGjJKiJSV0Cqo0cJWDaA=
filippo.io/edwards25519 v1.1.0/go.mod h1:BxyFTGdWcka3PhytdK4V28tE5sGfRvvvRV7EaN4VDT4= filippo.io/edwards25519 v1.1.0/go.mod h1:BxyFTGdWcka3PhytdK4V28tE5sGfRvvvRV7EaN4VDT4=
github.com/99designs/gqlgen v0.17.84 h1:iVMdiStgUVx/BFkMb0J5GAXlqfqtQ7bqMCYK6v52kQ0= github.com/99designs/gqlgen v0.17.66 h1:2/SRc+h3115fCOZeTtsqrB5R5gTGm+8qCAwcrZa+CXA=
github.com/99designs/gqlgen v0.17.84/go.mod h1:qjoUqzTeiejdo+bwUg8unqSpeYG42XrcrQboGIezmFA= github.com/99designs/gqlgen v0.17.66/go.mod h1:gucrb5jK5pgCKzAGuOMMVU9C8PnReecHEHd2UxLQwCg=
github.com/Azure/go-ansiterm v0.0.0-20230124172434-306776ec8161 h1:L/gRVlceqvL25UVaW/CKtUDjefjrs0SPonmDGUVOYP0=
github.com/Azure/go-ansiterm v0.0.0-20230124172434-306776ec8161/go.mod h1:xomTg63KZ2rFqZQzSB4Vz2SUXa1BpHTVz9L5PTmPC4E=
github.com/Azure/go-ntlmssp v0.0.0-20221128193559-754e69321358 h1:mFRzDkZVAjdal+s7s0MwaRv9igoPqLRdzOLzw/8Xvq8= github.com/Azure/go-ntlmssp v0.0.0-20221128193559-754e69321358 h1:mFRzDkZVAjdal+s7s0MwaRv9igoPqLRdzOLzw/8Xvq8=
github.com/Azure/go-ntlmssp v0.0.0-20221128193559-754e69321358/go.mod h1:chxPXzSsl7ZWRAuOIE23GDNzjWuZquvFlgA8xmpunjU= github.com/Azure/go-ntlmssp v0.0.0-20221128193559-754e69321358/go.mod h1:chxPXzSsl7ZWRAuOIE23GDNzjWuZquvFlgA8xmpunjU=
github.com/ClusterCockpit/cc-lib v1.0.2 h1:ZWn3oZkXgxrr3zSigBdlOOfayZ4Om4xL20DhmritPPg= github.com/ClusterCockpit/cc-lib v0.3.0 h1:HEWOgnzRM01U10ZFfpiUWMzkLHg5nPdXZqdsiI2q4x0=
github.com/ClusterCockpit/cc-lib v1.0.2/go.mod h1:UGdOvXEnjFqlnPSxtvtFwO6BtXYW6NnXFoud9FtN93k= github.com/ClusterCockpit/cc-lib v0.3.0/go.mod h1:7CuXVNIJdynMZf6B9v4m54VCbbFg3ZD0tvLw2bVxN0A=
github.com/KyleBanks/depth v1.2.1 h1:5h8fQADFrWtarTdtDudMmGsC7GPbOAu6RVB3ffsVFHc= github.com/KyleBanks/depth v1.2.1 h1:5h8fQADFrWtarTdtDudMmGsC7GPbOAu6RVB3ffsVFHc=
github.com/KyleBanks/depth v1.2.1/go.mod h1:jzSb9d0L43HxTQfT+oSA1EEp2q+ne2uh6XgeJcm8brE= github.com/KyleBanks/depth v1.2.1/go.mod h1:jzSb9d0L43HxTQfT+oSA1EEp2q+ne2uh6XgeJcm8brE=
github.com/Masterminds/squirrel v1.5.4 h1:uUcX/aBc8O7Fg9kaISIUsHXdKuqehiXAMQTYX8afzqM= github.com/Masterminds/squirrel v1.5.4 h1:uUcX/aBc8O7Fg9kaISIUsHXdKuqehiXAMQTYX8afzqM=
github.com/Masterminds/squirrel v1.5.4/go.mod h1:NNaOrjSoIDfDA40n7sr2tPNZRfjzjA400rg+riTZj10= github.com/Masterminds/squirrel v1.5.4/go.mod h1:NNaOrjSoIDfDA40n7sr2tPNZRfjzjA400rg+riTZj10=
github.com/NVIDIA/go-nvml v0.13.0-1 h1:OLX8Jq3dONuPOQPC7rndB6+iDmDakw0XTYgzMxObkEw= github.com/Microsoft/go-winio v0.6.2 h1:F2VQgta7ecxGYO8k3ZZz3RS8fVIXVxONVUPlNERoyfY=
github.com/NVIDIA/go-nvml v0.13.0-1/go.mod h1:+KNA7c7gIBH7SKSJ1ntlwkfN80zdx8ovl4hrK3LmPt4= github.com/Microsoft/go-winio v0.6.2/go.mod h1:yd8OoFMLzJbo9gZq8j5qaps8bJ9aShtEA8Ipt1oGCvU=
github.com/PuerkitoBio/goquery v1.11.0 h1:jZ7pwMQXIITcUXNH83LLk+txlaEy6NVOfTuP43xxfqw= github.com/PuerkitoBio/goquery v1.9.3 h1:mpJr/ikUA9/GNJB/DBZcGeFDXUtosHRyRrwh7KGdTG0=
github.com/PuerkitoBio/goquery v1.11.0/go.mod h1:wQHgxUOU3JGuj3oD/QFfxUdlzW6xPHfqyHre6VMY4DQ= github.com/PuerkitoBio/goquery v1.9.3/go.mod h1:1ndLHPdTz+DyQPICCWYlYQMPl0oXZj0G6D4LCYA6u4U=
github.com/agnivade/levenshtein v1.2.1 h1:EHBY3UOn1gwdy/VbFwgo4cxecRznFk7fKWN1KOX7eoM= github.com/agnivade/levenshtein v1.2.1 h1:EHBY3UOn1gwdy/VbFwgo4cxecRznFk7fKWN1KOX7eoM=
github.com/agnivade/levenshtein v1.2.1/go.mod h1:QVVI16kDrtSuwcpd0p1+xMC6Z/VfhtCyDIjcwga4/DU= github.com/agnivade/levenshtein v1.2.1/go.mod h1:QVVI16kDrtSuwcpd0p1+xMC6Z/VfhtCyDIjcwga4/DU=
github.com/alexbrainman/sspi v0.0.0-20250919150558-7d374ff0d59e h1:4dAU9FXIyQktpoUAgOJK3OTFc/xug0PCXYCqU0FgDKI= github.com/alexbrainman/sspi v0.0.0-20231016080023-1a75b4708caa h1:LHTHcTQiSGT7VVbI0o4wBRNQIgn917usHWOd6VAffYI=
github.com/alexbrainman/sspi v0.0.0-20250919150558-7d374ff0d59e/go.mod h1:cEWa1LVoE5KvSD9ONXsZrj0z6KqySlCCNKHlLzbqAt4= github.com/alexbrainman/sspi v0.0.0-20231016080023-1a75b4708caa/go.mod h1:cEWa1LVoE5KvSD9ONXsZrj0z6KqySlCCNKHlLzbqAt4=
github.com/andreyvit/diff v0.0.0-20170406064948-c7f18ee00883 h1:bvNMNQO63//z+xNgfBlViaCIJKLlCJ6/fmUseuG0wVQ= github.com/andreyvit/diff v0.0.0-20170406064948-c7f18ee00883 h1:bvNMNQO63//z+xNgfBlViaCIJKLlCJ6/fmUseuG0wVQ=
github.com/andreyvit/diff v0.0.0-20170406064948-c7f18ee00883/go.mod h1:rCTlJbsFo29Kk6CurOXKm700vrz8f0KW0JNfpkRJY/8= github.com/andreyvit/diff v0.0.0-20170406064948-c7f18ee00883/go.mod h1:rCTlJbsFo29Kk6CurOXKm700vrz8f0KW0JNfpkRJY/8=
github.com/andybalholm/cascadia v1.3.3 h1:AG2YHrzJIm4BZ19iwJ/DAua6Btl3IwJX+VI4kktS1LM= github.com/andybalholm/cascadia v1.3.2 h1:3Xi6Dw5lHF15JtdcmAHD3i1+T8plmv7BQ/nsViSLyss=
github.com/andybalholm/cascadia v1.3.3/go.mod h1:xNd9bqTn98Ln4DwST8/nG+H0yuB8Hmgu1YHNnWw0GeA= github.com/andybalholm/cascadia v1.3.2/go.mod h1:7gtRlve5FxPPgIgX36uWBX58OdBsSS6lUvCFb+h7KvU=
github.com/apapsch/go-jsonmerge/v2 v2.0.0 h1:axGnT1gRIfimI7gJifB699GoE/oq+F2MU7Dml6nw9rQ=
github.com/apapsch/go-jsonmerge/v2 v2.0.0/go.mod h1:lvDnEdqiQrp0O42VQGgmlKpxL1AP2+08jFMw88y4klk=
github.com/arbovm/levenshtein v0.0.0-20160628152529-48b4e1c0c4d0 h1:jfIu9sQUG6Ig+0+Ap1h4unLjW6YQJpKZVmUzxsD4E/Q= github.com/arbovm/levenshtein v0.0.0-20160628152529-48b4e1c0c4d0 h1:jfIu9sQUG6Ig+0+Ap1h4unLjW6YQJpKZVmUzxsD4E/Q=
github.com/arbovm/levenshtein v0.0.0-20160628152529-48b4e1c0c4d0/go.mod h1:t2tdKJDJF9BV14lnkjHmOQgcvEKgtqs5a1N3LNdJhGE= github.com/arbovm/levenshtein v0.0.0-20160628152529-48b4e1c0c4d0/go.mod h1:t2tdKJDJF9BV14lnkjHmOQgcvEKgtqs5a1N3LNdJhGE=
github.com/aws/aws-sdk-go-v2 v1.41.0 h1:tNvqh1s+v0vFYdA1xq0aOJH+Y5cRyZ5upu6roPgPKd4=
github.com/aws/aws-sdk-go-v2 v1.41.0/go.mod h1:MayyLB8y+buD9hZqkCW3kX1AKq07Y5pXxtgB+rRFhz0=
github.com/aws/aws-sdk-go-v2/aws/protocol/eventstream v1.7.3 h1:DHctwEM8P8iTXFxC/QK0MRjwEpWQeM9yzidCRjldUz0=
github.com/aws/aws-sdk-go-v2/aws/protocol/eventstream v1.7.3/go.mod h1:xdCzcZEtnSTKVDOmUZs4l/j3pSV6rpo1WXl5ugNsL8Y=
github.com/aws/aws-sdk-go-v2/config v1.31.20 h1:/jWF4Wu90EhKCgjTdy1DGxcbcbNrjfBHvksEL79tfQc=
github.com/aws/aws-sdk-go-v2/config v1.31.20/go.mod h1:95Hh1Tc5VYKL9NJ7tAkDcqeKt+MCXQB1hQZaRdJIZE0=
github.com/aws/aws-sdk-go-v2/credentials v1.18.24 h1:iJ2FmPT35EaIB0+kMa6TnQ+PwG5A1prEdAw+PsMzfHg=
github.com/aws/aws-sdk-go-v2/credentials v1.18.24/go.mod h1:U91+DrfjAiXPDEGYhh/x29o4p0qHX5HDqG7y5VViv64=
github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.18.13 h1:T1brd5dR3/fzNFAQch/iBKeX07/ffu/cLu+q+RuzEWk=
github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.18.13/go.mod h1:Peg/GBAQ6JDt+RoBf4meB1wylmAipb7Kg2ZFakZTlwk=
github.com/aws/aws-sdk-go-v2/internal/configsources v1.4.13 h1:a+8/MLcWlIxo1lF9xaGt3J/u3yOZx+CdSveSNwjhD40=
github.com/aws/aws-sdk-go-v2/internal/configsources v1.4.13/go.mod h1:oGnKwIYZ4XttyU2JWxFrwvhF6YKiK/9/wmE3v3Iu9K8=
github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.7.13 h1:HBSI2kDkMdWz4ZM7FjwE7e/pWDEZ+nR95x8Ztet1ooY=
github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.7.13/go.mod h1:YE94ZoDArI7awZqJzBAZ3PDD2zSfuP7w6P2knOzIn8M=
github.com/aws/aws-sdk-go-v2/internal/ini v1.8.4 h1:WKuaxf++XKWlHWu9ECbMlha8WOEGm0OUEZqm4K/Gcfk=
github.com/aws/aws-sdk-go-v2/internal/ini v1.8.4/go.mod h1:ZWy7j6v1vWGmPReu0iSGvRiise4YI5SkR3OHKTZ6Wuc=
github.com/aws/aws-sdk-go-v2/internal/v4a v1.4.13 h1:eg/WYAa12vqTphzIdWMzqYRVKKnCboVPRlvaybNCqPA=
github.com/aws/aws-sdk-go-v2/internal/v4a v1.4.13/go.mod h1:/FDdxWhz1486obGrKKC1HONd7krpk38LBt+dutLcN9k=
github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.13.3 h1:x2Ibm/Af8Fi+BH+Hsn9TXGdT+hKbDd5XOTZxTMxDk7o=
github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.13.3/go.mod h1:IW1jwyrQgMdhisceG8fQLmQIydcT/jWY21rFhzgaKwo=
github.com/aws/aws-sdk-go-v2/service/internal/checksum v1.9.4 h1:NvMjwvv8hpGUILarKw7Z4Q0w1H9anXKsesMxtw++MA4=
github.com/aws/aws-sdk-go-v2/service/internal/checksum v1.9.4/go.mod h1:455WPHSwaGj2waRSpQp7TsnpOnBfw8iDfPfbwl7KPJE=
github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.13.13 h1:kDqdFvMY4AtKoACfzIGD8A0+hbT41KTKF//gq7jITfM=
github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.13.13/go.mod h1:lmKuogqSU3HzQCwZ9ZtcqOc5XGMqtDK7OIc2+DxiUEg=
github.com/aws/aws-sdk-go-v2/service/internal/s3shared v1.19.13 h1:zhBJXdhWIFZ1acfDYIhu4+LCzdUS2Vbcum7D01dXlHQ=
github.com/aws/aws-sdk-go-v2/service/internal/s3shared v1.19.13/go.mod h1:JaaOeCE368qn2Hzi3sEzY6FgAZVCIYcC2nwbro2QCh8=
github.com/aws/aws-sdk-go-v2/service/s3 v1.90.2 h1:DhdbtDl4FdNlj31+xiRXANxEE+eC7n8JQz+/ilwQ8Uc=
github.com/aws/aws-sdk-go-v2/service/s3 v1.90.2/go.mod h1:+wArOOrcHUevqdto9k1tKOF5++YTe9JEcPSc9Tx2ZSw=
github.com/aws/aws-sdk-go-v2/service/sso v1.30.3 h1:NjShtS1t8r5LUfFVtFeI8xLAHQNTa7UI0VawXlrBMFQ=
github.com/aws/aws-sdk-go-v2/service/sso v1.30.3/go.mod h1:fKvyjJcz63iL/ftA6RaM8sRCtN4r4zl4tjL3qw5ec7k=
github.com/aws/aws-sdk-go-v2/service/ssooidc v1.35.7 h1:gTsnx0xXNQ6SBbymoDvcoRHL+q4l/dAFsQuKfDWSaGc=
github.com/aws/aws-sdk-go-v2/service/ssooidc v1.35.7/go.mod h1:klO+ejMvYsB4QATfEOIXk8WAEwN4N0aBfJpvC+5SZBo=
github.com/aws/aws-sdk-go-v2/service/sts v1.40.2 h1:HK5ON3KmQV2HcAunnx4sKLB9aPf3gKGwVAf7xnx0QT0=
github.com/aws/aws-sdk-go-v2/service/sts v1.40.2/go.mod h1:E19xDjpzPZC7LS2knI9E6BaRFDK43Eul7vd6rSq2HWk=
github.com/aws/smithy-go v1.24.0 h1:LpilSUItNPFr1eY85RYgTIg5eIEPtvFbskaFcmmIUnk=
github.com/aws/smithy-go v1.24.0/go.mod h1:LEj2LM3rBRQJxPZTB4KuzZkaZYnZPnvgIhb4pu07mx0=
github.com/beorn7/perks v1.0.1 h1:VlbKKnNfV8bJzeqoa4cOKqO6bYr3WgKZxO8Z16+hsOM= github.com/beorn7/perks v1.0.1 h1:VlbKKnNfV8bJzeqoa4cOKqO6bYr3WgKZxO8Z16+hsOM=
github.com/beorn7/perks v1.0.1/go.mod h1:G2ZrVWU2WbWT9wwq4/hrbKbnv/1ERSJQ0ibhJ6rlkpw= github.com/beorn7/perks v1.0.1/go.mod h1:G2ZrVWU2WbWT9wwq4/hrbKbnv/1ERSJQ0ibhJ6rlkpw=
github.com/cespare/xxhash/v2 v2.3.0 h1:UL815xU9SqsFlibzuggzjXhog7bL6oX9BbNZnL2UFvs= github.com/cespare/xxhash/v2 v2.3.0 h1:UL815xU9SqsFlibzuggzjXhog7bL6oX9BbNZnL2UFvs=
github.com/cespare/xxhash/v2 v2.3.0/go.mod h1:VGX0DQ3Q6kWi7AoAeZDth3/j3BFtOZR5XLFGgcrjCOs= github.com/cespare/xxhash/v2 v2.3.0/go.mod h1:VGX0DQ3Q6kWi7AoAeZDth3/j3BFtOZR5XLFGgcrjCOs=
github.com/coreos/go-oidc/v3 v3.16.0 h1:qRQUCFstKpXwmEjDQTIbyY/5jF00+asXzSkmkoa/mow= github.com/coreos/go-oidc/v3 v3.12.0 h1:sJk+8G2qq94rDI6ehZ71Bol3oUHy63qNYmkiSjrc/Jo=
github.com/coreos/go-oidc/v3 v3.16.0/go.mod h1:wqPbKFrVnE90vty060SB40FCJ8fTHTxSwyXJqZH+sI8= github.com/coreos/go-oidc/v3 v3.12.0/go.mod h1:gE3LgjOgFoHi9a4ce4/tJczr0Ai2/BoDhf0r5lltWI0=
github.com/cpuguy83/go-md2man/v2 v2.0.7 h1:zbFlGlXEAKlwXpmvle3d8Oe3YnkKIK4xSRTd3sHPnBo= github.com/cpuguy83/go-md2man/v2 v2.0.6 h1:XJtiaUW6dEEqVuZiMTn1ldk455QWwEIsMIJlo5vtkx0=
github.com/cpuguy83/go-md2man/v2 v2.0.7/go.mod h1:oOW0eioCTA6cOiMLiUPZOpcVxMig6NIQQ7OS05n1F4g= github.com/cpuguy83/go-md2man/v2 v2.0.6/go.mod h1:oOW0eioCTA6cOiMLiUPZOpcVxMig6NIQQ7OS05n1F4g=
github.com/creack/pty v1.1.9/go.mod h1:oKZEueFk5CKHvIhNR5MUki03XCEU+Q6VDXinZuGJ33E=
github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38= github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38= github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc h1:U9qPSI2PIWSS1VwoXQT9A3Wy9MM3WgvqSxFWenqJduM=
github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/dgryski/trifles v0.0.0-20230903005119-f50d829f2e54 h1:SG7nF6SRlWhcT7cNTs5R6Hk4V2lcmLz2NsG2VnInyNo= github.com/dgryski/trifles v0.0.0-20230903005119-f50d829f2e54 h1:SG7nF6SRlWhcT7cNTs5R6Hk4V2lcmLz2NsG2VnInyNo=
github.com/dgryski/trifles v0.0.0-20230903005119-f50d829f2e54/go.mod h1:if7Fbed8SFyPtHLHbg49SI7NAdJiC5WIA09pe59rfAA= github.com/dgryski/trifles v0.0.0-20230903005119-f50d829f2e54/go.mod h1:if7Fbed8SFyPtHLHbg49SI7NAdJiC5WIA09pe59rfAA=
github.com/expr-lang/expr v1.17.6 h1:1h6i8ONk9cexhDmowO/A64VPxHScu7qfSl2k8OlINec= github.com/dhui/dktest v0.4.4 h1:+I4s6JRE1yGuqflzwqG+aIaMdgXIorCf5P98JnaAWa8=
github.com/expr-lang/expr v1.17.6/go.mod h1:8/vRC7+7HBzESEqt5kKpYXxrxkr31SaO8r40VO/1IT4= github.com/dhui/dktest v0.4.4/go.mod h1:4+22R4lgsdAXrDyaH4Nqx2JEz2hLp49MqQmm9HLCQhM=
github.com/distribution/reference v0.6.0 h1:0IXCQ5g4/QMHHkarYzh5l+u8T3t73zM5QvfrDyIgxBk=
github.com/distribution/reference v0.6.0/go.mod h1:BbU0aIcezP1/5jX/8MP0YiH4SdvB5Y4f/wlDRiLyi3E=
github.com/docker/docker v27.2.0+incompatible h1:Rk9nIVdfH3+Vz4cyI/uhbINhEZ/oLmc+CBXmH6fbNk4=
github.com/docker/docker v27.2.0+incompatible/go.mod h1:eEKB0N0r5NX/I1kEveEz05bcu8tLC/8azJZsviup8Sk=
github.com/docker/go-connections v0.5.0 h1:USnMq7hx7gwdVZq1L49hLXaFtUdTADjXGp+uj1Br63c=
github.com/docker/go-connections v0.5.0/go.mod h1:ov60Kzw0kKElRwhNs9UlUHAE/F9Fe6GLaXnqyDdmEXc=
github.com/docker/go-units v0.5.0 h1:69rxXcBk27SvSaaxTtLh/8llcHD8vYHT7WSdRZ/jvr4=
github.com/docker/go-units v0.5.0/go.mod h1:fgPhTUdO+D/Jk86RDLlptpiXQzgHJF7gydDDbaIK4Dk=
github.com/expr-lang/expr v1.17.3 h1:myeTTuDFz7k6eFe/JPlep/UsiIjVhG61FMHFu63U7j0=
github.com/expr-lang/expr v1.17.3/go.mod h1:8/vRC7+7HBzESEqt5kKpYXxrxkr31SaO8r40VO/1IT4=
github.com/felixge/httpsnoop v1.0.4 h1:NFTV2Zj1bL4mc9sqWACXbQFVBBg2W3GPvqp8/ESS2Wg= github.com/felixge/httpsnoop v1.0.4 h1:NFTV2Zj1bL4mc9sqWACXbQFVBBg2W3GPvqp8/ESS2Wg=
github.com/felixge/httpsnoop v1.0.4/go.mod h1:m8KPJKqk1gH5J9DgRY2ASl2lWCfGKXixSwevea8zH2U= github.com/felixge/httpsnoop v1.0.4/go.mod h1:m8KPJKqk1gH5J9DgRY2ASl2lWCfGKXixSwevea8zH2U=
github.com/frankban/quicktest v1.11.0/go.mod h1:K+q6oSqb0W0Ininfk863uOk1lMy69l/P6txr3mVT54s=
github.com/frankban/quicktest v1.11.2/go.mod h1:K+q6oSqb0W0Ininfk863uOk1lMy69l/P6txr3mVT54s=
github.com/frankban/quicktest v1.13.0 h1:yNZif1OkDfNoDfb9zZa9aXIpejNR4F23Wely0c+Qdqk=
github.com/frankban/quicktest v1.13.0/go.mod h1:qLE0fzW0VuyUAJgPU19zByoIr0HtCHN/r/VLSOOIySU=
github.com/fsnotify/fsnotify v1.9.0 h1:2Ml+OJNzbYCTzsxtv8vKSFD9PbJjmhYF14k/jKC7S9k= github.com/fsnotify/fsnotify v1.9.0 h1:2Ml+OJNzbYCTzsxtv8vKSFD9PbJjmhYF14k/jKC7S9k=
github.com/fsnotify/fsnotify v1.9.0/go.mod h1:8jBTzvmWwFyi3Pb8djgCCO5IBqzKJ/Jwo8TRcHyHii0= github.com/fsnotify/fsnotify v1.9.0/go.mod h1:8jBTzvmWwFyi3Pb8djgCCO5IBqzKJ/Jwo8TRcHyHii0=
github.com/go-asn1-ber/asn1-ber v1.5.8-0.20250403174932-29230038a667 h1:BP4M0CvQ4S3TGls2FvczZtj5Re/2ZzkV9VwqPHH/3Bo= github.com/go-asn1-ber/asn1-ber v1.5.7 h1:DTX+lbVTWaTw1hQ+PbZPlnDZPEIs0SS/GCZAl535dDk=
github.com/go-asn1-ber/asn1-ber v1.5.8-0.20250403174932-29230038a667/go.mod h1:hEBeB/ic+5LoWskz+yKT7vGhhPYkProFKoKdwZRWMe0= github.com/go-asn1-ber/asn1-ber v1.5.7/go.mod h1:hEBeB/ic+5LoWskz+yKT7vGhhPYkProFKoKdwZRWMe0=
github.com/go-co-op/gocron/v2 v2.18.2 h1:+5VU41FUXPWSPKLXZQ/77SGzUiPCcakU0v7ENc2H20Q= github.com/go-co-op/gocron/v2 v2.16.0 h1:uqUF6WFZ4enRU45pWFNcn1xpDLc+jBOTKhPQI16Z1xs=
github.com/go-co-op/gocron/v2 v2.18.2/go.mod h1:Zii6he+Zfgy5W9B+JKk/KwejFOW0kZTFvHtwIpR4aBI= github.com/go-co-op/gocron/v2 v2.16.0/go.mod h1:opexeOFy5BplhsKdA7bzY9zeYih8I8/WNJ4arTIFPVc=
github.com/go-jose/go-jose/v4 v4.1.3 h1:CVLmWDhDVRa6Mi/IgCgaopNosCaHz7zrMeF9MlZRkrs= github.com/go-jose/go-jose/v4 v4.0.5 h1:M6T8+mKZl/+fNNuFHvGIzDz7BTLQPIounk/b9dw3AaE=
github.com/go-jose/go-jose/v4 v4.1.3/go.mod h1:x4oUasVrzR7071A4TnHLGSPpNOm2a21K9Kf04k1rs08= github.com/go-jose/go-jose/v4 v4.0.5/go.mod h1:s3P1lRrkT8igV8D9OjyL4WRyHvjB6a4JSllnOrmmBOA=
github.com/go-ldap/ldap/v3 v3.4.12 h1:1b81mv7MagXZ7+1r7cLTWmyuTqVqdwbtJSjC0DAp9s4= github.com/go-ldap/ldap/v3 v3.4.10 h1:ot/iwPOhfpNVgB1o+AVXljizWZ9JTp7YF5oeyONmcJU=
github.com/go-ldap/ldap/v3 v3.4.12/go.mod h1:+SPAGcTtOfmGsCb3h1RFiq4xpp4N636G75OEace8lNo= github.com/go-ldap/ldap/v3 v3.4.10/go.mod h1:JXh4Uxgi40P6E9rdsYqpUtbW46D9UTjJ9QSwGRznplY=
github.com/go-openapi/jsonpointer v0.22.3 h1:dKMwfV4fmt6Ah90zloTbUKWMD+0he+12XYAsPotrkn8= github.com/go-logr/logr v1.4.2 h1:6pFjapn8bFcIbiKo3XT4j/BhANplGihG6tvd+8rYgrY=
github.com/go-openapi/jsonpointer v0.22.3/go.mod h1:0lBbqeRsQ5lIanv3LHZBrmRGHLHcQoOXQnf88fHlGWo= github.com/go-logr/logr v1.4.2/go.mod h1:9T104GzyrTigFIr8wt5mBrctHMim0Nb2HLGrmQ40KvY=
github.com/go-openapi/jsonreference v0.21.3 h1:96Dn+MRPa0nYAR8DR1E03SblB5FJvh7W6krPI0Z7qMc= github.com/go-logr/stdr v1.2.2 h1:hSWxHoqTgW2S2qGc0LTAI563KZ5YKYRhT3MFKZMbjag=
github.com/go-openapi/jsonreference v0.21.3/go.mod h1:RqkUP0MrLf37HqxZxrIAtTWW4ZJIK1VzduhXYBEeGc4= github.com/go-logr/stdr v1.2.2/go.mod h1:mMo/vtBO5dYbehREoey6XUKy/eSumjCCveDpRre4VKE=
github.com/go-openapi/spec v0.22.1 h1:beZMa5AVQzRspNjvhe5aG1/XyBSMeX1eEOs7dMoXh/k= github.com/go-openapi/jsonpointer v0.21.0 h1:YgdVicSA9vH5RiHs9TZW5oyafXZFc6+2Vc1rr/O9oNQ=
github.com/go-openapi/spec v0.22.1/go.mod h1:c7aeIQT175dVowfp7FeCvXXnjN/MrpaONStibD2WtDA= github.com/go-openapi/jsonpointer v0.21.0/go.mod h1:IUyH9l/+uyhIYQ/PXVA41Rexl+kOkAPDdXEYns6fzUY=
github.com/go-openapi/swag v0.19.15 h1:D2NRCBzS9/pEY3gP9Nl8aDqGUcPFrwG2p+CNFrLyrCM= github.com/go-openapi/jsonreference v0.21.0 h1:Rs+Y7hSXT83Jacb7kFyjn4ijOuVGSvOdF2+tg1TRrwQ=
github.com/go-openapi/swag/conv v0.25.4 h1:/Dd7p0LZXczgUcC/Ikm1+YqVzkEeCc9LnOWjfkpkfe4= github.com/go-openapi/jsonreference v0.21.0/go.mod h1:LmZmgsrTkVg9LG4EaHeY8cBDslNPMo06cago5JNLkm4=
github.com/go-openapi/swag/conv v0.25.4/go.mod h1:3LXfie/lwoAv0NHoEuY1hjoFAYkvlqI/Bn5EQDD3PPU= github.com/go-openapi/spec v0.21.0 h1:LTVzPc3p/RzRnkQqLRndbAzjY0d0BCL72A6j3CdL9ZY=
github.com/go-openapi/swag/jsonname v0.25.4 h1:bZH0+MsS03MbnwBXYhuTttMOqk+5KcQ9869Vye1bNHI= github.com/go-openapi/spec v0.21.0/go.mod h1:78u6VdPw81XU44qEWGhtr982gJ5BWg2c0I5XwVMotYk=
github.com/go-openapi/swag/jsonname v0.25.4/go.mod h1:GPVEk9CWVhNvWhZgrnvRA6utbAltopbKwDu8mXNUMag= github.com/go-openapi/swag v0.23.0 h1:vsEVJDUo2hPJ2tu0/Xc+4noaxyEffXNIs3cOULZ+GrE=
github.com/go-openapi/swag/jsonutils v0.25.4 h1:VSchfbGhD4UTf4vCdR2F4TLBdLwHyUDTd1/q4i+jGZA= github.com/go-openapi/swag v0.23.0/go.mod h1:esZ8ITTYEsH1V2trKHjAN8Ai7xHb8RV+YSZ577vPjgQ=
github.com/go-openapi/swag/jsonutils v0.25.4/go.mod h1:7OYGXpvVFPn4PpaSdPHJBtF0iGnbEaTk8AvBkoWnaAY=
github.com/go-openapi/swag/jsonutils/fixtures_test v0.25.4 h1:IACsSvBhiNJwlDix7wq39SS2Fh7lUOCJRmx/4SN4sVo=
github.com/go-openapi/swag/jsonutils/fixtures_test v0.25.4/go.mod h1:Mt0Ost9l3cUzVv4OEZG+WSeoHwjWLnarzMePNDAOBiM=
github.com/go-openapi/swag/loading v0.25.4 h1:jN4MvLj0X6yhCDduRsxDDw1aHe+ZWoLjW+9ZQWIKn2s=
github.com/go-openapi/swag/loading v0.25.4/go.mod h1:rpUM1ZiyEP9+mNLIQUdMiD7dCETXvkkC30z53i+ftTE=
github.com/go-openapi/swag/stringutils v0.25.4 h1:O6dU1Rd8bej4HPA3/CLPciNBBDwZj9HiEpdVsb8B5A8=
github.com/go-openapi/swag/stringutils v0.25.4/go.mod h1:GTsRvhJW5xM5gkgiFe0fV3PUlFm0dr8vki6/VSRaZK0=
github.com/go-openapi/swag/typeutils v0.25.4 h1:1/fbZOUN472NTc39zpa+YGHn3jzHWhv42wAJSN91wRw=
github.com/go-openapi/swag/typeutils v0.25.4/go.mod h1:Ou7g//Wx8tTLS9vG0UmzfCsjZjKhpjxayRKTHXf2pTE=
github.com/go-openapi/swag/yamlutils v0.25.4 h1:6jdaeSItEUb7ioS9lFoCZ65Cne1/RZtPBZ9A56h92Sw=
github.com/go-openapi/swag/yamlutils v0.25.4/go.mod h1:MNzq1ulQu+yd8Kl7wPOut/YHAAU/H6hL91fF+E2RFwc=
github.com/go-openapi/testify/enable/yaml/v2 v2.0.2 h1:0+Y41Pz1NkbTHz8NngxTuAXxEodtNSI1WG1c/m5Akw4=
github.com/go-openapi/testify/enable/yaml/v2 v2.0.2/go.mod h1:kme83333GCtJQHXQ8UKX3IBZu6z8T5Dvy5+CW3NLUUg=
github.com/go-openapi/testify/v2 v2.0.2 h1:X999g3jeLcoY8qctY/c/Z8iBHTbwLz7R2WXd6Ub6wls=
github.com/go-openapi/testify/v2 v2.0.2/go.mod h1:HCPmvFFnheKK2BuwSA0TbbdxJ3I16pjwMkYkP4Ywn54=
github.com/go-sql-driver/mysql v1.4.1/go.mod h1:zAC/RDZ24gD3HViQzih4MyKcchzm+sOG5ZlKdlhCg5w= github.com/go-sql-driver/mysql v1.4.1/go.mod h1:zAC/RDZ24gD3HViQzih4MyKcchzm+sOG5ZlKdlhCg5w=
github.com/go-sql-driver/mysql v1.8.1 h1:LedoTUt/eveggdHS9qUFC1EFSa8bU2+1pZjSRpvNJ1Y=
github.com/go-sql-driver/mysql v1.8.1/go.mod h1:wEBSXgmK//2ZFJyE+qWnIsVGmvmEKlqwuVSjsCm7DZg= github.com/go-sql-driver/mysql v1.8.1/go.mod h1:wEBSXgmK//2ZFJyE+qWnIsVGmvmEKlqwuVSjsCm7DZg=
github.com/go-viper/mapstructure/v2 v2.4.0 h1:EBsztssimR/CONLSZZ04E8qAkxNYq4Qp9LvH92wZUgs= github.com/go-sql-driver/mysql v1.9.0 h1:Y0zIbQXhQKmQgTp44Y1dp3wTXcn804QoTptLZT1vtvo=
github.com/go-viper/mapstructure/v2 v2.4.0/go.mod h1:oJDH3BJKyqBA2TXFhDsKDGDTlndYOZ6rGS0BRZIxGhM= github.com/go-sql-driver/mysql v1.9.0/go.mod h1:pDetrLJeA3oMujJuvXc8RJoasr589B6A9fwzD3QMrqw=
github.com/goccy/go-yaml v1.19.0 h1:EmkZ9RIsX+Uq4DYFowegAuJo8+xdX3T/2dwNPXbxEYE= github.com/go-viper/mapstructure/v2 v2.2.1 h1:ZAaOCxANMuZx5RCeg0mBdEZk7DZasvvZIxtHqx8aGss=
github.com/goccy/go-yaml v1.19.0/go.mod h1:XBurs7gK8ATbW4ZPGKgcbrY1Br56PdM69F7LkFRi1kA= github.com/go-viper/mapstructure/v2 v2.2.1/go.mod h1:oJDH3BJKyqBA2TXFhDsKDGDTlndYOZ6rGS0BRZIxGhM=
github.com/golang-jwt/jwt/v5 v5.3.0 h1:pv4AsKCKKZuqlgs5sUmn4x8UlGa0kEVt/puTpKx9vvo= github.com/gogo/protobuf v1.3.2 h1:Ov1cvc58UF3b5XjBnZv7+opcTcQFZebYjWzi34vdm4Q=
github.com/golang-jwt/jwt/v5 v5.3.0/go.mod h1:fxCRLWMO43lRc8nhHWY6LGqRcf+1gQWArsqaEUEa5bE= github.com/gogo/protobuf v1.3.2/go.mod h1:P1XiOD3dCwIKUDQYPy72D8LYyHL2YPYrpS2s69NZV8Q=
github.com/golang-migrate/migrate/v4 v4.19.1 h1:OCyb44lFuQfYXYLx1SCxPZQGU7mcaZ7gH9yH4jSFbBA= github.com/golang-jwt/jwt/v5 v5.2.2 h1:Rl4B7itRWVtYIHFrSNd7vhTiz9UpLdi6gZhZ3wEeDy8=
github.com/golang-migrate/migrate/v4 v4.19.1/go.mod h1:CTcgfjxhaUtsLipnLoQRWCrjYXycRz/g5+RWDuYgPrE= github.com/golang-jwt/jwt/v5 v5.2.2/go.mod h1:pqrtFR0X4osieyHYxtmOUWsAWrfe1Q5UVIyoH402zdk=
github.com/golang/snappy v0.0.1/go.mod h1:/XxbfmMg8lxefKM7IXC3fBNl/7bRcc72aCRzEWrmP2Q= github.com/golang-migrate/migrate/v4 v4.18.2 h1:2VSCMz7x7mjyTXx3m2zPokOY82LTRgxK1yQYKo6wWQ8=
github.com/golang/snappy v0.0.4 h1:yAGX7huGHXlcLOEtBnF4w7FQwA26wojNCwOYAEhLjQM= github.com/golang-migrate/migrate/v4 v4.18.2/go.mod h1:2CM6tJvn2kqPXwnXO/d3rAQYiyoIm180VsO8PRX6Rpk=
github.com/golang/snappy v0.0.4/go.mod h1:/XxbfmMg8lxefKM7IXC3fBNl/7bRcc72aCRzEWrmP2Q= github.com/google/go-cmp v0.5.9/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeNGIjoY=
github.com/google/go-cmp v0.5.2/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/gNBxE= github.com/google/go-cmp v0.6.0/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeNGIjoY=
github.com/google/go-cmp v0.5.5/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/gNBxE=
github.com/google/go-cmp v0.7.0 h1:wk8382ETsv4JYUZwIsn6YpYiWiBsYLSJiTsyBybVuN8= github.com/google/go-cmp v0.7.0 h1:wk8382ETsv4JYUZwIsn6YpYiWiBsYLSJiTsyBybVuN8=
github.com/google/go-cmp v0.7.0/go.mod h1:pXiqmnSA92OHEEa9HXL2W4E7lf9JzCmGVUdgjX3N/iU= github.com/google/go-cmp v0.7.0/go.mod h1:pXiqmnSA92OHEEa9HXL2W4E7lf9JzCmGVUdgjX3N/iU=
github.com/google/gofuzz v1.0.0/go.mod h1:dBl0BpW6vV/+mYPU4Po3pmUjxk6FQPldtuIdl/M65Eg= github.com/google/gofuzz v1.0.0/go.mod h1:dBl0BpW6vV/+mYPU4Po3pmUjxk6FQPldtuIdl/M65Eg=
@@ -151,27 +102,24 @@ github.com/gorilla/handlers v1.5.2 h1:cLTUSsNkgcwhgRqvCNmdbRWG0A3N4F+M2nWKdScwyE
github.com/gorilla/handlers v1.5.2/go.mod h1:dX+xVpaxdSw+q0Qek8SSsl3dfMk3jNddUkMzo0GtH0w= github.com/gorilla/handlers v1.5.2/go.mod h1:dX+xVpaxdSw+q0Qek8SSsl3dfMk3jNddUkMzo0GtH0w=
github.com/gorilla/mux v1.8.1 h1:TuBL49tXwgrFYWhqrNgrUNEY92u81SPhu7sTdzQEiWY= github.com/gorilla/mux v1.8.1 h1:TuBL49tXwgrFYWhqrNgrUNEY92u81SPhu7sTdzQEiWY=
github.com/gorilla/mux v1.8.1/go.mod h1:AKf9I4AEqPTmMytcMc0KkNouC66V3BtZ4qD5fmWSiMQ= github.com/gorilla/mux v1.8.1/go.mod h1:AKf9I4AEqPTmMytcMc0KkNouC66V3BtZ4qD5fmWSiMQ=
github.com/gorilla/securecookie v1.1.1/go.mod h1:ra0sb63/xPlUeL+yeDciTfxMRAA+MP+HVt/4epWDjd4=
github.com/gorilla/securecookie v1.1.2 h1:YCIWL56dvtr73r6715mJs5ZvhtnY73hBvEF8kXD8ePA= github.com/gorilla/securecookie v1.1.2 h1:YCIWL56dvtr73r6715mJs5ZvhtnY73hBvEF8kXD8ePA=
github.com/gorilla/securecookie v1.1.2/go.mod h1:NfCASbcHqRSY+3a8tlWJwsQap2VX5pwzwo4h3eOamfo= github.com/gorilla/securecookie v1.1.2/go.mod h1:NfCASbcHqRSY+3a8tlWJwsQap2VX5pwzwo4h3eOamfo=
github.com/gorilla/sessions v1.2.1/go.mod h1:dk2InVEVJ0sfLlnXv9EAgkf6ecYs/i80K/zI+bUmuGM=
github.com/gorilla/sessions v1.4.0 h1:kpIYOp/oi6MG/p5PgxApU8srsSw9tuFbt46Lt7auzqQ= github.com/gorilla/sessions v1.4.0 h1:kpIYOp/oi6MG/p5PgxApU8srsSw9tuFbt46Lt7auzqQ=
github.com/gorilla/sessions v1.4.0/go.mod h1:FLWm50oby91+hl7p/wRxDth9bWSuk0qVL2emc7lT5ik= github.com/gorilla/sessions v1.4.0/go.mod h1:FLWm50oby91+hl7p/wRxDth9bWSuk0qVL2emc7lT5ik=
github.com/gorilla/websocket v1.5.3 h1:saDtZ6Pbx/0u+bgYQ3q96pZgCzfhKXGPqt7kZ72aNNg= github.com/gorilla/websocket v1.5.3 h1:saDtZ6Pbx/0u+bgYQ3q96pZgCzfhKXGPqt7kZ72aNNg=
github.com/gorilla/websocket v1.5.3/go.mod h1:YR8l580nyteQvAITg2hZ9XVh4b55+EU/adAjf1fMHhE= github.com/gorilla/websocket v1.5.3/go.mod h1:YR8l580nyteQvAITg2hZ9XVh4b55+EU/adAjf1fMHhE=
github.com/hashicorp/errwrap v1.0.0/go.mod h1:YH+1FKiLXxHSkmPseP+kNlulaMuP3n2brvKWEqk/Jc4=
github.com/hashicorp/errwrap v1.1.0 h1:OxrOeh75EUXMY8TBjag2fzXGZ40LB6IKw45YeGUDY2I=
github.com/hashicorp/errwrap v1.1.0/go.mod h1:YH+1FKiLXxHSkmPseP+kNlulaMuP3n2brvKWEqk/Jc4=
github.com/hashicorp/go-multierror v1.1.1 h1:H5DkEtf6CXdFp0N0Em5UCwQpXMWke8IA0+lD48awMYo=
github.com/hashicorp/go-multierror v1.1.1/go.mod h1:iw975J/qwKPdAO1clOe2L8331t/9/fmwbPZ6JB6eMoM=
github.com/hashicorp/go-uuid v1.0.2/go.mod h1:6SBZvOh/SIDV7/2o3Jml5SYk/TvGqwFJ/bN7x4byOro=
github.com/hashicorp/go-uuid v1.0.3 h1:2gKiV6YVmrJ1i2CKKa9obLvRieoRGviZFL26PcT/Co8= github.com/hashicorp/go-uuid v1.0.3 h1:2gKiV6YVmrJ1i2CKKa9obLvRieoRGviZFL26PcT/Co8=
github.com/hashicorp/go-uuid v1.0.3/go.mod h1:6SBZvOh/SIDV7/2o3Jml5SYk/TvGqwFJ/bN7x4byOro= github.com/hashicorp/go-uuid v1.0.3/go.mod h1:6SBZvOh/SIDV7/2o3Jml5SYk/TvGqwFJ/bN7x4byOro=
github.com/hashicorp/golang-lru/v2 v2.0.7 h1:a+bsQ5rvGLjzHuww6tVxozPZFVghXaHOwFs4luLUK2k= github.com/hashicorp/golang-lru/v2 v2.0.7 h1:a+bsQ5rvGLjzHuww6tVxozPZFVghXaHOwFs4luLUK2k=
github.com/hashicorp/golang-lru/v2 v2.0.7/go.mod h1:QeFd9opnmA6QUJc5vARoKUSoFhyfM2/ZepoAG6RGpeM= github.com/hashicorp/golang-lru/v2 v2.0.7/go.mod h1:QeFd9opnmA6QUJc5vARoKUSoFhyfM2/ZepoAG6RGpeM=
github.com/influxdata/influxdb-client-go/v2 v2.14.0 h1:AjbBfJuq+QoaXNcrova8smSjwJdUHnwvfjMF71M1iI4=
github.com/influxdata/influxdb-client-go/v2 v2.14.0/go.mod h1:Ahpm3QXKMJslpXl3IftVLVezreAUtBOTZssDrjZEFHI=
github.com/influxdata/line-protocol v0.0.0-20210922203350-b1ad95c89adf h1:7JTmneyiNEwVBOHSjoMxiWAqB992atOeepeFYegn5RU=
github.com/influxdata/line-protocol v0.0.0-20210922203350-b1ad95c89adf/go.mod h1:xaLFMmpvUxqXtVkUJfg9QmT88cDaCJ3ZKgdZ78oO8Qo=
github.com/influxdata/line-protocol-corpus v0.0.0-20210519164801-ca6fa5da0184/go.mod h1:03nmhxzZ7Xk2pdG+lmMd7mHDfeVOYFyhOgwO61qWU98=
github.com/influxdata/line-protocol-corpus v0.0.0-20210922080147-aa28ccfb8937 h1:MHJNQ+p99hFATQm6ORoLmpUCF7ovjwEFshs/NHzAbig=
github.com/influxdata/line-protocol-corpus v0.0.0-20210922080147-aa28ccfb8937/go.mod h1:BKR9c0uHSmRgM/se9JhFHtTT7JTO67X23MtKMHtZcpo=
github.com/influxdata/line-protocol/v2 v2.0.0-20210312151457-c52fdecb625a/go.mod h1:6+9Xt5Sq1rWx+glMgxhcg2c0DUaehK+5TDcPZ76GypY=
github.com/influxdata/line-protocol/v2 v2.1.0/go.mod h1:QKw43hdUBg3GTk2iC3iyCxksNj7PX9aUSeYOYE/ceHY=
github.com/influxdata/line-protocol/v2 v2.2.1 h1:EAPkqJ9Km4uAxtMRgUubJyqAr6zgWM0dznKMLRauQRE=
github.com/influxdata/line-protocol/v2 v2.2.1/go.mod h1:DmB3Cnh+3oxmG6LOBIxce4oaL4CPj3OmMPgvauXh+tM=
github.com/jcmturner/aescts/v2 v2.0.0 h1:9YKLH6ey7H4eDBXW8khjYslgyqG2xZikXP0EQFKrle8= github.com/jcmturner/aescts/v2 v2.0.0 h1:9YKLH6ey7H4eDBXW8khjYslgyqG2xZikXP0EQFKrle8=
github.com/jcmturner/aescts/v2 v2.0.0/go.mod h1:AiaICIRyfYg35RUkr8yESTqvSy7csK90qZ5xfvvsoNs= github.com/jcmturner/aescts/v2 v2.0.0/go.mod h1:AiaICIRyfYg35RUkr8yESTqvSy7csK90qZ5xfvvsoNs=
github.com/jcmturner/dnsutils/v2 v2.0.0 h1:lltnkeZGL0wILNvrNiVCR6Ro5PGU/SeBvVO/8c/iPbo= github.com/jcmturner/dnsutils/v2 v2.0.0 h1:lltnkeZGL0wILNvrNiVCR6Ro5PGU/SeBvVO/8c/iPbo=
@@ -190,17 +138,14 @@ github.com/joho/godotenv v1.5.1 h1:7eLL/+HRGLY0ldzfGMeQkb7vMd0as4CfYvUVzLqw0N0=
github.com/joho/godotenv v1.5.1/go.mod h1:f4LDr5Voq0i2e/R5DDNOoa2zzDfwtkZa6DnEwAbqwq4= github.com/joho/godotenv v1.5.1/go.mod h1:f4LDr5Voq0i2e/R5DDNOoa2zzDfwtkZa6DnEwAbqwq4=
github.com/jonboulle/clockwork v0.5.0 h1:Hyh9A8u51kptdkR+cqRpT1EebBwTn1oK9YfGYbdFz6I= github.com/jonboulle/clockwork v0.5.0 h1:Hyh9A8u51kptdkR+cqRpT1EebBwTn1oK9YfGYbdFz6I=
github.com/jonboulle/clockwork v0.5.0/go.mod h1:3mZlmanh0g2NDKO5TWZVJAfofYk64M7XN3SzBPjZF60= github.com/jonboulle/clockwork v0.5.0/go.mod h1:3mZlmanh0g2NDKO5TWZVJAfofYk64M7XN3SzBPjZF60=
github.com/josharian/intern v1.0.0 h1:vlS4z54oSdjm0bgjRigI+G1HpF+tI+9rE5LLzOg8HmY=
github.com/josharian/intern v1.0.0/go.mod h1:5DoeVV0s6jJacbCEi61lwdGj/aVlrQvzHFFd8Hwg//Y=
github.com/jpillora/backoff v1.0.0 h1:uvFg412JmmHBHw7iwprIxkPMI+sGQ4kzOWsMeHnm2EA= github.com/jpillora/backoff v1.0.0 h1:uvFg412JmmHBHw7iwprIxkPMI+sGQ4kzOWsMeHnm2EA=
github.com/jpillora/backoff v1.0.0/go.mod h1:J/6gKK9jxlEcS3zixgDgUAsiuZ7yrSoa/FX5e0EB2j4= github.com/jpillora/backoff v1.0.0/go.mod h1:J/6gKK9jxlEcS3zixgDgUAsiuZ7yrSoa/FX5e0EB2j4=
github.com/json-iterator/go v1.1.12 h1:PV8peI4a0ysnczrg+LtxykD8LfKY9ML6u2jnxaEnrnM= github.com/json-iterator/go v1.1.12 h1:PV8peI4a0ysnczrg+LtxykD8LfKY9ML6u2jnxaEnrnM=
github.com/json-iterator/go v1.1.12/go.mod h1:e30LSqwooZae/UwlEbR2852Gd8hjQvJoHmT4TnhNGBo= github.com/json-iterator/go v1.1.12/go.mod h1:e30LSqwooZae/UwlEbR2852Gd8hjQvJoHmT4TnhNGBo=
github.com/klauspost/compress v1.18.1 h1:bcSGx7UbpBqMChDtsF28Lw6v/G94LPrrbMbdC3JH2co=
github.com/klauspost/compress v1.18.1/go.mod h1:ZQFFVG+MdnR0P+l6wpXgIL4NTtwiKIdBnrBd8Nrxr+0=
github.com/kr/pretty v0.2.1/go.mod h1:ipq/a2n7PKx3OHsz4KJII5eveXtPO4qwEXGdVfWzfnI=
github.com/kr/pretty v0.3.1 h1:flRD4NNwYAUpkphVc1HcthR4KEIFJ65n8Mw5qdRn3LE= github.com/kr/pretty v0.3.1 h1:flRD4NNwYAUpkphVc1HcthR4KEIFJ65n8Mw5qdRn3LE=
github.com/kr/pretty v0.3.1/go.mod h1:hoEshYVHaxMs3cyo3Yncou5ZscifuDolrwPKZanG3xk= github.com/kr/pretty v0.3.1/go.mod h1:hoEshYVHaxMs3cyo3Yncou5ZscifuDolrwPKZanG3xk=
github.com/kr/pty v1.1.1/go.mod h1:pFQYn66WHrOpPYNljwOMqo10TkYh1fy3cYio2l3bCsQ=
github.com/kr/text v0.1.0/go.mod h1:4Jbv+DJW3UT/LiOwJeYQe1efqtUx/iVham/4vfdArNI=
github.com/kr/text v0.2.0 h1:5Nx0Ya0ZqY2ygV366QzturHI13Jq95ApcVaJBhpS+AY= github.com/kr/text v0.2.0 h1:5Nx0Ya0ZqY2ygV366QzturHI13Jq95ApcVaJBhpS+AY=
github.com/kr/text v0.2.0/go.mod h1:eLer722TekiGuMkidMxC/pM04lWEeraHUUmBw8l2grE= github.com/kr/text v0.2.0/go.mod h1:eLer722TekiGuMkidMxC/pM04lWEeraHUUmBw8l2grE=
github.com/lann/builder v0.0.0-20180802200727-47ae307949d0 h1:SOEGU9fKiNWd/HOJuq6+3iTQz8KNCLtVX6idSoTLdUw= github.com/lann/builder v0.0.0-20180802200727-47ae307949d0 h1:SOEGU9fKiNWd/HOJuq6+3iTQz8KNCLtVX6idSoTLdUw=
@@ -210,48 +155,50 @@ github.com/lann/ps v0.0.0-20150810152359-62de8c46ede0/go.mod h1:vmVJ0l/dxyfGW6Fm
github.com/lib/pq v1.2.0/go.mod h1:5WUZQaWbwv1U+lTReE5YruASi9Al49XbQIvNi/34Woo= github.com/lib/pq v1.2.0/go.mod h1:5WUZQaWbwv1U+lTReE5YruASi9Al49XbQIvNi/34Woo=
github.com/lib/pq v1.10.9 h1:YXG7RB+JIjhP29X+OtkiDnYaXQwpS4JEWq7dtCCRUEw= github.com/lib/pq v1.10.9 h1:YXG7RB+JIjhP29X+OtkiDnYaXQwpS4JEWq7dtCCRUEw=
github.com/lib/pq v1.10.9/go.mod h1:AlVN5x4E4T544tWzH6hKfbfQvm3HdbOxrmggDNAPY9o= github.com/lib/pq v1.10.9/go.mod h1:AlVN5x4E4T544tWzH6hKfbfQvm3HdbOxrmggDNAPY9o=
github.com/linkedin/goavro/v2 v2.14.1 h1:/8VjDpd38PRsy02JS0jflAu7JZPfJcGTwqWgMkFS2iI= github.com/mailru/easyjson v0.9.0 h1:PrnmzHw7262yW8sTBwxi1PdJA3Iw/EKBa8psRf7d9a4=
github.com/linkedin/goavro/v2 v2.14.1/go.mod h1:KXx+erlq+RPlGSPmLF7xGo6SAbh8sCQ53x064+ioxhk= github.com/mailru/easyjson v0.9.0/go.mod h1:1+xMtQp2MRNVL/V1bOzuP3aP8VNwRW55fQUto+XFtTU=
github.com/mattn/go-sqlite3 v1.10.0/go.mod h1:FPy6KqzDD04eiIsT53CuJW3U88zkxoIYsOqkbpncsNc= github.com/mattn/go-sqlite3 v1.10.0/go.mod h1:FPy6KqzDD04eiIsT53CuJW3U88zkxoIYsOqkbpncsNc=
github.com/mattn/go-sqlite3 v1.14.22/go.mod h1:Uh1q+B4BYcTPb+yiD3kU8Ct7aC0hY9fxUwlHK0RXw+Y= github.com/mattn/go-sqlite3 v1.14.22/go.mod h1:Uh1q+B4BYcTPb+yiD3kU8Ct7aC0hY9fxUwlHK0RXw+Y=
github.com/mattn/go-sqlite3 v1.14.32 h1:JD12Ag3oLy1zQA+BNn74xRgaBbdhbNIDYvQUEuuErjs= github.com/mattn/go-sqlite3 v1.14.24 h1:tpSp2G2KyMnnQu99ngJ47EIkWVmliIizyZBfPrBWDRM=
github.com/mattn/go-sqlite3 v1.14.32/go.mod h1:Uh1q+B4BYcTPb+yiD3kU8Ct7aC0hY9fxUwlHK0RXw+Y= github.com/mattn/go-sqlite3 v1.14.24/go.mod h1:Uh1q+B4BYcTPb+yiD3kU8Ct7aC0hY9fxUwlHK0RXw+Y=
github.com/moby/docker-image-spec v1.3.1 h1:jMKff3w6PgbfSa69GfNg+zN/XLhfXJGnEx3Nl2EsFP0=
github.com/moby/docker-image-spec v1.3.1/go.mod h1:eKmb5VW8vQEh/BAr2yvVNvuiJuY6UIocYsFu/DxxRpo=
github.com/moby/term v0.5.0 h1:xt8Q1nalod/v7BqbG21f8mQPqH+xAaC9C3N3wfWbVP0=
github.com/moby/term v0.5.0/go.mod h1:8FzsFHVUBGZdbDsJw/ot+X+d5HLUbvklYLJ9uGfcI3Y=
github.com/modern-go/concurrent v0.0.0-20180228061459-e0a39a4cb421/go.mod h1:6dJC0mAP4ikYIbvyc7fijjWJddQyLn8Ig3JB5CqoB9Q= github.com/modern-go/concurrent v0.0.0-20180228061459-e0a39a4cb421/go.mod h1:6dJC0mAP4ikYIbvyc7fijjWJddQyLn8Ig3JB5CqoB9Q=
github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd h1:TRLaZ9cD/w8PVh93nsPXa1VrQ6jlwL5oN8l14QlcNfg= github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd h1:TRLaZ9cD/w8PVh93nsPXa1VrQ6jlwL5oN8l14QlcNfg=
github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd/go.mod h1:6dJC0mAP4ikYIbvyc7fijjWJddQyLn8Ig3JB5CqoB9Q= github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd/go.mod h1:6dJC0mAP4ikYIbvyc7fijjWJddQyLn8Ig3JB5CqoB9Q=
github.com/modern-go/reflect2 v1.0.2 h1:xBagoLtFs94CBntxluKeaWgTMpvLxC4ur3nMaC9Gz0M= github.com/modern-go/reflect2 v1.0.2 h1:xBagoLtFs94CBntxluKeaWgTMpvLxC4ur3nMaC9Gz0M=
github.com/modern-go/reflect2 v1.0.2/go.mod h1:yWuevngMOJpCy52FWWMvUC8ws7m/LJsjYzDa0/r8luk= github.com/modern-go/reflect2 v1.0.2/go.mod h1:yWuevngMOJpCy52FWWMvUC8ws7m/LJsjYzDa0/r8luk=
github.com/morikuni/aec v1.0.0 h1:nP9CBfwrvYnBRgY6qfDQkygYDmYwOilePFkwzv4dU8A=
github.com/morikuni/aec v1.0.0/go.mod h1:BbKIizmSmc5MMPqRYbxO4ZU0S0+P200+tUnFx7PXmsc=
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 h1:C3w9PqII01/Oq1c1nUAm88MOHcQC9l5mIlSMApZMrHA= github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 h1:C3w9PqII01/Oq1c1nUAm88MOHcQC9l5mIlSMApZMrHA=
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822/go.mod h1:+n7T8mK8HuQTcFwEeznm/DIxMOiR9yIdICNftLE1DvQ= github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822/go.mod h1:+n7T8mK8HuQTcFwEeznm/DIxMOiR9yIdICNftLE1DvQ=
github.com/mwitkow/go-conntrack v0.0.0-20190716064945-2f068394615f h1:KUppIJq7/+SVif2QVs3tOP0zanoHgBEVAwHxUSIzRqU= github.com/mwitkow/go-conntrack v0.0.0-20190716064945-2f068394615f h1:KUppIJq7/+SVif2QVs3tOP0zanoHgBEVAwHxUSIzRqU=
github.com/mwitkow/go-conntrack v0.0.0-20190716064945-2f068394615f/go.mod h1:qRWi+5nqEBWmkhHvq77mSJWrCKwh8bxhgT7d/eI7P4U= github.com/mwitkow/go-conntrack v0.0.0-20190716064945-2f068394615f/go.mod h1:qRWi+5nqEBWmkhHvq77mSJWrCKwh8bxhgT7d/eI7P4U=
github.com/nats-io/nats.go v1.47.0 h1:YQdADw6J/UfGUd2Oy6tn4Hq6YHxCaJrVKayxxFqYrgM= github.com/opencontainers/go-digest v1.0.0 h1:apOUWs51W5PlhuyGyz9FCeeBIOUDA/6nW8Oi/yOhh5U=
github.com/nats-io/nats.go v1.47.0/go.mod h1:iRWIPokVIFbVijxuMQq4y9ttaBTMe0SFdlZfMDd+33g= github.com/opencontainers/go-digest v1.0.0/go.mod h1:0JzlMkj0TRzQZfJkVvzbP0HBR3IKzErnv2BNG4W4MAM=
github.com/nats-io/nkeys v0.4.11 h1:q44qGV008kYd9W1b1nEBkNzvnWxtRSQ7A8BoqRrcfa0= github.com/opencontainers/image-spec v1.1.0 h1:8SG7/vwALn54lVB/0yZ/MMwhFrPYtpEHQb2IpWsCzug=
github.com/nats-io/nkeys v0.4.11/go.mod h1:szDimtgmfOi9n25JpfIdGw12tZFYXqhGxjhVxsatHVE= github.com/opencontainers/image-spec v1.1.0/go.mod h1:W4s4sFTMaBeK1BQLXbG4AdM2szdn85PY75RI83NrTrM=
github.com/nats-io/nuid v1.0.1 h1:5iA8DT8V7q8WK2EScv2padNa/rTESc1KdnPw4TC2paw=
github.com/nats-io/nuid v1.0.1/go.mod h1:19wcPz3Ph3q0Jbyiqsd0kePYG7A95tJPxeL+1OSON2c=
github.com/niemeyer/pretty v0.0.0-20200227124842-a10e7caefd8e/go.mod h1:zD1mROLANZcx1PVRCS0qkT7pwLkGfwJo4zjcN/Tysno=
github.com/oapi-codegen/runtime v1.1.1 h1:EXLHh0DXIJnWhdRPN2w4MXAzFyE4CskzhNLUmtpMYro=
github.com/oapi-codegen/runtime v1.1.1/go.mod h1:SK9X900oXmPWilYR5/WKPzt3Kqxn/uS/+lbpREv+eCg=
github.com/opentracing/opentracing-go v1.1.0/go.mod h1:UkNAQd3GIcIGf0SeVgPpRdFStlNbqXla1AfSYxPUl2o= github.com/opentracing/opentracing-go v1.1.0/go.mod h1:UkNAQd3GIcIGf0SeVgPpRdFStlNbqXla1AfSYxPUl2o=
github.com/pkg/errors v0.9.1 h1:FEBLx1zS214owpjy7qsBeixbURkuhQAwrK5UwLGTwt4=
github.com/pkg/errors v0.9.1/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0=
github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4= github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
github.com/pmezard/go-difflib v1.0.1-0.20181226105442-5d4384ee4fb2 h1:Jamvg5psRIccs7FGNTlIRMkT8wgtp5eCXdBlqhYGL6U= github.com/prometheus/client_golang v1.22.0 h1:rb93p9lokFEsctTys46VnV1kLCDpVZ0a/Y92Vm0Zc6Q=
github.com/pmezard/go-difflib v1.0.1-0.20181226105442-5d4384ee4fb2/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4= github.com/prometheus/client_golang v1.22.0/go.mod h1:R7ljNsLXhuQXYZYtw6GAE9AZg8Y7vEW5scdCXrWRXC0=
github.com/prometheus/client_golang v1.23.2 h1:Je96obch5RDVy3FDMndoUsjAhG5Edi49h0RJWRi/o0o=
github.com/prometheus/client_golang v1.23.2/go.mod h1:Tb1a6LWHB3/SPIzCoaDXI4I8UHKeFTEQ1YCr+0Gyqmg=
github.com/prometheus/client_model v0.6.2 h1:oBsgwpGs7iVziMvrGhE53c/GrLUsZdHnqNwqPLxwZyk= github.com/prometheus/client_model v0.6.2 h1:oBsgwpGs7iVziMvrGhE53c/GrLUsZdHnqNwqPLxwZyk=
github.com/prometheus/client_model v0.6.2/go.mod h1:y3m2F6Gdpfy6Ut/GBsUqTWZqCUvMVzSfMLjcu6wAwpE= github.com/prometheus/client_model v0.6.2/go.mod h1:y3m2F6Gdpfy6Ut/GBsUqTWZqCUvMVzSfMLjcu6wAwpE=
github.com/prometheus/common v0.67.4 h1:yR3NqWO1/UyO1w2PhUvXlGQs/PtFmoveVO0KZ4+Lvsc= github.com/prometheus/common v0.63.0 h1:YR/EIY1o3mEFP/kZCD7iDMnLPlGyuU2Gb3HIcXnA98k=
github.com/prometheus/common v0.67.4/go.mod h1:gP0fq6YjjNCLssJCQp0yk4M8W6ikLURwkdd/YKtTbyI= github.com/prometheus/common v0.63.0/go.mod h1:VVFF/fBIoToEnWRVkYoXEkq3R3paCoxG9PXP74SnV18=
github.com/prometheus/procfs v0.16.1 h1:hZ15bTNuirocR6u0JZ6BAHHmwS1p8B4P6MRqxtzMyRg= github.com/prometheus/procfs v0.16.1 h1:hZ15bTNuirocR6u0JZ6BAHHmwS1p8B4P6MRqxtzMyRg=
github.com/prometheus/procfs v0.16.1/go.mod h1:teAbpZRB1iIAJYREa1LsoWUXykVXA1KlTmWl8x/U+Is= github.com/prometheus/procfs v0.16.1/go.mod h1:teAbpZRB1iIAJYREa1LsoWUXykVXA1KlTmWl8x/U+Is=
github.com/qustavo/sqlhooks/v2 v2.1.0 h1:54yBemHnGHp/7xgT+pxwmIlMSDNYKx5JW5dfRAiCZi0= github.com/qustavo/sqlhooks/v2 v2.1.0 h1:54yBemHnGHp/7xgT+pxwmIlMSDNYKx5JW5dfRAiCZi0=
github.com/qustavo/sqlhooks/v2 v2.1.0/go.mod h1:aMREyKo7fOKTwiLuWPsaHRXEmtqG4yREztO0idF83AU= github.com/qustavo/sqlhooks/v2 v2.1.0/go.mod h1:aMREyKo7fOKTwiLuWPsaHRXEmtqG4yREztO0idF83AU=
github.com/robfig/cron/v3 v3.0.1 h1:WdRxkvbJztn8LMz/QEvLN5sBU+xKpSqwwUO1Pjr4qDs= github.com/robfig/cron/v3 v3.0.1 h1:WdRxkvbJztn8LMz/QEvLN5sBU+xKpSqwwUO1Pjr4qDs=
github.com/robfig/cron/v3 v3.0.1/go.mod h1:eQICP3HwyT7UooqI/z+Ov+PtYAWygg1TEWWzGIFLtro= github.com/robfig/cron/v3 v3.0.1/go.mod h1:eQICP3HwyT7UooqI/z+Ov+PtYAWygg1TEWWzGIFLtro=
github.com/rogpeppe/go-internal v1.14.1 h1:UQB4HGPB6osV0SQTLymcB4TgvyWu6ZyliaW0tI/otEQ= github.com/rogpeppe/go-internal v1.12.0 h1:exVL4IDcn6na9z1rAb56Vxr+CgyK3nn3O+epU5NdKM8=
github.com/rogpeppe/go-internal v1.14.1/go.mod h1:MaRKkUm5W0goXpeCfT7UZI6fk/L7L7so1lCWt35ZSgc= github.com/rogpeppe/go-internal v1.12.0/go.mod h1:E+RYuTGaKKdloAfM02xzb0FW3Paa99yedzYV+kq4uf4=
github.com/russross/blackfriday/v2 v2.1.0 h1:JIOH55/0cWyOuilr9/qlrm0BSXldqnqwMsf35Ld67mk= github.com/russross/blackfriday/v2 v2.1.0 h1:JIOH55/0cWyOuilr9/qlrm0BSXldqnqwMsf35Ld67mk=
github.com/russross/blackfriday/v2 v2.1.0/go.mod h1:+Rmxgy9KzJVeS9/2gXHxylqXiyQDYRxCVz55jmeOWTM= github.com/russross/blackfriday/v2 v2.1.0/go.mod h1:+Rmxgy9KzJVeS9/2gXHxylqXiyQDYRxCVz55jmeOWTM=
github.com/santhosh-tekuri/jsonschema/v5 v5.3.1 h1:lZUw3E0/J3roVtGQ+SCrUrg3ON6NgVqpn3+iol9aGu4= github.com/santhosh-tekuri/jsonschema/v5 v5.3.1 h1:lZUw3E0/J3roVtGQ+SCrUrg3ON6NgVqpn3+iol9aGu4=
@@ -262,93 +209,136 @@ github.com/sosodev/duration v1.3.1 h1:qtHBDMQ6lvMQsL15g4aopM4HEfOaYuhWBw3NPTtlqq
github.com/sosodev/duration v1.3.1/go.mod h1:RQIBBX0+fMLc/D9+Jb/fwvVmo0eZvDDEERAikUR6SDg= github.com/sosodev/duration v1.3.1/go.mod h1:RQIBBX0+fMLc/D9+Jb/fwvVmo0eZvDDEERAikUR6SDg=
github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME= github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME=
github.com/stretchr/objx v0.4.0/go.mod h1:YvHI0jy2hoMjB+UWwv71VJQ9isScKT/TqJzVSSt89Yw= github.com/stretchr/objx v0.4.0/go.mod h1:YvHI0jy2hoMjB+UWwv71VJQ9isScKT/TqJzVSSt89Yw=
github.com/stretchr/objx v0.5.2 h1:xuMeJ0Sdp5ZMRXx/aWO6RZxdr3beISkG5/G/aIRr3pY= github.com/stretchr/objx v0.5.0/go.mod h1:Yh+to48EsGEfYuaHDzXPcE3xhTkx73EhmCGUpEOglKo=
github.com/stretchr/objx v0.5.2/go.mod h1:FRsXN1f5AsAjCGJKqEizvkpNtU+EGNCLh3NxZ/8L+MA=
github.com/stretchr/testify v1.2.2/go.mod h1:a8OnRcib4nhh0OaRAV+Yts87kKdq0PP7pXfy6kDkUVs= github.com/stretchr/testify v1.2.2/go.mod h1:a8OnRcib4nhh0OaRAV+Yts87kKdq0PP7pXfy6kDkUVs=
github.com/stretchr/testify v1.3.0/go.mod h1:M5WIy9Dh21IEIfnGCwXGc5bZfKNJtfHm1UVUgZn+9EI= github.com/stretchr/testify v1.3.0/go.mod h1:M5WIy9Dh21IEIfnGCwXGc5bZfKNJtfHm1UVUgZn+9EI=
github.com/stretchr/testify v1.4.0/go.mod h1:j7eGeouHqKxXV5pUuKE4zz7dFj8WfuZ+81PSLYec5m4= github.com/stretchr/testify v1.4.0/go.mod h1:j7eGeouHqKxXV5pUuKE4zz7dFj8WfuZ+81PSLYec5m4=
github.com/stretchr/testify v1.7.1/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg= github.com/stretchr/testify v1.7.1/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
github.com/stretchr/testify v1.7.5/go.mod h1:yNjHg4UonilssWZ8iaSj1OCr/vHnekPRkoO+kdMU+MU= github.com/stretchr/testify v1.8.0/go.mod h1:yNjHg4UonilssWZ8iaSj1OCr/vHnekPRkoO+kdMU+MU=
github.com/stretchr/testify v1.11.1 h1:7s2iGBzp5EwR7/aIZr8ao5+dra3wiQyKjjFuvgVKu7U= github.com/stretchr/testify v1.8.1/go.mod h1:w2LPCIKwWwSfY2zedu0+kehJoqGctiVI29o6fzry7u4=
github.com/stretchr/testify v1.11.1/go.mod h1:wZwfW3scLgRK+23gO65QZefKpKQRnfz6sD981Nm4B6U= github.com/stretchr/testify v1.10.0 h1:Xv5erBjTwe/5IxqUQTdXv5kgmIvbHo3QQyRwhJsOfJA=
github.com/stretchr/testify v1.10.0/go.mod h1:r2ic/lqez/lEtzL7wO/rwa5dbSLXVDPFyf8C91i36aY=
github.com/swaggo/files v1.0.1 h1:J1bVJ4XHZNq0I46UU90611i9/YzdrF7x92oX1ig5IdE= github.com/swaggo/files v1.0.1 h1:J1bVJ4XHZNq0I46UU90611i9/YzdrF7x92oX1ig5IdE=
github.com/swaggo/files v1.0.1/go.mod h1:0qXmMNH6sXNf+73t65aKeB+ApmgxdnkQzVTAj2uaMUg= github.com/swaggo/files v1.0.1/go.mod h1:0qXmMNH6sXNf+73t65aKeB+ApmgxdnkQzVTAj2uaMUg=
github.com/swaggo/http-swagger v1.3.4 h1:q7t/XLx0n15H1Q9/tk3Y9L4n210XzJF5WtnDX64a5ww= github.com/swaggo/http-swagger v1.3.4 h1:q7t/XLx0n15H1Q9/tk3Y9L4n210XzJF5WtnDX64a5ww=
github.com/swaggo/http-swagger v1.3.4/go.mod h1:9dAh0unqMBAlbp1uE2Uc2mQTxNMU/ha4UbucIg1MFkQ= github.com/swaggo/http-swagger v1.3.4/go.mod h1:9dAh0unqMBAlbp1uE2Uc2mQTxNMU/ha4UbucIg1MFkQ=
github.com/swaggo/swag v1.16.6 h1:qBNcx53ZaX+M5dxVyTrgQ0PJ/ACK+NzhwcbieTt+9yI= github.com/swaggo/swag v1.16.4 h1:clWJtd9LStiG3VeijiCfOVODP6VpHtKdQy9ELFG3s1A=
github.com/swaggo/swag v1.16.6/go.mod h1:ngP2etMK5a0P3QBizic5MEwpRmluJZPHjXcMoj4Xesg= github.com/swaggo/swag v1.16.4/go.mod h1:VBsHJRsDvfYvqoiMKnsdwhNV9LEMHgEDZcyVYX0sxPg=
github.com/urfave/cli/v2 v2.27.7 h1:bH59vdhbjLv3LAvIu6gd0usJHgoTTPhCFib8qqOwXYU= github.com/urfave/cli/v2 v2.27.5 h1:WoHEJLdsXr6dDWoJgMq/CboDmyY/8HMMH1fTECbih+w=
github.com/urfave/cli/v2 v2.27.7/go.mod h1:CyNAG/xg+iAOg0N4MPGZqVmv2rCoP267496AOXUZjA4= github.com/urfave/cli/v2 v2.27.5/go.mod h1:3Sevf16NykTbInEnD0yKkjDAeZDS0A6bzhBH5hrMvTQ=
github.com/urfave/cli/v3 v3.6.1 h1:j8Qq8NyUawj/7rTYdBGrxcH7A/j7/G8Q5LhWEW4G3Mo= github.com/vektah/gqlparser/v2 v2.5.22 h1:yaaeJ0fu+nv1vUMW0Hl+aS1eiv1vMfapBNjpffAda1I=
github.com/urfave/cli/v3 v3.6.1/go.mod h1:ysVLtOEmg2tOy6PknnYVhDoouyC/6N42TMeoMzskhso= github.com/vektah/gqlparser/v2 v2.5.22/go.mod h1:xMl+ta8a5M1Yo1A1Iwt/k7gSpscwSnHZdw7tfhEGfTM=
github.com/vektah/gqlparser/v2 v2.5.31 h1:YhWGA1mfTjID7qJhd1+Vxhpk5HTgydrGU9IgkWBTJ7k= github.com/xrash/smetrics v0.0.0-20240521201337-686a1a2994c1 h1:gEOO8jv9F4OT7lGCjxCBTO/36wtF6j2nSip77qHd4x4=
github.com/vektah/gqlparser/v2 v2.5.31/go.mod h1:c1I28gSOVNzlfc4WuDlqU7voQnsqI6OG2amkBAFmgts= github.com/xrash/smetrics v0.0.0-20240521201337-686a1a2994c1/go.mod h1:Ohn+xnUBiLI6FVj/9LpzZWtj1/D6lUovWYBkxHVV3aM=
github.com/xrash/smetrics v0.0.0-20250705151800-55b8f293f342 h1:FnBeRrxr7OU4VvAzt5X7s6266i6cSVkkFPS0TuXWbIg=
github.com/xrash/smetrics v0.0.0-20250705151800-55b8f293f342/go.mod h1:Ohn+xnUBiLI6FVj/9LpzZWtj1/D6lUovWYBkxHVV3aM=
github.com/yuin/goldmark v1.4.13/go.mod h1:6yULJ656Px+3vBD8DxQVa3kxgyrAnzto9xy5taEt/CY= github.com/yuin/goldmark v1.4.13/go.mod h1:6yULJ656Px+3vBD8DxQVa3kxgyrAnzto9xy5taEt/CY=
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.54.0 h1:TT4fX+nBOA/+LUkobKGW1ydGcn+G3vRw9+g5HwCphpk=
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.54.0/go.mod h1:L7UH0GbB0p47T4Rri3uHjbpCFYrVrwc1I25QhNPiGK8=
go.opentelemetry.io/otel v1.29.0 h1:PdomN/Al4q/lN6iBJEN3AwPvUiHPMlt93c8bqTG5Llw=
go.opentelemetry.io/otel v1.29.0/go.mod h1:N/WtXPs1CNCUEx+Agz5uouwCba+i+bJGFicT8SR4NP8=
go.opentelemetry.io/otel/metric v1.29.0 h1:vPf/HFWTNkPu1aYeIsc98l4ktOQaL6LeSoeV2g+8YLc=
go.opentelemetry.io/otel/metric v1.29.0/go.mod h1:auu/QWieFVWx+DmQOUMgj0F8LHWdgalxXqvp7BII/W8=
go.opentelemetry.io/otel/trace v1.29.0 h1:J/8ZNK4XgR7a21DZUAsbF8pZ5Jcw1VhACmnYt39JTi4=
go.opentelemetry.io/otel/trace v1.29.0/go.mod h1:eHl3w0sp3paPkYstJOmAimxhiFXPg+MMTlEh3nsQgWQ=
go.uber.org/atomic v1.11.0 h1:ZvwS0R+56ePWxUNi+Atn9dWONBPp/AUETXlHW0DxSjE=
go.uber.org/atomic v1.11.0/go.mod h1:LUxbIzbOniOlMKjJjyPfpl4v+PKK2cNJn91OQbhoJI0=
go.uber.org/goleak v1.3.0 h1:2K3zAYmnTNqV73imy9J1T3WC+gmCePx2hEGkimedGto= go.uber.org/goleak v1.3.0 h1:2K3zAYmnTNqV73imy9J1T3WC+gmCePx2hEGkimedGto=
go.uber.org/goleak v1.3.0/go.mod h1:CoHD4mav9JJNrW/WLlf7HGZPjdw8EucARQHekz1X6bE= go.uber.org/goleak v1.3.0/go.mod h1:CoHD4mav9JJNrW/WLlf7HGZPjdw8EucARQHekz1X6bE=
go.yaml.in/yaml/v2 v2.4.3 h1:6gvOSjQoTB3vt1l+CU+tSyi/HOjfOjRLJ4YwYZGwRO0=
go.yaml.in/yaml/v2 v2.4.3/go.mod h1:zSxWcmIDjOzPXpjlTTbAsKokqkDNAVtZO0WOMiT90s8=
go.yaml.in/yaml/v3 v3.0.4 h1:tfq32ie2Jv2UxXFdLJdh3jXuOzWiL1fo0bu/FbuKpbc=
go.yaml.in/yaml/v3 v3.0.4/go.mod h1:DhzuOOF2ATzADvBadXxruRBLzYTpT36CKvDb3+aBEFg=
golang.org/x/crypto v0.0.0-20190308221718-c2843e01d9a2/go.mod h1:djNgcEr1/C05ACkg1iLfiJU5Ep61QUkGW8qpdssI0+w= golang.org/x/crypto v0.0.0-20190308221718-c2843e01d9a2/go.mod h1:djNgcEr1/C05ACkg1iLfiJU5Ep61QUkGW8qpdssI0+w=
golang.org/x/crypto v0.0.0-20210921155107-089bfa567519/go.mod h1:GvvjBRRGRdwPK5ydBHafDWAxML/pGHZbMvKqRZ5+Abc= golang.org/x/crypto v0.0.0-20210921155107-089bfa567519/go.mod h1:GvvjBRRGRdwPK5ydBHafDWAxML/pGHZbMvKqRZ5+Abc=
golang.org/x/crypto v0.45.0 h1:jMBrvKuj23MTlT0bQEOBcAE0mjg8mK9RXFhRH6nyF3Q= golang.org/x/crypto v0.6.0/go.mod h1:OFC/31mSvZgRz0V1QTNCzfAI1aIRzbiufJtkMIlEp58=
golang.org/x/crypto v0.45.0/go.mod h1:XTGrrkGJve7CYK7J8PEww4aY7gM3qMCElcJQ8n8JdX4= golang.org/x/crypto v0.13.0/go.mod h1:y6Z2r+Rw4iayiXXAIxJIDAJ1zMW4yaTpebo8fPOliYc=
golang.org/x/exp v0.0.0-20250620022241-b7579e27df2b h1:M2rDM6z3Fhozi9O7NWsxAkg/yqS/lQJ6PmkyIV3YP+o= golang.org/x/crypto v0.19.0/go.mod h1:Iy9bg/ha4yyC70EfRS8jz+B6ybOBKMaSxLj6P6oBDfU=
golang.org/x/exp v0.0.0-20250620022241-b7579e27df2b/go.mod h1:3//PLf8L/X+8b4vuAfHzxeRUl04Adcb341+IGKfnqS8= golang.org/x/crypto v0.23.0/go.mod h1:CKFgDieR+mRhux2Lsu27y0fO304Db0wZe70UKqHu0v8=
golang.org/x/crypto v0.31.0/go.mod h1:kDsLvtWBEx7MV9tJOj9bnXsPbxwJQ6csT/x4KIN4Ssk=
golang.org/x/crypto v0.37.0 h1:kJNSjF/Xp7kU0iB2Z+9viTPMW4EqqsrywMXLJOOsXSE=
golang.org/x/crypto v0.37.0/go.mod h1:vg+k43peMZ0pUMhYmVAWysMK35e6ioLh3wB8ZCAfbVc=
golang.org/x/exp v0.0.0-20250408133849-7e4ce0ab07d0 h1:R84qjqJb5nVJMxqWYb3np9L5ZsaDtB+a39EqjV0JSUM=
golang.org/x/exp v0.0.0-20250408133849-7e4ce0ab07d0/go.mod h1:S9Xr4PYopiDyqSyp5NjCrhFrqg6A5zA2E/iPHPhqnS8=
golang.org/x/mod v0.6.0-dev.0.20220419223038-86c51ed26bb4/go.mod h1:jJ57K6gSWd91VN4djpZkiMVwK6gcyfeH4XE8wZrZaV4= golang.org/x/mod v0.6.0-dev.0.20220419223038-86c51ed26bb4/go.mod h1:jJ57K6gSWd91VN4djpZkiMVwK6gcyfeH4XE8wZrZaV4=
golang.org/x/mod v0.30.0 h1:fDEXFVZ/fmCKProc/yAXXUijritrDzahmwwefnjoPFk= golang.org/x/mod v0.8.0/go.mod h1:iBbtSCu2XBx23ZKBPSOrRkjjQPZFPuis4dIYUhu/chs=
golang.org/x/mod v0.30.0/go.mod h1:lAsf5O2EvJeSFMiBxXDki7sCgAxEUcZHXoXMKT4GJKc= golang.org/x/mod v0.12.0/go.mod h1:iBbtSCu2XBx23ZKBPSOrRkjjQPZFPuis4dIYUhu/chs=
golang.org/x/mod v0.15.0/go.mod h1:hTbmBsO62+eylJbnUtE2MGJUyE7QWk4xUqPFrRgJ+7c=
golang.org/x/mod v0.17.0/go.mod h1:hTbmBsO62+eylJbnUtE2MGJUyE7QWk4xUqPFrRgJ+7c=
golang.org/x/mod v0.24.0 h1:ZfthKaKaT4NrhGVZHO1/WDTwGES4De8KtWO0SIbNJMU=
golang.org/x/mod v0.24.0/go.mod h1:IXM97Txy2VM4PJ3gI61r1YEk/gAj6zAHN3AdZt6S9Ww=
golang.org/x/net v0.0.0-20190620200207-3b0461eec859/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s= golang.org/x/net v0.0.0-20190620200207-3b0461eec859/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=
golang.org/x/net v0.0.0-20200114155413-6afb5195e5aa/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=
golang.org/x/net v0.0.0-20210226172049-e18ecbb05110/go.mod h1:m0MpNAwzfU5UDzcl9v0D8zg8gWTRqZa9RBIspLL5mdg= golang.org/x/net v0.0.0-20210226172049-e18ecbb05110/go.mod h1:m0MpNAwzfU5UDzcl9v0D8zg8gWTRqZa9RBIspLL5mdg=
golang.org/x/net v0.0.0-20220722155237-a158d28d115b/go.mod h1:XRhObCWvk6IyKnWLug+ECip1KBveYUHfp+8e9klMJ9c= golang.org/x/net v0.0.0-20220722155237-a158d28d115b/go.mod h1:XRhObCWvk6IyKnWLug+ECip1KBveYUHfp+8e9klMJ9c=
golang.org/x/net v0.6.0/go.mod h1:2Tu9+aMcznHK/AK1HMvgo6xiTLG5rD5rZLDS+rp2Bjs=
golang.org/x/net v0.7.0/go.mod h1:2Tu9+aMcznHK/AK1HMvgo6xiTLG5rD5rZLDS+rp2Bjs= golang.org/x/net v0.7.0/go.mod h1:2Tu9+aMcznHK/AK1HMvgo6xiTLG5rD5rZLDS+rp2Bjs=
golang.org/x/net v0.47.0 h1:Mx+4dIFzqraBXUugkia1OOvlD6LemFo1ALMHjrXDOhY= golang.org/x/net v0.10.0/go.mod h1:0qNGK6F8kojg2nk9dLZ2mShWaEBan6FAoqfSigmmuDg=
golang.org/x/net v0.47.0/go.mod h1:/jNxtkgq5yWUGYkaZGqo27cfGZ1c5Nen03aYrrKpVRU= golang.org/x/net v0.15.0/go.mod h1:idbUs1IY1+zTqbi8yxTbhexhEEk5ur9LInksu6HrEpk=
golang.org/x/oauth2 v0.32.0 h1:jsCblLleRMDrxMN29H3z/k1KliIvpLgCkE6R8FXXNgY= golang.org/x/net v0.21.0/go.mod h1:bIjVDfnllIU7BJ2DNgfnXvpSvtn8VRwhlsaeUTyUS44=
golang.org/x/oauth2 v0.32.0/go.mod h1:lzm5WQJQwKZ3nwavOZ3IS5Aulzxi68dUSgRHujetwEA= golang.org/x/net v0.25.0/go.mod h1:JkAGAh7GEvH74S6FOH42FLoXpXbE/aqXSrIQjXgsiwM=
golang.org/x/net v0.33.0/go.mod h1:HXLR5J+9DxmrqMwG9qjGCxZ+zKXxBru04zlTvWlWuN4=
golang.org/x/net v0.39.0 h1:ZCu7HMWDxpXpaiKdhzIfaltL9Lp31x/3fCP11bc6/fY=
golang.org/x/net v0.39.0/go.mod h1:X7NRbYVEA+ewNkCNyJ513WmMdQ3BineSwVtN2zD/d+E=
golang.org/x/oauth2 v0.27.0 h1:da9Vo7/tDv5RH/7nZDz1eMGS/q1Vv1N/7FCrBhI9I3M=
golang.org/x/oauth2 v0.27.0/go.mod h1:onh5ek6nERTohokkhCD/y2cV4Do3fxFHFuAejCkRWT8=
golang.org/x/sync v0.0.0-20190423024810-112230192c58/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= golang.org/x/sync v0.0.0-20190423024810-112230192c58/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20220722155255-886fb9371eb4/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM= golang.org/x/sync v0.0.0-20220722155255-886fb9371eb4/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.18.0 h1:kr88TuHDroi+UVf+0hZnirlk8o8T+4MrK6mr60WkH/I= golang.org/x/sync v0.1.0/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.18.0/go.mod h1:9KTHXmSnoGruLpwFjVSX0lNNA75CykiMECbovNTZqGI= golang.org/x/sync v0.3.0/go.mod h1:FU7BRWz2tNW+3quACPkgCx/L+uEAv1htQ0V83Z9Rj+Y=
golang.org/x/sync v0.6.0/go.mod h1:Czt+wKu1gCyEFDUtn0jG5QVvpJ6rzVqr5aXyt9drQfk=
golang.org/x/sync v0.7.0/go.mod h1:Czt+wKu1gCyEFDUtn0jG5QVvpJ6rzVqr5aXyt9drQfk=
golang.org/x/sync v0.10.0/go.mod h1:Czt+wKu1gCyEFDUtn0jG5QVvpJ6rzVqr5aXyt9drQfk=
golang.org/x/sync v0.13.0 h1:AauUjRAJ9OSnvULf/ARrrVywoJDy0YS2AwQ98I37610=
golang.org/x/sync v0.13.0/go.mod h1:1dzgHSNfp02xaA81J2MS99Qcpr2w7fw1gpm99rleRqA=
golang.org/x/sys v0.0.0-20190215142949-d0b11bdaac8a/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY= golang.org/x/sys v0.0.0-20190215142949-d0b11bdaac8a/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20201119102817-f84b799fce68/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= golang.org/x/sys v0.0.0-20201119102817-f84b799fce68/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20210615035016-665e8c7367d1/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= golang.org/x/sys v0.0.0-20210615035016-665e8c7367d1/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.0.0-20220520151302-bc2c85ada10a/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= golang.org/x/sys v0.0.0-20220520151302-bc2c85ada10a/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.0.0-20220722155257-8c9f86f7a55f/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= golang.org/x/sys v0.0.0-20220722155257-8c9f86f7a55f/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.5.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= golang.org/x/sys v0.5.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.38.0 h1:3yZWxaJjBmCWXqhN1qh02AkOnCQ1poK6oF+a7xWL6Gc= golang.org/x/sys v0.8.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.38.0/go.mod h1:OgkHotnGiDImocRcuBABYBEXf8A9a87e/uXjp9XT3ks= golang.org/x/sys v0.12.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.17.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=
golang.org/x/sys v0.20.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=
golang.org/x/sys v0.28.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=
golang.org/x/sys v0.32.0 h1:s77OFDvIQeibCmezSnk/q6iAfkdiQaJi4VzroCFrN20=
golang.org/x/sys v0.32.0/go.mod h1:BJP2sWEmIv4KK5OTEluFJCKSidICx8ciO85XgH3Ak8k=
golang.org/x/telemetry v0.0.0-20240228155512-f48c80bd79b2/go.mod h1:TeRTkGYfJXctD9OcfyVLyj2J3IxLnKwHJR8f4D8a3YE=
golang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1/go.mod h1:bj7SfCRtBDWHUb9snDiAeCFNEtKQo2Wmx5Cou7ajbmo= golang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1/go.mod h1:bj7SfCRtBDWHUb9snDiAeCFNEtKQo2Wmx5Cou7ajbmo=
golang.org/x/term v0.0.0-20210927222741-03fcf44c2211/go.mod h1:jbD1KX2456YbFQfuXm/mYQcufACuNUgVhRMnK/tPxf8= golang.org/x/term v0.0.0-20210927222741-03fcf44c2211/go.mod h1:jbD1KX2456YbFQfuXm/mYQcufACuNUgVhRMnK/tPxf8=
golang.org/x/term v0.5.0/go.mod h1:jMB1sMXY+tzblOD4FWmEbocvup2/aLOaQEp7JmGp78k= golang.org/x/term v0.5.0/go.mod h1:jMB1sMXY+tzblOD4FWmEbocvup2/aLOaQEp7JmGp78k=
golang.org/x/term v0.8.0/go.mod h1:xPskH00ivmX89bAKVGSKKtLOWNx2+17Eiy94tnKShWo=
golang.org/x/term v0.12.0/go.mod h1:owVbMEjm3cBLCHdkQu9b1opXd4ETQWc3BhuQGKgXgvU=
golang.org/x/term v0.17.0/go.mod h1:lLRBjIVuehSbZlaOtGMbcMncT+aqLLLmKrsjNrUguwk=
golang.org/x/term v0.20.0/go.mod h1:8UkIAJTvZgivsXaD6/pH6U9ecQzZ45awqEOzuCvwpFY=
golang.org/x/term v0.27.0/go.mod h1:iMsnZpn0cago0GOrHO2+Y7u7JPn5AylBrcoWkElMTSM=
golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ= golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
golang.org/x/text v0.3.3/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ= golang.org/x/text v0.3.3/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ=
golang.org/x/text v0.3.7/go.mod h1:u+2+/6zg+i71rQMx5EYifcz6MCKuco9NR6JIITiCfzQ= golang.org/x/text v0.3.7/go.mod h1:u+2+/6zg+i71rQMx5EYifcz6MCKuco9NR6JIITiCfzQ=
golang.org/x/text v0.7.0/go.mod h1:mrYo+phRRbMaCq/xk9113O4dZlRixOauAjOtrjsXDZ8= golang.org/x/text v0.7.0/go.mod h1:mrYo+phRRbMaCq/xk9113O4dZlRixOauAjOtrjsXDZ8=
golang.org/x/text v0.31.0 h1:aC8ghyu4JhP8VojJ2lEHBnochRno1sgL6nEi9WGFGMM= golang.org/x/text v0.9.0/go.mod h1:e1OnstbJyHTd6l/uOt8jFFHp6TRDWZR/bV3emEE/zU8=
golang.org/x/text v0.31.0/go.mod h1:tKRAlv61yKIjGGHX/4tP1LTbc13YSec1pxVEWXzfoeM= golang.org/x/text v0.13.0/go.mod h1:TvPlkZtksWOMsz7fbANvkp4WM8x/WCo/om8BMLbz+aE=
golang.org/x/time v0.14.0 h1:MRx4UaLrDotUKUdCIqzPC48t1Y9hANFKIRpNx+Te8PI= golang.org/x/text v0.14.0/go.mod h1:18ZOQIKpY8NJVqYksKHtTdi31H5itFRjB5/qKTNYzSU=
golang.org/x/time v0.14.0/go.mod h1:eL/Oa2bBBK0TkX57Fyni+NgnyQQN4LitPmob2Hjnqw4= golang.org/x/text v0.15.0/go.mod h1:18ZOQIKpY8NJVqYksKHtTdi31H5itFRjB5/qKTNYzSU=
golang.org/x/text v0.21.0/go.mod h1:4IBbMaMmOPCJ8SecivzSH54+73PCFmPWxNTLm+vZkEQ=
golang.org/x/text v0.24.0 h1:dd5Bzh4yt5KYA8f9CJHCP4FB4D51c2c6JvN37xJJkJ0=
golang.org/x/text v0.24.0/go.mod h1:L8rBsPeo2pSS+xqN0d5u2ikmjtmoJbDBT1b7nHvFCdU=
golang.org/x/time v0.5.0 h1:o7cqy6amK/52YcAKIPlM3a+Fpj35zvRj2TP+e1xFSfk=
golang.org/x/time v0.5.0/go.mod h1:3BpzKBy/shNhVucY/MWOyx10tF3SFh9QdLuxbVysPQM=
golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ= golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ=
golang.org/x/tools v0.0.0-20191119224855-298f0cb1881e/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo= golang.org/x/tools v0.0.0-20191119224855-298f0cb1881e/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo=
golang.org/x/tools v0.1.12/go.mod h1:hNGJHUnrk76NpqgfD5Aqm5Crs+Hm0VOH/i9J2+nxYbc= golang.org/x/tools v0.1.12/go.mod h1:hNGJHUnrk76NpqgfD5Aqm5Crs+Hm0VOH/i9J2+nxYbc=
golang.org/x/tools v0.39.0 h1:ik4ho21kwuQln40uelmciQPp9SipgNDdrafrYA4TmQQ= golang.org/x/tools v0.6.0/go.mod h1:Xwgl3UAJ/d3gWutnCtw505GrjyAbvKui8lOU390QaIU=
golang.org/x/tools v0.39.0/go.mod h1:JnefbkDPyD8UU2kI5fuf8ZX4/yUeh9W877ZeBONxUqQ= golang.org/x/tools v0.13.0/go.mod h1:HvlwmtVNQAhOuCjW7xxvovg8wbNq7LwfXh/k7wXUl58=
golang.org/x/tools v0.21.1-0.20240508182429-e35e4ccd0d2d/go.mod h1:aiJjzUbINMkxbQROHiO6hDPo2LHcIPhhQsa9DLh0yGk=
golang.org/x/tools v0.32.0 h1:Q7N1vhpkQv7ybVzLFtTjvQya2ewbwNDZzUgfXGqtMWU=
golang.org/x/tools v0.32.0/go.mod h1:ZxrU41P/wAbZD8EDa6dDCa6XfpkhJ7HFMjHJXfBDu8s=
golang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0= golang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0= google.golang.org/protobuf v1.36.6 h1:z1NpPI8ku2WgiWnf+t9wTPsn6eP1L7ksHUlkfLvd9xY=
google.golang.org/protobuf v1.36.10 h1:AYd7cD/uASjIL6Q9LiTjz8JLcrh/88q5UObnmY3aOOE= google.golang.org/protobuf v1.36.6/go.mod h1:jduwjTPXsFjZGTmRluh+L6NjiWu7pchiJ2/5YcXBHnY=
google.golang.org/protobuf v1.36.10/go.mod h1:HTf+CrKn2C3g5S8VImy6tdcUvCska2kB7j23XfzDpco=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0= gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/check.v1 v1.0.0-20200227125254-8fa46927fb4f/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c h1:Hei/4ADfdWqJk1ZMxUNpqntNwaWcugrBjAiHlqqRiVk= gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c h1:Hei/4ADfdWqJk1ZMxUNpqntNwaWcugrBjAiHlqqRiVk=
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c/go.mod h1:JHkPIbrfpd72SG/EVd6muEfDQjcINNoR0C8j2r3qZ4Q= gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c/go.mod h1:JHkPIbrfpd72SG/EVd6muEfDQjcINNoR0C8j2r3qZ4Q=
gopkg.in/yaml.v2 v2.2.2/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI= gopkg.in/yaml.v2 v2.2.2/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI=
gopkg.in/yaml.v2 v2.4.0 h1:D8xgwECY7CYvx+Y2n4sBz93Jn9JRvxdiyyo8CTfuKaY=
gopkg.in/yaml.v2 v2.4.0/go.mod h1:RDklbk79AGWmwhnvt/jBztapEOGDOx6ZbXqjP6csGnQ=
gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM= gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
gopkg.in/yaml.v3 v3.0.0-20200615113413-eeeca48fe776/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA= gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM= gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
sigs.k8s.io/yaml v1.6.0 h1:G8fkbMSAFqgEFgh4b1wmtzDnioxFCUgTZhlbj5P9QYs= sigs.k8s.io/yaml v1.4.0 h1:Mk1wCc2gy/F0THH0TAp1QYyJNzRm2KCLy3o5ASXVI5E=
sigs.k8s.io/yaml v1.6.0/go.mod h1:796bPqUfzR/0jLAl6XjHl3Ck7MiyVv8dbTdyT3/pMf4= sigs.k8s.io/yaml v1.4.0/go.mod h1:Ejl7/uTz7PSA4eKMyQCUTnhZYNmLIl+5c2lQPGR2BPY=

View File

@@ -32,7 +32,6 @@ resolver:
autobind: autobind:
- "github.com/99designs/gqlgen/graphql/introspection" - "github.com/99designs/gqlgen/graphql/introspection"
- "github.com/ClusterCockpit/cc-backend/internal/graph/model" - "github.com/ClusterCockpit/cc-backend/internal/graph/model"
- "github.com/ClusterCockpit/cc-backend/internal/config"
# This section declares type mapping between the GraphQL and go type systems # This section declares type mapping between the GraphQL and go type systems
# #
@@ -63,11 +62,11 @@ models:
fields: fields:
partitions: partitions:
resolver: true resolver: true
# Node: Node:
# model: "github.com/ClusterCockpit/cc-lib/schema.Node" model: "github.com/ClusterCockpit/cc-lib/schema.Node"
# fields: fields:
# metaData: metaData:
# resolver: true resolver: true
NullableFloat: { model: "github.com/ClusterCockpit/cc-lib/schema.Float" } NullableFloat: { model: "github.com/ClusterCockpit/cc-lib/schema.Float" }
MetricScope: { model: "github.com/ClusterCockpit/cc-lib/schema.MetricScope" } MetricScope: { model: "github.com/ClusterCockpit/cc-lib/schema.MetricScope" }
MetricValue: { model: "github.com/ClusterCockpit/cc-lib/schema.MetricValue" } MetricValue: { model: "github.com/ClusterCockpit/cc-lib/schema.MetricValue" }
@@ -80,11 +79,12 @@ models:
Tag: { model: "github.com/ClusterCockpit/cc-lib/schema.Tag" } Tag: { model: "github.com/ClusterCockpit/cc-lib/schema.Tag" }
Resource: { model: "github.com/ClusterCockpit/cc-lib/schema.Resource" } Resource: { model: "github.com/ClusterCockpit/cc-lib/schema.Resource" }
JobState: { model: "github.com/ClusterCockpit/cc-lib/schema.JobState" } JobState: { model: "github.com/ClusterCockpit/cc-lib/schema.JobState" }
Node: { model: "github.com/ClusterCockpit/cc-lib/schema.Node" } MonitoringState:
SchedulerState: { model: "github.com/ClusterCockpit/cc-lib/schema.NodeState" }
{ model: "github.com/ClusterCockpit/cc-lib/schema.SchedulerState" }
HealthState: HealthState:
{ model: "github.com/ClusterCockpit/cc-lib/schema.MonitoringState" } { model: "github.com/ClusterCockpit/cc-lib/schema.MonitoringState" }
TimeRange: { model: "github.com/ClusterCockpit/cc-lib/schema.TimeRange" }
IntRange: { model: "github.com/ClusterCockpit/cc-lib/schema.IntRange" }
JobMetric: { model: "github.com/ClusterCockpit/cc-lib/schema.JobMetric" } JobMetric: { model: "github.com/ClusterCockpit/cc-lib/schema.JobMetric" }
Series: { model: "github.com/ClusterCockpit/cc-lib/schema.Series" } Series: { model: "github.com/ClusterCockpit/cc-lib/schema.Series" }
MetricStatistics: MetricStatistics:

View File

@@ -3,7 +3,7 @@ Description=ClusterCockpit Web Server
Documentation=https://github.com/ClusterCockpit/cc-backend Documentation=https://github.com/ClusterCockpit/cc-backend
Wants=network-online.target Wants=network-online.target
After=network-online.target After=network-online.target
# Database is file-based SQLite - no service dependency required After=mariadb.service mysql.service
[Service] [Service]
WorkingDirectory=/opt/monitoring/cc-backend WorkingDirectory=/opt/monitoring/cc-backend

View File

@@ -17,18 +17,16 @@ import (
"strings" "strings"
"testing" "testing"
"time" "time"
"sync"
"github.com/ClusterCockpit/cc-backend/internal/api" "github.com/ClusterCockpit/cc-backend/internal/api"
"github.com/ClusterCockpit/cc-backend/internal/archiver" "github.com/ClusterCockpit/cc-backend/internal/archiver"
"github.com/ClusterCockpit/cc-backend/internal/auth" "github.com/ClusterCockpit/cc-backend/internal/auth"
"github.com/ClusterCockpit/cc-backend/internal/config" "github.com/ClusterCockpit/cc-backend/internal/config"
"github.com/ClusterCockpit/cc-backend/internal/graph" "github.com/ClusterCockpit/cc-backend/internal/graph"
"github.com/ClusterCockpit/cc-backend/internal/memorystore" "github.com/ClusterCockpit/cc-backend/internal/metricDataDispatcher"
"github.com/ClusterCockpit/cc-backend/internal/metricdata"
"github.com/ClusterCockpit/cc-backend/internal/repository" "github.com/ClusterCockpit/cc-backend/internal/repository"
"github.com/ClusterCockpit/cc-backend/pkg/archive" "github.com/ClusterCockpit/cc-backend/pkg/archive"
ccconf "github.com/ClusterCockpit/cc-lib/ccConfig"
cclog "github.com/ClusterCockpit/cc-lib/ccLogger" cclog "github.com/ClusterCockpit/cc-lib/ccLogger"
"github.com/ClusterCockpit/cc-lib/schema" "github.com/ClusterCockpit/cc-lib/schema"
"github.com/gorilla/mux" "github.com/gorilla/mux"
@@ -36,27 +34,24 @@ import (
_ "github.com/mattn/go-sqlite3" _ "github.com/mattn/go-sqlite3"
) )
func setup(t *testing.T) *api.RestAPI { func setup(t *testing.T) *api.RestApi {
const testconfig = `{ const testconfig = `{
"main": {
"addr": "0.0.0.0:8080", "addr": "0.0.0.0:8080",
"validate": false, "validate": false,
"apiAllowedIPs": [
"*"
]
},
"archive": { "archive": {
"kind": "file", "kind": "file",
"path": "./var/job-archive" "path": "./var/job-archive"
}, },
"auth": { "jwts": {
"jwts": { "max-age": "2m"
"max-age": "2m" },
} "apiAllowedIPs": [
}, "*"
],
"clusters": [ "clusters": [
{ {
"name": "testcluster", "name": "testcluster",
"metricDataRepository": {"kind": "test", "url": "bla:8081"},
"filterRanges": { "filterRanges": {
"numNodes": { "from": 1, "to": 64 }, "numNodes": { "from": 1, "to": 64 },
"duration": { "from": 0, "to": 86400 }, "duration": { "from": 0, "to": 86400 },
@@ -65,7 +60,7 @@ func setup(t *testing.T) *api.RestAPI {
} }
] ]
}` }`
const testclusterJSON = `{ const testclusterJson = `{
"name": "testcluster", "name": "testcluster",
"subClusters": [ "subClusters": [
{ {
@@ -124,45 +119,34 @@ func setup(t *testing.T) *api.RestAPI {
cclog.Init("info", true) cclog.Init("info", true)
tmpdir := t.TempDir() tmpdir := t.TempDir()
jobarchive := filepath.Join(tmpdir, "job-archive") jobarchive := filepath.Join(tmpdir, "job-archive")
if err := os.Mkdir(jobarchive, 0o777); err != nil { if err := os.Mkdir(jobarchive, 0777); err != nil {
t.Fatal(err) t.Fatal(err)
} }
if err := os.WriteFile(filepath.Join(jobarchive, "version.txt"), fmt.Appendf(nil, "%d", 3), 0o666); err != nil { if err := os.WriteFile(filepath.Join(jobarchive, "version.txt"), fmt.Appendf(nil, "%d", 2), 0666); err != nil {
t.Fatal(err) t.Fatal(err)
} }
if err := os.Mkdir(filepath.Join(jobarchive, "testcluster"), 0o777); err != nil { if err := os.Mkdir(filepath.Join(jobarchive, "testcluster"), 0777); err != nil {
t.Fatal(err) t.Fatal(err)
} }
if err := os.WriteFile(filepath.Join(jobarchive, "testcluster", "cluster.json"), []byte(testclusterJSON), 0o666); err != nil { if err := os.WriteFile(filepath.Join(jobarchive, "testcluster", "cluster.json"), []byte(testclusterJson), 0666); err != nil {
t.Fatal(err) t.Fatal(err)
} }
dbfilepath := filepath.Join(tmpdir, "test.db") dbfilepath := filepath.Join(tmpdir, "test.db")
err := repository.MigrateDB(dbfilepath) err := repository.MigrateDB("sqlite3", dbfilepath)
if err != nil { if err != nil {
t.Fatal(err) t.Fatal(err)
} }
cfgFilePath := filepath.Join(tmpdir, "config.json") cfgFilePath := filepath.Join(tmpdir, "config.json")
if err := os.WriteFile(cfgFilePath, []byte(testconfig), 0o666); err != nil { if err := os.WriteFile(cfgFilePath, []byte(testconfig), 0666); err != nil {
t.Fatal(err) t.Fatal(err)
} }
ccconf.Init(cfgFilePath) config.Init(cfgFilePath)
// Load and check main configuration
if cfg := ccconf.GetPackageConfig("main"); cfg != nil {
if clustercfg := ccconf.GetPackageConfig("clusters"); clustercfg != nil {
config.Init(cfg, clustercfg)
} else {
cclog.Abort("Cluster configuration must be present")
}
} else {
cclog.Abort("Main configuration must be present")
}
archiveCfg := fmt.Sprintf("{\"kind\": \"file\",\"path\": \"%s\"}", jobarchive) archiveCfg := fmt.Sprintf("{\"kind\": \"file\",\"path\": \"%s\"}", jobarchive)
repository.Connect("sqlite3", dbfilepath) repository.Connect("sqlite3", dbfilepath)
@@ -171,51 +155,53 @@ func setup(t *testing.T) *api.RestAPI {
t.Fatal(err) t.Fatal(err)
} }
// Initialize memorystore (optional - will return nil if not configured) if err := metricdata.Init(); err != nil {
// For this test, we don't initialize it to test the nil handling t.Fatal(err)
mscfg := ccconf.GetPackageConfig("metric-store")
if mscfg != nil {
var wg sync.WaitGroup
memorystore.Init(mscfg, &wg)
}
archiver.Start(repository.GetJobRepository(), context.Background())
if cfg := ccconf.GetPackageConfig("auth"); cfg != nil {
auth.Init(&cfg)
} else {
cclog.Warn("Authentication disabled due to missing configuration")
auth.Init(nil)
} }
archiver.Start(repository.GetJobRepository())
auth.Init()
graph.Init() graph.Init()
return api.New() return api.New()
} }
func cleanup() { func cleanup() {
// Gracefully shutdown archiver with timeout // TODO: Clear all caches, reset all modules, etc...
if err := archiver.Shutdown(5 * time.Second); err != nil {
cclog.Warnf("Archiver shutdown timeout in tests: %v", err)
}
// Shutdown memorystore if it was initialized
memorystore.Shutdown()
} }
/* /*
* This function starts a job, stops it, and tests the REST API. * This function starts a job, stops it, and then reads its data from the job-archive.
* Do not run sub-tests in parallel! Tests should not be run in parallel at all, because * Do not run sub-tests in parallel! Tests should not be run in parallel at all, because
* at least `setup` modifies global state. * at least `setup` modifies global state.
*/ */
func TestRestApi(t *testing.T) { func TestRestApi(t *testing.T) {
restapi := setup(t) restapi := setup(t)
t.Cleanup(cleanup) t.Cleanup(cleanup)
testData := schema.JobData{
"load_one": map[schema.MetricScope]*schema.JobMetric{
schema.MetricScopeNode: {
Unit: schema.Unit{Base: "load"},
Timestep: 60,
Series: []schema.Series{
{
Hostname: "host123",
Statistics: schema.MetricStatistics{Min: 0.1, Avg: 0.2, Max: 0.3},
Data: []schema.Float{0.1, 0.1, 0.1, 0.2, 0.2, 0.2, 0.3, 0.3, 0.3},
},
},
},
},
}
metricdata.TestLoadDataCallback = func(job *schema.Job, metrics []string, scopes []schema.MetricScope, ctx context.Context, resolution int) (schema.JobData, error) {
return testData, nil
}
r := mux.NewRouter() r := mux.NewRouter()
r.PathPrefix("/api").Subrouter() r.PathPrefix("/api").Subrouter()
r.StrictSlash(true) r.StrictSlash(true)
restapi.MountAPIRoutes(r) restapi.MountApiRoutes(r)
var TestJobId int64 = 123 var TestJobId int64 = 123
TestClusterName := "testcluster" TestClusterName := "testcluster"
@@ -232,7 +218,7 @@ func TestRestApi(t *testing.T) {
"numNodes": 1, "numNodes": 1,
"numHwthreads": 8, "numHwthreads": 8,
"numAcc": 0, "numAcc": 0,
"shared": "none", "exclusive": 1,
"monitoringStatus": 1, "monitoringStatus": 1,
"smt": 1, "smt": 1,
"resources": [ "resources": [
@@ -265,12 +251,18 @@ func TestRestApi(t *testing.T) {
if response.StatusCode != http.StatusCreated { if response.StatusCode != http.StatusCreated {
t.Fatal(response.Status, recorder.Body.String()) t.Fatal(response.Status, recorder.Body.String())
} }
// resolver := graph.GetResolverInstance()
restapi.JobRepository.SyncJobs() restapi.JobRepository.SyncJobs()
job, err := restapi.JobRepository.Find(&TestJobId, &TestClusterName, &TestStartTime) job, err := restapi.JobRepository.Find(&TestJobId, &TestClusterName, &TestStartTime)
if err != nil { if err != nil {
t.Fatal(err) t.Fatal(err)
} }
// job.Tags, err = resolver.Job().Tags(ctx, job)
// if err != nil {
// t.Fatal(err)
// }
if job.JobID != 123 || if job.JobID != 123 ||
job.User != "testuser" || job.User != "testuser" ||
job.Project != "testproj" || job.Project != "testproj" ||
@@ -278,16 +270,21 @@ func TestRestApi(t *testing.T) {
job.SubCluster != "sc1" || job.SubCluster != "sc1" ||
job.Partition != "default" || job.Partition != "default" ||
job.Walltime != 3600 || job.Walltime != 3600 ||
job.ArrayJobID != 0 || job.ArrayJobId != 0 ||
job.NumNodes != 1 || job.NumNodes != 1 ||
job.NumHWThreads != 8 || job.NumHWThreads != 8 ||
job.NumAcc != 0 || job.NumAcc != 0 ||
job.Exclusive != 1 ||
job.MonitoringStatus != 1 || job.MonitoringStatus != 1 ||
job.SMT != 1 || job.SMT != 1 ||
!reflect.DeepEqual(job.Resources, []*schema.Resource{{Hostname: "host123", HWThreads: []int{0, 1, 2, 3, 4, 5, 6, 7}}}) || !reflect.DeepEqual(job.Resources, []*schema.Resource{{Hostname: "host123", HWThreads: []int{0, 1, 2, 3, 4, 5, 6, 7}}}) ||
job.StartTime != 123456789 { job.StartTime != 123456789 {
t.Fatalf("unexpected job properties: %#v", job) t.Fatalf("unexpected job properties: %#v", job)
} }
// if len(job.Tags) != 1 || job.Tags[0].Type != "testTagType" || job.Tags[0].Name != "testTagName" || job.Tags[0].Scope != "testuser" {
// t.Fatalf("unexpected tags: %#v", job.Tags)
// }
}); !ok { }); !ok {
return return
} }
@@ -301,6 +298,7 @@ func TestRestApi(t *testing.T) {
"stopTime": 123457789 "stopTime": 123457789
}` }`
var stoppedJob *schema.Job
if ok := t.Run("StopJob", func(t *testing.T) { if ok := t.Run("StopJob", func(t *testing.T) {
req := httptest.NewRequest(http.MethodPost, "/jobs/stop_job/", bytes.NewBuffer([]byte(stopJobBody))) req := httptest.NewRequest(http.MethodPost, "/jobs/stop_job/", bytes.NewBuffer([]byte(stopJobBody)))
recorder := httptest.NewRecorder() recorder := httptest.NewRecorder()
@@ -313,7 +311,7 @@ func TestRestApi(t *testing.T) {
t.Fatal(response.Status, recorder.Body.String()) t.Fatal(response.Status, recorder.Body.String())
} }
// Archiving happens asynchronously, will be completed in cleanup archiver.WaitForArchiving()
job, err := restapi.JobRepository.Find(&TestJobId, &TestClusterName, &TestStartTime) job, err := restapi.JobRepository.Find(&TestJobId, &TestClusterName, &TestStartTime)
if err != nil { if err != nil {
t.Fatal(err) t.Fatal(err)
@@ -336,12 +334,21 @@ func TestRestApi(t *testing.T) {
t.Fatalf("unexpected job.metaData: %#v", job.MetaData) t.Fatalf("unexpected job.metaData: %#v", job.MetaData)
} }
stoppedJob = job
}); !ok { }); !ok {
return return
} }
// Note: We skip the CheckArchive test because without memorystore initialized, t.Run("CheckArchive", func(t *testing.T) {
// archiving will fail gracefully. This test now focuses on the REST API itself. data, err := metricDataDispatcher.LoadData(stoppedJob, []string{"load_one"}, []schema.MetricScope{schema.MetricScopeNode}, context.Background(), 60)
if err != nil {
t.Fatal(err)
}
if !reflect.DeepEqual(data, testData) {
t.Fatal("unexpected data fetched from archive")
}
})
t.Run("CheckDoubleStart", func(t *testing.T) { t.Run("CheckDoubleStart", func(t *testing.T) {
// Starting a job with the same jobId and cluster should only be allowed if the startTime is far appart! // Starting a job with the same jobId and cluster should only be allowed if the startTime is far appart!
@@ -367,7 +374,7 @@ func TestRestApi(t *testing.T) {
"partition": "default", "partition": "default",
"walltime": 3600, "walltime": 3600,
"numNodes": 1, "numNodes": 1,
"shared": "none", "exclusive": 1,
"monitoringStatus": 1, "monitoringStatus": 1,
"smt": 1, "smt": 1,
"resources": [ "resources": [
@@ -417,7 +424,7 @@ func TestRestApi(t *testing.T) {
t.Fatal(response.Status, recorder.Body.String()) t.Fatal(response.Status, recorder.Body.String())
} }
// Archiving happens asynchronously, will be completed in cleanup archiver.WaitForArchiving()
jobid, cluster := int64(12345), "testcluster" jobid, cluster := int64(12345), "testcluster"
job, err := restapi.JobRepository.Find(&jobid, &cluster, nil) job, err := restapi.JobRepository.Find(&jobid, &cluster, nil)
if err != nil { if err != nil {

View File

@@ -2,7 +2,6 @@
// All rights reserved. This file is part of cc-backend. // All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style // Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file. // license that can be found in the LICENSE file.
package api package api
import ( import (
@@ -16,8 +15,8 @@ import (
"github.com/ClusterCockpit/cc-lib/schema" "github.com/ClusterCockpit/cc-lib/schema"
) )
// GetClustersAPIResponse model // GetClustersApiResponse model
type GetClustersAPIResponse struct { type GetClustersApiResponse struct {
Clusters []*schema.Cluster `json:"clusters"` // Array of clusters Clusters []*schema.Cluster `json:"clusters"` // Array of clusters
} }
@@ -34,7 +33,7 @@ type GetClustersAPIResponse struct {
// @failure 500 {object} api.ErrorResponse "Internal Server Error" // @failure 500 {object} api.ErrorResponse "Internal Server Error"
// @security ApiKeyAuth // @security ApiKeyAuth
// @router /api/clusters/ [get] // @router /api/clusters/ [get]
func (api *RestAPI) getClusters(rw http.ResponseWriter, r *http.Request) { func (api *RestApi) getClusters(rw http.ResponseWriter, r *http.Request) {
if user := repository.GetUserFromContext(r.Context()); user != nil && if user := repository.GetUserFromContext(r.Context()); user != nil &&
!user.HasRole(schema.RoleApi) { !user.HasRole(schema.RoleApi) {
@@ -60,7 +59,7 @@ func (api *RestAPI) getClusters(rw http.ResponseWriter, r *http.Request) {
clusters = archive.Clusters clusters = archive.Clusters
} }
payload := GetClustersAPIResponse{ payload := GetClustersApiResponse{
Clusters: clusters, Clusters: clusters,
} }

View File

@@ -390,8 +390,71 @@ const docTemplate = `{
} }
} }
}, },
"/api/jobs/edit_meta/": {
"patch": {
"security": [
{
"ApiKeyAuth": []
}
],
"description": "Edit key value pairs in metadata json of job specified by jobID, StartTime and Cluster\nIf a key already exists its content will be overwritten",
"consumes": [
"application/json"
],
"produces": [
"application/json"
],
"tags": [
"Job add and modify"
],
"summary": "Edit meta-data json by request",
"parameters": [
{
"description": "Specifies job and payload to add or update",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/api.JobMetaRequest"
}
}
],
"responses": {
"200": {
"description": "Updated job resource",
"schema": {
"$ref": "#/definitions/schema.Job"
}
},
"400": {
"description": "Bad Request",
"schema": {
"$ref": "#/definitions/api.ErrorResponse"
}
},
"401": {
"description": "Unauthorized",
"schema": {
"$ref": "#/definitions/api.ErrorResponse"
}
},
"404": {
"description": "Job does not exist",
"schema": {
"$ref": "#/definitions/api.ErrorResponse"
}
},
"500": {
"description": "Internal Server Error",
"schema": {
"$ref": "#/definitions/api.ErrorResponse"
}
}
}
}
},
"/api/jobs/edit_meta/{id}": { "/api/jobs/edit_meta/{id}": {
"post": { "patch": {
"security": [ "security": [
{ {
"ApiKeyAuth": [] "ApiKeyAuth": []
@@ -1244,6 +1307,37 @@ const docTemplate = `{
} }
} }
}, },
"api.JobMetaRequest": {
"type": "object",
"required": [
"jobId"
],
"properties": {
"cluster": {
"description": "Cluster of job",
"type": "string",
"example": "fritz"
},
"jobId": {
"description": "Cluster Job ID of job",
"type": "integer",
"example": 123000
},
"payload": {
"description": "Content to Add to Job Meta_Data",
"allOf": [
{
"$ref": "#/definitions/api.EditMetaRequest"
}
]
},
"startTime": {
"description": "Start Time of job as epoch",
"type": "integer",
"example": 1649723812
}
}
},
"api.JobMetricWithName": { "api.JobMetricWithName": {
"type": "object", "type": "object",
"properties": { "properties": {
@@ -1261,27 +1355,9 @@ const docTemplate = `{
"api.Node": { "api.Node": {
"type": "object", "type": "object",
"properties": { "properties": {
"cpusAllocated": {
"type": "integer"
},
"cpusTotal": {
"type": "integer"
},
"gpusAllocated": {
"type": "integer"
},
"gpusTotal": {
"type": "integer"
},
"hostname": { "hostname": {
"type": "string" "type": "string"
}, },
"memoryAllocated": {
"type": "integer"
},
"memoryTotal": {
"type": "integer"
},
"states": { "states": {
"type": "array", "type": "array",
"items": { "items": {
@@ -1397,15 +1473,19 @@ const docTemplate = `{
"energyFootprint": { "energyFootprint": {
"type": "object", "type": "object",
"additionalProperties": { "additionalProperties": {
"type": "number", "type": "number"
"format": "float64"
} }
}, },
"exclusive": {
"type": "integer",
"maximum": 2,
"minimum": 0,
"example": 1
},
"footprint": { "footprint": {
"type": "object", "type": "object",
"additionalProperties": { "additionalProperties": {
"type": "number", "type": "number"
"format": "float64"
} }
}, },
"id": { "id": {
@@ -1417,18 +1497,12 @@ const docTemplate = `{
}, },
"jobState": { "jobState": {
"enum": [ "enum": [
"boot_fail",
"cancelled",
"completed", "completed",
"deadline",
"failed", "failed",
"node_fail", "cancelled",
"out_of_memory", "stopped",
"pending", "timeout",
"preempted", "out_of_memory"
"running",
"suspended",
"timeout"
], ],
"allOf": [ "allOf": [
{ {
@@ -1484,14 +1558,6 @@ const docTemplate = `{
"$ref": "#/definitions/schema.Resource" "$ref": "#/definitions/schema.Resource"
} }
}, },
"shared": {
"type": "string",
"enum": [
"none",
"single_user",
"multi_user"
]
},
"smt": { "smt": {
"type": "integer", "type": "integer",
"example": 4 "example": 4
@@ -1510,10 +1576,6 @@ const docTemplate = `{
"type": "string", "type": "string",
"example": "main" "example": "main"
}, },
"submitTime": {
"type": "integer",
"example": 1649723812
},
"tags": { "tags": {
"type": "array", "type": "array",
"items": { "items": {
@@ -1579,32 +1641,24 @@ const docTemplate = `{
"schema.JobState": { "schema.JobState": {
"type": "string", "type": "string",
"enum": [ "enum": [
"boot_fail",
"cancelled",
"completed",
"deadline",
"failed",
"node_fail",
"out_of_memory",
"pending",
"preempted",
"running", "running",
"suspended", "completed",
"timeout" "failed",
"cancelled",
"stopped",
"timeout",
"preempted",
"out_of_memory"
], ],
"x-enum-varnames": [ "x-enum-varnames": [
"JobStateBootFail",
"JobStateCancelled",
"JobStateCompleted",
"JobStateDeadline",
"JobStateFailed",
"JobStateNodeFail",
"JobStateOutOfMemory",
"JobStatePending",
"JobStatePreempted",
"JobStateRunning", "JobStateRunning",
"JobStateSuspended", "JobStateCompleted",
"JobStateTimeout" "JobStateFailed",
"JobStateCancelled",
"JobStateStopped",
"JobStateTimeout",
"JobStatePreempted",
"JobStateOutOfMemory"
] ]
}, },
"schema.JobStatistics": { "schema.JobStatistics": {
@@ -1803,8 +1857,7 @@ const docTemplate = `{
"additionalProperties": { "additionalProperties": {
"type": "array", "type": "array",
"items": { "items": {
"type": "number", "type": "number"
"format": "float64"
} }
} }
} }

View File

@@ -2,7 +2,6 @@
// All rights reserved. This file is part of cc-backend. // All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style // Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file. // license that can be found in the LICENSE file.
package api package api
import ( import (
@@ -18,11 +17,10 @@ import (
"time" "time"
"github.com/ClusterCockpit/cc-backend/internal/archiver" "github.com/ClusterCockpit/cc-backend/internal/archiver"
"github.com/ClusterCockpit/cc-backend/internal/config"
"github.com/ClusterCockpit/cc-backend/internal/graph" "github.com/ClusterCockpit/cc-backend/internal/graph"
"github.com/ClusterCockpit/cc-backend/internal/graph/model" "github.com/ClusterCockpit/cc-backend/internal/graph/model"
"github.com/ClusterCockpit/cc-backend/internal/importer" "github.com/ClusterCockpit/cc-backend/internal/importer"
"github.com/ClusterCockpit/cc-backend/internal/metricdispatcher" "github.com/ClusterCockpit/cc-backend/internal/metricDataDispatcher"
"github.com/ClusterCockpit/cc-backend/internal/repository" "github.com/ClusterCockpit/cc-backend/internal/repository"
"github.com/ClusterCockpit/cc-backend/pkg/archive" "github.com/ClusterCockpit/cc-backend/pkg/archive"
cclog "github.com/ClusterCockpit/cc-lib/ccLogger" cclog "github.com/ClusterCockpit/cc-lib/ccLogger"
@@ -30,58 +28,61 @@ import (
"github.com/gorilla/mux" "github.com/gorilla/mux"
) )
const ( // StopJobApiRequest model
// secondsPerDay is the number of seconds in 24 hours. type StopJobApiRequest struct {
// Used for duplicate job detection within a day window. JobId *int64 `json:"jobId" example:"123000"`
secondsPerDay = 86400
)
// StopJobAPIRequest model
type StopJobAPIRequest struct {
JobID *int64 `json:"jobId" example:"123000"`
Cluster *string `json:"cluster" example:"fritz"` Cluster *string `json:"cluster" example:"fritz"`
StartTime *int64 `json:"startTime" example:"1649723812"` StartTime *int64 `json:"startTime" example:"1649723812"`
State schema.JobState `json:"jobState" validate:"required" example:"completed"` State schema.JobState `json:"jobState" validate:"required" example:"completed"`
StopTime int64 `json:"stopTime" validate:"required" example:"1649763839"` StopTime int64 `json:"stopTime" validate:"required" example:"1649763839"`
} }
// DeleteJobAPIRequest model // DeleteJobApiRequest model
type DeleteJobAPIRequest struct { type DeleteJobApiRequest struct {
JobID *int64 `json:"jobId" validate:"required" example:"123000"` // Cluster Job ID of job JobId *int64 `json:"jobId" validate:"required" example:"123000"` // Cluster Job ID of job
Cluster *string `json:"cluster" example:"fritz"` // Cluster of job Cluster *string `json:"cluster" example:"fritz"` // Cluster of job
StartTime *int64 `json:"startTime" example:"1649723812"` // Start Time of job as epoch StartTime *int64 `json:"startTime" example:"1649723812"` // Start Time of job as epoch
} }
// GetJobsAPIResponse model // GetJobsApiResponse model
type GetJobsAPIResponse struct { type GetJobsApiResponse struct {
Jobs []*schema.Job `json:"jobs"` // Array of jobs Jobs []*schema.Job `json:"jobs"` // Array of jobs
Items int `json:"items"` // Number of jobs returned Items int `json:"items"` // Number of jobs returned
Page int `json:"page"` // Page id returned Page int `json:"page"` // Page id returned
} }
// APITag model // ApiTag model
type APITag struct { type ApiTag struct {
// Tag Type // Tag Type
Type string `json:"type" example:"Debug"` Type string `json:"type" example:"Debug"`
Name string `json:"name" example:"Testjob"` // Tag Name Name string `json:"name" example:"Testjob"` // Tag Name
Scope string `json:"scope" example:"global"` // Tag Scope for Frontend Display Scope string `json:"scope" example:"global"` // Tag Scope for Frontend Display
} }
// ApiMeta model
type EditMetaRequest struct { type EditMetaRequest struct {
Key string `json:"key" example:"jobScript"` Key string `json:"key" example:"jobScript"`
Value string `json:"value" example:"bash script"` Value string `json:"value" example:"bash script"`
} }
type TagJobAPIRequest []*APITag // JobMetaRequest model
type JobMetaRequest struct {
JobId *int64 `json:"jobId" validate:"required" example:"123000"` // Cluster Job ID of job
Cluster *string `json:"cluster" example:"fritz"` // Cluster of job
StartTime *int64 `json:"startTime" example:"1649723812"` // Start Time of job as epoch
Payload EditMetaRequest `json:"payload"` // Content to Add to Job Meta_Data
}
type GetJobAPIRequest []string type TagJobApiRequest []*ApiTag
type GetJobAPIResponse struct { type GetJobApiRequest []string
type GetJobApiResponse struct {
Meta *schema.Job Meta *schema.Job
Data []*JobMetricWithName Data []*JobMetricWithName
} }
type GetCompleteJobAPIResponse struct { type GetCompleteJobApiResponse struct {
Meta *schema.Job Meta *schema.Job
Data schema.JobData Data schema.JobData
} }
@@ -111,7 +112,7 @@ type JobMetricWithName struct {
// @failure 500 {object} api.ErrorResponse "Internal Server Error" // @failure 500 {object} api.ErrorResponse "Internal Server Error"
// @security ApiKeyAuth // @security ApiKeyAuth
// @router /api/jobs/ [get] // @router /api/jobs/ [get]
func (api *RestAPI) getJobs(rw http.ResponseWriter, r *http.Request) { func (api *RestApi) getJobs(rw http.ResponseWriter, r *http.Request) {
withMetadata := false withMetadata := false
filter := &model.JobFilter{} filter := &model.JobFilter{}
page := &model.PageRequest{ItemsPerPage: 25, Page: 1} page := &model.PageRequest{ItemsPerPage: 25, Page: 1}
@@ -119,8 +120,6 @@ func (api *RestAPI) getJobs(rw http.ResponseWriter, r *http.Request) {
for key, vals := range r.URL.Query() { for key, vals := range r.URL.Query() {
switch key { switch key {
case "project":
filter.Project = &model.StringInput{Eq: &vals[0]}
case "state": case "state":
for _, s := range vals { for _, s := range vals {
state := schema.JobState(s) state := schema.JobState(s)
@@ -133,7 +132,7 @@ func (api *RestAPI) getJobs(rw http.ResponseWriter, r *http.Request) {
} }
case "cluster": case "cluster":
filter.Cluster = &model.StringInput{Eq: &vals[0]} filter.Cluster = &model.StringInput{Eq: &vals[0]}
case "start-time": // ?startTime=1753707480-1754053139 case "start-time":
st := strings.Split(vals[0], "-") st := strings.Split(vals[0], "-")
if len(st) != 2 { if len(st) != 2 {
handleError(fmt.Errorf("invalid query parameter value: startTime"), handleError(fmt.Errorf("invalid query parameter value: startTime"),
@@ -151,7 +150,7 @@ func (api *RestAPI) getJobs(rw http.ResponseWriter, r *http.Request) {
return return
} }
ufrom, uto := time.Unix(from, 0), time.Unix(to, 0) ufrom, uto := time.Unix(from, 0), time.Unix(to, 0)
filter.StartTime = &config.TimeRange{From: &ufrom, To: &uto} filter.StartTime = &schema.TimeRange{From: &ufrom, To: &uto}
case "page": case "page":
x, err := strconv.Atoi(vals[0]) x, err := strconv.Atoi(vals[0])
if err != nil { if err != nil {
@@ -212,7 +211,7 @@ func (api *RestAPI) getJobs(rw http.ResponseWriter, r *http.Request) {
bw := bufio.NewWriter(rw) bw := bufio.NewWriter(rw)
defer bw.Flush() defer bw.Flush()
payload := GetJobsAPIResponse{ payload := GetJobsApiResponse{
Jobs: results, Jobs: results,
Items: page.ItemsPerPage, Items: page.ItemsPerPage,
Page: page.Page, Page: page.Page,
@@ -224,7 +223,7 @@ func (api *RestAPI) getJobs(rw http.ResponseWriter, r *http.Request) {
} }
} }
// getCompleteJobByID godoc // getCompleteJobById godoc
// @summary Get job meta and optional all metric data // @summary Get job meta and optional all metric data
// @tags Job query // @tags Job query
// @description Job to get is specified by database ID // @description Job to get is specified by database ID
@@ -241,7 +240,7 @@ func (api *RestAPI) getJobs(rw http.ResponseWriter, r *http.Request) {
// @failure 500 {object} api.ErrorResponse "Internal Server Error" // @failure 500 {object} api.ErrorResponse "Internal Server Error"
// @security ApiKeyAuth // @security ApiKeyAuth
// @router /api/jobs/{id} [get] // @router /api/jobs/{id} [get]
func (api *RestAPI) getCompleteJobByID(rw http.ResponseWriter, r *http.Request) { func (api *RestApi) getCompleteJobById(rw http.ResponseWriter, r *http.Request) {
// Fetch job from db // Fetch job from db
id, ok := mux.Vars(r)["id"] id, ok := mux.Vars(r)["id"]
var job *schema.Job var job *schema.Job
@@ -253,7 +252,7 @@ func (api *RestAPI) getCompleteJobByID(rw http.ResponseWriter, r *http.Request)
return return
} }
job, err = api.JobRepository.FindByID(r.Context(), id) // Get Job from Repo by ID job, err = api.JobRepository.FindById(r.Context(), id) // Get Job from Repo by ID
} else { } else {
handleError(fmt.Errorf("the parameter 'id' is required"), http.StatusBadRequest, rw) handleError(fmt.Errorf("the parameter 'id' is required"), http.StatusBadRequest, rw)
return return
@@ -293,7 +292,7 @@ func (api *RestAPI) getCompleteJobByID(rw http.ResponseWriter, r *http.Request)
} }
if r.URL.Query().Get("all-metrics") == "true" { if r.URL.Query().Get("all-metrics") == "true" {
data, err = metricdispatcher.LoadData(job, nil, scopes, r.Context(), resolution) data, err = metricDataDispatcher.LoadData(job, nil, scopes, r.Context(), resolution)
if err != nil { if err != nil {
cclog.Warnf("REST: error while loading all-metrics job data for JobID %d on %s", job.JobID, job.Cluster) cclog.Warnf("REST: error while loading all-metrics job data for JobID %d on %s", job.JobID, job.Cluster)
return return
@@ -305,7 +304,7 @@ func (api *RestAPI) getCompleteJobByID(rw http.ResponseWriter, r *http.Request)
bw := bufio.NewWriter(rw) bw := bufio.NewWriter(rw)
defer bw.Flush() defer bw.Flush()
payload := GetCompleteJobAPIResponse{ payload := GetCompleteJobApiResponse{
Meta: job, Meta: job,
Data: data, Data: data,
} }
@@ -316,7 +315,7 @@ func (api *RestAPI) getCompleteJobByID(rw http.ResponseWriter, r *http.Request)
} }
} }
// getJobByID godoc // getJobById godoc
// @summary Get job meta and configurable metric data // @summary Get job meta and configurable metric data
// @tags Job query // @tags Job query
// @description Job to get is specified by database ID // @description Job to get is specified by database ID
@@ -334,7 +333,7 @@ func (api *RestAPI) getCompleteJobByID(rw http.ResponseWriter, r *http.Request)
// @failure 500 {object} api.ErrorResponse "Internal Server Error" // @failure 500 {object} api.ErrorResponse "Internal Server Error"
// @security ApiKeyAuth // @security ApiKeyAuth
// @router /api/jobs/{id} [post] // @router /api/jobs/{id} [post]
func (api *RestAPI) getJobByID(rw http.ResponseWriter, r *http.Request) { func (api *RestApi) getJobById(rw http.ResponseWriter, r *http.Request) {
// Fetch job from db // Fetch job from db
id, ok := mux.Vars(r)["id"] id, ok := mux.Vars(r)["id"]
var job *schema.Job var job *schema.Job
@@ -346,7 +345,7 @@ func (api *RestAPI) getJobByID(rw http.ResponseWriter, r *http.Request) {
return return
} }
job, err = api.JobRepository.FindByID(r.Context(), id) job, err = api.JobRepository.FindById(r.Context(), id)
} else { } else {
handleError(errors.New("the parameter 'id' is required"), http.StatusBadRequest, rw) handleError(errors.New("the parameter 'id' is required"), http.StatusBadRequest, rw)
return return
@@ -368,9 +367,9 @@ func (api *RestAPI) getJobByID(rw http.ResponseWriter, r *http.Request) {
return return
} }
var metrics GetJobAPIRequest var metrics GetJobApiRequest
if err = decode(r.Body, &metrics); err != nil { if err = decode(r.Body, &metrics); err != nil {
handleError(fmt.Errorf("decoding request failed: %w", err), http.StatusBadRequest, rw) http.Error(rw, err.Error(), http.StatusBadRequest)
return return
} }
@@ -389,7 +388,7 @@ func (api *RestAPI) getJobByID(rw http.ResponseWriter, r *http.Request) {
resolution = max(resolution, mc.Timestep) resolution = max(resolution, mc.Timestep)
} }
data, err := metricdispatcher.LoadData(job, metrics, scopes, r.Context(), resolution) data, err := metricDataDispatcher.LoadData(job, metrics, scopes, r.Context(), resolution)
if err != nil { if err != nil {
cclog.Warnf("REST: error while loading job data for JobID %d on %s", job.JobID, job.Cluster) cclog.Warnf("REST: error while loading job data for JobID %d on %s", job.JobID, job.Cluster)
return return
@@ -411,7 +410,7 @@ func (api *RestAPI) getJobByID(rw http.ResponseWriter, r *http.Request) {
bw := bufio.NewWriter(rw) bw := bufio.NewWriter(rw)
defer bw.Flush() defer bw.Flush()
payload := GetJobAPIResponse{ payload := GetJobApiResponse{
Meta: job, Meta: job,
Data: res, Data: res,
} }
@@ -423,50 +422,96 @@ func (api *RestAPI) getJobByID(rw http.ResponseWriter, r *http.Request) {
} }
// editMeta godoc // editMeta godoc
// @summary Edit meta-data json // @summary Edit meta-data json of job identified by database id
// @tags Job add and modify // @tags Job add and modify
// @description Edit key value pairs in job metadata json // @description Edit key value pairs in job metadata json of job specified by database id
// @description If a key already exists its content will be overwritten // @description If a key already exists its content will be overwritten
// @accept json // @accept json
// @produce json // @produce json
// @param id path int true "Job Database ID" // @param id path int true "Job Database ID"
// @param request body api.EditMetaRequest true "Kay value pair to add" // @param request body api.EditMetaRequest true "Metadata Key value pair to add or update"
// @success 200 {object} schema.Job "Updated job resource" // @success 200 {object} schema.Job "Updated job resource"
// @failure 400 {object} api.ErrorResponse "Bad Request" // @failure 400 {object} api.ErrorResponse "Bad Request"
// @failure 401 {object} api.ErrorResponse "Unauthorized" // @failure 401 {object} api.ErrorResponse "Unauthorized"
// @failure 404 {object} api.ErrorResponse "Job does not exist" // @failure 404 {object} api.ErrorResponse "Job does not exist"
// @failure 500 {object} api.ErrorResponse "Internal Server Error" // @failure 500 {object} api.ErrorResponse "Internal Server Error"
// @security ApiKeyAuth // @security ApiKeyAuth
// @router /api/jobs/edit_meta/{id} [post] // @router /api/jobs/edit_meta/{id} [patch]
func (api *RestAPI) editMeta(rw http.ResponseWriter, r *http.Request) { func (api *RestApi) editMeta(rw http.ResponseWriter, r *http.Request) {
id, err := strconv.ParseInt(mux.Vars(r)["id"], 10, 64) id, err := strconv.ParseInt(mux.Vars(r)["id"], 10, 64)
if err != nil { if err != nil {
handleError(fmt.Errorf("parsing job ID failed: %w", err), http.StatusBadRequest, rw) http.Error(rw, err.Error(), http.StatusBadRequest)
return return
} }
job, err := api.JobRepository.FindByID(r.Context(), id) job, err := api.JobRepository.FindById(r.Context(), id)
if err != nil { if err != nil {
handleError(fmt.Errorf("finding job failed: %w", err), http.StatusNotFound, rw) http.Error(rw, err.Error(), http.StatusNotFound)
return return
} }
var req EditMetaRequest var req EditMetaRequest
if err := decode(r.Body, &req); err != nil { if err := decode(r.Body, &req); err != nil {
handleError(fmt.Errorf("decoding request failed: %w", err), http.StatusBadRequest, rw) http.Error(rw, err.Error(), http.StatusBadRequest)
return return
} }
if err := api.JobRepository.UpdateMetadata(job, req.Key, req.Value); err != nil { if err := api.JobRepository.UpdateMetadata(job, req.Key, req.Value); err != nil {
handleError(fmt.Errorf("updating metadata failed: %w", err), http.StatusInternalServerError, rw) http.Error(rw, err.Error(), http.StatusInternalServerError)
return return
} }
rw.Header().Add("Content-Type", "application/json") rw.Header().Add("Content-Type", "application/json")
rw.WriteHeader(http.StatusOK) rw.WriteHeader(http.StatusOK)
if err := json.NewEncoder(rw).Encode(job); err != nil { json.NewEncoder(rw).Encode(job)
cclog.Errorf("Failed to encode job response: %v", err) }
// editMetaByRequest godoc
// @summary Edit meta-data json of job identified by request
// @tags Job add and modify
// @description Edit key value pairs in metadata json of job specified by jobID, StartTime and Cluster
// @description If a key already exists its content will be overwritten
// @accept json
// @produce json
// @param request body api.JobMetaRequest true "Specifies job and payload to add or update"
// @success 200 {object} schema.Job "Updated job resource"
// @failure 400 {object} api.ErrorResponse "Bad Request"
// @failure 401 {object} api.ErrorResponse "Unauthorized"
// @failure 404 {object} api.ErrorResponse "Job does not exist"
// @failure 500 {object} api.ErrorResponse "Internal Server Error"
// @security ApiKeyAuth
// @router /api/jobs/edit_meta/ [patch]
func (api *RestApi) editMetaByRequest(rw http.ResponseWriter, r *http.Request) {
// Parse request body
req := JobMetaRequest{}
if err := decode(r.Body, &req); err != nil {
handleError(fmt.Errorf("parsing request body failed: %w", err), http.StatusBadRequest, rw)
return
} }
// Fetch job (that will have its meta_data edited) from db
var job *schema.Job
var err error
if req.JobId == nil {
handleError(errors.New("the field 'jobId' is required"), http.StatusBadRequest, rw)
return
}
// log.Printf("loading db job for editMetaByRequest... : JobMetaRequest=%v", req)
job, err = api.JobRepository.Find(req.JobId, req.Cluster, req.StartTime)
if err != nil {
handleError(fmt.Errorf("finding job failed: %w", err), http.StatusUnprocessableEntity, rw)
return
}
if err := api.JobRepository.UpdateMetadata(job, req.Payload.Key, req.Payload.Value); err != nil {
http.Error(rw, err.Error(), http.StatusInternalServerError)
return
}
rw.Header().Add("Content-Type", "application/json")
rw.WriteHeader(http.StatusOK)
json.NewEncoder(rw).Encode(job)
} }
// tagJob godoc // tagJob godoc
@@ -486,40 +531,40 @@ func (api *RestAPI) editMeta(rw http.ResponseWriter, r *http.Request) {
// @failure 500 {object} api.ErrorResponse "Internal Server Error" // @failure 500 {object} api.ErrorResponse "Internal Server Error"
// @security ApiKeyAuth // @security ApiKeyAuth
// @router /api/jobs/tag_job/{id} [post] // @router /api/jobs/tag_job/{id} [post]
func (api *RestAPI) tagJob(rw http.ResponseWriter, r *http.Request) { func (api *RestApi) tagJob(rw http.ResponseWriter, r *http.Request) {
id, err := strconv.ParseInt(mux.Vars(r)["id"], 10, 64) id, err := strconv.ParseInt(mux.Vars(r)["id"], 10, 64)
if err != nil { if err != nil {
handleError(fmt.Errorf("parsing job ID failed: %w", err), http.StatusBadRequest, rw) http.Error(rw, err.Error(), http.StatusBadRequest)
return return
} }
job, err := api.JobRepository.FindByID(r.Context(), id) job, err := api.JobRepository.FindById(r.Context(), id)
if err != nil { if err != nil {
handleError(fmt.Errorf("finding job failed: %w", err), http.StatusNotFound, rw) http.Error(rw, err.Error(), http.StatusNotFound)
return return
} }
job.Tags, err = api.JobRepository.GetTags(repository.GetUserFromContext(r.Context()), job.ID) job.Tags, err = api.JobRepository.GetTags(repository.GetUserFromContext(r.Context()), job.ID)
if err != nil { if err != nil {
handleError(fmt.Errorf("getting tags failed: %w", err), http.StatusInternalServerError, rw) http.Error(rw, err.Error(), http.StatusInternalServerError)
return return
} }
var req TagJobAPIRequest var req TagJobApiRequest
if err := decode(r.Body, &req); err != nil { if err := decode(r.Body, &req); err != nil {
handleError(fmt.Errorf("decoding request failed: %w", err), http.StatusBadRequest, rw) http.Error(rw, err.Error(), http.StatusBadRequest)
return return
} }
for _, tag := range req { for _, tag := range req {
tagID, err := api.JobRepository.AddTagOrCreate(repository.GetUserFromContext(r.Context()), *job.ID, tag.Type, tag.Name, tag.Scope) tagId, err := api.JobRepository.AddTagOrCreate(repository.GetUserFromContext(r.Context()), *job.ID, tag.Type, tag.Name, tag.Scope)
if err != nil { if err != nil {
handleError(fmt.Errorf("adding tag failed: %w", err), http.StatusInternalServerError, rw) http.Error(rw, err.Error(), http.StatusInternalServerError)
return return
} }
job.Tags = append(job.Tags, &schema.Tag{ job.Tags = append(job.Tags, &schema.Tag{
ID: tagID, ID: tagId,
Type: tag.Type, Type: tag.Type,
Name: tag.Name, Name: tag.Name,
Scope: tag.Scope, Scope: tag.Scope,
@@ -528,9 +573,7 @@ func (api *RestAPI) tagJob(rw http.ResponseWriter, r *http.Request) {
rw.Header().Add("Content-Type", "application/json") rw.Header().Add("Content-Type", "application/json")
rw.WriteHeader(http.StatusOK) rw.WriteHeader(http.StatusOK)
if err := json.NewEncoder(rw).Encode(job); err != nil { json.NewEncoder(rw).Encode(job)
cclog.Errorf("Failed to encode job response: %v", err)
}
} }
// removeTagJob godoc // removeTagJob godoc
@@ -550,28 +593,28 @@ func (api *RestAPI) tagJob(rw http.ResponseWriter, r *http.Request) {
// @failure 500 {object} api.ErrorResponse "Internal Server Error" // @failure 500 {object} api.ErrorResponse "Internal Server Error"
// @security ApiKeyAuth // @security ApiKeyAuth
// @router /jobs/tag_job/{id} [delete] // @router /jobs/tag_job/{id} [delete]
func (api *RestAPI) removeTagJob(rw http.ResponseWriter, r *http.Request) { func (api *RestApi) removeTagJob(rw http.ResponseWriter, r *http.Request) {
id, err := strconv.ParseInt(mux.Vars(r)["id"], 10, 64) id, err := strconv.ParseInt(mux.Vars(r)["id"], 10, 64)
if err != nil { if err != nil {
handleError(fmt.Errorf("parsing job ID failed: %w", err), http.StatusBadRequest, rw) http.Error(rw, err.Error(), http.StatusBadRequest)
return return
} }
job, err := api.JobRepository.FindByID(r.Context(), id) job, err := api.JobRepository.FindById(r.Context(), id)
if err != nil { if err != nil {
handleError(fmt.Errorf("finding job failed: %w", err), http.StatusNotFound, rw) http.Error(rw, err.Error(), http.StatusNotFound)
return return
} }
job.Tags, err = api.JobRepository.GetTags(repository.GetUserFromContext(r.Context()), job.ID) job.Tags, err = api.JobRepository.GetTags(repository.GetUserFromContext(r.Context()), job.ID)
if err != nil { if err != nil {
handleError(fmt.Errorf("getting tags failed: %w", err), http.StatusInternalServerError, rw) http.Error(rw, err.Error(), http.StatusInternalServerError)
return return
} }
var req TagJobAPIRequest var req TagJobApiRequest
if err := decode(r.Body, &req); err != nil { if err := decode(r.Body, &req); err != nil {
handleError(fmt.Errorf("decoding request failed: %w", err), http.StatusBadRequest, rw) http.Error(rw, err.Error(), http.StatusBadRequest)
return return
} }
@@ -584,7 +627,7 @@ func (api *RestAPI) removeTagJob(rw http.ResponseWriter, r *http.Request) {
remainingTags, err := api.JobRepository.RemoveJobTagByRequest(repository.GetUserFromContext(r.Context()), *job.ID, rtag.Type, rtag.Name, rtag.Scope) remainingTags, err := api.JobRepository.RemoveJobTagByRequest(repository.GetUserFromContext(r.Context()), *job.ID, rtag.Type, rtag.Name, rtag.Scope)
if err != nil { if err != nil {
handleError(fmt.Errorf("removing tag failed: %w", err), http.StatusInternalServerError, rw) http.Error(rw, err.Error(), http.StatusInternalServerError)
return return
} }
@@ -593,9 +636,7 @@ func (api *RestAPI) removeTagJob(rw http.ResponseWriter, r *http.Request) {
rw.Header().Add("Content-Type", "application/json") rw.Header().Add("Content-Type", "application/json")
rw.WriteHeader(http.StatusOK) rw.WriteHeader(http.StatusOK)
if err := json.NewEncoder(rw).Encode(job); err != nil { json.NewEncoder(rw).Encode(job)
cclog.Errorf("Failed to encode job response: %v", err)
}
} }
// removeTags godoc // removeTags godoc
@@ -614,10 +655,10 @@ func (api *RestAPI) removeTagJob(rw http.ResponseWriter, r *http.Request) {
// @failure 500 {object} api.ErrorResponse "Internal Server Error" // @failure 500 {object} api.ErrorResponse "Internal Server Error"
// @security ApiKeyAuth // @security ApiKeyAuth
// @router /tags/ [delete] // @router /tags/ [delete]
func (api *RestAPI) removeTags(rw http.ResponseWriter, r *http.Request) { func (api *RestApi) removeTags(rw http.ResponseWriter, r *http.Request) {
var req TagJobAPIRequest var req TagJobApiRequest
if err := decode(r.Body, &req); err != nil { if err := decode(r.Body, &req); err != nil {
handleError(fmt.Errorf("decoding request failed: %w", err), http.StatusBadRequest, rw) http.Error(rw, err.Error(), http.StatusBadRequest)
return return
} }
@@ -632,10 +673,11 @@ func (api *RestAPI) removeTags(rw http.ResponseWriter, r *http.Request) {
err := api.JobRepository.RemoveTagByRequest(rtag.Type, rtag.Name, rtag.Scope) err := api.JobRepository.RemoveTagByRequest(rtag.Type, rtag.Name, rtag.Scope)
if err != nil { if err != nil {
handleError(fmt.Errorf("removing tag failed: %w", err), http.StatusInternalServerError, rw) http.Error(rw, err.Error(), http.StatusInternalServerError)
return return
} else {
currentCount++
} }
currentCount++
} }
rw.WriteHeader(http.StatusOK) rw.WriteHeader(http.StatusOK)
@@ -658,9 +700,9 @@ func (api *RestAPI) removeTags(rw http.ResponseWriter, r *http.Request) {
// @failure 500 {object} api.ErrorResponse "Internal Server Error" // @failure 500 {object} api.ErrorResponse "Internal Server Error"
// @security ApiKeyAuth // @security ApiKeyAuth
// @router /api/jobs/start_job/ [post] // @router /api/jobs/start_job/ [post]
func (api *RestAPI) startJob(rw http.ResponseWriter, r *http.Request) { func (api *RestApi) startJob(rw http.ResponseWriter, r *http.Request) {
req := schema.Job{ req := schema.Job{
Shared: "none", Exclusive: 1,
MonitoringStatus: schema.MonitoringStatusRunningOrArchiving, MonitoringStatus: schema.MonitoringStatusRunningOrArchiving,
} }
if err := decode(r.Body, &req); err != nil { if err := decode(r.Body, &req); err != nil {
@@ -668,7 +710,7 @@ func (api *RestAPI) startJob(rw http.ResponseWriter, r *http.Request) {
return return
} }
cclog.Debugf("REST: %s", req.GoString()) cclog.Printf("REST: %s\n", req.GoString())
req.State = schema.JobStateRunning req.State = schema.JobStateRunning
if err := importer.SanityChecks(&req); err != nil { if err := importer.SanityChecks(&req); err != nil {
@@ -686,11 +728,9 @@ func (api *RestAPI) startJob(rw http.ResponseWriter, r *http.Request) {
if err != nil && err != sql.ErrNoRows { if err != nil && err != sql.ErrNoRows {
handleError(fmt.Errorf("checking for duplicate failed: %w", err), http.StatusInternalServerError, rw) handleError(fmt.Errorf("checking for duplicate failed: %w", err), http.StatusInternalServerError, rw)
return return
} } else if err == nil {
if err == nil {
for _, job := range jobs { for _, job := range jobs {
// Check if jobs are within the same day (prevent duplicates) if (req.StartTime - job.StartTime) < 86400 {
if (req.StartTime - job.StartTime) < secondsPerDay {
handleError(fmt.Errorf("a job with that jobId, cluster and startTime already exists: dbid: %d, jobid: %d", job.ID, job.JobID), http.StatusUnprocessableEntity, rw) handleError(fmt.Errorf("a job with that jobId, cluster and startTime already exists: dbid: %d, jobid: %d", job.ID, job.JobID), http.StatusUnprocessableEntity, rw)
return return
} }
@@ -707,19 +747,18 @@ func (api *RestAPI) startJob(rw http.ResponseWriter, r *http.Request) {
for _, tag := range req.Tags { for _, tag := range req.Tags {
if _, err := api.JobRepository.AddTagOrCreate(repository.GetUserFromContext(r.Context()), id, tag.Type, tag.Name, tag.Scope); err != nil { if _, err := api.JobRepository.AddTagOrCreate(repository.GetUserFromContext(r.Context()), id, tag.Type, tag.Name, tag.Scope); err != nil {
http.Error(rw, err.Error(), http.StatusInternalServerError)
handleError(fmt.Errorf("adding tag to new job %d failed: %w", id, err), http.StatusInternalServerError, rw) handleError(fmt.Errorf("adding tag to new job %d failed: %w", id, err), http.StatusInternalServerError, rw)
return return
} }
} }
cclog.Infof("new job (id: %d): cluster=%s, jobId=%d, user=%s, startTime=%d", id, req.Cluster, req.JobID, req.User, req.StartTime) cclog.Printf("new job (id: %d): cluster=%s, jobId=%d, user=%s, startTime=%d", id, req.Cluster, req.JobID, req.User, req.StartTime)
rw.Header().Add("Content-Type", "application/json") rw.Header().Add("Content-Type", "application/json")
rw.WriteHeader(http.StatusCreated) rw.WriteHeader(http.StatusCreated)
if err := json.NewEncoder(rw).Encode(DefaultAPIResponse{ json.NewEncoder(rw).Encode(DefaultApiResponse{
Message: "success", Message: "success",
}); err != nil { })
cclog.Errorf("Failed to encode response: %v", err)
}
} }
// stopJobByRequest godoc // stopJobByRequest godoc
@@ -738,9 +777,9 @@ func (api *RestAPI) startJob(rw http.ResponseWriter, r *http.Request) {
// @failure 500 {object} api.ErrorResponse "Internal Server Error" // @failure 500 {object} api.ErrorResponse "Internal Server Error"
// @security ApiKeyAuth // @security ApiKeyAuth
// @router /api/jobs/stop_job/ [post] // @router /api/jobs/stop_job/ [post]
func (api *RestAPI) stopJobByRequest(rw http.ResponseWriter, r *http.Request) { func (api *RestApi) stopJobByRequest(rw http.ResponseWriter, r *http.Request) {
// Parse request body // Parse request body
req := StopJobAPIRequest{} req := StopJobApiRequest{}
if err := decode(r.Body, &req); err != nil { if err := decode(r.Body, &req); err != nil {
handleError(fmt.Errorf("parsing request body failed: %w", err), http.StatusBadRequest, rw) handleError(fmt.Errorf("parsing request body failed: %w", err), http.StatusBadRequest, rw)
return return
@@ -749,28 +788,26 @@ func (api *RestAPI) stopJobByRequest(rw http.ResponseWriter, r *http.Request) {
// Fetch job (that will be stopped) from db // Fetch job (that will be stopped) from db
var job *schema.Job var job *schema.Job
var err error var err error
if req.JobID == nil { if req.JobId == nil {
handleError(errors.New("the field 'jobId' is required"), http.StatusBadRequest, rw) handleError(errors.New("the field 'jobId' is required"), http.StatusBadRequest, rw)
return return
} }
// cclog.Printf("loading db job for stopJobByRequest... : stopJobApiRequest=%v", req) // cclog.Printf("loading db job for stopJobByRequest... : stopJobApiRequest=%v", req)
job, err = api.JobRepository.Find(req.JobID, req.Cluster, req.StartTime) job, err = api.JobRepository.Find(req.JobId, req.Cluster, req.StartTime)
if err != nil { if err != nil {
// Try cached jobs if not found in main repository job, err = api.JobRepository.FindCached(req.JobId, req.Cluster, req.StartTime)
cachedJob, cachedErr := api.JobRepository.FindCached(req.JobID, req.Cluster, req.StartTime) // FIXME: Previous error is hidden
if cachedErr != nil { if err != nil {
// Combine both errors for better debugging handleError(fmt.Errorf("finding job failed: %w", err), http.StatusUnprocessableEntity, rw)
handleError(fmt.Errorf("finding job failed: %w (cached lookup also failed: %v)", err, cachedErr), http.StatusNotFound, rw)
return return
} }
job = cachedJob
} }
api.checkAndHandleStopJob(rw, job, req) api.checkAndHandleStopJob(rw, job, req)
} }
// deleteJobByID godoc // deleteJobById godoc
// @summary Remove a job from the sql database // @summary Remove a job from the sql database
// @tags Job remove // @tags Job remove
// @description Job to remove is specified by database ID. This will not remove the job from the job archive. // @description Job to remove is specified by database ID. This will not remove the job from the job archive.
@@ -785,7 +822,7 @@ func (api *RestAPI) stopJobByRequest(rw http.ResponseWriter, r *http.Request) {
// @failure 500 {object} api.ErrorResponse "Internal Server Error" // @failure 500 {object} api.ErrorResponse "Internal Server Error"
// @security ApiKeyAuth // @security ApiKeyAuth
// @router /api/jobs/delete_job/{id} [delete] // @router /api/jobs/delete_job/{id} [delete]
func (api *RestAPI) deleteJobByID(rw http.ResponseWriter, r *http.Request) { func (api *RestApi) deleteJobById(rw http.ResponseWriter, r *http.Request) {
// Fetch job (that will be stopped) from db // Fetch job (that will be stopped) from db
id, ok := mux.Vars(r)["id"] id, ok := mux.Vars(r)["id"]
var err error var err error
@@ -796,7 +833,7 @@ func (api *RestAPI) deleteJobByID(rw http.ResponseWriter, r *http.Request) {
return return
} }
err = api.JobRepository.DeleteJobByID(id) err = api.JobRepository.DeleteJobById(id)
} else { } else {
handleError(errors.New("the parameter 'id' is required"), http.StatusBadRequest, rw) handleError(errors.New("the parameter 'id' is required"), http.StatusBadRequest, rw)
return return
@@ -807,11 +844,9 @@ func (api *RestAPI) deleteJobByID(rw http.ResponseWriter, r *http.Request) {
} }
rw.Header().Add("Content-Type", "application/json") rw.Header().Add("Content-Type", "application/json")
rw.WriteHeader(http.StatusOK) rw.WriteHeader(http.StatusOK)
if err := json.NewEncoder(rw).Encode(DefaultAPIResponse{ json.NewEncoder(rw).Encode(DefaultApiResponse{
Message: fmt.Sprintf("Successfully deleted job %s", id), Message: fmt.Sprintf("Successfully deleted job %s", id),
}); err != nil { })
cclog.Errorf("Failed to encode response: %v", err)
}
} }
// deleteJobByRequest godoc // deleteJobByRequest godoc
@@ -830,9 +865,9 @@ func (api *RestAPI) deleteJobByID(rw http.ResponseWriter, r *http.Request) {
// @failure 500 {object} api.ErrorResponse "Internal Server Error" // @failure 500 {object} api.ErrorResponse "Internal Server Error"
// @security ApiKeyAuth // @security ApiKeyAuth
// @router /api/jobs/delete_job/ [delete] // @router /api/jobs/delete_job/ [delete]
func (api *RestAPI) deleteJobByRequest(rw http.ResponseWriter, r *http.Request) { func (api *RestApi) deleteJobByRequest(rw http.ResponseWriter, r *http.Request) {
// Parse request body // Parse request body
req := DeleteJobAPIRequest{} req := DeleteJobApiRequest{}
if err := decode(r.Body, &req); err != nil { if err := decode(r.Body, &req); err != nil {
handleError(fmt.Errorf("parsing request body failed: %w", err), http.StatusBadRequest, rw) handleError(fmt.Errorf("parsing request body failed: %w", err), http.StatusBadRequest, rw)
return return
@@ -841,18 +876,18 @@ func (api *RestAPI) deleteJobByRequest(rw http.ResponseWriter, r *http.Request)
// Fetch job (that will be deleted) from db // Fetch job (that will be deleted) from db
var job *schema.Job var job *schema.Job
var err error var err error
if req.JobID == nil { if req.JobId == nil {
handleError(errors.New("the field 'jobId' is required"), http.StatusBadRequest, rw) handleError(errors.New("the field 'jobId' is required"), http.StatusBadRequest, rw)
return return
} }
job, err = api.JobRepository.Find(req.JobID, req.Cluster, req.StartTime) job, err = api.JobRepository.Find(req.JobId, req.Cluster, req.StartTime)
if err != nil { if err != nil {
handleError(fmt.Errorf("finding job failed: %w", err), http.StatusUnprocessableEntity, rw) handleError(fmt.Errorf("finding job failed: %w", err), http.StatusUnprocessableEntity, rw)
return return
} }
err = api.JobRepository.DeleteJobByID(*job.ID) err = api.JobRepository.DeleteJobById(*job.ID)
if err != nil { if err != nil {
handleError(fmt.Errorf("deleting job failed: %w", err), http.StatusUnprocessableEntity, rw) handleError(fmt.Errorf("deleting job failed: %w", err), http.StatusUnprocessableEntity, rw)
return return
@@ -860,11 +895,9 @@ func (api *RestAPI) deleteJobByRequest(rw http.ResponseWriter, r *http.Request)
rw.Header().Add("Content-Type", "application/json") rw.Header().Add("Content-Type", "application/json")
rw.WriteHeader(http.StatusOK) rw.WriteHeader(http.StatusOK)
if err := json.NewEncoder(rw).Encode(DefaultAPIResponse{ json.NewEncoder(rw).Encode(DefaultApiResponse{
Message: fmt.Sprintf("Successfully deleted job %d", job.ID), Message: fmt.Sprintf("Successfully deleted job %d", job.ID),
}); err != nil { })
cclog.Errorf("Failed to encode response: %v", err)
}
} }
// deleteJobBefore godoc // deleteJobBefore godoc
@@ -882,8 +915,7 @@ func (api *RestAPI) deleteJobByRequest(rw http.ResponseWriter, r *http.Request)
// @failure 500 {object} api.ErrorResponse "Internal Server Error" // @failure 500 {object} api.ErrorResponse "Internal Server Error"
// @security ApiKeyAuth // @security ApiKeyAuth
// @router /api/jobs/delete_job_before/{ts} [delete] // @router /api/jobs/delete_job_before/{ts} [delete]
// @param omit-tagged query bool false "Omit jobs with tags from deletion" func (api *RestApi) deleteJobBefore(rw http.ResponseWriter, r *http.Request) {
func (api *RestAPI) deleteJobBefore(rw http.ResponseWriter, r *http.Request) {
var cnt int var cnt int
// Fetch job (that will be stopped) from db // Fetch job (that will be stopped) from db
id, ok := mux.Vars(r)["ts"] id, ok := mux.Vars(r)["ts"]
@@ -895,17 +927,7 @@ func (api *RestAPI) deleteJobBefore(rw http.ResponseWriter, r *http.Request) {
return return
} }
// Check for omit-tagged query parameter cnt, err = api.JobRepository.DeleteJobsBefore(ts)
omitTagged := false
if omitTaggedStr := r.URL.Query().Get("omit-tagged"); omitTaggedStr != "" {
omitTagged, e = strconv.ParseBool(omitTaggedStr)
if e != nil {
handleError(fmt.Errorf("boolean expected for omit-tagged parameter: %w", e), http.StatusBadRequest, rw)
return
}
}
cnt, err = api.JobRepository.DeleteJobsBefore(ts, omitTagged)
} else { } else {
handleError(errors.New("the parameter 'ts' is required"), http.StatusBadRequest, rw) handleError(errors.New("the parameter 'ts' is required"), http.StatusBadRequest, rw)
return return
@@ -917,21 +939,19 @@ func (api *RestAPI) deleteJobBefore(rw http.ResponseWriter, r *http.Request) {
rw.Header().Add("Content-Type", "application/json") rw.Header().Add("Content-Type", "application/json")
rw.WriteHeader(http.StatusOK) rw.WriteHeader(http.StatusOK)
if err := json.NewEncoder(rw).Encode(DefaultAPIResponse{ json.NewEncoder(rw).Encode(DefaultApiResponse{
Message: fmt.Sprintf("Successfully deleted %d jobs", cnt), Message: fmt.Sprintf("Successfully deleted %d jobs", cnt),
}); err != nil { })
cclog.Errorf("Failed to encode response: %v", err)
}
} }
func (api *RestAPI) checkAndHandleStopJob(rw http.ResponseWriter, job *schema.Job, req StopJobAPIRequest) { func (api *RestApi) checkAndHandleStopJob(rw http.ResponseWriter, job *schema.Job, req StopJobApiRequest) {
// Sanity checks // Sanity checks
if job.State != schema.JobStateRunning { if job.State != schema.JobStateRunning {
handleError(fmt.Errorf("jobId %d (id %d) on %s : job has already been stopped (state is: %s)", job.JobID, job.ID, job.Cluster, job.State), http.StatusUnprocessableEntity, rw) handleError(fmt.Errorf("jobId %d (id %d) on %s : job has already been stopped (state is: %s)", job.JobID, job.ID, job.Cluster, job.State), http.StatusUnprocessableEntity, rw)
return return
} }
if job.StartTime > req.StopTime { if job == nil || job.StartTime > req.StopTime {
handleError(fmt.Errorf("jobId %d (id %d) on %s : stopTime %d must be larger/equal than startTime %d", job.JobID, job.ID, job.Cluster, req.StopTime, job.StartTime), http.StatusBadRequest, rw) handleError(fmt.Errorf("jobId %d (id %d) on %s : stopTime %d must be larger/equal than startTime %d", job.JobID, job.ID, job.Cluster, req.StopTime, job.StartTime), http.StatusBadRequest, rw)
return return
} }
@@ -947,25 +967,23 @@ func (api *RestAPI) checkAndHandleStopJob(rw http.ResponseWriter, job *schema.Jo
job.Duration = int32(req.StopTime - job.StartTime) job.Duration = int32(req.StopTime - job.StartTime)
job.State = req.State job.State = req.State
api.JobRepository.Mutex.Lock() api.JobRepository.Mutex.Lock()
defer api.JobRepository.Mutex.Unlock()
if err := api.JobRepository.Stop(*job.ID, job.Duration, job.State, job.MonitoringStatus); err != nil { if err := api.JobRepository.Stop(*job.ID, job.Duration, job.State, job.MonitoringStatus); err != nil {
if err := api.JobRepository.StopCached(*job.ID, job.Duration, job.State, job.MonitoringStatus); err != nil { if err := api.JobRepository.StopCached(*job.ID, job.Duration, job.State, job.MonitoringStatus); err != nil {
api.JobRepository.Mutex.Unlock()
handleError(fmt.Errorf("jobId %d (id %d) on %s : marking job as '%s' (duration: %d) in DB failed: %w", job.JobID, job.ID, job.Cluster, job.State, job.Duration, err), http.StatusInternalServerError, rw) handleError(fmt.Errorf("jobId %d (id %d) on %s : marking job as '%s' (duration: %d) in DB failed: %w", job.JobID, job.ID, job.Cluster, job.State, job.Duration, err), http.StatusInternalServerError, rw)
return return
} }
} }
api.JobRepository.Mutex.Unlock()
cclog.Infof("archiving job... (dbid: %d): cluster=%s, jobId=%d, user=%s, startTime=%d, duration=%d, state=%s", job.ID, job.Cluster, job.JobID, job.User, job.StartTime, job.Duration, job.State) cclog.Printf("archiving job... (dbid: %d): cluster=%s, jobId=%d, user=%s, startTime=%d, duration=%d, state=%s", job.ID, job.Cluster, job.JobID, job.User, job.StartTime, job.Duration, job.State)
// Send a response (with status OK). This means that errors that happen from here on forward // Send a response (with status OK). This means that errors that happen from here on forward
// can *NOT* be communicated to the client. If reading from a MetricDataRepository or // can *NOT* be communicated to the client. If reading from a MetricDataRepository or
// writing to the filesystem fails, the client will not know. // writing to the filesystem fails, the client will not know.
rw.Header().Add("Content-Type", "application/json") rw.Header().Add("Content-Type", "application/json")
rw.WriteHeader(http.StatusOK) rw.WriteHeader(http.StatusOK)
if err := json.NewEncoder(rw).Encode(job); err != nil { json.NewEncoder(rw).Encode(job)
cclog.Errorf("Failed to encode job response: %v", err)
}
// Monitoring is disabled... // Monitoring is disabled...
if job.MonitoringStatus == schema.MonitoringStatusDisabled { if job.MonitoringStatus == schema.MonitoringStatusDisabled {
@@ -976,14 +994,14 @@ func (api *RestAPI) checkAndHandleStopJob(rw http.ResponseWriter, job *schema.Jo
archiver.TriggerArchiving(job) archiver.TriggerArchiving(job)
} }
func (api *RestAPI) getJobMetrics(rw http.ResponseWriter, r *http.Request) { func (api *RestApi) getJobMetrics(rw http.ResponseWriter, r *http.Request) {
id := mux.Vars(r)["id"] id := mux.Vars(r)["id"]
metrics := r.URL.Query()["metric"] metrics := r.URL.Query()["metric"]
var scopes []schema.MetricScope var scopes []schema.MetricScope
for _, scope := range r.URL.Query()["scope"] { for _, scope := range r.URL.Query()["scope"] {
var s schema.MetricScope var s schema.MetricScope
if err := s.UnmarshalGQL(scope); err != nil { if err := s.UnmarshalGQL(scope); err != nil {
handleError(fmt.Errorf("unmarshaling scope failed: %w", err), http.StatusBadRequest, rw) http.Error(rw, err.Error(), http.StatusBadRequest)
return return
} }
scopes = append(scopes, s) scopes = append(scopes, s)
@@ -992,7 +1010,7 @@ func (api *RestAPI) getJobMetrics(rw http.ResponseWriter, r *http.Request) {
rw.Header().Add("Content-Type", "application/json") rw.Header().Add("Content-Type", "application/json")
rw.WriteHeader(http.StatusOK) rw.WriteHeader(http.StatusOK)
type Response struct { type Respone struct {
Data *struct { Data *struct {
JobMetrics []*model.JobMetricWithName `json:"jobMetrics"` JobMetrics []*model.JobMetricWithName `json:"jobMetrics"`
} `json:"data"` } `json:"data"`
@@ -1004,21 +1022,17 @@ func (api *RestAPI) getJobMetrics(rw http.ResponseWriter, r *http.Request) {
resolver := graph.GetResolverInstance() resolver := graph.GetResolverInstance()
data, err := resolver.Query().JobMetrics(r.Context(), id, metrics, scopes, nil) data, err := resolver.Query().JobMetrics(r.Context(), id, metrics, scopes, nil)
if err != nil { if err != nil {
if err := json.NewEncoder(rw).Encode(Response{ json.NewEncoder(rw).Encode(Respone{
Error: &struct { Error: &struct {
Message string `json:"message"` Message string "json:\"message\""
}{Message: err.Error()}, }{Message: err.Error()},
}); err != nil { })
cclog.Errorf("Failed to encode error response: %v", err)
}
return return
} }
if err := json.NewEncoder(rw).Encode(Response{ json.NewEncoder(rw).Encode(Respone{
Data: &struct { Data: &struct {
JobMetrics []*model.JobMetricWithName `json:"jobMetrics"` JobMetrics []*model.JobMetricWithName "json:\"jobMetrics\""
}{JobMetrics: data}, }{JobMetrics: data},
}); err != nil { })
cclog.Errorf("Failed to encode response: %v", err)
}
} }

View File

@@ -1,170 +0,0 @@
// Copyright (C) NHR@FAU, University Erlangen-Nuremberg.
// All rights reserved.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.
package api
import (
"bufio"
"encoding/json"
"errors"
"fmt"
"io"
"net/http"
"strconv"
"strings"
"github.com/ClusterCockpit/cc-backend/internal/memorystore"
cclog "github.com/ClusterCockpit/cc-lib/ccLogger"
"github.com/influxdata/line-protocol/v2/lineprotocol"
)
// handleFree godoc
// @summary
// @tags free
// @description This endpoint allows the users to free the Buffers from the
// metric store. This endpoint offers the users to remove then systematically
// and also allows then to prune the data under node, if they do not want to
// remove the whole node.
// @produce json
// @param to query string false "up to timestamp"
// @success 200 {string} string "ok"
// @failure 400 {object} api.ErrorResponse "Bad Request"
// @failure 401 {object} api.ErrorResponse "Unauthorized"
// @failure 403 {object} api.ErrorResponse "Forbidden"
// @failure 500 {object} api.ErrorResponse "Internal Server Error"
// @security ApiKeyAuth
// @router /free/ [post]
func freeMetrics(rw http.ResponseWriter, r *http.Request) {
rawTo := r.URL.Query().Get("to")
if rawTo == "" {
handleError(errors.New("'to' is a required query parameter"), http.StatusBadRequest, rw)
return
}
to, err := strconv.ParseInt(rawTo, 10, 64)
if err != nil {
handleError(err, http.StatusInternalServerError, rw)
return
}
bodyDec := json.NewDecoder(r.Body)
var selectors [][]string
err = bodyDec.Decode(&selectors)
if err != nil {
http.Error(rw, err.Error(), http.StatusBadRequest)
return
}
ms := memorystore.GetMemoryStore()
n := 0
for _, sel := range selectors {
bn, err := ms.Free(sel, to)
if err != nil {
handleError(err, http.StatusInternalServerError, rw)
return
}
n += bn
}
rw.WriteHeader(http.StatusOK)
fmt.Fprintf(rw, "buffers freed: %d\n", n)
}
// handleWrite godoc
// @summary Receive metrics in InfluxDB line-protocol
// @tags write
// @description Write data to the in-memory store in the InfluxDB line-protocol using [this format](https://github.com/ClusterCockpit/cc-specifications/blob/master/metrics/lineprotocol_alternative.md)
// @accept plain
// @produce json
// @param cluster query string false "If the lines in the body do not have a cluster tag, use this value instead."
// @success 200 {string} string "ok"
// @failure 400 {object} api.ErrorResponse "Bad Request"
// @failure 401 {object} api.ErrorResponse "Unauthorized"
// @failure 403 {object} api.ErrorResponse "Forbidden"
// @failure 500 {object} api.ErrorResponse "Internal Server Error"
// @security ApiKeyAuth
// @router /write/ [post]
func writeMetrics(rw http.ResponseWriter, r *http.Request) {
bytes, err := io.ReadAll(r.Body)
rw.Header().Add("Content-Type", "application/json")
if err != nil {
handleError(err, http.StatusInternalServerError, rw)
return
}
ms := memorystore.GetMemoryStore()
dec := lineprotocol.NewDecoderWithBytes(bytes)
if err := memorystore.DecodeLine(dec, ms, r.URL.Query().Get("cluster")); err != nil {
cclog.Errorf("/api/write error: %s", err.Error())
handleError(err, http.StatusBadRequest, rw)
return
}
rw.WriteHeader(http.StatusOK)
}
// handleDebug godoc
// @summary Debug endpoint
// @tags debug
// @description This endpoint allows the users to print the content of
// nodes/clusters/metrics to review the state of the data.
// @produce json
// @param selector query string false "Selector"
// @success 200 {string} string "Debug dump"
// @failure 400 {object} api.ErrorResponse "Bad Request"
// @failure 401 {object} api.ErrorResponse "Unauthorized"
// @failure 403 {object} api.ErrorResponse "Forbidden"
// @failure 500 {object} api.ErrorResponse "Internal Server Error"
// @security ApiKeyAuth
// @router /debug/ [post]
func debugMetrics(rw http.ResponseWriter, r *http.Request) {
raw := r.URL.Query().Get("selector")
rw.Header().Add("Content-Type", "application/json")
selector := []string{}
if len(raw) != 0 {
selector = strings.Split(raw, ":")
}
ms := memorystore.GetMemoryStore()
if err := ms.DebugDump(bufio.NewWriter(rw), selector); err != nil {
handleError(err, http.StatusBadRequest, rw)
return
}
}
// handleHealthCheck godoc
// @summary HealthCheck endpoint
// @tags healthcheck
// @description This endpoint allows the users to check if a node is healthy
// @produce json
// @param selector query string false "Selector"
// @success 200 {string} string "Debug dump"
// @failure 400 {object} api.ErrorResponse "Bad Request"
// @failure 401 {object} api.ErrorResponse "Unauthorized"
// @failure 403 {object} api.ErrorResponse "Forbidden"
// @failure 500 {object} api.ErrorResponse "Internal Server Error"
// @security ApiKeyAuth
// @router /healthcheck/ [get]
func metricsHealth(rw http.ResponseWriter, r *http.Request) {
rawCluster := r.URL.Query().Get("cluster")
rawNode := r.URL.Query().Get("node")
if rawCluster == "" || rawNode == "" {
handleError(errors.New("'cluster' and 'node' are required query parameter"), http.StatusBadRequest, rw)
return
}
rw.Header().Add("Content-Type", "application/json")
selector := []string{rawCluster, rawNode}
ms := memorystore.GetMemoryStore()
if err := ms.HealthCheck(bufio.NewWriter(rw), selector); err != nil {
handleError(err, http.StatusBadRequest, rw)
return
}
}

View File

@@ -1,231 +0,0 @@
// Copyright (C) NHR@FAU, University Erlangen-Nuremberg.
// All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.
package api
import (
"bytes"
"database/sql"
"encoding/json"
"sync"
"time"
"github.com/ClusterCockpit/cc-backend/internal/archiver"
"github.com/ClusterCockpit/cc-backend/internal/config"
"github.com/ClusterCockpit/cc-backend/internal/importer"
"github.com/ClusterCockpit/cc-backend/internal/repository"
"github.com/ClusterCockpit/cc-backend/pkg/nats"
cclog "github.com/ClusterCockpit/cc-lib/ccLogger"
"github.com/ClusterCockpit/cc-lib/schema"
)
// NatsAPI provides NATS subscription-based handlers for Job and Node operations.
// It mirrors the functionality of the REST API but uses NATS messaging.
type NatsAPI struct {
JobRepository *repository.JobRepository
// RepositoryMutex protects job creation operations from race conditions
// when checking for duplicate jobs during startJob calls.
RepositoryMutex sync.Mutex
}
// NewNatsAPI creates a new NatsAPI instance with default dependencies.
func NewNatsAPI() *NatsAPI {
return &NatsAPI{
JobRepository: repository.GetJobRepository(),
}
}
// StartSubscriptions registers all NATS subscriptions for Job and Node APIs.
// Returns an error if the NATS client is not available or subscription fails.
func (api *NatsAPI) StartSubscriptions() error {
client := nats.GetClient()
if client == nil {
cclog.Warn("NATS client not available, skipping API subscriptions")
return nil
}
if config.Keys.APISubjects != nil {
s := config.Keys.APISubjects
if err := client.Subscribe(s.SubjectJobStart, api.handleStartJob); err != nil {
return err
}
if err := client.Subscribe(s.SubjectJobStop, api.handleStopJob); err != nil {
return err
}
if err := client.Subscribe(s.SubjectNodeState, api.handleNodeState); err != nil {
return err
}
cclog.Info("NATS API subscriptions started")
}
return nil
}
// handleStartJob processes job start messages received via NATS.
// Expected JSON payload follows the schema.Job structure.
func (api *NatsAPI) handleStartJob(subject string, data []byte) {
req := schema.Job{
Shared: "none",
MonitoringStatus: schema.MonitoringStatusRunningOrArchiving,
}
dec := json.NewDecoder(bytes.NewReader(data))
dec.DisallowUnknownFields()
if err := dec.Decode(&req); err != nil {
cclog.Errorf("NATS %s: parsing request failed: %v", subject, err)
return
}
cclog.Debugf("NATS %s: %s", subject, req.GoString())
req.State = schema.JobStateRunning
if err := importer.SanityChecks(&req); err != nil {
cclog.Errorf("NATS %s: sanity check failed: %v", subject, err)
return
}
var unlockOnce sync.Once
api.RepositoryMutex.Lock()
defer unlockOnce.Do(api.RepositoryMutex.Unlock)
jobs, err := api.JobRepository.FindAll(&req.JobID, &req.Cluster, nil)
if err != nil && err != sql.ErrNoRows {
cclog.Errorf("NATS %s: checking for duplicate failed: %v", subject, err)
return
}
if err == nil {
for _, job := range jobs {
if (req.StartTime - job.StartTime) < secondsPerDay {
cclog.Errorf("NATS %s: job with jobId %d, cluster %s already exists (dbid: %d)",
subject, req.JobID, req.Cluster, job.ID)
return
}
}
}
id, err := api.JobRepository.Start(&req)
if err != nil {
cclog.Errorf("NATS %s: insert into database failed: %v", subject, err)
return
}
unlockOnce.Do(api.RepositoryMutex.Unlock)
for _, tag := range req.Tags {
if _, err := api.JobRepository.AddTagOrCreate(nil, id, tag.Type, tag.Name, tag.Scope); err != nil {
cclog.Errorf("NATS %s: adding tag to new job %d failed: %v", subject, id, err)
return
}
}
cclog.Infof("NATS: new job (id: %d): cluster=%s, jobId=%d, user=%s, startTime=%d",
id, req.Cluster, req.JobID, req.User, req.StartTime)
}
// handleStopJob processes job stop messages received via NATS.
// Expected JSON payload follows the StopJobAPIRequest structure.
func (api *NatsAPI) handleStopJob(subject string, data []byte) {
var req StopJobAPIRequest
dec := json.NewDecoder(bytes.NewReader(data))
dec.DisallowUnknownFields()
if err := dec.Decode(&req); err != nil {
cclog.Errorf("NATS %s: parsing request failed: %v", subject, err)
return
}
if req.JobID == nil {
cclog.Errorf("NATS %s: the field 'jobId' is required", subject)
return
}
job, err := api.JobRepository.Find(req.JobID, req.Cluster, req.StartTime)
if err != nil {
cachedJob, cachedErr := api.JobRepository.FindCached(req.JobID, req.Cluster, req.StartTime)
if cachedErr != nil {
cclog.Errorf("NATS %s: finding job failed: %v (cached lookup also failed: %v)",
subject, err, cachedErr)
return
}
job = cachedJob
}
if job.State != schema.JobStateRunning {
cclog.Errorf("NATS %s: jobId %d (id %d) on %s: job has already been stopped (state is: %s)",
subject, job.JobID, job.ID, job.Cluster, job.State)
return
}
if job.StartTime > req.StopTime {
cclog.Errorf("NATS %s: jobId %d (id %d) on %s: stopTime %d must be >= startTime %d",
subject, job.JobID, job.ID, job.Cluster, req.StopTime, job.StartTime)
return
}
if req.State != "" && !req.State.Valid() {
cclog.Errorf("NATS %s: jobId %d (id %d) on %s: invalid job state: %#v",
subject, job.JobID, job.ID, job.Cluster, req.State)
return
} else if req.State == "" {
req.State = schema.JobStateCompleted
}
job.Duration = int32(req.StopTime - job.StartTime)
job.State = req.State
api.JobRepository.Mutex.Lock()
defer api.JobRepository.Mutex.Unlock()
if err := api.JobRepository.Stop(*job.ID, job.Duration, job.State, job.MonitoringStatus); err != nil {
if err := api.JobRepository.StopCached(*job.ID, job.Duration, job.State, job.MonitoringStatus); err != nil {
cclog.Errorf("NATS %s: jobId %d (id %d) on %s: marking job as '%s' failed: %v",
subject, job.JobID, job.ID, job.Cluster, job.State, err)
return
}
}
cclog.Infof("NATS: archiving job (dbid: %d): cluster=%s, jobId=%d, user=%s, startTime=%d, duration=%d, state=%s",
job.ID, job.Cluster, job.JobID, job.User, job.StartTime, job.Duration, job.State)
if job.MonitoringStatus == schema.MonitoringStatusDisabled {
return
}
archiver.TriggerArchiving(job)
}
// handleNodeState processes node state update messages received via NATS.
// Expected JSON payload follows the UpdateNodeStatesRequest structure.
func (api *NatsAPI) handleNodeState(subject string, data []byte) {
var req UpdateNodeStatesRequest
dec := json.NewDecoder(bytes.NewReader(data))
dec.DisallowUnknownFields()
if err := dec.Decode(&req); err != nil {
cclog.Errorf("NATS %s: parsing request failed: %v", subject, err)
return
}
repo := repository.GetNodeRepository()
for _, node := range req.Nodes {
state := determineState(node.States)
nodeState := schema.NodeStateDB{
TimeStamp: time.Now().Unix(),
NodeState: state,
CpusAllocated: node.CpusAllocated,
MemoryAllocated: node.MemoryAllocated,
GpusAllocated: node.GpusAllocated,
HealthState: schema.MonitoringStateFull,
JobsRunning: node.JobsRunning,
}
repo.UpdateNodeState(node.Hostname, req.Cluster, &nodeState)
}
cclog.Debugf("NATS %s: updated %d node states for cluster %s", subject, len(req.Nodes), req.Cluster)
}

View File

@@ -2,26 +2,36 @@
// All rights reserved. This file is part of cc-backend. // All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style // Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file. // license that can be found in the LICENSE file.
package api package api
import ( import (
"fmt" "fmt"
"net/http" "net/http"
"strings" "strings"
"time"
"github.com/ClusterCockpit/cc-backend/internal/repository" "github.com/ClusterCockpit/cc-backend/internal/repository"
"github.com/ClusterCockpit/cc-lib/schema" "github.com/ClusterCockpit/cc-lib/schema"
) )
type Node struct {
Name string `json:"hostname"`
States []string `json:"states"`
CpusAllocated int `json:"cpusAllocated"`
CpusTotal int `json:"cpusTotal"`
MemoryAllocated int `json:"memoryAllocated"`
MemoryTotal int `json:"memoryTotal"`
GpusAllocated int `json:"gpusAllocated"`
GpusTotal int `json:"gpusTotal"`
}
// updateNodeStatesRequest model
type UpdateNodeStatesRequest struct { type UpdateNodeStatesRequest struct {
Nodes []schema.NodePayload `json:"nodes"` Nodes []Node `json:"nodes"`
Cluster string `json:"cluster" example:"fritz"` Cluster string `json:"cluster" example:"fritz"`
} }
// this routine assumes that only one of them exists per node // this routine assumes that only one of them exists per node
func determineState(states []string) schema.SchedulerState { func determineState(states []string) schema.NodeState {
for _, state := range states { for _, state := range states {
switch strings.ToLower(state) { switch strings.ToLower(state) {
case "allocated": case "allocated":
@@ -54,27 +64,17 @@ func determineState(states []string) schema.SchedulerState {
// @failure 500 {object} api.ErrorResponse "Internal Server Error" // @failure 500 {object} api.ErrorResponse "Internal Server Error"
// @security ApiKeyAuth // @security ApiKeyAuth
// @router /api/nodestats/ [post] // @router /api/nodestats/ [post]
func (api *RestAPI) updateNodeStates(rw http.ResponseWriter, r *http.Request) { func (api *RestApi) updateNodeStates(rw http.ResponseWriter, r *http.Request) {
// Parse request body // Parse request body
req := UpdateNodeStatesRequest{} req := UpdateNodeStatesRequest{}
if err := decode(r.Body, &req); err != nil { if err := decode(r.Body, &req); err != nil {
handleError(fmt.Errorf("parsing request body failed: %w", err), handleError(fmt.Errorf("parsing request body failed: %w", err), http.StatusBadRequest, rw)
http.StatusBadRequest, rw)
return return
} }
repo := repository.GetNodeRepository() repo := repository.GetNodeRepository()
for _, node := range req.Nodes { for _, node := range req.Nodes {
state := determineState(node.States) state := determineState(node.States)
nodeState := schema.NodeStateDB{ repo.UpdateNodeState(node.Name, req.Cluster, &state)
TimeStamp: time.Now().Unix(), NodeState: state,
CpusAllocated: node.CpusAllocated,
MemoryAllocated: node.MemoryAllocated,
GpusAllocated: node.GpusAllocated,
HealthState: schema.MonitoringStateFull,
JobsRunning: node.JobsRunning,
}
repo.UpdateNodeState(node.Hostname, req.Cluster, &nodeState)
} }
} }

View File

@@ -2,11 +2,6 @@
// All rights reserved. This file is part of cc-backend. // All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style // Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file. // license that can be found in the LICENSE file.
// Package api provides the REST API layer for ClusterCockpit.
// It handles HTTP requests for job management, user administration,
// cluster queries, node state updates, and metrics storage operations.
// The API supports both JWT token authentication and session-based authentication.
package api package api
import ( import (
@@ -16,7 +11,6 @@ import (
"net/http" "net/http"
"os" "os"
"path/filepath" "path/filepath"
"strings"
"sync" "sync"
"github.com/ClusterCockpit/cc-backend/internal/auth" "github.com/ClusterCockpit/cc-backend/internal/auth"
@@ -45,31 +39,22 @@ import (
// @in header // @in header
// @name X-Auth-Token // @name X-Auth-Token
const ( type RestApi struct {
noticeFilePath = "./var/notice.txt"
noticeFilePerms = 0o644
)
type RestAPI struct {
JobRepository *repository.JobRepository JobRepository *repository.JobRepository
Authentication *auth.Authentication Authentication *auth.Authentication
MachineStateDir string MachineStateDir string
// RepositoryMutex protects job creation operations from race conditions
// when checking for duplicate jobs during startJob API calls.
// It prevents concurrent job starts with the same jobId/cluster/startTime
// from creating duplicate entries in the database.
RepositoryMutex sync.Mutex RepositoryMutex sync.Mutex
} }
func New() *RestAPI { func New() *RestApi {
return &RestAPI{ return &RestApi{
JobRepository: repository.GetJobRepository(), JobRepository: repository.GetJobRepository(),
MachineStateDir: config.Keys.MachineStateDir, MachineStateDir: config.Keys.MachineStateDir,
Authentication: auth.GetAuthInstance(), Authentication: auth.GetAuthInstance(),
} }
} }
func (api *RestAPI) MountAPIRoutes(r *mux.Router) { func (api *RestApi) MountApiRoutes(r *mux.Router) {
r.StrictSlash(true) r.StrictSlash(true)
// REST API Uses TokenAuth // REST API Uses TokenAuth
// User List // User List
@@ -79,20 +64,19 @@ func (api *RestAPI) MountAPIRoutes(r *mux.Router) {
// Slurm node state // Slurm node state
r.HandleFunc("/nodestate/", api.updateNodeStates).Methods(http.MethodPost, http.MethodPut) r.HandleFunc("/nodestate/", api.updateNodeStates).Methods(http.MethodPost, http.MethodPut)
// Job Handler // Job Handler
if config.Keys.APISubjects == nil { r.HandleFunc("/jobs/start_job/", api.startJob).Methods(http.MethodPost, http.MethodPut)
cclog.Info("Enabling REST start/stop job API") r.HandleFunc("/jobs/stop_job/", api.stopJobByRequest).Methods(http.MethodPost, http.MethodPut)
r.HandleFunc("/jobs/start_job/", api.startJob).Methods(http.MethodPost, http.MethodPut) // r.HandleFunc("/jobs/import/", api.importJob).Methods(http.MethodPost, http.MethodPut)
r.HandleFunc("/jobs/stop_job/", api.stopJobByRequest).Methods(http.MethodPost, http.MethodPut)
}
r.HandleFunc("/jobs/", api.getJobs).Methods(http.MethodGet) r.HandleFunc("/jobs/", api.getJobs).Methods(http.MethodGet)
r.HandleFunc("/jobs/{id}", api.getJobByID).Methods(http.MethodPost) r.HandleFunc("/jobs/{id}", api.getJobById).Methods(http.MethodPost)
r.HandleFunc("/jobs/{id}", api.getCompleteJobByID).Methods(http.MethodGet) r.HandleFunc("/jobs/{id}", api.getCompleteJobById).Methods(http.MethodGet)
r.HandleFunc("/jobs/tag_job/{id}", api.tagJob).Methods(http.MethodPost, http.MethodPatch) r.HandleFunc("/jobs/tag_job/{id}", api.tagJob).Methods(http.MethodPost, http.MethodPatch)
r.HandleFunc("/jobs/tag_job/{id}", api.removeTagJob).Methods(http.MethodDelete) r.HandleFunc("/jobs/tag_job/{id}", api.removeTagJob).Methods(http.MethodDelete)
r.HandleFunc("/jobs/edit_meta/{id}", api.editMeta).Methods(http.MethodPost, http.MethodPatch) r.HandleFunc("/jobs/edit_meta/", api.editMetaByRequest).Methods(http.MethodPatch)
r.HandleFunc("/jobs/edit_meta/{id}", api.editMeta).Methods(http.MethodPatch)
r.HandleFunc("/jobs/metrics/{id}", api.getJobMetrics).Methods(http.MethodGet) r.HandleFunc("/jobs/metrics/{id}", api.getJobMetrics).Methods(http.MethodGet)
r.HandleFunc("/jobs/delete_job/", api.deleteJobByRequest).Methods(http.MethodDelete) r.HandleFunc("/jobs/delete_job/", api.deleteJobByRequest).Methods(http.MethodDelete)
r.HandleFunc("/jobs/delete_job/{id}", api.deleteJobByID).Methods(http.MethodDelete) r.HandleFunc("/jobs/delete_job/{id}", api.deleteJobById).Methods(http.MethodDelete)
r.HandleFunc("/jobs/delete_job_before/{ts}", api.deleteJobBefore).Methods(http.MethodDelete) r.HandleFunc("/jobs/delete_job_before/{ts}", api.deleteJobBefore).Methods(http.MethodDelete)
r.HandleFunc("/tags/", api.removeTags).Methods(http.MethodDelete) r.HandleFunc("/tags/", api.removeTags).Methods(http.MethodDelete)
@@ -103,30 +87,16 @@ func (api *RestAPI) MountAPIRoutes(r *mux.Router) {
} }
} }
func (api *RestAPI) MountUserAPIRoutes(r *mux.Router) { func (api *RestApi) MountUserApiRoutes(r *mux.Router) {
r.StrictSlash(true) r.StrictSlash(true)
// REST API Uses TokenAuth // USER REST API Uses TokenAuth
r.HandleFunc("/jobs/", api.getJobs).Methods(http.MethodGet) r.HandleFunc("/jobs/", api.getJobs).Methods(http.MethodGet)
r.HandleFunc("/jobs/{id}", api.getJobByID).Methods(http.MethodPost) r.HandleFunc("/jobs/{id}", api.getJobById).Methods(http.MethodPost)
r.HandleFunc("/jobs/{id}", api.getCompleteJobByID).Methods(http.MethodGet) r.HandleFunc("/jobs/{id}", api.getCompleteJobById).Methods(http.MethodGet)
r.HandleFunc("/jobs/metrics/{id}", api.getJobMetrics).Methods(http.MethodGet) r.HandleFunc("/jobs/metrics/{id}", api.getJobMetrics).Methods(http.MethodGet)
} }
func (api *RestAPI) MountMetricStoreAPIRoutes(r *mux.Router) { func (api *RestApi) MountConfigApiRoutes(r *mux.Router) {
// REST API Uses TokenAuth
// Note: StrictSlash handles trailing slash variations automatically
r.HandleFunc("/api/free", freeMetrics).Methods(http.MethodPost)
r.HandleFunc("/api/write", writeMetrics).Methods(http.MethodPost)
r.HandleFunc("/api/debug", debugMetrics).Methods(http.MethodGet)
r.HandleFunc("/api/healthcheck", metricsHealth).Methods(http.MethodGet)
// Same endpoints but with trailing slash
r.HandleFunc("/api/free/", freeMetrics).Methods(http.MethodPost)
r.HandleFunc("/api/write/", writeMetrics).Methods(http.MethodPost)
r.HandleFunc("/api/debug/", debugMetrics).Methods(http.MethodGet)
r.HandleFunc("/api/healthcheck/", metricsHealth).Methods(http.MethodGet)
}
func (api *RestAPI) MountConfigAPIRoutes(r *mux.Router) {
r.StrictSlash(true) r.StrictSlash(true)
// Settings Frontend Uses SessionAuth // Settings Frontend Uses SessionAuth
if api.Authentication != nil { if api.Authentication != nil {
@@ -139,7 +109,7 @@ func (api *RestAPI) MountConfigAPIRoutes(r *mux.Router) {
} }
} }
func (api *RestAPI) MountFrontendAPIRoutes(r *mux.Router) { func (api *RestApi) MountFrontendApiRoutes(r *mux.Router) {
r.StrictSlash(true) r.StrictSlash(true)
// Settings Frontend Uses SessionAuth // Settings Frontend Uses SessionAuth
if api.Authentication != nil { if api.Authentication != nil {
@@ -155,8 +125,8 @@ type ErrorResponse struct {
Error string `json:"error"` // Error Message Error string `json:"error"` // Error Message
} }
// DefaultAPIResponse model // DefaultApiResponse model
type DefaultAPIResponse struct { type DefaultApiResponse struct {
Message string `json:"msg"` Message string `json:"msg"`
} }
@@ -164,12 +134,10 @@ func handleError(err error, statusCode int, rw http.ResponseWriter) {
cclog.Warnf("REST ERROR : %s", err.Error()) cclog.Warnf("REST ERROR : %s", err.Error())
rw.Header().Add("Content-Type", "application/json") rw.Header().Add("Content-Type", "application/json")
rw.WriteHeader(statusCode) rw.WriteHeader(statusCode)
if err := json.NewEncoder(rw).Encode(ErrorResponse{ json.NewEncoder(rw).Encode(ErrorResponse{
Status: http.StatusText(statusCode), Status: http.StatusText(statusCode),
Error: err.Error(), Error: err.Error(),
}); err != nil { })
cclog.Errorf("Failed to encode error response: %v", err)
}
} }
func decode(r io.Reader, val any) error { func decode(r io.Reader, val any) error {
@@ -178,68 +146,69 @@ func decode(r io.Reader, val any) error {
return dec.Decode(val) return dec.Decode(val)
} }
func (api *RestAPI) editNotice(rw http.ResponseWriter, r *http.Request) { func (api *RestApi) editNotice(rw http.ResponseWriter, r *http.Request) {
// SecuredCheck() only worked with TokenAuth: Removed // SecuredCheck() only worked with TokenAuth: Removed
if user := repository.GetUserFromContext(r.Context()); !user.HasRole(schema.RoleAdmin) { if user := repository.GetUserFromContext(r.Context()); !user.HasRole(schema.RoleAdmin) {
handleError(fmt.Errorf("only admins are allowed to update the notice.txt file"), http.StatusForbidden, rw) http.Error(rw, "Only admins are allowed to update the notice.txt file", http.StatusForbidden)
return return
} }
// Get Value // Get Value
newContent := r.FormValue("new-content") newContent := r.FormValue("new-content")
// Validate content length to prevent DoS // Check FIle
if len(newContent) > 10000 { noticeExists := util.CheckFileExists("./var/notice.txt")
handleError(fmt.Errorf("notice content exceeds maximum length of 10000 characters"), http.StatusBadRequest, rw)
return
}
// Check File
noticeExists := util.CheckFileExists(noticeFilePath)
if !noticeExists { if !noticeExists {
ntxt, err := os.Create(noticeFilePath) ntxt, err := os.Create("./var/notice.txt")
if err != nil { if err != nil {
handleError(fmt.Errorf("creating notice file failed: %w", err), http.StatusInternalServerError, rw) cclog.Errorf("Creating ./var/notice.txt failed: %s", err.Error())
http.Error(rw, err.Error(), http.StatusUnprocessableEntity)
return return
} }
ntxt.Close() ntxt.Close()
} }
if err := os.WriteFile(noticeFilePath, []byte(newContent), noticeFilePerms); err != nil {
handleError(fmt.Errorf("writing to notice file failed: %w", err), http.StatusInternalServerError, rw)
return
}
rw.Header().Set("Content-Type", "text/plain")
rw.WriteHeader(http.StatusOK)
if newContent != "" { if newContent != "" {
rw.Write([]byte("Update Notice Content Success")) if err := os.WriteFile("./var/notice.txt", []byte(newContent), 0o666); err != nil {
cclog.Errorf("Writing to ./var/notice.txt failed: %s", err.Error())
http.Error(rw, err.Error(), http.StatusUnprocessableEntity)
return
} else {
rw.Write([]byte("Update Notice Content Success"))
}
} else { } else {
rw.Write([]byte("Empty Notice Content Success")) if err := os.WriteFile("./var/notice.txt", []byte(""), 0o666); err != nil {
cclog.Errorf("Writing to ./var/notice.txt failed: %s", err.Error())
http.Error(rw, err.Error(), http.StatusUnprocessableEntity)
return
} else {
rw.Write([]byte("Empty Notice Content Success"))
}
} }
} }
func (api *RestAPI) getJWT(rw http.ResponseWriter, r *http.Request) { func (api *RestApi) getJWT(rw http.ResponseWriter, r *http.Request) {
rw.Header().Set("Content-Type", "text/plain") rw.Header().Set("Content-Type", "text/plain")
username := r.FormValue("username") username := r.FormValue("username")
me := repository.GetUserFromContext(r.Context()) me := repository.GetUserFromContext(r.Context())
if !me.HasRole(schema.RoleAdmin) { if !me.HasRole(schema.RoleAdmin) {
if username != me.Username { if username != me.Username {
handleError(fmt.Errorf("only admins are allowed to sign JWTs not for themselves"), http.StatusForbidden, rw) http.Error(rw, "Only admins are allowed to sign JWTs not for themselves",
http.StatusForbidden)
return return
} }
} }
user, err := repository.GetUserRepository().GetUser(username) user, err := repository.GetUserRepository().GetUser(username)
if err != nil { if err != nil {
handleError(fmt.Errorf("getting user failed: %w", err), http.StatusNotFound, rw) http.Error(rw, err.Error(), http.StatusUnprocessableEntity)
return return
} }
jwt, err := api.Authentication.JwtAuth.ProvideJWT(user) jwt, err := api.Authentication.JwtAuth.ProvideJWT(user)
if err != nil { if err != nil {
handleError(fmt.Errorf("providing JWT failed: %w", err), http.StatusInternalServerError, rw) http.Error(rw, err.Error(), http.StatusUnprocessableEntity)
return return
} }
@@ -247,103 +216,75 @@ func (api *RestAPI) getJWT(rw http.ResponseWriter, r *http.Request) {
rw.Write([]byte(jwt)) rw.Write([]byte(jwt))
} }
func (api *RestAPI) getRoles(rw http.ResponseWriter, r *http.Request) { func (api *RestApi) getRoles(rw http.ResponseWriter, r *http.Request) {
// SecuredCheck() only worked with TokenAuth: Removed // SecuredCheck() only worked with TokenAuth: Removed
user := repository.GetUserFromContext(r.Context()) user := repository.GetUserFromContext(r.Context())
if !user.HasRole(schema.RoleAdmin) { if !user.HasRole(schema.RoleAdmin) {
handleError(fmt.Errorf("only admins are allowed to fetch a list of roles"), http.StatusForbidden, rw) http.Error(rw, "only admins are allowed to fetch a list of roles", http.StatusForbidden)
return return
} }
roles, err := schema.GetValidRoles(user) roles, err := schema.GetValidRoles(user)
if err != nil { if err != nil {
handleError(fmt.Errorf("getting valid roles failed: %w", err), http.StatusInternalServerError, rw) http.Error(rw, err.Error(), http.StatusInternalServerError)
return return
} }
rw.Header().Set("Content-Type", "application/json") json.NewEncoder(rw).Encode(roles)
if err := json.NewEncoder(rw).Encode(roles); err != nil {
cclog.Errorf("Failed to encode roles response: %v", err)
}
} }
func (api *RestAPI) updateConfiguration(rw http.ResponseWriter, r *http.Request) { func (api *RestApi) updateConfiguration(rw http.ResponseWriter, r *http.Request) {
rw.Header().Set("Content-Type", "text/plain") rw.Header().Set("Content-Type", "text/plain")
key, value := r.FormValue("key"), r.FormValue("value") key, value := r.FormValue("key"), r.FormValue("value")
if err := repository.GetUserCfgRepo().UpdateConfig(key, value, repository.GetUserFromContext(r.Context())); err != nil { if err := repository.GetUserCfgRepo().UpdateConfig(key, value, repository.GetUserFromContext(r.Context())); err != nil {
handleError(fmt.Errorf("updating configuration failed: %w", err), http.StatusInternalServerError, rw) http.Error(rw, err.Error(), http.StatusUnprocessableEntity)
return return
} }
rw.WriteHeader(http.StatusOK)
rw.Write([]byte("success")) rw.Write([]byte("success"))
} }
func (api *RestAPI) putMachineState(rw http.ResponseWriter, r *http.Request) { func (api *RestApi) putMachineState(rw http.ResponseWriter, r *http.Request) {
if api.MachineStateDir == "" { if api.MachineStateDir == "" {
handleError(fmt.Errorf("machine state not enabled"), http.StatusNotFound, rw) http.Error(rw, "REST > machine state not enabled", http.StatusNotFound)
return return
} }
vars := mux.Vars(r) vars := mux.Vars(r)
cluster := vars["cluster"] cluster := vars["cluster"]
host := vars["host"] host := vars["host"]
// Validate cluster and host to prevent path traversal attacks
if strings.Contains(cluster, "..") || strings.Contains(cluster, "/") || strings.Contains(cluster, "\\") {
handleError(fmt.Errorf("invalid cluster name"), http.StatusBadRequest, rw)
return
}
if strings.Contains(host, "..") || strings.Contains(host, "/") || strings.Contains(host, "\\") {
handleError(fmt.Errorf("invalid host name"), http.StatusBadRequest, rw)
return
}
dir := filepath.Join(api.MachineStateDir, cluster) dir := filepath.Join(api.MachineStateDir, cluster)
if err := os.MkdirAll(dir, 0o755); err != nil { if err := os.MkdirAll(dir, 0755); err != nil {
handleError(fmt.Errorf("creating directory failed: %w", err), http.StatusInternalServerError, rw) http.Error(rw, err.Error(), http.StatusInternalServerError)
return return
} }
filename := filepath.Join(dir, fmt.Sprintf("%s.json", host)) filename := filepath.Join(dir, fmt.Sprintf("%s.json", host))
f, err := os.Create(filename) f, err := os.Create(filename)
if err != nil { if err != nil {
handleError(fmt.Errorf("creating file failed: %w", err), http.StatusInternalServerError, rw) http.Error(rw, err.Error(), http.StatusInternalServerError)
return return
} }
defer f.Close() defer f.Close()
if _, err := io.Copy(f, r.Body); err != nil { if _, err := io.Copy(f, r.Body); err != nil {
handleError(fmt.Errorf("writing file failed: %w", err), http.StatusInternalServerError, rw) http.Error(rw, err.Error(), http.StatusInternalServerError)
return return
} }
rw.WriteHeader(http.StatusCreated) rw.WriteHeader(http.StatusCreated)
} }
func (api *RestAPI) getMachineState(rw http.ResponseWriter, r *http.Request) { func (api *RestApi) getMachineState(rw http.ResponseWriter, r *http.Request) {
if api.MachineStateDir == "" { if api.MachineStateDir == "" {
handleError(fmt.Errorf("machine state not enabled"), http.StatusNotFound, rw) http.Error(rw, "REST > machine state not enabled", http.StatusNotFound)
return return
} }
vars := mux.Vars(r) vars := mux.Vars(r)
cluster := vars["cluster"] filename := filepath.Join(api.MachineStateDir, vars["cluster"], fmt.Sprintf("%s.json", vars["host"]))
host := vars["host"]
// Validate cluster and host to prevent path traversal attacks
if strings.Contains(cluster, "..") || strings.Contains(cluster, "/") || strings.Contains(cluster, "\\") {
handleError(fmt.Errorf("invalid cluster name"), http.StatusBadRequest, rw)
return
}
if strings.Contains(host, "..") || strings.Contains(host, "/") || strings.Contains(host, "\\") {
handleError(fmt.Errorf("invalid host name"), http.StatusBadRequest, rw)
return
}
filename := filepath.Join(api.MachineStateDir, cluster, fmt.Sprintf("%s.json", host))
// Sets the content-type and 'Last-Modified' Header and so on automatically // Sets the content-type and 'Last-Modified' Header and so on automatically
http.ServeFile(rw, r, filename) http.ServeFile(rw, r, filename)

View File

@@ -2,7 +2,6 @@
// All rights reserved. This file is part of cc-backend. // All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style // Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file. // license that can be found in the LICENSE file.
package api package api
import ( import (
@@ -11,12 +10,11 @@ import (
"net/http" "net/http"
"github.com/ClusterCockpit/cc-backend/internal/repository" "github.com/ClusterCockpit/cc-backend/internal/repository"
cclog "github.com/ClusterCockpit/cc-lib/ccLogger"
"github.com/ClusterCockpit/cc-lib/schema" "github.com/ClusterCockpit/cc-lib/schema"
"github.com/gorilla/mux" "github.com/gorilla/mux"
) )
type APIReturnedUser struct { type ApiReturnedUser struct {
Username string `json:"username"` Username string `json:"username"`
Name string `json:"name"` Name string `json:"name"`
Roles []string `json:"roles"` Roles []string `json:"roles"`
@@ -38,46 +36,28 @@ type APIReturnedUser struct {
// @failure 500 {string} string "Internal Server Error" // @failure 500 {string} string "Internal Server Error"
// @security ApiKeyAuth // @security ApiKeyAuth
// @router /api/users/ [get] // @router /api/users/ [get]
func (api *RestAPI) getUsers(rw http.ResponseWriter, r *http.Request) { func (api *RestApi) getUsers(rw http.ResponseWriter, r *http.Request) {
// SecuredCheck() only worked with TokenAuth: Removed // SecuredCheck() only worked with TokenAuth: Removed
if user := repository.GetUserFromContext(r.Context()); !user.HasRole(schema.RoleAdmin) { if user := repository.GetUserFromContext(r.Context()); !user.HasRole(schema.RoleAdmin) {
handleError(fmt.Errorf("only admins are allowed to fetch a list of users"), http.StatusForbidden, rw) http.Error(rw, "Only admins are allowed to fetch a list of users", http.StatusForbidden)
return return
} }
users, err := repository.GetUserRepository().ListUsers(r.URL.Query().Get("not-just-user") == "true") users, err := repository.GetUserRepository().ListUsers(r.URL.Query().Get("not-just-user") == "true")
if err != nil { if err != nil {
handleError(fmt.Errorf("listing users failed: %w", err), http.StatusInternalServerError, rw) http.Error(rw, err.Error(), http.StatusInternalServerError)
return return
} }
rw.Header().Set("Content-Type", "application/json") json.NewEncoder(rw).Encode(users)
if err := json.NewEncoder(rw).Encode(users); err != nil {
cclog.Errorf("Failed to encode users response: %v", err)
}
} }
// updateUser godoc func (api *RestApi) updateUser(rw http.ResponseWriter, r *http.Request) {
// @summary Update user roles and projects
// @tags User
// @description Allows admins to add/remove roles and projects for a user
// @produce plain
// @param id path string true "Username"
// @param add-role formData string false "Role to add"
// @param remove-role formData string false "Role to remove"
// @param add-project formData string false "Project to add"
// @param remove-project formData string false "Project to remove"
// @success 200 {string} string "Success message"
// @failure 403 {object} api.ErrorResponse "Forbidden"
// @failure 422 {object} api.ErrorResponse "Unprocessable Entity"
// @security ApiKeyAuth
// @router /api/user/{id} [post]
func (api *RestAPI) updateUser(rw http.ResponseWriter, r *http.Request) {
// SecuredCheck() only worked with TokenAuth: Removed // SecuredCheck() only worked with TokenAuth: Removed
if user := repository.GetUserFromContext(r.Context()); !user.HasRole(schema.RoleAdmin) { if user := repository.GetUserFromContext(r.Context()); !user.HasRole(schema.RoleAdmin) {
handleError(fmt.Errorf("only admins are allowed to update a user"), http.StatusForbidden, rw) http.Error(rw, "Only admins are allowed to update a user", http.StatusForbidden)
return return
} }
@@ -87,70 +67,43 @@ func (api *RestAPI) updateUser(rw http.ResponseWriter, r *http.Request) {
newproj := r.FormValue("add-project") newproj := r.FormValue("add-project")
delproj := r.FormValue("remove-project") delproj := r.FormValue("remove-project")
rw.Header().Set("Content-Type", "application/json") // TODO: Handle anything but roles...
// Handle role updates
if newrole != "" { if newrole != "" {
if err := repository.GetUserRepository().AddRole(r.Context(), mux.Vars(r)["id"], newrole); err != nil { if err := repository.GetUserRepository().AddRole(r.Context(), mux.Vars(r)["id"], newrole); err != nil {
handleError(fmt.Errorf("adding role failed: %w", err), http.StatusUnprocessableEntity, rw) http.Error(rw, err.Error(), http.StatusUnprocessableEntity)
return return
} }
if err := json.NewEncoder(rw).Encode(DefaultAPIResponse{Message: "Add Role Success"}); err != nil { rw.Write([]byte("Add Role Success"))
cclog.Errorf("Failed to encode response: %v", err)
}
} else if delrole != "" { } else if delrole != "" {
if err := repository.GetUserRepository().RemoveRole(r.Context(), mux.Vars(r)["id"], delrole); err != nil { if err := repository.GetUserRepository().RemoveRole(r.Context(), mux.Vars(r)["id"], delrole); err != nil {
handleError(fmt.Errorf("removing role failed: %w", err), http.StatusUnprocessableEntity, rw) http.Error(rw, err.Error(), http.StatusUnprocessableEntity)
return return
} }
if err := json.NewEncoder(rw).Encode(DefaultAPIResponse{Message: "Remove Role Success"}); err != nil { rw.Write([]byte("Remove Role Success"))
cclog.Errorf("Failed to encode response: %v", err)
}
} else if newproj != "" { } else if newproj != "" {
if err := repository.GetUserRepository().AddProject(r.Context(), mux.Vars(r)["id"], newproj); err != nil { if err := repository.GetUserRepository().AddProject(r.Context(), mux.Vars(r)["id"], newproj); err != nil {
handleError(fmt.Errorf("adding project failed: %w", err), http.StatusUnprocessableEntity, rw) http.Error(rw, err.Error(), http.StatusUnprocessableEntity)
return return
} }
if err := json.NewEncoder(rw).Encode(DefaultAPIResponse{Message: "Add Project Success"}); err != nil { rw.Write([]byte("Add Project Success"))
cclog.Errorf("Failed to encode response: %v", err)
}
} else if delproj != "" { } else if delproj != "" {
if err := repository.GetUserRepository().RemoveProject(r.Context(), mux.Vars(r)["id"], delproj); err != nil { if err := repository.GetUserRepository().RemoveProject(r.Context(), mux.Vars(r)["id"], delproj); err != nil {
handleError(fmt.Errorf("removing project failed: %w", err), http.StatusUnprocessableEntity, rw) http.Error(rw, err.Error(), http.StatusUnprocessableEntity)
return return
} }
if err := json.NewEncoder(rw).Encode(DefaultAPIResponse{Message: "Remove Project Success"}); err != nil { rw.Write([]byte("Remove Project Success"))
cclog.Errorf("Failed to encode response: %v", err)
}
} else { } else {
handleError(fmt.Errorf("no operation specified: must provide add-role, remove-role, add-project, or remove-project"), http.StatusBadRequest, rw) http.Error(rw, "Not Add or Del [role|project]?", http.StatusInternalServerError)
} }
} }
// createUser godoc func (api *RestApi) createUser(rw http.ResponseWriter, r *http.Request) {
// @summary Create a new user
// @tags User
// @description Creates a new user with specified credentials and role
// @produce plain
// @param username formData string true "Username"
// @param password formData string false "Password (not required for API users)"
// @param role formData string true "User role"
// @param name formData string false "Full name"
// @param email formData string false "Email address"
// @param project formData string false "Project (required for managers)"
// @success 200 {string} string "Success message"
// @failure 400 {object} api.ErrorResponse "Bad Request"
// @failure 403 {object} api.ErrorResponse "Forbidden"
// @failure 422 {object} api.ErrorResponse "Unprocessable Entity"
// @security ApiKeyAuth
// @router /api/users/ [post]
func (api *RestAPI) createUser(rw http.ResponseWriter, r *http.Request) {
// SecuredCheck() only worked with TokenAuth: Removed // SecuredCheck() only worked with TokenAuth: Removed
rw.Header().Set("Content-Type", "text/plain") rw.Header().Set("Content-Type", "text/plain")
me := repository.GetUserFromContext(r.Context()) me := repository.GetUserFromContext(r.Context())
if !me.HasRole(schema.RoleAdmin) { if !me.HasRole(schema.RoleAdmin) {
handleError(fmt.Errorf("only admins are allowed to create new users"), http.StatusForbidden, rw) http.Error(rw, "Only admins are allowed to create new users", http.StatusForbidden)
return return
} }
@@ -158,22 +111,18 @@ func (api *RestAPI) createUser(rw http.ResponseWriter, r *http.Request) {
r.FormValue("password"), r.FormValue("role"), r.FormValue("name"), r.FormValue("password"), r.FormValue("role"), r.FormValue("name"),
r.FormValue("email"), r.FormValue("project") r.FormValue("email"), r.FormValue("project")
// Validate username length
if len(username) == 0 || len(username) > 100 {
handleError(fmt.Errorf("username must be between 1 and 100 characters"), http.StatusBadRequest, rw)
return
}
if len(password) == 0 && role != schema.GetRoleString(schema.RoleApi) { if len(password) == 0 && role != schema.GetRoleString(schema.RoleApi) {
handleError(fmt.Errorf("only API users are allowed to have a blank password (login will be impossible)"), http.StatusBadRequest, rw) http.Error(rw, "Only API users are allowed to have a blank password (login will be impossible)", http.StatusBadRequest)
return return
} }
if len(project) != 0 && role != schema.GetRoleString(schema.RoleManager) { if len(project) != 0 && role != schema.GetRoleString(schema.RoleManager) {
handleError(fmt.Errorf("only managers require a project (can be changed later)"), http.StatusBadRequest, rw) http.Error(rw, "only managers require a project (can be changed later)",
http.StatusBadRequest)
return return
} else if len(project) == 0 && role == schema.GetRoleString(schema.RoleManager) { } else if len(project) == 0 && role == schema.GetRoleString(schema.RoleManager) {
handleError(fmt.Errorf("managers require a project to manage (can be changed later)"), http.StatusBadRequest, rw) http.Error(rw, "managers require a project to manage (can be changed later)",
http.StatusBadRequest)
return return
} }
@@ -185,35 +134,24 @@ func (api *RestAPI) createUser(rw http.ResponseWriter, r *http.Request) {
Projects: []string{project}, Projects: []string{project},
Roles: []string{role}, Roles: []string{role},
}); err != nil { }); err != nil {
handleError(fmt.Errorf("adding user failed: %w", err), http.StatusUnprocessableEntity, rw) http.Error(rw, err.Error(), http.StatusUnprocessableEntity)
return return
} }
fmt.Fprintf(rw, "User %v successfully created!\n", username) fmt.Fprintf(rw, "User %v successfully created!\n", username)
} }
// deleteUser godoc func (api *RestApi) deleteUser(rw http.ResponseWriter, r *http.Request) {
// @summary Delete a user
// @tags User
// @description Deletes a user from the system
// @produce plain
// @param username formData string true "Username to delete"
// @success 200 {string} string "Success"
// @failure 403 {object} api.ErrorResponse "Forbidden"
// @failure 422 {object} api.ErrorResponse "Unprocessable Entity"
// @security ApiKeyAuth
// @router /api/users/ [delete]
func (api *RestAPI) deleteUser(rw http.ResponseWriter, r *http.Request) {
// SecuredCheck() only worked with TokenAuth: Removed // SecuredCheck() only worked with TokenAuth: Removed
if user := repository.GetUserFromContext(r.Context()); !user.HasRole(schema.RoleAdmin) { if user := repository.GetUserFromContext(r.Context()); !user.HasRole(schema.RoleAdmin) {
handleError(fmt.Errorf("only admins are allowed to delete a user"), http.StatusForbidden, rw) http.Error(rw, "Only admins are allowed to delete a user", http.StatusForbidden)
return return
} }
username := r.FormValue("username") username := r.FormValue("username")
if err := repository.GetUserRepository().DelUser(username); err != nil { if err := repository.GetUserRepository().DelUser(username); err != nil {
handleError(fmt.Errorf("deleting user failed: %w", err), http.StatusUnprocessableEntity, rw) http.Error(rw, err.Error(), http.StatusUnprocessableEntity)
return return
} }

View File

@@ -1,190 +0,0 @@
# Archiver Package
The `archiver` package provides asynchronous job archiving functionality for ClusterCockpit. When jobs complete, their metric data is archived from the metric store to a persistent archive backend (filesystem, S3, SQLite, etc.).
## Architecture
### Producer-Consumer Pattern
```
┌──────────────┐ TriggerArchiving() ┌───────────────┐
│ API Handler │ ───────────────────────▶ │ archiveChannel│
│ (Job Stop) │ │ (buffer: 128)│
└──────────────┘ └───────┬───────┘
┌─────────────────────────────────┘
┌──────────────────────┐
│ archivingWorker() │
│ (goroutine) │
└──────────┬───────────┘
1. Fetch job metadata
2. Load metric data
3. Calculate statistics
4. Archive to backend
5. Update database
6. Call hooks
```
### Components
- **archiveChannel**: Buffered channel (128 jobs) for async communication
- **archivePending**: WaitGroup tracking in-flight archiving operations
- **archivingWorker**: Background goroutine processing archiving requests
- **shutdownCtx**: Context for graceful cancellation during shutdown
## Usage
### Initialization
```go
// Start archiver with context for shutdown control
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
archiver.Start(jobRepository, ctx)
```
### Archiving a Job
```go
// Called automatically when a job completes
archiver.TriggerArchiving(job)
```
The function returns immediately. Actual archiving happens in the background.
### Graceful Shutdown
```go
// Shutdown with 10 second timeout
if err := archiver.Shutdown(10 * time.Second); err != nil {
log.Printf("Archiver shutdown timeout: %v", err)
}
```
**Shutdown process:**
1. Closes channel (rejects new jobs)
2. Waits for pending jobs (up to timeout)
3. Cancels context if timeout exceeded
4. Waits for worker to exit cleanly
## Configuration
### Channel Buffer Size
The archiving channel has a buffer of 128 jobs. If more than 128 jobs are queued simultaneously, `TriggerArchiving()` will block until space is available.
To adjust:
```go
// In archiveWorker.go Start() function
archiveChannel = make(chan *schema.Job, 256) // Increase buffer
```
### Scope Selection
Archive data scopes are automatically selected based on job size:
- **Node scope**: Always included
- **Core scope**: Included for jobs with ≤8 nodes (reduces data volume for large jobs)
- **Accelerator scope**: Included if job used accelerators (`NumAcc > 0`)
To adjust the node threshold:
```go
// In archiver.go ArchiveJob() function
if job.NumNodes <= 16 { // Change from 8 to 16
scopes = append(scopes, schema.MetricScopeCore)
}
```
### Resolution
Data is archived at the highest available resolution (typically 60s intervals). To change:
```go
// In archiver.go ArchiveJob() function
jobData, err := metricdispatcher.LoadData(job, allMetrics, scopes, ctx, 300)
// 0 = highest resolution
// 300 = 5-minute resolution
```
## Error Handling
### Automatic Retry
The archiver does **not** automatically retry failed archiving operations. If archiving fails:
1. Error is logged
2. Job is marked as `MonitoringStatusArchivingFailed` in database
3. Worker continues processing other jobs
### Manual Retry
To re-archive failed jobs, query for jobs with `MonitoringStatusArchivingFailed` and call `TriggerArchiving()` again.
## Performance Considerations
### Single Worker Thread
The archiver uses a single worker goroutine. For high-throughput systems:
- Large channel buffer (128) prevents blocking
- Archiving is typically I/O bound (writing to storage)
- Single worker prevents overwhelming storage backend
### Shutdown Timeout
Recommended timeout values:
- **Development**: 5-10 seconds
- **Production**: 10-30 seconds
- **High-load**: 30-60 seconds
Choose based on:
- Average archiving time per job
- Storage backend latency
- Acceptable shutdown delay
## Monitoring
### Logging
The archiver logs:
- **Info**: Startup, shutdown, successful completions
- **Debug**: Individual job archiving times
- **Error**: Archiving failures with job ID and reason
- **Warn**: Shutdown timeout exceeded
### Metrics
Monitor these signals for archiver health:
- Jobs with `MonitoringStatusArchivingFailed`
- Time from job stop to successful archive
- Shutdown timeout occurrences
## Thread Safety
All exported functions are safe for concurrent use:
- `Start()` - Safe to call once
- `TriggerArchiving()` - Safe from multiple goroutines
- `Shutdown()` - Safe to call once
- `WaitForArchiving()` - Deprecated, but safe
Internal state is protected by:
- Channel synchronization (`archiveChannel`)
- WaitGroup for pending count (`archivePending`)
- Context for cancellation (`shutdownCtx`)
## Files
- **archiveWorker.go**: Worker lifecycle, channel management, shutdown logic
- **archiver.go**: Core archiving logic, metric loading, statistics calculation
## Dependencies
- `internal/repository`: Database operations for job metadata
- `internal/metricdispatcher`: Loading metric data from various backends
- `pkg/archive`: Archive backend abstraction (filesystem, S3, SQLite)
- `cc-lib/schema`: Job and metric data structures

View File

@@ -2,54 +2,10 @@
// All rights reserved. This file is part of cc-backend. // All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style // Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file. // license that can be found in the LICENSE file.
// Package archiver provides asynchronous job archiving functionality for ClusterCockpit.
//
// The archiver runs a background worker goroutine that processes job archiving requests
// from a buffered channel. When jobs complete, their metric data is archived from the
// metric store to the configured archive backend (filesystem, S3, etc.).
//
// # Architecture
//
// The archiver uses a producer-consumer pattern:
// - Producer: TriggerArchiving() sends jobs to archiveChannel
// - Consumer: archivingWorker() processes jobs from the channel
// - Coordination: sync.WaitGroup tracks pending archive operations
//
// # Lifecycle
//
// 1. Start(repo, ctx) - Initialize worker with context for cancellation
// 2. TriggerArchiving(job) - Queue job for archiving (called when job stops)
// 3. archivingWorker() - Background goroutine processes jobs
// 4. Shutdown(timeout) - Graceful shutdown with timeout
//
// # Graceful Shutdown
//
// The archiver supports graceful shutdown with configurable timeout:
// - Closes channel to reject new jobs
// - Waits for pending jobs to complete (up to timeout)
// - Cancels context if timeout exceeded
// - Ensures worker goroutine exits cleanly
//
// # Example Usage
//
// // Initialize archiver
// ctx, cancel := context.WithCancel(context.Background())
// defer cancel()
// archiver.Start(jobRepository, ctx)
//
// // Trigger archiving when job completes
// archiver.TriggerArchiving(job)
//
// // Graceful shutdown with 10 second timeout
// if err := archiver.Shutdown(10 * time.Second); err != nil {
// log.Printf("Archiver shutdown timeout: %v", err)
// }
package archiver package archiver
import ( import (
"context" "context"
"fmt"
"sync" "sync"
"time" "time"
@@ -63,82 +19,38 @@ var (
archivePending sync.WaitGroup archivePending sync.WaitGroup
archiveChannel chan *schema.Job archiveChannel chan *schema.Job
jobRepo *repository.JobRepository jobRepo *repository.JobRepository
shutdownCtx context.Context
shutdownCancel context.CancelFunc
workerDone chan struct{}
) )
// Start initializes the archiver and starts the background worker goroutine. func Start(r *repository.JobRepository) {
//
// The archiver processes job archiving requests asynchronously via a buffered channel.
// Jobs are sent to the channel using TriggerArchiving() and processed by the worker.
//
// Parameters:
// - r: JobRepository instance for database operations
// - ctx: Context for cancellation (shutdown signal propagation)
//
// The worker goroutine will run until:
// - ctx is cancelled (via parent shutdown)
// - archiveChannel is closed (via Shutdown())
//
// Must be called before TriggerArchiving(). Safe to call only once.
func Start(r *repository.JobRepository, ctx context.Context) {
shutdownCtx, shutdownCancel = context.WithCancel(ctx)
archiveChannel = make(chan *schema.Job, 128) archiveChannel = make(chan *schema.Job, 128)
workerDone = make(chan struct{})
jobRepo = r jobRepo = r
go archivingWorker() go archivingWorker()
} }
// archivingWorker is the background goroutine that processes job archiving requests. // Archiving worker thread
//
// The worker loop:
// 1. Blocks waiting for jobs on archiveChannel or shutdown signal
// 2. Fetches job metadata from repository
// 3. Archives job data to configured backend (calls ArchiveJob)
// 4. Updates job footprint and energy metrics in database
// 5. Marks job as successfully archived
// 6. Calls job stop hooks
//
// The worker exits when:
// - shutdownCtx is cancelled (timeout during shutdown)
// - archiveChannel is closed (normal shutdown)
//
// Errors during archiving are logged and the job is marked as failed,
// but the worker continues processing other jobs.
func archivingWorker() { func archivingWorker() {
defer close(workerDone)
for { for {
select { select {
case <-shutdownCtx.Done():
cclog.Info("Archive worker received shutdown signal")
return
case job, ok := <-archiveChannel: case job, ok := <-archiveChannel:
if !ok { if !ok {
cclog.Info("Archive channel closed, worker exiting") break
return
} }
start := time.Now() start := time.Now()
// not using meta data, called to load JobMeta into Cache? // not using meta data, called to load JobMeta into Cache?
// will fail if job meta not in repository // will fail if job meta not in repository
if _, err := jobRepo.FetchMetadata(job); err != nil { if _, err := jobRepo.FetchMetadata(job); err != nil {
cclog.Errorf("archiving job (dbid: %d) failed at check metadata step: %s", job.ID, err.Error()) cclog.Errorf("archiving job (dbid: %d) failed at check metadata step: %s", job.ID, err.Error())
jobRepo.UpdateMonitoringStatus(*job.ID, schema.MonitoringStatusArchivingFailed) jobRepo.UpdateMonitoringStatus(*job.ID, schema.MonitoringStatusArchivingFailed)
archivePending.Done()
continue continue
} }
// ArchiveJob will fetch all the data from a MetricDataRepository and push into configured archive backend // ArchiveJob will fetch all the data from a MetricDataRepository and push into configured archive backend
// Use shutdown context to allow cancellation // TODO: Maybe use context with cancel/timeout here
jobMeta, err := ArchiveJob(job, shutdownCtx) jobMeta, err := ArchiveJob(job, context.Background())
if err != nil { if err != nil {
cclog.Errorf("archiving job (dbid: %d) failed at archiving job step: %s", job.ID, err.Error()) cclog.Errorf("archiving job (dbid: %d) failed at archiving job step: %s", job.ID, err.Error())
jobRepo.UpdateMonitoringStatus(*job.ID, schema.MonitoringStatusArchivingFailed) jobRepo.UpdateMonitoringStatus(*job.ID, schema.MonitoringStatusArchivingFailed)
archivePending.Done()
continue continue
} }
@@ -146,44 +58,30 @@ func archivingWorker() {
if stmt, err = jobRepo.UpdateFootprint(stmt, jobMeta); err != nil { if stmt, err = jobRepo.UpdateFootprint(stmt, jobMeta); err != nil {
cclog.Errorf("archiving job (dbid: %d) failed at update Footprint step: %s", job.ID, err.Error()) cclog.Errorf("archiving job (dbid: %d) failed at update Footprint step: %s", job.ID, err.Error())
archivePending.Done()
continue continue
} }
if stmt, err = jobRepo.UpdateEnergy(stmt, jobMeta); err != nil { if stmt, err = jobRepo.UpdateEnergy(stmt, jobMeta); err != nil {
cclog.Errorf("archiving job (dbid: %d) failed at update Energy step: %s", job.ID, err.Error()) cclog.Errorf("archiving job (dbid: %d) failed at update Energy step: %s", job.ID, err.Error())
archivePending.Done()
continue continue
} }
// Update the jobs database entry one last time: // Update the jobs database entry one last time:
stmt = jobRepo.MarkArchived(stmt, schema.MonitoringStatusArchivingSuccessful) stmt = jobRepo.MarkArchived(stmt, schema.MonitoringStatusArchivingSuccessful)
if err := jobRepo.Execute(stmt); err != nil { if err := jobRepo.Execute(stmt); err != nil {
cclog.Errorf("archiving job (dbid: %d) failed at db execute: %s", job.ID, err.Error()) cclog.Errorf("archiving job (dbid: %d) failed at db execute: %s", job.ID, err.Error())
archivePending.Done()
continue continue
} }
cclog.Debugf("archiving job %d took %s", job.JobID, time.Since(start)) cclog.Debugf("archiving job %d took %s", job.JobID, time.Since(start))
cclog.Infof("archiving job (dbid: %d) successful", job.ID) cclog.Printf("archiving job (dbid: %d) successful", job.ID)
repository.CallJobStopHooks(job) repository.CallJobStopHooks(job)
archivePending.Done() archivePending.Done()
default:
continue
} }
} }
} }
// TriggerArchiving queues a job for asynchronous archiving. // Trigger async archiving
//
// This function should be called when a job completes (stops) to archive its
// metric data from the metric store to the configured archive backend.
//
// The function:
// 1. Increments the pending job counter (WaitGroup)
// 2. Sends the job to the archiving channel (buffered, capacity 128)
// 3. Returns immediately (non-blocking unless channel is full)
//
// The actual archiving is performed asynchronously by the worker goroutine.
// Upon completion, the worker will decrement the pending counter.
//
// Panics if Start() has not been called first.
func TriggerArchiving(job *schema.Job) { func TriggerArchiving(job *schema.Job) {
if archiveChannel == nil { if archiveChannel == nil {
cclog.Fatal("Cannot archive without archiving channel. Did you Start the archiver?") cclog.Fatal("Cannot archive without archiving channel. Did you Start the archiver?")
@@ -193,58 +91,8 @@ func TriggerArchiving(job *schema.Job) {
archiveChannel <- job archiveChannel <- job
} }
// Shutdown performs a graceful shutdown of the archiver with a configurable timeout. // Wait for background thread to finish pending archiving operations
// func WaitForArchiving() {
// The shutdown process: // close channel and wait for worker to process remaining jobs
// 1. Closes archiveChannel - no new jobs will be accepted archivePending.Wait()
// 2. Waits for pending jobs to complete (up to timeout duration)
// 3. If timeout is exceeded:
// - Cancels shutdownCtx to interrupt ongoing ArchiveJob operations
// - Returns error indicating timeout
// 4. Waits for worker goroutine to exit cleanly
//
// Parameters:
// - timeout: Maximum duration to wait for pending jobs to complete
// (recommended: 10-30 seconds for production)
//
// Returns:
// - nil if all jobs completed within timeout
// - error if timeout was exceeded (some jobs may not have been archived)
//
// Jobs that don't complete within the timeout will be marked as failed.
// The function always ensures the worker goroutine exits before returning.
//
// Example:
//
// if err := archiver.Shutdown(10 * time.Second); err != nil {
// log.Printf("Some jobs did not complete: %v", err)
// }
func Shutdown(timeout time.Duration) error {
cclog.Info("Initiating archiver shutdown...")
// Close channel to signal no more jobs will be accepted
close(archiveChannel)
// Create a channel to signal when all jobs are done
done := make(chan struct{})
go func() {
archivePending.Wait()
close(done)
}()
// Wait for jobs to complete or timeout
select {
case <-done:
cclog.Info("All archive jobs completed successfully")
// Wait for worker to exit
<-workerDone
return nil
case <-time.After(timeout):
cclog.Warn("Archiver shutdown timeout exceeded, cancelling remaining operations")
// Cancel any ongoing operations
shutdownCancel()
// Wait for worker to exit
<-workerDone
return fmt.Errorf("archiver shutdown timeout after %v", timeout)
}
} }

View File

@@ -2,7 +2,6 @@
// All rights reserved. This file is part of cc-backend. // All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style // Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file. // license that can be found in the LICENSE file.
package archiver package archiver
import ( import (
@@ -10,38 +9,13 @@ import (
"math" "math"
"github.com/ClusterCockpit/cc-backend/internal/config" "github.com/ClusterCockpit/cc-backend/internal/config"
"github.com/ClusterCockpit/cc-backend/internal/metricdispatcher" "github.com/ClusterCockpit/cc-backend/internal/metricDataDispatcher"
"github.com/ClusterCockpit/cc-backend/pkg/archive" "github.com/ClusterCockpit/cc-backend/pkg/archive"
cclog "github.com/ClusterCockpit/cc-lib/ccLogger" cclog "github.com/ClusterCockpit/cc-lib/ccLogger"
"github.com/ClusterCockpit/cc-lib/schema" "github.com/ClusterCockpit/cc-lib/schema"
) )
// ArchiveJob archives a completed job's metric data to the configured archive backend. // Writes a running job to the job-archive
//
// This function performs the following operations:
// 1. Loads all metric data for the job from the metric data repository
// 2. Calculates job-level statistics (avg, min, max) for each metric
// 3. Stores the job metadata and metric data to the archive backend
//
// Metric data is retrieved at the highest available resolution (typically 60s)
// for the following scopes:
// - Node scope (always)
// - Core scope (for jobs with ≤8 nodes, to reduce data volume)
// - Accelerator scope (if job used accelerators)
//
// The function respects context cancellation. If ctx is cancelled (e.g., during
// shutdown timeout), the operation will be interrupted and return an error.
//
// Parameters:
// - job: The job to archive (must be a completed job)
// - ctx: Context for cancellation and timeout control
//
// Returns:
// - *schema.Job with populated Statistics field
// - error if data loading or archiving fails
//
// If config.Keys.DisableArchive is true, only job statistics are calculated
// and returned (no data is written to archive backend).
func ArchiveJob(job *schema.Job, ctx context.Context) (*schema.Job, error) { func ArchiveJob(job *schema.Job, ctx context.Context) (*schema.Job, error) {
allMetrics := make([]string, 0) allMetrics := make([]string, 0)
metricConfigs := archive.GetCluster(job.Cluster).MetricConfig metricConfigs := archive.GetCluster(job.Cluster).MetricConfig
@@ -60,7 +34,7 @@ func ArchiveJob(job *schema.Job, ctx context.Context) (*schema.Job, error) {
scopes = append(scopes, schema.MetricScopeAccelerator) scopes = append(scopes, schema.MetricScopeAccelerator)
} }
jobData, err := metricdispatcher.LoadData(job, allMetrics, scopes, ctx, 0) // 0 Resulotion-Value retrieves highest res (60s) jobData, err := metricDataDispatcher.LoadData(job, allMetrics, scopes, ctx, 0) // 0 Resulotion-Value retrieves highest res (60s)
if err != nil { if err != nil {
cclog.Error("Error wile loading job data for archiving") cclog.Error("Error wile loading job data for archiving")
return nil, err return nil, err

View File

@@ -2,22 +2,19 @@
// All rights reserved. This file is part of cc-backend. // All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style // Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file. // license that can be found in the LICENSE file.
// Package auth implements various authentication methods
package auth package auth
import ( import (
"bytes"
"context" "context"
"crypto/rand" "crypto/rand"
"database/sql" "database/sql"
"encoding/base64" "encoding/base64"
"encoding/json"
"errors" "errors"
"fmt" "fmt"
"net" "net"
"net/http" "net/http"
"os" "os"
"strings"
"sync" "sync"
"time" "time"
@@ -31,19 +28,8 @@ import (
"github.com/gorilla/sessions" "github.com/gorilla/sessions"
) )
// Authenticator is the interface for all authentication methods.
// Each authenticator determines if it can handle a login request (CanLogin)
// and performs the actual authentication (Login).
type Authenticator interface { type Authenticator interface {
// CanLogin determines if this authenticator can handle the login request.
// It returns the user object if available and a boolean indicating if this
// authenticator should attempt the login. This method should not perform
// expensive operations or actual authentication.
CanLogin(user *schema.User, username string, rw http.ResponseWriter, r *http.Request) (*schema.User, bool) CanLogin(user *schema.User, username string, rw http.ResponseWriter, r *http.Request) (*schema.User, bool)
// Login performs the actually authentication for the user.
// It returns the authenticated user or an error if authentication fails.
// The user parameter may be nil if the user doesn't exist in the database yet.
Login(user *schema.User, rw http.ResponseWriter, r *http.Request) (*schema.User, error) Login(user *schema.User, rw http.ResponseWriter, r *http.Request) (*schema.User, error)
} }
@@ -52,70 +38,19 @@ var (
authInstance *Authentication authInstance *Authentication
) )
// rateLimiterEntry tracks a rate limiter and its last use time for cleanup
type rateLimiterEntry struct {
limiter *rate.Limiter
lastUsed time.Time
}
var ipUserLimiters sync.Map var ipUserLimiters sync.Map
// getIPUserLimiter returns a rate limiter for the given IP and username combination.
// Rate limiters are created on demand and track 5 attempts per 15 minutes.
func getIPUserLimiter(ip, username string) *rate.Limiter { func getIPUserLimiter(ip, username string) *rate.Limiter {
key := ip + ":" + username key := ip + ":" + username
now := time.Now() limiter, ok := ipUserLimiters.Load(key)
if !ok {
if entry, ok := ipUserLimiters.Load(key); ok { newLimiter := rate.NewLimiter(rate.Every(time.Hour/10), 10)
rle := entry.(*rateLimiterEntry) ipUserLimiters.Store(key, newLimiter)
rle.lastUsed = now return newLimiter
return rle.limiter
} }
return limiter.(*rate.Limiter)
// More aggressive rate limiting: 5 attempts per 15 minutes
newLimiter := rate.NewLimiter(rate.Every(15*time.Minute/5), 5)
ipUserLimiters.Store(key, &rateLimiterEntry{
limiter: newLimiter,
lastUsed: now,
})
return newLimiter
} }
// cleanupOldRateLimiters removes rate limiters that haven't been used recently
func cleanupOldRateLimiters(olderThan time.Time) {
ipUserLimiters.Range(func(key, value any) bool {
entry := value.(*rateLimiterEntry)
if entry.lastUsed.Before(olderThan) {
ipUserLimiters.Delete(key)
cclog.Debugf("Cleaned up rate limiter for %v", key)
}
return true
})
}
// startRateLimiterCleanup starts a background goroutine to clean up old rate limiters
func startRateLimiterCleanup() {
go func() {
ticker := time.NewTicker(1 * time.Hour)
defer ticker.Stop()
for range ticker.C {
// Clean up limiters not used in the last 24 hours
cleanupOldRateLimiters(time.Now().Add(-24 * time.Hour))
}
}()
}
// AuthConfig contains configuration for all authentication methods
type AuthConfig struct {
LdapConfig *LdapConfig `json:"ldap"`
JwtConfig *JWTAuthConfig `json:"jwts"`
OpenIDConfig *OpenIDConfig `json:"oidc"`
}
// Keys holds the global authentication configuration
var Keys AuthConfig
// Authentication manages all authentication methods and session handling
type Authentication struct { type Authentication struct {
sessionStore *sessions.CookieStore sessionStore *sessions.CookieStore
LdapAuth *LdapAuthenticator LdapAuth *LdapAuthenticator
@@ -139,31 +74,10 @@ func (auth *Authentication) AuthViaSession(
return nil, nil return nil, nil
} }
// Validate session data with proper type checking // TODO: Check if session keys exist
username, ok := session.Values["username"].(string) username, _ := session.Values["username"].(string)
if !ok || username == "" { projects, _ := session.Values["projects"].([]string)
cclog.Warn("Invalid session: missing or invalid username") roles, _ := session.Values["roles"].([]string)
// Invalidate the corrupted session
session.Options.MaxAge = -1
_ = auth.sessionStore.Save(r, rw, session)
return nil, errors.New("invalid session data")
}
projects, ok := session.Values["projects"].([]string)
if !ok {
cclog.Warn("Invalid session: projects not found or invalid type, using empty list")
projects = []string{}
}
roles, ok := session.Values["roles"].([]string)
if !ok || len(roles) == 0 {
cclog.Warn("Invalid session: missing or invalid roles")
// Invalidate the corrupted session
session.Options.MaxAge = -1
_ = auth.sessionStore.Save(r, rw, session)
return nil, errors.New("invalid session data")
}
return &schema.User{ return &schema.User{
Username: username, Username: username,
Projects: projects, Projects: projects,
@@ -173,12 +87,9 @@ func (auth *Authentication) AuthViaSession(
}, nil }, nil
} }
func Init(authCfg *json.RawMessage) { func Init() {
initOnce.Do(func() { initOnce.Do(func() {
authInstance = &Authentication{} authInstance = &Authentication{}
// Start background cleanup of rate limiters
startRateLimiterCleanup()
sessKey := os.Getenv("SESSION_KEY") sessKey := os.Getenv("SESSION_KEY")
if sessKey == "" { if sessKey == "" {
@@ -200,18 +111,7 @@ func Init(authCfg *json.RawMessage) {
authInstance.SessionMaxAge = d authInstance.SessionMaxAge = d
} }
if authCfg == nil { if config.Keys.LdapConfig != nil {
return
}
config.Validate(configSchema, *authCfg)
dec := json.NewDecoder(bytes.NewReader(*authCfg))
dec.DisallowUnknownFields()
if err := dec.Decode(&Keys); err != nil {
cclog.Errorf("error while decoding ldap config: %v", err)
}
if Keys.LdapConfig != nil {
ldapAuth := &LdapAuthenticator{} ldapAuth := &LdapAuthenticator{}
if err := ldapAuth.Init(); err != nil { if err := ldapAuth.Init(); err != nil {
cclog.Warn("Error while initializing authentication -> ldapAuth init failed") cclog.Warn("Error while initializing authentication -> ldapAuth init failed")
@@ -223,7 +123,7 @@ func Init(authCfg *json.RawMessage) {
cclog.Info("Missing LDAP configuration: No LDAP support!") cclog.Info("Missing LDAP configuration: No LDAP support!")
} }
if Keys.JwtConfig != nil { if config.Keys.JwtConfig != nil {
authInstance.JwtAuth = &JWTAuthenticator{} authInstance.JwtAuth = &JWTAuthenticator{}
if err := authInstance.JwtAuth.Init(); err != nil { if err := authInstance.JwtAuth.Init(); err != nil {
cclog.Fatal("Error while initializing authentication -> jwtAuth init failed") cclog.Fatal("Error while initializing authentication -> jwtAuth init failed")
@@ -262,36 +162,38 @@ func GetAuthInstance() *Authentication {
return authInstance return authInstance
} }
// handleUserSync syncs or updates a user in the database based on configuration. func handleTokenUser(tokenUser *schema.User) {
// This is used for both JWT and OIDC authentication when syncUserOnLogin or updateUserOnLogin is enabled.
func handleUserSync(user *schema.User, syncUserOnLogin, updateUserOnLogin bool) {
r := repository.GetUserRepository() r := repository.GetUserRepository()
dbUser, err := r.GetUser(user.Username) dbUser, err := r.GetUser(tokenUser.Username)
if err != nil && err != sql.ErrNoRows { if err != nil && err != sql.ErrNoRows {
cclog.Errorf("Error while loading user '%s': %v", user.Username, err) cclog.Errorf("Error while loading user '%s': %v", tokenUser.Username, err)
return } else if err == sql.ErrNoRows && config.Keys.JwtConfig.SyncUserOnLogin { // Adds New User
} if err := r.AddUser(tokenUser); err != nil {
cclog.Errorf("Error while adding user '%s' to DB: %v", tokenUser.Username, err)
if err == sql.ErrNoRows && syncUserOnLogin { // Add new user
if err := r.AddUser(user); err != nil {
cclog.Errorf("Error while adding user '%s' to DB: %v", user.Username, err)
} }
} else if err == nil && updateUserOnLogin { // Update existing user } else if err == nil && config.Keys.JwtConfig.UpdateUserOnLogin { // Update Existing User
if err := r.UpdateUser(dbUser, user); err != nil { if err := r.UpdateUser(dbUser, tokenUser); err != nil {
cclog.Errorf("Error while updating user '%s' in DB: %v", dbUser.Username, err) cclog.Errorf("Error while updating user '%s' to DB: %v", dbUser.Username, err)
} }
} }
} }
// handleTokenUser syncs JWT token user with database
func handleTokenUser(tokenUser *schema.User) {
handleUserSync(tokenUser, Keys.JwtConfig.SyncUserOnLogin, Keys.JwtConfig.UpdateUserOnLogin)
}
// handleOIDCUser syncs OIDC user with database
func handleOIDCUser(OIDCUser *schema.User) { func handleOIDCUser(OIDCUser *schema.User) {
handleUserSync(OIDCUser, Keys.OpenIDConfig.SyncUserOnLogin, Keys.OpenIDConfig.UpdateUserOnLogin) r := repository.GetUserRepository()
dbUser, err := r.GetUser(OIDCUser.Username)
if err != nil && err != sql.ErrNoRows {
cclog.Errorf("Error while loading user '%s': %v", OIDCUser.Username, err)
} else if err == sql.ErrNoRows && config.Keys.OpenIDConfig.SyncUserOnLogin { // Adds New User
if err := r.AddUser(OIDCUser); err != nil {
cclog.Errorf("Error while adding user '%s' to DB: %v", OIDCUser.Username, err)
}
} else if err == nil && config.Keys.OpenIDConfig.UpdateUserOnLogin { // Update Existing User
if err := r.UpdateUser(dbUser, OIDCUser); err != nil {
cclog.Errorf("Error while updating user '%s' to DB: %v", dbUser.Username, err)
}
}
} }
func (auth *Authentication) SaveSession(rw http.ResponseWriter, r *http.Request, user *schema.User) error { func (auth *Authentication) SaveSession(rw http.ResponseWriter, r *http.Request, user *schema.User) error {
@@ -305,8 +207,7 @@ func (auth *Authentication) SaveSession(rw http.ResponseWriter, r *http.Request,
if auth.SessionMaxAge != 0 { if auth.SessionMaxAge != 0 {
session.Options.MaxAge = int(auth.SessionMaxAge.Seconds()) session.Options.MaxAge = int(auth.SessionMaxAge.Seconds())
} }
if config.Keys.HTTPSCertFile == "" && config.Keys.HTTPSKeyFile == "" { if config.Keys.HttpsCertFile == "" && config.Keys.HttpsKeyFile == "" {
cclog.Warn("HTTPS not configured - session cookies will not have Secure flag set (insecure for production)")
session.Options.Secure = false session.Options.Secure = false
} }
session.Options.SameSite = http.SameSiteStrictMode session.Options.SameSite = http.SameSiteStrictMode
@@ -416,7 +317,7 @@ func (auth *Authentication) Auth(
}) })
} }
func (auth *Authentication) AuthAPI( func (auth *Authentication) AuthApi(
onsuccess http.Handler, onsuccess http.Handler,
onfailure func(rw http.ResponseWriter, r *http.Request, authErr error), onfailure func(rw http.ResponseWriter, r *http.Request, authErr error),
) http.Handler { ) http.Handler {
@@ -459,7 +360,7 @@ func (auth *Authentication) AuthAPI(
}) })
} }
func (auth *Authentication) AuthUserAPI( func (auth *Authentication) AuthUserApi(
onsuccess http.Handler, onsuccess http.Handler,
onfailure func(rw http.ResponseWriter, r *http.Request, authErr error), onfailure func(rw http.ResponseWriter, r *http.Request, authErr error),
) http.Handler { ) http.Handler {
@@ -480,7 +381,7 @@ func (auth *Authentication) AuthUserAPI(
return return
} }
case len(user.Roles) >= 2: case len(user.Roles) >= 2:
if user.HasRole(schema.RoleApi) && user.HasAnyRole([]schema.Role{schema.RoleUser, schema.RoleManager, schema.RoleSupport, schema.RoleAdmin}) { if user.HasRole(schema.RoleApi) && user.HasAnyRole([]schema.Role{schema.RoleUser, schema.RoleManager, schema.RoleAdmin}) {
ctx := context.WithValue(r.Context(), repository.ContextUserKey, user) ctx := context.WithValue(r.Context(), repository.ContextUserKey, user)
onsuccess.ServeHTTP(rw, r.WithContext(ctx)) onsuccess.ServeHTTP(rw, r.WithContext(ctx))
return return
@@ -495,43 +396,7 @@ func (auth *Authentication) AuthUserAPI(
}) })
} }
func (auth *Authentication) AuthMetricStoreAPI( func (auth *Authentication) AuthConfigApi(
onsuccess http.Handler,
onfailure func(rw http.ResponseWriter, r *http.Request, authErr error),
) http.Handler {
return http.HandlerFunc(func(rw http.ResponseWriter, r *http.Request) {
user, err := auth.JwtAuth.AuthViaJWT(rw, r)
if err != nil {
cclog.Infof("auth metricstore api -> authentication failed: %s", err.Error())
onfailure(rw, r, err)
return
}
if user != nil {
switch {
case len(user.Roles) == 1:
if user.HasRole(schema.RoleApi) {
ctx := context.WithValue(r.Context(), repository.ContextUserKey, user)
onsuccess.ServeHTTP(rw, r.WithContext(ctx))
return
}
case len(user.Roles) >= 2:
if user.HasRole(schema.RoleApi) && user.HasAnyRole([]schema.Role{schema.RoleUser, schema.RoleManager, schema.RoleAdmin}) {
ctx := context.WithValue(r.Context(), repository.ContextUserKey, user)
onsuccess.ServeHTTP(rw, r.WithContext(ctx))
return
}
default:
cclog.Info("auth metricstore api -> authentication failed: missing role")
onfailure(rw, r, errors.New("unauthorized"))
}
}
cclog.Info("auth metricstore api -> authentication failed: no auth")
onfailure(rw, r, errors.New("unauthorized"))
})
}
func (auth *Authentication) AuthConfigAPI(
onsuccess http.Handler, onsuccess http.Handler,
onfailure func(rw http.ResponseWriter, r *http.Request, authErr error), onfailure func(rw http.ResponseWriter, r *http.Request, authErr error),
) http.Handler { ) http.Handler {
@@ -552,7 +417,7 @@ func (auth *Authentication) AuthConfigAPI(
}) })
} }
func (auth *Authentication) AuthFrontendAPI( func (auth *Authentication) AuthFrontendApi(
onsuccess http.Handler, onsuccess http.Handler,
onfailure func(rw http.ResponseWriter, r *http.Request, authErr error), onfailure func(rw http.ResponseWriter, r *http.Request, authErr error),
) http.Handler { ) http.Handler {
@@ -608,24 +473,21 @@ func securedCheck(user *schema.User, r *http.Request) error {
IPAddress = r.RemoteAddr IPAddress = r.RemoteAddr
} }
// Handle both IPv4 and IPv6 addresses properly // Cannot Handle ipv6! (e.g. localhost -> [::1])
// For IPv6, this will strip the port and brackets if strings.Contains(IPAddress, ":") {
// For IPv4, this will strip the port IPAddress = strings.Split(IPAddress, ":")[0]
if host, _, err := net.SplitHostPort(IPAddress); err == nil {
IPAddress = host
} }
// If SplitHostPort fails, IPAddress is already just a host (no port)
// If nothing declared in config: deny all request to this api endpoint // If nothing declared in config: deny all request to this api endpoint
if len(config.Keys.APIAllowedIPs) == 0 { if len(config.Keys.ApiAllowedIPs) == 0 {
return fmt.Errorf("missing configuration key ApiAllowedIPs") return fmt.Errorf("missing configuration key ApiAllowedIPs")
} }
// If wildcard declared in config: Continue // If wildcard declared in config: Continue
if config.Keys.APIAllowedIPs[0] == "*" { if config.Keys.ApiAllowedIPs[0] == "*" {
return nil return nil
} }
// check if IP is allowed // check if IP is allowed
if !util.Contains(config.Keys.APIAllowedIPs, IPAddress) { if !util.Contains(config.Keys.ApiAllowedIPs, IPAddress) {
return fmt.Errorf("unknown ip: %v", IPAddress) return fmt.Errorf("unknown ip: %v", IPAddress)
} }

View File

@@ -1,176 +0,0 @@
// Copyright (C) NHR@FAU, University Erlangen-Nuremberg.
// All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.
package auth
import (
"net"
"testing"
"time"
)
// TestGetIPUserLimiter tests the rate limiter creation and retrieval
func TestGetIPUserLimiter(t *testing.T) {
ip := "192.168.1.1"
username := "testuser"
// Get limiter for the first time
limiter1 := getIPUserLimiter(ip, username)
if limiter1 == nil {
t.Fatal("Expected limiter to be created")
}
// Get the same limiter again
limiter2 := getIPUserLimiter(ip, username)
if limiter1 != limiter2 {
t.Error("Expected to get the same limiter instance")
}
// Get a different limiter for different user
limiter3 := getIPUserLimiter(ip, "otheruser")
if limiter1 == limiter3 {
t.Error("Expected different limiter for different user")
}
// Get a different limiter for different IP
limiter4 := getIPUserLimiter("192.168.1.2", username)
if limiter1 == limiter4 {
t.Error("Expected different limiter for different IP")
}
}
// TestRateLimiterBehavior tests that rate limiting works correctly
func TestRateLimiterBehavior(t *testing.T) {
ip := "10.0.0.1"
username := "ratelimituser"
limiter := getIPUserLimiter(ip, username)
// Should allow first 5 attempts
for i := 0; i < 5; i++ {
if !limiter.Allow() {
t.Errorf("Request %d should be allowed within rate limit", i+1)
}
}
// 6th attempt should be blocked
if limiter.Allow() {
t.Error("Request 6 should be blocked by rate limiter")
}
}
// TestCleanupOldRateLimiters tests the cleanup function
func TestCleanupOldRateLimiters(t *testing.T) {
// Clear all existing limiters first to avoid interference from other tests
cleanupOldRateLimiters(time.Now().Add(24 * time.Hour))
// Create some new rate limiters
limiter1 := getIPUserLimiter("1.1.1.1", "user1")
limiter2 := getIPUserLimiter("2.2.2.2", "user2")
if limiter1 == nil || limiter2 == nil {
t.Fatal("Failed to create test limiters")
}
// Cleanup limiters older than 1 second from now (should keep both)
time.Sleep(10 * time.Millisecond) // Small delay to ensure timestamp difference
cleanupOldRateLimiters(time.Now().Add(-1 * time.Second))
// Verify they still exist (should get same instance)
if getIPUserLimiter("1.1.1.1", "user1") != limiter1 {
t.Error("Limiter 1 was incorrectly cleaned up")
}
if getIPUserLimiter("2.2.2.2", "user2") != limiter2 {
t.Error("Limiter 2 was incorrectly cleaned up")
}
// Cleanup limiters older than 1 hour from now (should remove both)
cleanupOldRateLimiters(time.Now().Add(2 * time.Hour))
// Getting them again should create new instances
newLimiter1 := getIPUserLimiter("1.1.1.1", "user1")
if newLimiter1 == limiter1 {
t.Error("Old limiter should have been cleaned up")
}
}
// TestIPv4Extraction tests extracting IPv4 addresses
func TestIPv4Extraction(t *testing.T) {
tests := []struct {
name string
input string
expected string
}{
{"IPv4 with port", "192.168.1.1:8080", "192.168.1.1"},
{"IPv4 without port", "192.168.1.1", "192.168.1.1"},
{"Localhost with port", "127.0.0.1:3000", "127.0.0.1"},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result := tt.input
if host, _, err := net.SplitHostPort(result); err == nil {
result = host
}
if result != tt.expected {
t.Errorf("Expected %s, got %s", tt.expected, result)
}
})
}
}
// TestIPv6Extraction tests extracting IPv6 addresses
func TestIPv6Extraction(t *testing.T) {
tests := []struct {
name string
input string
expected string
}{
{"IPv6 with port", "[2001:db8::1]:8080", "2001:db8::1"},
{"IPv6 localhost with port", "[::1]:3000", "::1"},
{"IPv6 without port", "2001:db8::1", "2001:db8::1"},
{"IPv6 localhost", "::1", "::1"},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result := tt.input
if host, _, err := net.SplitHostPort(result); err == nil {
result = host
}
if result != tt.expected {
t.Errorf("Expected %s, got %s", tt.expected, result)
}
})
}
}
// TestIPExtractionEdgeCases tests edge cases for IP extraction
func TestIPExtractionEdgeCases(t *testing.T) {
tests := []struct {
name string
input string
expected string
}{
{"Hostname without port", "example.com", "example.com"},
{"Empty string", "", ""},
{"Just port", ":8080", ""},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result := tt.input
if host, _, err := net.SplitHostPort(result); err == nil {
result = host
}
if result != tt.expected {
t.Errorf("Expected %s, got %s", tt.expected, result)
}
})
}
}

View File

@@ -2,7 +2,6 @@
// All rights reserved. This file is part of cc-backend. // All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style // Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file. // license that can be found in the LICENSE file.
package auth package auth
import ( import (
@@ -14,33 +13,13 @@ import (
"strings" "strings"
"time" "time"
"github.com/ClusterCockpit/cc-backend/internal/config"
"github.com/ClusterCockpit/cc-backend/internal/repository"
cclog "github.com/ClusterCockpit/cc-lib/ccLogger" cclog "github.com/ClusterCockpit/cc-lib/ccLogger"
"github.com/ClusterCockpit/cc-lib/schema" "github.com/ClusterCockpit/cc-lib/schema"
"github.com/golang-jwt/jwt/v5" "github.com/golang-jwt/jwt/v5"
) )
type JWTAuthConfig struct {
// Specifies for how long a JWT token shall be valid
// as a string parsable by time.ParseDuration().
MaxAge string `json:"max-age"`
// Specifies which cookie should be checked for a JWT token (if no authorization header is present)
CookieName string `json:"cookieName"`
// Deny login for users not in database (but defined in JWT).
// Ignore user roles defined in JWTs ('roles' claim), get them from db.
ValidateUser bool `json:"validateUser"`
// Specifies which issuer should be accepted when validating external JWTs ('iss' claim)
TrustedIssuer string `json:"trustedIssuer"`
// Should an non-existent user be added to the DB based on the information in the token
SyncUserOnLogin bool `json:"syncUserOnLogin"`
// Should an existent user be updated in the DB based on the information in the token
UpdateUserOnLogin bool `json:"updateUserOnLogin"`
}
type JWTAuthenticator struct { type JWTAuthenticator struct {
publicKey ed25519.PublicKey publicKey ed25519.PublicKey
privateKey ed25519.PrivateKey privateKey ed25519.PrivateKey
@@ -83,7 +62,7 @@ func (ja *JWTAuthenticator) AuthViaJWT(
return nil, nil return nil, nil
} }
token, err := jwt.Parse(rawtoken, func(t *jwt.Token) (any, error) { token, err := jwt.Parse(rawtoken, func(t *jwt.Token) (interface{}, error) {
if t.Method != jwt.SigningMethodEdDSA { if t.Method != jwt.SigningMethodEdDSA {
return nil, errors.New("only Ed25519/EdDSA supported") return nil, errors.New("only Ed25519/EdDSA supported")
} }
@@ -101,24 +80,41 @@ func (ja *JWTAuthenticator) AuthViaJWT(
// Token is valid, extract payload // Token is valid, extract payload
claims := token.Claims.(jwt.MapClaims) claims := token.Claims.(jwt.MapClaims)
sub, _ := claims["sub"].(string)
// Use shared helper to get user from JWT claims
var user *schema.User var roles []string
user, err = getUserFromJWT(claims, Keys.JwtConfig.ValidateUser, schema.AuthToken, -1)
if err != nil { // Validate user + roles from JWT against database?
return nil, err if config.Keys.JwtConfig.ValidateUser {
ur := repository.GetUserRepository()
user, err := ur.GetUser(sub)
// Deny any logins for unknown usernames
if err != nil {
cclog.Warn("Could not find user from JWT in internal database.")
return nil, errors.New("unknown user")
}
// Take user roles from database instead of trusting the JWT
roles = user.Roles
} else {
// Extract roles from JWT (if present)
if rawroles, ok := claims["roles"].([]interface{}); ok {
for _, rr := range rawroles {
if r, ok := rr.(string); ok {
roles = append(roles, r)
}
}
}
} }
// If not validating user, we only get roles from JWT (no projects for this auth method) return &schema.User{
if !Keys.JwtConfig.ValidateUser { Username: sub,
user.Roles = extractRolesFromClaims(claims, false) Roles: roles,
user.Projects = nil // Standard JWT auth doesn't include projects AuthType: schema.AuthToken,
} AuthSource: -1,
}, nil
return user, nil
} }
// ProvideJWT generates a new JWT that can be used for authentication // Generate a new JWT that can be used for authentication
func (ja *JWTAuthenticator) ProvideJWT(user *schema.User) (string, error) { func (ja *JWTAuthenticator) ProvideJWT(user *schema.User) (string, error) {
if ja.privateKey == nil { if ja.privateKey == nil {
return "", errors.New("environment variable 'JWT_PRIVATE_KEY' not set") return "", errors.New("environment variable 'JWT_PRIVATE_KEY' not set")
@@ -130,8 +126,8 @@ func (ja *JWTAuthenticator) ProvideJWT(user *schema.User) (string, error) {
"roles": user.Roles, "roles": user.Roles,
"iat": now.Unix(), "iat": now.Unix(),
} }
if Keys.JwtConfig.MaxAge != "" { if config.Keys.JwtConfig.MaxAge != "" {
d, err := time.ParseDuration(Keys.JwtConfig.MaxAge) d, err := time.ParseDuration(config.Keys.JwtConfig.MaxAge)
if err != nil { if err != nil {
return "", errors.New("cannot parse max-age config key") return "", errors.New("cannot parse max-age config key")
} }

View File

@@ -2,16 +2,19 @@
// All rights reserved. This file is part of cc-backend. // All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style // Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file. // license that can be found in the LICENSE file.
package auth package auth
import ( import (
"crypto/ed25519" "crypto/ed25519"
"database/sql"
"encoding/base64" "encoding/base64"
"errors" "errors"
"fmt"
"net/http" "net/http"
"os" "os"
"github.com/ClusterCockpit/cc-backend/internal/config"
"github.com/ClusterCockpit/cc-backend/internal/repository"
cclog "github.com/ClusterCockpit/cc-lib/ccLogger" cclog "github.com/ClusterCockpit/cc-lib/ccLogger"
"github.com/ClusterCockpit/cc-lib/schema" "github.com/ClusterCockpit/cc-lib/schema"
"github.com/golang-jwt/jwt/v5" "github.com/golang-jwt/jwt/v5"
@@ -60,16 +63,17 @@ func (ja *JWTCookieSessionAuthenticator) Init() error {
return errors.New("environment variable 'CROSS_LOGIN_JWT_PUBLIC_KEY' not set (cross login token based authentication will not work)") return errors.New("environment variable 'CROSS_LOGIN_JWT_PUBLIC_KEY' not set (cross login token based authentication will not work)")
} }
jc := config.Keys.JwtConfig
// Warn if other necessary settings are not configured // Warn if other necessary settings are not configured
if Keys.JwtConfig != nil { if jc != nil {
if Keys.JwtConfig.CookieName == "" { if jc.CookieName == "" {
cclog.Info("cookieName for JWTs not configured (cross login via JWT cookie will fail)") cclog.Info("cookieName for JWTs not configured (cross login via JWT cookie will fail)")
return errors.New("cookieName for JWTs not configured (cross login via JWT cookie will fail)") return errors.New("cookieName for JWTs not configured (cross login via JWT cookie will fail)")
} }
if !Keys.JwtConfig.ValidateUser { if !jc.ValidateUser {
cclog.Info("forceJWTValidationViaDatabase not set to true: CC will accept users and roles defined in JWTs regardless of its own database!") cclog.Info("forceJWTValidationViaDatabase not set to true: CC will accept users and roles defined in JWTs regardless of its own database!")
} }
if Keys.JwtConfig.TrustedIssuer == "" { if jc.TrustedIssuer == "" {
cclog.Info("trustedExternalIssuer for JWTs not configured (cross login via JWT cookie will fail)") cclog.Info("trustedExternalIssuer for JWTs not configured (cross login via JWT cookie will fail)")
return errors.New("trustedExternalIssuer for JWTs not configured (cross login via JWT cookie will fail)") return errors.New("trustedExternalIssuer for JWTs not configured (cross login via JWT cookie will fail)")
} }
@@ -88,7 +92,7 @@ func (ja *JWTCookieSessionAuthenticator) CanLogin(
rw http.ResponseWriter, rw http.ResponseWriter,
r *http.Request, r *http.Request,
) (*schema.User, bool) { ) (*schema.User, bool) {
jc := Keys.JwtConfig jc := config.Keys.JwtConfig
cookieName := "" cookieName := ""
if jc.CookieName != "" { if jc.CookieName != "" {
cookieName = jc.CookieName cookieName = jc.CookieName
@@ -111,7 +115,7 @@ func (ja *JWTCookieSessionAuthenticator) Login(
rw http.ResponseWriter, rw http.ResponseWriter,
r *http.Request, r *http.Request,
) (*schema.User, error) { ) (*schema.User, error) {
jc := Keys.JwtConfig jc := config.Keys.JwtConfig
jwtCookie, err := r.Cookie(jc.CookieName) jwtCookie, err := r.Cookie(jc.CookieName)
var rawtoken string var rawtoken string
@@ -119,7 +123,7 @@ func (ja *JWTCookieSessionAuthenticator) Login(
rawtoken = jwtCookie.Value rawtoken = jwtCookie.Value
} }
token, err := jwt.Parse(rawtoken, func(t *jwt.Token) (any, error) { token, err := jwt.Parse(rawtoken, func(t *jwt.Token) (interface{}, error) {
if t.Method != jwt.SigningMethodEdDSA { if t.Method != jwt.SigningMethodEdDSA {
return nil, errors.New("only Ed25519/EdDSA supported") return nil, errors.New("only Ed25519/EdDSA supported")
} }
@@ -146,16 +150,57 @@ func (ja *JWTCookieSessionAuthenticator) Login(
} }
claims := token.Claims.(jwt.MapClaims) claims := token.Claims.(jwt.MapClaims)
sub, _ := claims["sub"].(string)
// Use shared helper to get user from JWT claims
user, err = getUserFromJWT(claims, jc.ValidateUser, schema.AuthSession, schema.AuthViaToken) var roles []string
if err != nil { projects := make([]string, 0)
return nil, err
} if jc.ValidateUser {
var err error
// Sync or update user if configured user, err = repository.GetUserRepository().GetUser(sub)
if !jc.ValidateUser && (jc.SyncUserOnLogin || jc.UpdateUserOnLogin) { if err != nil && err != sql.ErrNoRows {
handleTokenUser(user) cclog.Errorf("Error while loading user '%v'", sub)
}
// Deny any logins for unknown usernames
if user == nil {
cclog.Warn("Could not find user from JWT in internal database.")
return nil, errors.New("unknown user")
}
} else {
var name string
if wrap, ok := claims["name"].(map[string]interface{}); ok {
if vals, ok := wrap["values"].([]interface{}); ok {
if len(vals) != 0 {
name = fmt.Sprintf("%v", vals[0])
for i := 1; i < len(vals); i++ {
name += fmt.Sprintf(" %v", vals[i])
}
}
}
}
// Extract roles from JWT (if present)
if rawroles, ok := claims["roles"].([]interface{}); ok {
for _, rr := range rawroles {
if r, ok := rr.(string); ok {
roles = append(roles, r)
}
}
}
user = &schema.User{
Username: sub,
Name: name,
Roles: roles,
Projects: projects,
AuthType: schema.AuthSession,
AuthSource: schema.AuthViaToken,
}
if jc.SyncUserOnLogin || jc.UpdateUserOnLogin {
handleTokenUser(user)
}
} }
// (Ask browser to) Delete JWT cookie // (Ask browser to) Delete JWT cookie

View File

@@ -1,136 +0,0 @@
// Copyright (C) NHR@FAU, University Erlangen-Nuremberg.
// All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.
package auth
import (
"database/sql"
"errors"
"fmt"
"github.com/ClusterCockpit/cc-backend/internal/repository"
cclog "github.com/ClusterCockpit/cc-lib/ccLogger"
"github.com/ClusterCockpit/cc-lib/schema"
"github.com/golang-jwt/jwt/v5"
)
// extractStringFromClaims extracts a string value from JWT claims
func extractStringFromClaims(claims jwt.MapClaims, key string) string {
if val, ok := claims[key].(string); ok {
return val
}
return ""
}
// extractRolesFromClaims extracts roles from JWT claims
// If validateRoles is true, only valid roles are returned
func extractRolesFromClaims(claims jwt.MapClaims, validateRoles bool) []string {
var roles []string
if rawroles, ok := claims["roles"].([]any); ok {
for _, rr := range rawroles {
if r, ok := rr.(string); ok {
if validateRoles {
if schema.IsValidRole(r) {
roles = append(roles, r)
}
} else {
roles = append(roles, r)
}
}
}
}
return roles
}
// extractProjectsFromClaims extracts projects from JWT claims
func extractProjectsFromClaims(claims jwt.MapClaims) []string {
projects := make([]string, 0)
if rawprojs, ok := claims["projects"].([]any); ok {
for _, pp := range rawprojs {
if p, ok := pp.(string); ok {
projects = append(projects, p)
}
}
} else if rawprojs, ok := claims["projects"]; ok {
if projSlice, ok := rawprojs.([]string); ok {
projects = append(projects, projSlice...)
}
}
return projects
}
// extractNameFromClaims extracts name from JWT claims
// Handles both simple string and complex nested structure
func extractNameFromClaims(claims jwt.MapClaims) string {
// Try simple string first
if name, ok := claims["name"].(string); ok {
return name
}
// Try nested structure: {name: {values: [...]}}
if wrap, ok := claims["name"].(map[string]any); ok {
if vals, ok := wrap["values"].([]any); ok {
if len(vals) == 0 {
return ""
}
name := fmt.Sprintf("%v", vals[0])
for i := 1; i < len(vals); i++ {
name += fmt.Sprintf(" %v", vals[i])
}
return name
}
}
return ""
}
// getUserFromJWT creates or retrieves a user based on JWT claims
// If validateUser is true, the user must exist in the database
// Otherwise, a new user object is created from claims
// authSource should be a schema.AuthSource constant (like schema.AuthViaToken)
func getUserFromJWT(claims jwt.MapClaims, validateUser bool, authType schema.AuthType, authSource schema.AuthSource) (*schema.User, error) {
sub := extractStringFromClaims(claims, "sub")
if sub == "" {
return nil, errors.New("missing 'sub' claim in JWT")
}
if validateUser {
// Validate user against database
ur := repository.GetUserRepository()
user, err := ur.GetUser(sub)
if err != nil && err != sql.ErrNoRows {
cclog.Errorf("Error while loading user '%v': %v", sub, err)
return nil, fmt.Errorf("database error: %w", err)
}
// Deny any logins for unknown usernames
if user == nil || err == sql.ErrNoRows {
cclog.Warn("Could not find user from JWT in internal database.")
return nil, errors.New("unknown user")
}
// Return database user (with database roles)
return user, nil
}
// Create user from JWT claims
name := extractNameFromClaims(claims)
roles := extractRolesFromClaims(claims, true) // Validate roles
projects := extractProjectsFromClaims(claims)
return &schema.User{
Username: sub,
Name: name,
Roles: roles,
Projects: projects,
AuthType: authType,
AuthSource: authSource,
}, nil
}

View File

@@ -1,281 +0,0 @@
// Copyright (C) NHR@FAU, University Erlangen-Nuremberg.
// All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.
package auth
import (
"testing"
"github.com/ClusterCockpit/cc-lib/schema"
"github.com/golang-jwt/jwt/v5"
)
// TestExtractStringFromClaims tests extracting string values from JWT claims
func TestExtractStringFromClaims(t *testing.T) {
claims := jwt.MapClaims{
"sub": "testuser",
"email": "test@example.com",
"age": 25, // not a string
}
tests := []struct {
name string
key string
expected string
}{
{"Existing string", "sub", "testuser"},
{"Another string", "email", "test@example.com"},
{"Non-existent key", "missing", ""},
{"Non-string value", "age", ""},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result := extractStringFromClaims(claims, tt.key)
if result != tt.expected {
t.Errorf("Expected %s, got %s", tt.expected, result)
}
})
}
}
// TestExtractRolesFromClaims tests role extraction and validation
func TestExtractRolesFromClaims(t *testing.T) {
tests := []struct {
name string
claims jwt.MapClaims
validateRoles bool
expected []string
}{
{
name: "Valid roles without validation",
claims: jwt.MapClaims{
"roles": []any{"admin", "user", "invalid_role"},
},
validateRoles: false,
expected: []string{"admin", "user", "invalid_role"},
},
{
name: "Valid roles with validation",
claims: jwt.MapClaims{
"roles": []any{"admin", "user", "api"},
},
validateRoles: true,
expected: []string{"admin", "user", "api"},
},
{
name: "Invalid roles with validation",
claims: jwt.MapClaims{
"roles": []any{"invalid_role", "fake_role"},
},
validateRoles: true,
expected: []string{}, // Should filter out invalid roles
},
{
name: "No roles claim",
claims: jwt.MapClaims{},
validateRoles: false,
expected: []string{},
},
{
name: "Non-array roles",
claims: jwt.MapClaims{
"roles": "admin",
},
validateRoles: false,
expected: []string{},
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result := extractRolesFromClaims(tt.claims, tt.validateRoles)
if len(result) != len(tt.expected) {
t.Errorf("Expected %d roles, got %d", len(tt.expected), len(result))
return
}
for i, role := range result {
if i >= len(tt.expected) || role != tt.expected[i] {
t.Errorf("Expected role %s at position %d, got %s", tt.expected[i], i, role)
}
}
})
}
}
// TestExtractProjectsFromClaims tests project extraction from claims
func TestExtractProjectsFromClaims(t *testing.T) {
tests := []struct {
name string
claims jwt.MapClaims
expected []string
}{
{
name: "Projects as array of interfaces",
claims: jwt.MapClaims{
"projects": []any{"project1", "project2", "project3"},
},
expected: []string{"project1", "project2", "project3"},
},
{
name: "Projects as string array",
claims: jwt.MapClaims{
"projects": []string{"projectA", "projectB"},
},
expected: []string{"projectA", "projectB"},
},
{
name: "No projects claim",
claims: jwt.MapClaims{},
expected: []string{},
},
{
name: "Mixed types in projects array",
claims: jwt.MapClaims{
"projects": []any{"project1", 123, "project2"},
},
expected: []string{"project1", "project2"}, // Should skip non-strings
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result := extractProjectsFromClaims(tt.claims)
if len(result) != len(tt.expected) {
t.Errorf("Expected %d projects, got %d", len(tt.expected), len(result))
return
}
for i, project := range result {
if i >= len(tt.expected) || project != tt.expected[i] {
t.Errorf("Expected project %s at position %d, got %s", tt.expected[i], i, project)
}
}
})
}
}
// TestExtractNameFromClaims tests name extraction from various formats
func TestExtractNameFromClaims(t *testing.T) {
tests := []struct {
name string
claims jwt.MapClaims
expected string
}{
{
name: "Simple string name",
claims: jwt.MapClaims{
"name": "John Doe",
},
expected: "John Doe",
},
{
name: "Nested name structure",
claims: jwt.MapClaims{
"name": map[string]any{
"values": []any{"John", "Doe"},
},
},
expected: "John Doe",
},
{
name: "Nested name with single value",
claims: jwt.MapClaims{
"name": map[string]any{
"values": []any{"Alice"},
},
},
expected: "Alice",
},
{
name: "No name claim",
claims: jwt.MapClaims{},
expected: "",
},
{
name: "Empty nested values",
claims: jwt.MapClaims{
"name": map[string]any{
"values": []any{},
},
},
expected: "",
},
{
name: "Nested with non-string values",
claims: jwt.MapClaims{
"name": map[string]any{
"values": []any{123, "Smith"},
},
},
expected: "123 Smith", // Should convert to string
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result := extractNameFromClaims(tt.claims)
if result != tt.expected {
t.Errorf("Expected '%s', got '%s'", tt.expected, result)
}
})
}
}
// TestGetUserFromJWT_NoValidation tests getUserFromJWT without database validation
func TestGetUserFromJWT_NoValidation(t *testing.T) {
claims := jwt.MapClaims{
"sub": "testuser",
"name": "Test User",
"roles": []any{"user", "admin"},
"projects": []any{"project1", "project2"},
}
user, err := getUserFromJWT(claims, false, schema.AuthToken, -1)
if err != nil {
t.Fatalf("Unexpected error: %v", err)
}
if user.Username != "testuser" {
t.Errorf("Expected username 'testuser', got '%s'", user.Username)
}
if user.Name != "Test User" {
t.Errorf("Expected name 'Test User', got '%s'", user.Name)
}
if len(user.Roles) != 2 {
t.Errorf("Expected 2 roles, got %d", len(user.Roles))
}
if len(user.Projects) != 2 {
t.Errorf("Expected 2 projects, got %d", len(user.Projects))
}
if user.AuthType != schema.AuthToken {
t.Errorf("Expected AuthType %v, got %v", schema.AuthToken, user.AuthType)
}
}
// TestGetUserFromJWT_MissingSub tests error when sub claim is missing
func TestGetUserFromJWT_MissingSub(t *testing.T) {
claims := jwt.MapClaims{
"name": "Test User",
}
_, err := getUserFromJWT(claims, false, schema.AuthToken, -1)
if err == nil {
t.Error("Expected error for missing sub claim")
}
if err.Error() != "missing 'sub' claim in JWT" {
t.Errorf("Expected specific error message, got: %v", err)
}
}

View File

@@ -2,10 +2,10 @@
// All rights reserved. This file is part of cc-backend. // All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style // Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file. // license that can be found in the LICENSE file.
package auth package auth
import ( import (
"database/sql"
"encoding/base64" "encoding/base64"
"errors" "errors"
"fmt" "fmt"
@@ -13,6 +13,8 @@ import (
"os" "os"
"strings" "strings"
"github.com/ClusterCockpit/cc-backend/internal/config"
"github.com/ClusterCockpit/cc-backend/internal/repository"
cclog "github.com/ClusterCockpit/cc-lib/ccLogger" cclog "github.com/ClusterCockpit/cc-lib/ccLogger"
"github.com/ClusterCockpit/cc-lib/schema" "github.com/ClusterCockpit/cc-lib/schema"
"github.com/golang-jwt/jwt/v5" "github.com/golang-jwt/jwt/v5"
@@ -58,7 +60,7 @@ func (ja *JWTSessionAuthenticator) Login(
rawtoken = r.URL.Query().Get("login-token") rawtoken = r.URL.Query().Get("login-token")
} }
token, err := jwt.Parse(rawtoken, func(t *jwt.Token) (any, error) { token, err := jwt.Parse(rawtoken, func(t *jwt.Token) (interface{}, error) {
if t.Method == jwt.SigningMethodHS256 || t.Method == jwt.SigningMethodHS512 { if t.Method == jwt.SigningMethodHS256 || t.Method == jwt.SigningMethodHS512 {
return ja.loginTokenKey, nil return ja.loginTokenKey, nil
} }
@@ -75,16 +77,70 @@ func (ja *JWTSessionAuthenticator) Login(
} }
claims := token.Claims.(jwt.MapClaims) claims := token.Claims.(jwt.MapClaims)
sub, _ := claims["sub"].(string)
// Use shared helper to get user from JWT claims
user, err = getUserFromJWT(claims, Keys.JwtConfig.ValidateUser, schema.AuthSession, schema.AuthViaToken) var roles []string
if err != nil { projects := make([]string, 0)
return nil, err
} if config.Keys.JwtConfig.ValidateUser {
var err error
// Sync or update user if configured user, err = repository.GetUserRepository().GetUser(sub)
if !Keys.JwtConfig.ValidateUser && (Keys.JwtConfig.SyncUserOnLogin || Keys.JwtConfig.UpdateUserOnLogin) { if err != nil && err != sql.ErrNoRows {
handleTokenUser(user) cclog.Errorf("Error while loading user '%v'", sub)
}
// Deny any logins for unknown usernames
if user == nil {
cclog.Warn("Could not find user from JWT in internal database.")
return nil, errors.New("unknown user")
}
} else {
var name string
if wrap, ok := claims["name"].(map[string]interface{}); ok {
if vals, ok := wrap["values"].([]interface{}); ok {
if len(vals) != 0 {
name = fmt.Sprintf("%v", vals[0])
for i := 1; i < len(vals); i++ {
name += fmt.Sprintf(" %v", vals[i])
}
}
}
}
// Extract roles from JWT (if present)
if rawroles, ok := claims["roles"].([]interface{}); ok {
for _, rr := range rawroles {
if r, ok := rr.(string); ok {
if schema.IsValidRole(r) {
roles = append(roles, r)
}
}
}
}
if rawprojs, ok := claims["projects"].([]interface{}); ok {
for _, pp := range rawprojs {
if p, ok := pp.(string); ok {
projects = append(projects, p)
}
}
} else if rawprojs, ok := claims["projects"]; ok {
projects = append(projects, rawprojs.([]string)...)
}
user = &schema.User{
Username: sub,
Name: name,
Roles: roles,
Projects: projects,
AuthType: schema.AuthSession,
AuthSource: schema.AuthViaToken,
}
if config.Keys.JwtConfig.SyncUserOnLogin || config.Keys.JwtConfig.UpdateUserOnLogin {
handleTokenUser(user)
}
} }
return user, nil return user, nil

View File

@@ -2,7 +2,6 @@
// All rights reserved. This file is part of cc-backend. // All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style // Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file. // license that can be found in the LICENSE file.
package auth package auth
import ( import (
@@ -12,26 +11,13 @@ import (
"os" "os"
"strings" "strings"
"github.com/ClusterCockpit/cc-backend/internal/config"
"github.com/ClusterCockpit/cc-backend/internal/repository" "github.com/ClusterCockpit/cc-backend/internal/repository"
cclog "github.com/ClusterCockpit/cc-lib/ccLogger" cclog "github.com/ClusterCockpit/cc-lib/ccLogger"
"github.com/ClusterCockpit/cc-lib/schema" "github.com/ClusterCockpit/cc-lib/schema"
"github.com/go-ldap/ldap/v3" "github.com/go-ldap/ldap/v3"
) )
type LdapConfig struct {
URL string `json:"url"`
UserBase string `json:"user_base"`
SearchDN string `json:"search_dn"`
UserBind string `json:"user_bind"`
UserFilter string `json:"user_filter"`
UserAttr string `json:"username_attr"`
SyncInterval string `json:"sync_interval"` // Parsed using time.ParseDuration.
SyncDelOldUsers bool `json:"sync_del_old_users"`
// Should an non-existent user be added to the DB if user exists in ldap directory
SyncUserOnLogin bool `json:"syncUserOnLogin"`
}
type LdapAuthenticator struct { type LdapAuthenticator struct {
syncPassword string syncPassword string
UserAttr string UserAttr string
@@ -45,8 +31,10 @@ func (la *LdapAuthenticator) Init() error {
cclog.Warn("environment variable 'LDAP_ADMIN_PASSWORD' not set (ldap sync will not work)") cclog.Warn("environment variable 'LDAP_ADMIN_PASSWORD' not set (ldap sync will not work)")
} }
if Keys.LdapConfig.UserAttr != "" { lc := config.Keys.LdapConfig
la.UserAttr = Keys.LdapConfig.UserAttr
if lc.UserAttr != "" {
la.UserAttr = lc.UserAttr
} else { } else {
la.UserAttr = "gecos" la.UserAttr = "gecos"
} }
@@ -60,7 +48,7 @@ func (la *LdapAuthenticator) CanLogin(
rw http.ResponseWriter, rw http.ResponseWriter,
r *http.Request, r *http.Request,
) (*schema.User, bool) { ) (*schema.User, bool) {
lc := Keys.LdapConfig lc := config.Keys.LdapConfig
if user != nil { if user != nil {
if user.AuthSource == schema.AuthViaLDAP { if user.AuthSource == schema.AuthViaLDAP {
@@ -71,7 +59,6 @@ func (la *LdapAuthenticator) CanLogin(
l, err := la.getLdapConnection(true) l, err := la.getLdapConnection(true)
if err != nil { if err != nil {
cclog.Error("LDAP connection error") cclog.Error("LDAP connection error")
return nil, false
} }
defer l.Close() defer l.Close()
@@ -132,7 +119,7 @@ func (la *LdapAuthenticator) Login(
} }
defer l.Close() defer l.Close()
userDn := strings.ReplaceAll(Keys.LdapConfig.UserBind, "{username}", user.Username) userDn := strings.Replace(config.Keys.LdapConfig.UserBind, "{username}", user.Username, -1)
if err := l.Bind(userDn, r.FormValue("password")); err != nil { if err := l.Bind(userDn, r.FormValue("password")); err != nil {
cclog.Errorf("AUTH/LDAP > Authentication for user %s failed: %v", cclog.Errorf("AUTH/LDAP > Authentication for user %s failed: %v",
user.Username, err) user.Username, err)
@@ -143,11 +130,11 @@ func (la *LdapAuthenticator) Login(
} }
func (la *LdapAuthenticator) Sync() error { func (la *LdapAuthenticator) Sync() error {
const InDB int = 1 const IN_DB int = 1
const InLdap int = 2 const IN_LDAP int = 2
const InBoth int = 3 const IN_BOTH int = 3
ur := repository.GetUserRepository() ur := repository.GetUserRepository()
lc := Keys.LdapConfig lc := config.Keys.LdapConfig
users := map[string]int{} users := map[string]int{}
usernames, err := ur.GetLdapUsernames() usernames, err := ur.GetLdapUsernames()
@@ -156,7 +143,7 @@ func (la *LdapAuthenticator) Sync() error {
} }
for _, username := range usernames { for _, username := range usernames {
users[username] = InDB users[username] = IN_DB
} }
l, err := la.getLdapConnection(true) l, err := la.getLdapConnection(true)
@@ -185,18 +172,18 @@ func (la *LdapAuthenticator) Sync() error {
_, ok := users[username] _, ok := users[username]
if !ok { if !ok {
users[username] = InLdap users[username] = IN_LDAP
newnames[username] = entry.GetAttributeValue(la.UserAttr) newnames[username] = entry.GetAttributeValue(la.UserAttr)
} else { } else {
users[username] = InBoth users[username] = IN_BOTH
} }
} }
for username, where := range users { for username, where := range users {
if where == InDB && lc.SyncDelOldUsers { if where == IN_DB && lc.SyncDelOldUsers {
ur.DelUser(username) ur.DelUser(username)
cclog.Debugf("sync: remove %v (does not show up in LDAP anymore)", username) cclog.Debugf("sync: remove %v (does not show up in LDAP anymore)", username)
} else if where == InLdap { } else if where == IN_LDAP {
name := newnames[username] name := newnames[username]
var roles []string var roles []string
@@ -223,8 +210,8 @@ func (la *LdapAuthenticator) Sync() error {
} }
func (la *LdapAuthenticator) getLdapConnection(admin bool) (*ldap.Conn, error) { func (la *LdapAuthenticator) getLdapConnection(admin bool) (*ldap.Conn, error) {
lc := Keys.LdapConfig lc := config.Keys.LdapConfig
conn, err := ldap.DialURL(lc.URL) conn, err := ldap.DialURL(lc.Url)
if err != nil { if err != nil {
cclog.Warn("LDAP URL dial failed") cclog.Warn("LDAP URL dial failed")
return nil, err return nil, err

View File

@@ -2,7 +2,6 @@
// All rights reserved. This file is part of cc-backend. // All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style // Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file. // license that can be found in the LICENSE file.
package auth package auth
import ( import (

View File

@@ -2,7 +2,6 @@
// All rights reserved. This file is part of cc-backend. // All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style // Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file. // license that can be found in the LICENSE file.
package auth package auth
import ( import (
@@ -14,6 +13,7 @@ import (
"os" "os"
"time" "time"
"github.com/ClusterCockpit/cc-backend/internal/config"
"github.com/ClusterCockpit/cc-backend/internal/repository" "github.com/ClusterCockpit/cc-backend/internal/repository"
cclog "github.com/ClusterCockpit/cc-lib/ccLogger" cclog "github.com/ClusterCockpit/cc-lib/ccLogger"
"github.com/ClusterCockpit/cc-lib/schema" "github.com/ClusterCockpit/cc-lib/schema"
@@ -22,12 +22,6 @@ import (
"golang.org/x/oauth2" "golang.org/x/oauth2"
) )
type OpenIDConfig struct {
Provider string `json:"provider"`
SyncUserOnLogin bool `json:"syncUserOnLogin"`
UpdateUserOnLogin bool `json:"updateUserOnLogin"`
}
type OIDC struct { type OIDC struct {
client *oauth2.Config client *oauth2.Config
provider *oidc.Provider provider *oidc.Provider
@@ -54,13 +48,8 @@ func setCallbackCookie(w http.ResponseWriter, r *http.Request, name, value strin
http.SetCookie(w, c) http.SetCookie(w, c)
} }
// NewOIDC creates a new OIDC authenticator with the configured provider
func NewOIDC(a *Authentication) *OIDC { func NewOIDC(a *Authentication) *OIDC {
// Use context with timeout for provider initialization provider, err := oidc.NewProvider(context.Background(), config.Keys.OpenIDConfig.Provider)
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
provider, err := oidc.NewProvider(ctx, Keys.OpenIDConfig.Provider)
if err != nil { if err != nil {
cclog.Fatal(err) cclog.Fatal(err)
} }
@@ -116,18 +105,13 @@ func (oa *OIDC) OAuth2Callback(rw http.ResponseWriter, r *http.Request) {
http.Error(rw, "Code not found", http.StatusBadRequest) http.Error(rw, "Code not found", http.StatusBadRequest)
return return
} }
// Exchange authorization code for token with timeout token, err := oa.client.Exchange(context.Background(), code, oauth2.VerifierOption(codeVerifier))
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
token, err := oa.client.Exchange(ctx, code, oauth2.VerifierOption(codeVerifier))
if err != nil { if err != nil {
http.Error(rw, "Failed to exchange token: "+err.Error(), http.StatusInternalServerError) http.Error(rw, "Failed to exchange token: "+err.Error(), http.StatusInternalServerError)
return return
} }
// Get user info from OIDC provider with same timeout userInfo, err := oa.provider.UserInfo(context.Background(), oauth2.StaticTokenSource(token))
userInfo, err := oa.provider.UserInfo(ctx, oauth2.StaticTokenSource(token))
if err != nil { if err != nil {
http.Error(rw, "Failed to get userinfo: "+err.Error(), http.StatusInternalServerError) http.Error(rw, "Failed to get userinfo: "+err.Error(), http.StatusInternalServerError)
return return
@@ -184,14 +168,14 @@ func (oa *OIDC) OAuth2Callback(rw http.ResponseWriter, r *http.Request) {
AuthSource: schema.AuthViaOIDC, AuthSource: schema.AuthViaOIDC,
} }
if Keys.OpenIDConfig.SyncUserOnLogin || Keys.OpenIDConfig.UpdateUserOnLogin { if config.Keys.OpenIDConfig.SyncUserOnLogin || config.Keys.OpenIDConfig.UpdateUserOnLogin {
handleOIDCUser(user) handleOIDCUser(user)
} }
oa.authentication.SaveSession(rw, r, user) oa.authentication.SaveSession(rw, r, user)
cclog.Infof("login successfull: user: %#v (roles: %v, projects: %v)", user.Username, user.Roles, user.Projects) cclog.Infof("login successfull: user: %#v (roles: %v, projects: %v)", user.Username, user.Roles, user.Projects)
userCtx := context.WithValue(r.Context(), repository.ContextUserKey, user) ctx := context.WithValue(r.Context(), repository.ContextUserKey, user)
http.RedirectHandler("/", http.StatusTemporaryRedirect).ServeHTTP(rw, r.WithContext(userCtx)) http.RedirectHandler("/", http.StatusTemporaryRedirect).ServeHTTP(rw, r.WithContext(ctx))
} }
func (oa *OIDC) OAuth2Login(rw http.ResponseWriter, r *http.Request) { func (oa *OIDC) OAuth2Login(rw http.ResponseWriter, r *http.Request) {

View File

@@ -1,96 +0,0 @@
// Copyright (C) NHR@FAU, University Erlangen-Nuremberg.
// All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.
package auth
var configSchema = `
{
"jwts": {
"description": "For JWT token authentication.",
"type": "object",
"properties": {
"max-age": {
"description": "Configure how long a token is valid. As string parsable by time.ParseDuration()",
"type": "string"
},
"cookieName": {
"description": "Cookie that should be checked for a JWT token.",
"type": "string"
},
"validateUser": {
"description": "Deny login for users not in database (but defined in JWT). Overwrite roles in JWT with database roles.",
"type": "boolean"
},
"trustedIssuer": {
"description": "Issuer that should be accepted when validating external JWTs ",
"type": "string"
},
"syncUserOnLogin": {
"description": "Add non-existent user to DB at login attempt with values provided in JWT.",
"type": "boolean"
}
},
"required": ["max-age"]
},
"oidc": {
"provider": {
"description": "",
"type": "string"
},
"syncUserOnLogin": {
"description": "",
"type": "boolean"
},
"updateUserOnLogin": {
"description": "",
"type": "boolean"
},
"required": ["provider"]
},
"ldap": {
"description": "For LDAP Authentication and user synchronisation.",
"type": "object",
"properties": {
"url": {
"description": "URL of LDAP directory server.",
"type": "string"
},
"user_base": {
"description": "Base DN of user tree root.",
"type": "string"
},
"search_dn": {
"description": "DN for authenticating LDAP admin account with general read rights.",
"type": "string"
},
"user_bind": {
"description": "Expression used to authenticate users via LDAP bind. Must contain uid={username}.",
"type": "string"
},
"user_filter": {
"description": "Filter to extract users for syncing.",
"type": "string"
},
"username_attr": {
"description": "Attribute with full username. Default: gecos",
"type": "string"
},
"sync_interval": {
"description": "Interval used for syncing local user table with LDAP directory. Parsed using time.ParseDuration.",
"type": "string"
},
"sync_del_old_users": {
"description": "Delete obsolete users in database.",
"type": "boolean"
},
"syncUserOnLogin": {
"description": "Add non-existent user to DB at login attempt if user exists in Ldap directory",
"type": "boolean"
}
},
"required": ["url", "user_base", "search_dn", "user_bind", "user_filter"]
},
"required": ["jwts"]
}`

View File

@@ -2,159 +2,71 @@
// All rights reserved. This file is part of cc-backend. // All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style // Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file. // license that can be found in the LICENSE file.
// Package config implements the program configuration data structures, validation and parsing
package config package config
import ( import (
"bytes" "bytes"
"encoding/json" "encoding/json"
"time" "os"
cclog "github.com/ClusterCockpit/cc-lib/ccLogger" cclog "github.com/ClusterCockpit/cc-lib/ccLogger"
"github.com/ClusterCockpit/cc-lib/resampler" "github.com/ClusterCockpit/cc-lib/schema"
) )
type ProgramConfig struct { var Keys schema.ProgramConfig = schema.ProgramConfig{
// Address where the http (or https) server will listen on (for example: 'localhost:80').
Addr string `json:"addr"`
// Addresses from which secured admin API endpoints can be reached, can be wildcard "*"
APIAllowedIPs []string `json:"apiAllowedIPs"`
APISubjects *NATSConfig `json:"apiSubjects"`
// Drop root permissions once .env was read and the port was taken.
User string `json:"user"`
Group string `json:"group"`
// Disable authentication (for everything: API, Web-UI, ...)
DisableAuthentication bool `json:"disable-authentication"`
// If `embed-static-files` is true (default), the frontend files are directly
// embeded into the go binary and expected to be in web/frontend. Only if
// it is false the files in `static-files` are served instead.
EmbedStaticFiles bool `json:"embed-static-files"`
StaticFiles string `json:"static-files"`
// Database driver - only 'sqlite3' is supported
DBDriver string `json:"db-driver"`
// Path to SQLite database file
DB string `json:"db"`
// Keep all metric data in the metric data repositories,
// do not write to the job-archive.
DisableArchive bool `json:"disable-archive"`
EnableJobTaggers bool `json:"enable-job-taggers"`
// Validate json input against schema
Validate bool `json:"validate"`
// If 0 or empty, the session does not expire!
SessionMaxAge string `json:"session-max-age"`
// If both those options are not empty, use HTTPS using those certificates.
HTTPSCertFile string `json:"https-cert-file"`
HTTPSKeyFile string `json:"https-key-file"`
// If not the empty string and `addr` does not end in ":80",
// redirect every request incoming at port 80 to that url.
RedirectHTTPTo string `json:"redirect-http-to"`
// Where to store MachineState files
MachineStateDir string `json:"machine-state-dir"`
// If not zero, automatically mark jobs as stopped running X seconds longer than their walltime.
StopJobsExceedingWalltime int `json:"stop-jobs-exceeding-walltime"`
// Defines time X in seconds in which jobs are considered to be "short" and will be filtered in specific views.
ShortRunningJobsDuration int `json:"short-running-jobs-duration"`
// Energy Mix CO2 Emission Constant [g/kWh]
// If entered, displays estimated CO2 emission for job based on jobs totalEnergy
EmissionConstant int `json:"emission-constant"`
// If exists, will enable dynamic zoom in frontend metric plots using the configured values
EnableResampling *ResampleConfig `json:"resampling"`
// Global upstream metric repository configuration for metric pull workers
UpstreamMetricRepository *json.RawMessage `json:"upstreamMetricRepository,omitempty"`
}
type ResampleConfig struct {
// Minimum number of points to trigger resampling of data
MinimumPoints int `json:"minimumPoints"`
// Array of resampling target resolutions, in seconds; Example: [600,300,60]
Resolutions []int `json:"resolutions"`
// Trigger next zoom level at less than this many visible datapoints
Trigger int `json:"trigger"`
}
type NATSConfig struct {
SubjectJobStart string `json:"subjectJobStart"`
SubjectJobStop string `json:"subjectJobStop"`
SubjectNodeState string `json:"subjectNodeState"`
}
type IntRange struct {
From int `json:"from"`
To int `json:"to"`
}
type TimeRange struct {
From *time.Time `json:"from"`
To *time.Time `json:"to"`
Range string `json:"range,omitempty"`
}
type FilterRanges struct {
Duration *IntRange `json:"duration"`
NumNodes *IntRange `json:"numNodes"`
StartTime *TimeRange `json:"startTime"`
}
type ClusterConfig struct {
Name string `json:"name"`
FilterRanges *FilterRanges `json:"filterRanges"`
}
var Clusters []*ClusterConfig
var Keys ProgramConfig = ProgramConfig{
Addr: "localhost:8080", Addr: "localhost:8080",
DisableAuthentication: false, DisableAuthentication: false,
EmbedStaticFiles: true, EmbedStaticFiles: true,
DBDriver: "sqlite3", DBDriver: "sqlite3",
DB: "./var/job.db", DB: "./var/job.db",
Archive: json.RawMessage(`{\"kind\":\"file\",\"path\":\"./var/job-archive\"}`),
DisableArchive: false, DisableArchive: false,
Validate: false, Validate: false,
SessionMaxAge: "168h", SessionMaxAge: "168h",
StopJobsExceedingWalltime: 0, StopJobsExceedingWalltime: 0,
ShortRunningJobsDuration: 5 * 60, ShortRunningJobsDuration: 5 * 60,
UiDefaults: map[string]interface{}{
"analysis_view_histogramMetrics": []string{"flops_any", "mem_bw", "mem_used"},
"analysis_view_scatterPlotMetrics": [][]string{{"flops_any", "mem_bw"}, {"flops_any", "cpu_load"}, {"cpu_load", "mem_bw"}},
"job_view_nodestats_selectedMetrics": []string{"flops_any", "mem_bw", "mem_used"},
"job_view_selectedMetrics": []string{"flops_any", "mem_bw", "mem_used"},
"job_view_showFootprint": true,
"job_list_usePaging": false,
"plot_general_colorBackground": true,
"plot_general_colorscheme": []string{"#00bfff", "#0000ff", "#ff00ff", "#ff0000", "#ff8000", "#ffff00", "#80ff00"},
"plot_general_lineWidth": 3,
"plot_list_jobsPerPage": 50,
"plot_list_selectedMetrics": []string{"cpu_load", "mem_used", "flops_any", "mem_bw"},
"plot_view_plotsPerRow": 3,
"plot_view_showPolarplot": true,
"plot_view_showRoofline": true,
"plot_view_showStatTable": true,
"system_view_selectedMetric": "cpu_load",
"analysis_view_selectedTopEntity": "user",
"analysis_view_selectedTopCategory": "totalWalltime",
"status_view_selectedTopUserCategory": "totalJobs",
"status_view_selectedTopProjectCategory": "totalJobs",
},
} }
func Init(mainConfig json.RawMessage, clusterConfig json.RawMessage) { func Init(flagConfigFile string) {
Validate(configSchema, mainConfig) raw, err := os.ReadFile(flagConfigFile)
dec := json.NewDecoder(bytes.NewReader(mainConfig)) if err != nil {
dec.DisallowUnknownFields() if !os.IsNotExist(err) {
if err := dec.Decode(&Keys); err != nil { cclog.Abortf("Config Init: Could not read config file '%s'.\nError: %s\n", flagConfigFile, err.Error())
cclog.Abortf("Config Init: Could not decode config file '%s'.\nError: %s\n", mainConfig, err.Error()) }
} } else {
if err := schema.Validate(schema.Config, bytes.NewReader(raw)); err != nil {
cclog.Abortf("Config Init: Could not validate config file '%s'.\nError: %s\n", flagConfigFile, err.Error())
}
dec := json.NewDecoder(bytes.NewReader(raw))
dec.DisallowUnknownFields()
if err := dec.Decode(&Keys); err != nil {
cclog.Abortf("Config Init: Could not decode config file '%s'.\nError: %s\n", flagConfigFile, err.Error())
}
Validate(clustersSchema, clusterConfig) if Keys.Clusters == nil || len(Keys.Clusters) < 1 {
dec = json.NewDecoder(bytes.NewReader(clusterConfig)) cclog.Abort("Config Init: At least one cluster required in config. Exited with error.")
dec.DisallowUnknownFields() }
if err := dec.Decode(&Clusters); err != nil {
cclog.Abortf("Config Init: Could not decode config file '%s'.\nError: %s\n", mainConfig, err.Error())
}
if len(Clusters) < 1 {
cclog.Abort("Config Init: At least one cluster required in config. Exited with error.")
}
if Keys.EnableResampling != nil && Keys.EnableResampling.MinimumPoints > 0 {
resampler.SetMinimumRequiredPoints(Keys.EnableResampling.MinimumPoints)
} }
} }

View File

@@ -2,29 +2,15 @@
// All rights reserved. This file is part of cc-backend. // All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style // Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file. // license that can be found in the LICENSE file.
package config package config
import ( import (
"testing" "testing"
ccconf "github.com/ClusterCockpit/cc-lib/ccConfig"
cclog "github.com/ClusterCockpit/cc-lib/ccLogger"
) )
func TestInit(t *testing.T) { func TestInit(t *testing.T) {
fp := "../../configs/config.json" fp := "../../configs/config.json"
ccconf.Init(fp) Init(fp)
if cfg := ccconf.GetPackageConfig("main"); cfg != nil {
if clustercfg := ccconf.GetPackageConfig("clusters"); clustercfg != nil {
Init(cfg, clustercfg)
} else {
cclog.Abort("Cluster configuration must be present")
}
} else {
cclog.Abort("Main configuration must be present")
}
if Keys.Addr != "0.0.0.0:443" { if Keys.Addr != "0.0.0.0:443" {
t.Errorf("wrong addr\ngot: %s \nwant: 0.0.0.0:443", Keys.Addr) t.Errorf("wrong addr\ngot: %s \nwant: 0.0.0.0:443", Keys.Addr)
} }
@@ -32,17 +18,7 @@ func TestInit(t *testing.T) {
func TestInitMinimal(t *testing.T) { func TestInitMinimal(t *testing.T) {
fp := "../../configs/config-demo.json" fp := "../../configs/config-demo.json"
ccconf.Init(fp) Init(fp)
if cfg := ccconf.GetPackageConfig("main"); cfg != nil {
if clustercfg := ccconf.GetPackageConfig("clusters"); clustercfg != nil {
Init(cfg, clustercfg)
} else {
cclog.Abort("Cluster configuration must be present")
}
} else {
cclog.Abort("Main configuration must be present")
}
if Keys.Addr != "127.0.0.1:8080" { if Keys.Addr != "127.0.0.1:8080" {
t.Errorf("wrong addr\ngot: %s \nwant: 127.0.0.1:8080", Keys.Addr) t.Errorf("wrong addr\ngot: %s \nwant: 127.0.0.1:8080", Keys.Addr)
} }

View File

@@ -2,7 +2,6 @@
// All rights reserved. This file is part of cc-backend. // All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style // Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file. // license that can be found in the LICENSE file.
package config package config
import ( import (
@@ -11,8 +10,6 @@ import (
"strings" "strings"
) )
// DEPRECATED: SUPERSEDED BY NEW USER CONFIG - userConfig.go / web.go
type DefaultMetricsCluster struct { type DefaultMetricsCluster struct {
Name string `json:"name"` Name string `json:"name"`
DefaultMetrics string `json:"default_metrics"` DefaultMetrics string `json:"default_metrics"`

View File

@@ -1,222 +0,0 @@
// Copyright (C) NHR@FAU, University Erlangen-Nuremberg.
// All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.
package config
var configSchema = `
{
"type": "object",
"properties": {
"addr": {
"description": "Address where the http (or https) server will listen on (for example: 'localhost:80').",
"type": "string"
},
"apiAllowedIPs": {
"description": "Addresses from which secured API endpoints can be reached",
"type": "array",
"items": {
"type": "string"
}
},
"user": {
"description": "Drop root permissions once .env was read and the port was taken. Only applicable if using privileged port.",
"type": "string"
},
"group": {
"description": "Drop root permissions once .env was read and the port was taken. Only applicable if using privileged port.",
"type": "string"
},
"disable-authentication": {
"description": "Disable authentication (for everything: API, Web-UI, ...).",
"type": "boolean"
},
"embed-static-files": {
"description": "If all files in web/frontend/public should be served from within the binary itself (they are embedded) or not.",
"type": "boolean"
},
"static-files": {
"description": "Folder where static assets can be found, if embed-static-files is false.",
"type": "string"
},
"db": {
"description": "Path to SQLite database file (e.g., './var/job.db')",
"type": "string"
},
"disable-archive": {
"description": "Keep all metric data in the metric data repositories, do not write to the job-archive.",
"type": "boolean"
},
"enable-job-taggers": {
"description": "Turn on automatic application and jobclass taggers",
"type": "boolean"
},
"validate": {
"description": "Validate all input json documents against json schema.",
"type": "boolean"
},
"session-max-age": {
"description": "Specifies for how long a session shall be valid as a string parsable by time.ParseDuration(). If 0 or empty, the session/token does not expire!",
"type": "string"
},
"https-cert-file": {
"description": "Filepath to SSL certificate. If also https-key-file is set use HTTPS using those certificates.",
"type": "string"
},
"https-key-file": {
"description": "Filepath to SSL key file. If also https-cert-file is set use HTTPS using those certificates.",
"type": "string"
},
"redirect-http-to": {
"description": "If not the empty string and addr does not end in :80, redirect every request incoming at port 80 to that url.",
"type": "string"
},
"stop-jobs-exceeding-walltime": {
"description": "If not zero, automatically mark jobs as stopped running X seconds longer than their walltime. Only applies if walltime is set for job.",
"type": "integer"
},
"short-running-jobs-duration": {
"description": "Do not show running jobs shorter than X seconds.",
"type": "integer"
},
"emission-constant": {
"description": ".",
"type": "integer"
},
"cron-frequency": {
"description": "Frequency of cron job workers.",
"type": "object",
"properties": {
"duration-worker": {
"description": "Duration Update Worker [Defaults to '5m']",
"type": "string"
},
"footprint-worker": {
"description": "Metric-Footprint Update Worker [Defaults to '10m']",
"type": "string"
}
}
},
"enable-resampling": {
"description": "Enable dynamic zoom in frontend metric plots.",
"type": "object",
"properties": {
"minimumPoints": {
"description": "Minimum points to trigger resampling of time-series data.",
"type": "integer"
},
"trigger": {
"description": "Trigger next zoom level at less than this many visible datapoints.",
"type": "integer"
},
"resolutions": {
"description": "Array of resampling target resolutions, in seconds.",
"type": "array",
"items": {
"type": "integer"
}
}
},
"required": ["trigger", "resolutions"]
},
"upstreamMetricRepository": {
"description": "Global upstream metric repository configuration for metric pull workers",
"type": "object",
"properties": {
"kind": {
"type": "string",
"enum": ["influxdb", "prometheus", "cc-metric-store", "cc-metric-store-internal", "test"]
},
"url": {
"type": "string"
},
"token": {
"type": "string"
}
},
"required": ["kind"]
}
},
"required": ["apiAllowedIPs"]
}`
var clustersSchema = `
{
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {
"description": "The name of the cluster.",
"type": "string"
},
"metricDataRepository": {
"description": "Type of the metric data repository for this cluster",
"type": "object",
"properties": {
"kind": {
"type": "string",
"enum": ["influxdb", "prometheus", "cc-metric-store", "cc-metric-store-internal", "test"]
},
"url": {
"type": "string"
},
"token": {
"type": "string"
}
},
"required": ["kind"]
},
"filterRanges": {
"description": "This option controls the slider ranges for the UI controls of numNodes, duration, and startTime.",
"type": "object",
"properties": {
"numNodes": {
"description": "UI slider range for number of nodes",
"type": "object",
"properties": {
"from": {
"type": "integer"
},
"to": {
"type": "integer"
}
},
"required": ["from", "to"]
},
"duration": {
"description": "UI slider range for duration",
"type": "object",
"properties": {
"from": {
"type": "integer"
},
"to": {
"type": "integer"
}
},
"required": ["from", "to"]
},
"startTime": {
"description": "UI slider range for start time",
"type": "object",
"properties": {
"from": {
"type": "string",
"format": "date-time"
},
"to": {
"type": "null"
}
},
"required": ["from", "to"]
}
},
"required": ["numNodes", "duration", "startTime"]
}
},
"required": ["name", "filterRanges"],
"minItems": 1
}
}`

View File

@@ -1,29 +0,0 @@
// Copyright (C) NHR@FAU, University Erlangen-Nuremberg.
// All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.
package config
import (
"encoding/json"
cclog "github.com/ClusterCockpit/cc-lib/ccLogger"
"github.com/santhosh-tekuri/jsonschema/v5"
)
func Validate(schema string, instance json.RawMessage) {
sch, err := jsonschema.CompileString("schema.json", schema)
if err != nil {
cclog.Fatalf("%#v", err)
}
var v any
if err := json.Unmarshal([]byte(instance), &v); err != nil {
cclog.Fatal(err)
}
if err = sch.Validate(v); err != nil {
cclog.Fatalf("%#v", err)
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -2,5 +2,4 @@
// All rights reserved. This file is part of cc-backend. // All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style // Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file. // license that can be found in the LICENSE file.
package model package model

View File

@@ -3,28 +3,14 @@
package model package model
import ( import (
"bytes"
"fmt" "fmt"
"io" "io"
"strconv" "strconv"
"time" "time"
"github.com/ClusterCockpit/cc-backend/internal/config"
"github.com/ClusterCockpit/cc-lib/schema" "github.com/ClusterCockpit/cc-lib/schema"
) )
type ClusterMetricWithName struct {
Name string `json:"name"`
Unit *schema.Unit `json:"unit,omitempty"`
Timestep int `json:"timestep"`
Data []schema.Float `json:"data"`
}
type ClusterMetrics struct {
NodeCount int `json:"nodeCount"`
Metrics []*ClusterMetricWithName `json:"metrics"`
}
type Count struct { type Count struct {
Name string `json:"name"` Name string `json:"name"`
Count int `json:"count"` Count int `json:"count"`
@@ -72,16 +58,16 @@ type JobFilter struct {
JobName *StringInput `json:"jobName,omitempty"` JobName *StringInput `json:"jobName,omitempty"`
Cluster *StringInput `json:"cluster,omitempty"` Cluster *StringInput `json:"cluster,omitempty"`
Partition *StringInput `json:"partition,omitempty"` Partition *StringInput `json:"partition,omitempty"`
Duration *config.IntRange `json:"duration,omitempty"` Duration *schema.IntRange `json:"duration,omitempty"`
Energy *FloatRange `json:"energy,omitempty"` Energy *FloatRange `json:"energy,omitempty"`
MinRunningFor *int `json:"minRunningFor,omitempty"` MinRunningFor *int `json:"minRunningFor,omitempty"`
NumNodes *config.IntRange `json:"numNodes,omitempty"` NumNodes *schema.IntRange `json:"numNodes,omitempty"`
NumAccelerators *config.IntRange `json:"numAccelerators,omitempty"` NumAccelerators *schema.IntRange `json:"numAccelerators,omitempty"`
NumHWThreads *config.IntRange `json:"numHWThreads,omitempty"` NumHWThreads *schema.IntRange `json:"numHWThreads,omitempty"`
StartTime *config.TimeRange `json:"startTime,omitempty"` StartTime *schema.TimeRange `json:"startTime,omitempty"`
State []schema.JobState `json:"state,omitempty"` State []schema.JobState `json:"state,omitempty"`
MetricStats []*MetricStatItem `json:"metricStats,omitempty"` MetricStats []*MetricStatItem `json:"metricStats,omitempty"`
Shared *string `json:"shared,omitempty"` Exclusive *int `json:"exclusive,omitempty"`
Node *StringInput `json:"node,omitempty"` Node *StringInput `json:"node,omitempty"`
} }
@@ -126,7 +112,6 @@ type JobStats struct {
type JobsStatistics struct { type JobsStatistics struct {
ID string `json:"id"` ID string `json:"id"`
Name string `json:"name"` Name string `json:"name"`
TotalUsers int `json:"totalUsers"`
TotalJobs int `json:"totalJobs"` TotalJobs int `json:"totalJobs"`
RunningJobs int `json:"runningJobs"` RunningJobs int `json:"runningJobs"`
ShortJobs int `json:"shortJobs"` ShortJobs int `json:"shortJobs"`
@@ -183,17 +168,14 @@ type NamedStatsWithScope struct {
} }
type NodeFilter struct { type NodeFilter struct {
Hostname *StringInput `json:"hostname,omitempty"` Hostname *StringInput `json:"hostname,omitempty"`
Cluster *StringInput `json:"cluster,omitempty"` Cluster *StringInput `json:"cluster,omitempty"`
Subcluster *StringInput `json:"subcluster,omitempty"` NodeState *string `json:"nodeState,omitempty"`
SchedulerState *schema.SchedulerState `json:"schedulerState,omitempty"` HealthState *schema.NodeState `json:"healthState,omitempty"`
HealthState *string `json:"healthState,omitempty"`
TimeStart *int `json:"timeStart,omitempty"`
} }
type NodeMetrics struct { type NodeMetrics struct {
Host string `json:"host"` Host string `json:"host"`
State string `json:"state"`
SubCluster string `json:"subCluster"` SubCluster string `json:"subCluster"`
Metrics []*JobMetricWithName `json:"metrics"` Metrics []*JobMetricWithName `json:"metrics"`
} }
@@ -203,17 +185,11 @@ type NodeStateResultList struct {
Count *int `json:"count,omitempty"` Count *int `json:"count,omitempty"`
} }
type NodeStates struct { type NodeStats struct {
State string `json:"state"` State string `json:"state"`
Count int `json:"count"` Count int `json:"count"`
} }
type NodeStatesTimed struct {
State string `json:"state"`
Counts []int `json:"counts"`
Times []int `json:"times"`
}
type NodesResultList struct { type NodesResultList struct {
Items []*NodeMetrics `json:"items"` Items []*NodeMetrics `json:"items"`
Offset *int `json:"offset,omitempty"` Offset *int `json:"offset,omitempty"`
@@ -270,22 +246,20 @@ type User struct {
type Aggregate string type Aggregate string
const ( const (
AggregateUser Aggregate = "USER" AggregateUser Aggregate = "USER"
AggregateProject Aggregate = "PROJECT" AggregateProject Aggregate = "PROJECT"
AggregateCluster Aggregate = "CLUSTER" AggregateCluster Aggregate = "CLUSTER"
AggregateSubcluster Aggregate = "SUBCLUSTER"
) )
var AllAggregate = []Aggregate{ var AllAggregate = []Aggregate{
AggregateUser, AggregateUser,
AggregateProject, AggregateProject,
AggregateCluster, AggregateCluster,
AggregateSubcluster,
} }
func (e Aggregate) IsValid() bool { func (e Aggregate) IsValid() bool {
switch e { switch e {
case AggregateUser, AggregateProject, AggregateCluster, AggregateSubcluster: case AggregateUser, AggregateProject, AggregateCluster:
return true return true
} }
return false return false
@@ -312,26 +286,11 @@ func (e Aggregate) MarshalGQL(w io.Writer) {
fmt.Fprint(w, strconv.Quote(e.String())) fmt.Fprint(w, strconv.Quote(e.String()))
} }
func (e *Aggregate) UnmarshalJSON(b []byte) error {
s, err := strconv.Unquote(string(b))
if err != nil {
return err
}
return e.UnmarshalGQL(s)
}
func (e Aggregate) MarshalJSON() ([]byte, error) {
var buf bytes.Buffer
e.MarshalGQL(&buf)
return buf.Bytes(), nil
}
type SortByAggregate string type SortByAggregate string
const ( const (
SortByAggregateTotalwalltime SortByAggregate = "TOTALWALLTIME" SortByAggregateTotalwalltime SortByAggregate = "TOTALWALLTIME"
SortByAggregateTotaljobs SortByAggregate = "TOTALJOBS" SortByAggregateTotaljobs SortByAggregate = "TOTALJOBS"
SortByAggregateTotalusers SortByAggregate = "TOTALUSERS"
SortByAggregateTotalnodes SortByAggregate = "TOTALNODES" SortByAggregateTotalnodes SortByAggregate = "TOTALNODES"
SortByAggregateTotalnodehours SortByAggregate = "TOTALNODEHOURS" SortByAggregateTotalnodehours SortByAggregate = "TOTALNODEHOURS"
SortByAggregateTotalcores SortByAggregate = "TOTALCORES" SortByAggregateTotalcores SortByAggregate = "TOTALCORES"
@@ -343,7 +302,6 @@ const (
var AllSortByAggregate = []SortByAggregate{ var AllSortByAggregate = []SortByAggregate{
SortByAggregateTotalwalltime, SortByAggregateTotalwalltime,
SortByAggregateTotaljobs, SortByAggregateTotaljobs,
SortByAggregateTotalusers,
SortByAggregateTotalnodes, SortByAggregateTotalnodes,
SortByAggregateTotalnodehours, SortByAggregateTotalnodehours,
SortByAggregateTotalcores, SortByAggregateTotalcores,
@@ -354,7 +312,7 @@ var AllSortByAggregate = []SortByAggregate{
func (e SortByAggregate) IsValid() bool { func (e SortByAggregate) IsValid() bool {
switch e { switch e {
case SortByAggregateTotalwalltime, SortByAggregateTotaljobs, SortByAggregateTotalusers, SortByAggregateTotalnodes, SortByAggregateTotalnodehours, SortByAggregateTotalcores, SortByAggregateTotalcorehours, SortByAggregateTotalaccs, SortByAggregateTotalacchours: case SortByAggregateTotalwalltime, SortByAggregateTotaljobs, SortByAggregateTotalnodes, SortByAggregateTotalnodehours, SortByAggregateTotalcores, SortByAggregateTotalcorehours, SortByAggregateTotalaccs, SortByAggregateTotalacchours:
return true return true
} }
return false return false
@@ -381,20 +339,6 @@ func (e SortByAggregate) MarshalGQL(w io.Writer) {
fmt.Fprint(w, strconv.Quote(e.String())) fmt.Fprint(w, strconv.Quote(e.String()))
} }
func (e *SortByAggregate) UnmarshalJSON(b []byte) error {
s, err := strconv.Unquote(string(b))
if err != nil {
return err
}
return e.UnmarshalGQL(s)
}
func (e SortByAggregate) MarshalJSON() ([]byte, error) {
var buf bytes.Buffer
e.MarshalGQL(&buf)
return buf.Bytes(), nil
}
type SortDirectionEnum string type SortDirectionEnum string
const ( const (
@@ -435,17 +379,3 @@ func (e *SortDirectionEnum) UnmarshalGQL(v any) error {
func (e SortDirectionEnum) MarshalGQL(w io.Writer) { func (e SortDirectionEnum) MarshalGQL(w io.Writer) {
fmt.Fprint(w, strconv.Quote(e.String())) fmt.Fprint(w, strconv.Quote(e.String()))
} }
func (e *SortDirectionEnum) UnmarshalJSON(b []byte) error {
s, err := strconv.Unquote(string(b))
if err != nil {
return err
}
return e.UnmarshalGQL(s)
}
func (e SortDirectionEnum) MarshalJSON() ([]byte, error) {
var buf bytes.Buffer
e.MarshalGQL(&buf)
return buf.Bytes(), nil
}

View File

@@ -1,15 +1,13 @@
package graph package graph
// This file will be automatically regenerated based on the schema, any resolver // This file will be automatically regenerated based on the schema, any resolver implementations
// implementations
// will be copied through when generating and any unknown code will be moved to the end. // will be copied through when generating and any unknown code will be moved to the end.
// Code generated by github.com/99designs/gqlgen version v0.17.84 // Code generated by github.com/99designs/gqlgen version v0.17.66
import ( import (
"context" "context"
"errors" "errors"
"fmt" "fmt"
"math"
"regexp" "regexp"
"slices" "slices"
"strconv" "strconv"
@@ -19,7 +17,7 @@ import (
"github.com/ClusterCockpit/cc-backend/internal/config" "github.com/ClusterCockpit/cc-backend/internal/config"
"github.com/ClusterCockpit/cc-backend/internal/graph/generated" "github.com/ClusterCockpit/cc-backend/internal/graph/generated"
"github.com/ClusterCockpit/cc-backend/internal/graph/model" "github.com/ClusterCockpit/cc-backend/internal/graph/model"
"github.com/ClusterCockpit/cc-backend/internal/metricdispatcher" "github.com/ClusterCockpit/cc-backend/internal/metricDataDispatcher"
"github.com/ClusterCockpit/cc-backend/internal/repository" "github.com/ClusterCockpit/cc-backend/internal/repository"
"github.com/ClusterCockpit/cc-backend/pkg/archive" "github.com/ClusterCockpit/cc-backend/pkg/archive"
cclog "github.com/ClusterCockpit/cc-lib/ccLogger" cclog "github.com/ClusterCockpit/cc-lib/ccLogger"
@@ -45,7 +43,7 @@ func (r *jobResolver) Tags(ctx context.Context, obj *schema.Job) ([]*schema.Tag,
// ConcurrentJobs is the resolver for the concurrentJobs field. // ConcurrentJobs is the resolver for the concurrentJobs field.
func (r *jobResolver) ConcurrentJobs(ctx context.Context, obj *schema.Job) (*model.JobLinkResultList, error) { func (r *jobResolver) ConcurrentJobs(ctx context.Context, obj *schema.Job) (*model.JobLinkResultList, error) {
// FIXME: Make the hardcoded duration configurable // FIXME: Make the hardcoded duration configurable
if obj.Shared != "none" && obj.Duration > 600 { if obj.Exclusive != 1 && obj.Duration > 600 {
return r.Repo.FindConcurrentJobs(ctx, obj) return r.Repo.FindConcurrentJobs(ctx, obj)
} }
@@ -88,14 +86,14 @@ func (r *jobResolver) EnergyFootprint(ctx context.Context, obj *schema.Job) ([]*
res := []*model.EnergyFootprintValue{} res := []*model.EnergyFootprintValue{}
for name, value := range rawEnergyFootprint { for name, value := range rawEnergyFootprint {
// Suboptimal: Nearly hardcoded metric name expectations // Suboptimal: Nearly hardcoded metric name expectations
matchCPU := regexp.MustCompile(`cpu|Cpu|CPU`) matchCpu := regexp.MustCompile(`cpu|Cpu|CPU`)
matchAcc := regexp.MustCompile(`acc|Acc|ACC`) matchAcc := regexp.MustCompile(`acc|Acc|ACC`)
matchMem := regexp.MustCompile(`mem|Mem|MEM`) matchMem := regexp.MustCompile(`mem|Mem|MEM`)
matchCore := regexp.MustCompile(`core|Core|CORE`) matchCore := regexp.MustCompile(`core|Core|CORE`)
hwType := "" hwType := ""
switch test := name; { // NOtice ';' for var declaration switch test := name; { // NOtice ';' for var declaration
case matchCPU.MatchString(test): case matchCpu.MatchString(test):
hwType = "CPU" hwType = "CPU"
case matchAcc.MatchString(test): case matchAcc.MatchString(test):
hwType = "Accelerator" hwType = "Accelerator"
@@ -175,9 +173,9 @@ func (r *mutationResolver) AddTagsToJob(ctx context.Context, job string, tagIds
} }
tags := []*schema.Tag{} tags := []*schema.Tag{}
for _, tagID := range tagIds { for _, tagId := range tagIds {
// Get ID // Get ID
tid, err := strconv.ParseInt(tagID, 10, 64) tid, err := strconv.ParseInt(tagId, 10, 64)
if err != nil { if err != nil {
cclog.Warn("Error while parsing tag id") cclog.Warn("Error while parsing tag id")
return nil, err return nil, err
@@ -222,9 +220,9 @@ func (r *mutationResolver) RemoveTagsFromJob(ctx context.Context, job string, ta
} }
tags := []*schema.Tag{} tags := []*schema.Tag{}
for _, tagID := range tagIds { for _, tagId := range tagIds {
// Get ID // Get ID
tid, err := strconv.ParseInt(tagID, 10, 64) tid, err := strconv.ParseInt(tagId, 10, 64)
if err != nil { if err != nil {
cclog.Warn("Error while parsing tag id") cclog.Warn("Error while parsing tag id")
return nil, err return nil, err
@@ -265,9 +263,9 @@ func (r *mutationResolver) RemoveTagFromList(ctx context.Context, tagIds []strin
} }
tags := []int{} tags := []int{}
for _, tagID := range tagIds { for _, tagId := range tagIds {
// Get ID // Get ID
tid, err := strconv.ParseInt(tagID, 10, 64) tid, err := strconv.ParseInt(tagId, 10, 64)
if err != nil { if err != nil {
cclog.Warn("Error while parsing tag id for removal") cclog.Warn("Error while parsing tag id for removal")
return nil, err return nil, err
@@ -307,23 +305,14 @@ func (r *mutationResolver) UpdateConfiguration(ctx context.Context, name string,
return nil, nil return nil, nil
} }
// ID is the resolver for the id field. // NodeState is the resolver for the nodeState field.
func (r *nodeResolver) ID(ctx context.Context, obj *schema.Node) (string, error) { func (r *nodeResolver) NodeState(ctx context.Context, obj *schema.Node) (string, error) {
panic(fmt.Errorf("not implemented: ID - id")) panic(fmt.Errorf("not implemented: NodeState - nodeState"))
} }
// SchedulerState is the resolver for the schedulerState field. // HealthState is the resolver for the HealthState field.
func (r *nodeResolver) SchedulerState(ctx context.Context, obj *schema.Node) (schema.SchedulerState, error) { func (r *nodeResolver) HealthState(ctx context.Context, obj *schema.Node) (schema.NodeState, error) {
if obj.NodeState != "" { panic(fmt.Errorf("not implemented: HealthState - HealthState"))
return obj.NodeState, nil
} else {
return "", fmt.Errorf("no SchedulerState (NodeState) on Object")
}
}
// HealthState is the resolver for the healthState field.
func (r *nodeResolver) HealthState(ctx context.Context, obj *schema.Node) (string, error) {
panic(fmt.Errorf("not implemented: HealthState - healthState"))
} }
// MetaData is the resolver for the metaData field. // MetaData is the resolver for the metaData field.
@@ -343,14 +332,6 @@ func (r *queryResolver) Tags(ctx context.Context) ([]*schema.Tag, error) {
// GlobalMetrics is the resolver for the globalMetrics field. // GlobalMetrics is the resolver for the globalMetrics field.
func (r *queryResolver) GlobalMetrics(ctx context.Context) ([]*schema.GlobalMetricListItem, error) { func (r *queryResolver) GlobalMetrics(ctx context.Context) ([]*schema.GlobalMetricListItem, error) {
user := repository.GetUserFromContext(ctx)
if user != nil {
if user.HasRole(schema.RoleUser) || user.HasRole(schema.RoleManager) {
return archive.GlobalUserMetricList, nil
}
}
return archive.GlobalMetricList, nil return archive.GlobalMetricList, nil
} }
@@ -381,77 +362,36 @@ func (r *queryResolver) AllocatedNodes(ctx context.Context, cluster string) ([]*
// Node is the resolver for the node field. // Node is the resolver for the node field.
func (r *queryResolver) Node(ctx context.Context, id string) (*schema.Node, error) { func (r *queryResolver) Node(ctx context.Context, id string) (*schema.Node, error) {
repo := repository.GetNodeRepository() repo := repository.GetNodeRepository()
numericID, err := strconv.ParseInt(id, 10, 64) numericId, err := strconv.ParseInt(id, 10, 64)
if err != nil { if err != nil {
cclog.Warn("Error while parsing job id") cclog.Warn("Error while parsing job id")
return nil, err return nil, err
} }
return repo.GetNodeByID(numericID, false) return repo.GetNode(numericId, false)
} }
// Nodes is the resolver for the nodes field. // Nodes is the resolver for the nodes field.
func (r *queryResolver) Nodes(ctx context.Context, filter []*model.NodeFilter, order *model.OrderByInput) (*model.NodeStateResultList, error) { func (r *queryResolver) Nodes(ctx context.Context, filter []*model.NodeFilter, order *model.OrderByInput) (*model.NodeStateResultList, error) {
repo := repository.GetNodeRepository() repo := repository.GetNodeRepository()
nodes, err := repo.QueryNodes(ctx, filter, nil, order) // Ignore Paging, Order Unused nodes, err := repo.QueryNodes(ctx, filter, order)
count := len(nodes) count := len(nodes)
return &model.NodeStateResultList{Items: nodes, Count: &count}, err return &model.NodeStateResultList{Items: nodes, Count: &count}, err
} }
// NodeStates is the resolver for the nodeStates field. // NodeStats is the resolver for the nodeStats field.
func (r *queryResolver) NodeStates(ctx context.Context, filter []*model.NodeFilter) ([]*model.NodeStates, error) { func (r *queryResolver) NodeStats(ctx context.Context, filter []*model.NodeFilter) ([]*model.NodeStats, error) {
repo := repository.GetNodeRepository() panic(fmt.Errorf("not implemented: NodeStats - nodeStats"))
stateCounts, serr := repo.CountStates(ctx, filter, "node_state")
if serr != nil {
cclog.Warnf("Error while counting nodeStates: %s", serr.Error())
return nil, serr
}
healthCounts, herr := repo.CountStates(ctx, filter, "health_state")
if herr != nil {
cclog.Warnf("Error while counting healthStates: %s", herr.Error())
return nil, herr
}
allCounts := append(stateCounts, healthCounts...)
return allCounts, nil
}
// NodeStatesTimed is the resolver for the nodeStatesTimed field.
func (r *queryResolver) NodeStatesTimed(ctx context.Context, filter []*model.NodeFilter, typeArg string) ([]*model.NodeStatesTimed, error) {
repo := repository.GetNodeRepository()
if typeArg == "node" {
stateCounts, serr := repo.CountStatesTimed(ctx, filter, "node_state")
if serr != nil {
cclog.Warnf("Error while counting nodeStates in time: %s", serr.Error())
return nil, serr
}
return stateCounts, nil
}
if typeArg == "health" {
healthCounts, herr := repo.CountStatesTimed(ctx, filter, "health_state")
if herr != nil {
cclog.Warnf("Error while counting healthStates in time: %s", herr.Error())
return nil, herr
}
return healthCounts, nil
}
return nil, errors.New("unknown Node State Query Type")
} }
// Job is the resolver for the job field. // Job is the resolver for the job field.
func (r *queryResolver) Job(ctx context.Context, id string) (*schema.Job, error) { func (r *queryResolver) Job(ctx context.Context, id string) (*schema.Job, error) {
numericID, err := strconv.ParseInt(id, 10, 64) numericId, err := strconv.ParseInt(id, 10, 64)
if err != nil { if err != nil {
cclog.Warn("Error while parsing job id") cclog.Warn("Error while parsing job id")
return nil, err return nil, err
} }
job, err := r.Repo.FindByID(ctx, numericID) job, err := r.Repo.FindById(ctx, numericId)
if err != nil { if err != nil {
cclog.Warn("Error while finding job by id") cclog.Warn("Error while finding job by id")
return nil, err return nil, err
@@ -484,7 +424,7 @@ func (r *queryResolver) JobMetrics(ctx context.Context, id string, metrics []str
return nil, err return nil, err
} }
data, err := metricdispatcher.LoadData(job, metrics, scopes, ctx, *resolution) data, err := metricDataDispatcher.LoadData(job, metrics, scopes, ctx, *resolution)
if err != nil { if err != nil {
cclog.Warn("Error while loading job data") cclog.Warn("Error while loading job data")
return nil, err return nil, err
@@ -512,7 +452,7 @@ func (r *queryResolver) JobStats(ctx context.Context, id string, metrics []strin
return nil, err return nil, err
} }
data, err := metricdispatcher.LoadJobStats(job, metrics, ctx) data, err := metricDataDispatcher.LoadJobStats(job, metrics, ctx)
if err != nil { if err != nil {
cclog.Warnf("Error while loading jobStats data for job id %s", id) cclog.Warnf("Error while loading jobStats data for job id %s", id)
return nil, err return nil, err
@@ -537,7 +477,7 @@ func (r *queryResolver) ScopedJobStats(ctx context.Context, id string, metrics [
return nil, err return nil, err
} }
data, err := metricdispatcher.LoadScopedJobStats(job, metrics, scopes, ctx) data, err := metricDataDispatcher.LoadScopedJobStats(job, metrics, scopes, ctx)
if err != nil { if err != nil {
cclog.Warnf("Error while loading scopedJobStats data for job id %s", id) cclog.Warnf("Error while loading scopedJobStats data for job id %s", id)
return nil, err return nil, err
@@ -618,7 +558,7 @@ func (r *queryResolver) JobsStatistics(ctx context.Context, filter []*model.JobF
defaultDurationBins := "1h" defaultDurationBins := "1h"
defaultMetricBins := 10 defaultMetricBins := 10
if requireField(ctx, "totalJobs") || requireField(ctx, "totalUsers") || requireField(ctx, "totalWalltime") || requireField(ctx, "totalNodes") || requireField(ctx, "totalCores") || if requireField(ctx, "totalJobs") || requireField(ctx, "totalWalltime") || requireField(ctx, "totalNodes") || requireField(ctx, "totalCores") ||
requireField(ctx, "totalAccs") || requireField(ctx, "totalNodeHours") || requireField(ctx, "totalCoreHours") || requireField(ctx, "totalAccHours") { requireField(ctx, "totalAccs") || requireField(ctx, "totalNodeHours") || requireField(ctx, "totalCoreHours") || requireField(ctx, "totalAccHours") {
if groupBy == nil { if groupBy == nil {
stats, err = r.Repo.JobsStats(ctx, filter) stats, err = r.Repo.JobsStats(ctx, filter)
@@ -702,7 +642,7 @@ func (r *queryResolver) JobsMetricStats(ctx context.Context, filter []*model.Job
res := []*model.JobStats{} res := []*model.JobStats{}
for _, job := range jobs { for _, job := range jobs {
data, err := metricdispatcher.LoadJobStats(job, metrics, ctx) data, err := metricDataDispatcher.LoadJobStats(job, metrics, ctx)
if err != nil { if err != nil {
cclog.Warnf("Error while loading comparison jobStats data for job id %d", job.JobID) cclog.Warnf("Error while loading comparison jobStats data for job id %d", job.JobID)
continue continue
@@ -759,20 +699,16 @@ func (r *queryResolver) NodeMetrics(ctx context.Context, cluster string, nodes [
} }
} }
data, err := metricdispatcher.LoadNodeData(cluster, metrics, nodes, scopes, from, to, ctx) data, err := metricDataDispatcher.LoadNodeData(cluster, metrics, nodes, scopes, from, to, ctx)
if err != nil { if err != nil {
cclog.Warn("error while loading node data") cclog.Warn("error while loading node data")
return nil, err return nil, err
} }
nodeRepo := repository.GetNodeRepository()
stateMap, _ := nodeRepo.MapNodes(cluster)
nodeMetrics := make([]*model.NodeMetrics, 0, len(data)) nodeMetrics := make([]*model.NodeMetrics, 0, len(data))
for hostname, metrics := range data { for hostname, metrics := range data {
host := &model.NodeMetrics{ host := &model.NodeMetrics{
Host: hostname, Host: hostname,
State: stateMap[hostname],
Metrics: make([]*model.JobMetricWithName, 0, len(metrics)*len(scopes)), Metrics: make([]*model.JobMetricWithName, 0, len(metrics)*len(scopes)),
} }
host.SubCluster, err = archive.GetSubClusterByNode(cluster, hostname) host.SubCluster, err = archive.GetSubClusterByNode(cluster, hostname)
@@ -797,7 +733,7 @@ func (r *queryResolver) NodeMetrics(ctx context.Context, cluster string, nodes [
} }
// NodeMetricsList is the resolver for the nodeMetricsList field. // NodeMetricsList is the resolver for the nodeMetricsList field.
func (r *queryResolver) NodeMetricsList(ctx context.Context, cluster string, subCluster string, stateFilter string, nodeFilter string, scopes []schema.MetricScope, metrics []string, from time.Time, to time.Time, page *model.PageRequest, resolution *int) (*model.NodesResultList, error) { func (r *queryResolver) NodeMetricsList(ctx context.Context, cluster string, subCluster string, nodeFilter string, scopes []schema.MetricScope, metrics []string, from time.Time, to time.Time, page *model.PageRequest, resolution *int) (*model.NodesResultList, error) {
if resolution == nil { // Load from Config if resolution == nil { // Load from Config
if config.Keys.EnableResampling != nil { if config.Keys.EnableResampling != nil {
defaultRes := slices.Max(config.Keys.EnableResampling.Resolutions) defaultRes := slices.Max(config.Keys.EnableResampling.Resolutions)
@@ -813,21 +749,15 @@ func (r *queryResolver) NodeMetricsList(ctx context.Context, cluster string, sub
return nil, errors.New("you need to be administrator or support staff for this query") return nil, errors.New("you need to be administrator or support staff for this query")
} }
nodeRepo := repository.GetNodeRepository()
nodes, stateMap, countNodes, hasNextPage, nerr := nodeRepo.GetNodesForList(ctx, cluster, subCluster, stateFilter, nodeFilter, page)
if nerr != nil {
return nil, errors.New("could not retrieve node list required for resolving NodeMetricsList")
}
if metrics == nil { if metrics == nil {
for _, mc := range archive.GetCluster(cluster).MetricConfig { for _, mc := range archive.GetCluster(cluster).MetricConfig {
metrics = append(metrics, mc.Name) metrics = append(metrics, mc.Name)
} }
} }
data, err := metricdispatcher.LoadNodeListData(cluster, subCluster, nodes, metrics, scopes, *resolution, from, to, ctx) data, totalNodes, hasNextPage, err := metricDataDispatcher.LoadNodeListData(cluster, subCluster, nodeFilter, metrics, scopes, *resolution, from, to, page, ctx)
if err != nil { if err != nil {
cclog.Warn("error while loading node data (Resolver.NodeMetricsList") cclog.Warn("error while loading node data")
return nil, err return nil, err
} }
@@ -835,7 +765,6 @@ func (r *queryResolver) NodeMetricsList(ctx context.Context, cluster string, sub
for hostname, metrics := range data { for hostname, metrics := range data {
host := &model.NodeMetrics{ host := &model.NodeMetrics{
Host: hostname, Host: hostname,
State: stateMap[hostname],
Metrics: make([]*model.JobMetricWithName, 0, len(metrics)*len(scopes)), Metrics: make([]*model.JobMetricWithName, 0, len(metrics)*len(scopes)),
} }
host.SubCluster, err = archive.GetSubClusterByNode(cluster, hostname) host.SubCluster, err = archive.GetSubClusterByNode(cluster, hostname)
@@ -858,90 +787,13 @@ func (r *queryResolver) NodeMetricsList(ctx context.Context, cluster string, sub
nodeMetricsListResult := &model.NodesResultList{ nodeMetricsListResult := &model.NodesResultList{
Items: nodeMetricsList, Items: nodeMetricsList,
TotalNodes: &countNodes, TotalNodes: &totalNodes,
HasNextPage: &hasNextPage, HasNextPage: &hasNextPage,
} }
return nodeMetricsListResult, nil return nodeMetricsListResult, nil
} }
// ClusterMetrics is the resolver for the clusterMetrics field.
func (r *queryResolver) ClusterMetrics(ctx context.Context, cluster string, metrics []string, from time.Time, to time.Time) (*model.ClusterMetrics, error) {
user := repository.GetUserFromContext(ctx)
if user != nil && !user.HasAnyRole([]schema.Role{schema.RoleAdmin, schema.RoleSupport}) {
return nil, errors.New("you need to be administrator or support staff for this query")
}
if metrics == nil {
for _, mc := range archive.GetCluster(cluster).MetricConfig {
metrics = append(metrics, mc.Name)
}
}
// 'nodes' == nil -> Defaults to all nodes of cluster for existing query workflow
scopes := []schema.MetricScope{"node"}
data, err := metricdispatcher.LoadNodeData(cluster, metrics, nil, scopes, from, to, ctx)
if err != nil {
cclog.Warn("error while loading node data")
return nil, err
}
clusterMetricData := make([]*model.ClusterMetricWithName, 0)
clusterMetrics := model.ClusterMetrics{NodeCount: 0, Metrics: clusterMetricData}
collectorTimestep := make(map[string]int)
collectorUnit := make(map[string]schema.Unit)
collectorData := make(map[string][]schema.Float)
for _, metrics := range data {
clusterMetrics.NodeCount += 1
for metric, scopedMetrics := range metrics {
_, ok := collectorData[metric]
if !ok {
collectorData[metric] = make([]schema.Float, 0)
for _, scopedMetric := range scopedMetrics {
// Collect Info
collectorTimestep[metric] = scopedMetric.Timestep
collectorUnit[metric] = scopedMetric.Unit
// Collect Initial Data
for _, ser := range scopedMetric.Series {
collectorData[metric] = append(collectorData[metric], ser.Data...)
}
}
} else {
// Sum up values by index
for _, scopedMetric := range scopedMetrics {
// For This Purpose (Cluster_Wide-Sum of Node Metrics) OK
for _, ser := range scopedMetric.Series {
for i, val := range ser.Data {
collectorData[metric][i] += val
}
}
}
}
}
}
for metricName, data := range collectorData {
cu := collectorUnit[metricName]
roundedData := make([]schema.Float, 0)
for _, val := range data {
roundedData = append(roundedData, schema.Float((math.Round(float64(val)*100.0) / 100.0)))
}
cm := model.ClusterMetricWithName{
Name: metricName,
Unit: &cu,
Timestep: collectorTimestep[metricName],
Data: roundedData,
}
clusterMetrics.Metrics = append(clusterMetrics.Metrics, &cm)
}
return &clusterMetrics, nil
}
// NumberOfNodes is the resolver for the numberOfNodes field. // NumberOfNodes is the resolver for the numberOfNodes field.
func (r *subClusterResolver) NumberOfNodes(ctx context.Context, obj *schema.SubCluster) (int, error) { func (r *subClusterResolver) NumberOfNodes(ctx context.Context, obj *schema.SubCluster) (int, error) {
nodeList, err := archive.ParseNodeList(obj.Nodes) nodeList, err := archive.ParseNodeList(obj.Nodes)
@@ -972,12 +824,10 @@ func (r *Resolver) Query() generated.QueryResolver { return &queryResolver{r} }
// SubCluster returns generated.SubClusterResolver implementation. // SubCluster returns generated.SubClusterResolver implementation.
func (r *Resolver) SubCluster() generated.SubClusterResolver { return &subClusterResolver{r} } func (r *Resolver) SubCluster() generated.SubClusterResolver { return &subClusterResolver{r} }
type ( type clusterResolver struct{ *Resolver }
clusterResolver struct{ *Resolver } type jobResolver struct{ *Resolver }
jobResolver struct{ *Resolver } type metricValueResolver struct{ *Resolver }
metricValueResolver struct{ *Resolver } type mutationResolver struct{ *Resolver }
mutationResolver struct{ *Resolver } type nodeResolver struct{ *Resolver }
nodeResolver struct{ *Resolver } type queryResolver struct{ *Resolver }
queryResolver struct{ *Resolver } type subClusterResolver struct{ *Resolver }
subClusterResolver struct{ *Resolver }
)

View File

@@ -2,18 +2,16 @@
// All rights reserved. This file is part of cc-backend. // All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style // Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file. // license that can be found in the LICENSE file.
package graph package graph
import ( import (
"context" "context"
"fmt" "fmt"
"math" "math"
"slices"
"github.com/99designs/gqlgen/graphql" "github.com/99designs/gqlgen/graphql"
"github.com/ClusterCockpit/cc-backend/internal/graph/model" "github.com/ClusterCockpit/cc-backend/internal/graph/model"
"github.com/ClusterCockpit/cc-backend/internal/metricdispatcher" "github.com/ClusterCockpit/cc-backend/internal/metricDataDispatcher"
cclog "github.com/ClusterCockpit/cc-lib/ccLogger" cclog "github.com/ClusterCockpit/cc-lib/ccLogger"
"github.com/ClusterCockpit/cc-lib/schema" "github.com/ClusterCockpit/cc-lib/schema"
) )
@@ -55,7 +53,7 @@ func (r *queryResolver) rooflineHeatmap(
// resolution = max(resolution, mc.Timestep) // resolution = max(resolution, mc.Timestep)
// } // }
jobdata, err := metricdispatcher.LoadData(job, []string{"flops_any", "mem_bw"}, []schema.MetricScope{schema.MetricScopeNode}, ctx, 0) jobdata, err := metricDataDispatcher.LoadData(job, []string{"flops_any", "mem_bw"}, []schema.MetricScope{schema.MetricScopeNode}, ctx, 0)
if err != nil { if err != nil {
cclog.Errorf("Error while loading roofline metrics for job %d", job.ID) cclog.Errorf("Error while loading roofline metrics for job %d", job.ID)
return nil, err return nil, err
@@ -128,7 +126,7 @@ func (r *queryResolver) jobsFootprints(ctx context.Context, filter []*model.JobF
continue continue
} }
if err := metricdispatcher.LoadAverages(job, metrics, avgs, ctx); err != nil { if err := metricDataDispatcher.LoadAverages(job, metrics, avgs, ctx); err != nil {
cclog.Error("Error while loading averages for footprint") cclog.Error("Error while loading averages for footprint")
return nil, err return nil, err
} }
@@ -187,5 +185,11 @@ func (r *queryResolver) jobsFootprints(ctx context.Context, filter []*model.JobF
func requireField(ctx context.Context, name string) bool { func requireField(ctx context.Context, name string) bool {
fields := graphql.CollectAllFields(ctx) fields := graphql.CollectAllFields(ctx)
return slices.Contains(fields, name) for _, f := range fields {
if f == name {
return true
}
}
return false
} }

View File

@@ -1,132 +0,0 @@
# Importer Package
The `importer` package provides functionality for importing job data into the ClusterCockpit database from archived job files.
## Overview
This package supports two primary import workflows:
1. **Bulk Database Initialization** - Reinitialize the entire job database from archived jobs
2. **Individual Job Import** - Import specific jobs from metadata/data file pairs
Both workflows enrich job metadata by calculating performance footprints and energy consumption metrics before persisting to the database.
## Main Entry Points
### InitDB()
Reinitializes the job database from all archived jobs.
```go
if err := importer.InitDB(); err != nil {
log.Fatal(err)
}
```
This function:
- Flushes existing job, tag, and jobtag tables
- Iterates through all jobs in the configured archive
- Enriches each job with calculated metrics
- Inserts jobs into the database in batched transactions (100 jobs per batch)
- Continues on individual job failures, logging errors
**Use Case**: Initial database setup or complete database rebuild from archive.
### HandleImportFlag(flag string)
Imports jobs from specified file pairs.
```go
// Format: "<meta.json>:<data.json>[,<meta2.json>:<data2.json>,...]"
flag := "/path/to/meta.json:/path/to/data.json"
if err := importer.HandleImportFlag(flag); err != nil {
log.Fatal(err)
}
```
This function:
- Parses the comma-separated file pairs
- Validates metadata and job data against schemas (if validation enabled)
- Enriches each job with footprints and energy metrics
- Imports jobs into both the archive and database
- Fails fast on the first error
**Use Case**: Importing specific jobs from external sources or manual job additions.
## Job Enrichment
Both import workflows use `enrichJobMetadata()` to calculate:
### Performance Footprints
Performance footprints are calculated from metric averages based on the subcluster configuration:
```go
job.Footprint["mem_used_avg"] = 45.2 // GB
job.Footprint["cpu_load_avg"] = 0.87 // percentage
```
### Energy Metrics
Energy consumption is calculated from power metrics using the formula:
```
Energy (kWh) = (Power (W) × Duration (s) / 3600) / 1000
```
For each energy metric:
```go
job.EnergyFootprint["acc_power"] = 12.5 // kWh
job.Energy = 150.2 // Total energy in kWh
```
**Note**: Energy calculations for metrics with unit "energy" (Joules) are not yet implemented.
## Data Validation
### SanityChecks(job *schema.Job)
Validates job metadata before database insertion:
- Cluster exists in configuration
- Subcluster is valid (assigns if needed)
- Job state is valid
- Resources and user fields are populated
- Node counts and hardware thread counts are positive
- Resource count matches declared node count
## Normalization Utilities
The package includes utilities for normalizing metric values to appropriate SI prefixes:
### Normalize(avg float64, prefix string)
Adjusts values and SI prefixes for readability:
```go
factor, newPrefix := importer.Normalize(2048.0, "M")
// Converts 2048 MB → ~2.0 GB
// Returns: factor for conversion, "G"
```
This is useful for automatically scaling metrics (e.g., memory, storage) to human-readable units.
## Dependencies
- `github.com/ClusterCockpit/cc-backend/internal/repository` - Database operations
- `github.com/ClusterCockpit/cc-backend/pkg/archive` - Job archive access
- `github.com/ClusterCockpit/cc-lib/schema` - Job schema definitions
- `github.com/ClusterCockpit/cc-lib/ccLogger` - Logging
- `github.com/ClusterCockpit/cc-lib/ccUnits` - SI unit handling
## Error Handling
- **InitDB**: Continues processing on individual job failures, logs errors, returns summary
- **HandleImportFlag**: Fails fast on first error, returns immediately
- Both functions log detailed error context for debugging
## Performance
- **Transaction Batching**: InitDB processes jobs in batches of 100 for optimal database performance
- **Tag Caching**: Tag IDs are cached during import to minimize database queries
- **Progress Reporting**: InitDB prints progress updates during bulk operations

View File

@@ -8,6 +8,7 @@ import (
"bytes" "bytes"
"encoding/json" "encoding/json"
"fmt" "fmt"
"math"
"os" "os"
"strings" "strings"
@@ -18,22 +19,7 @@ import (
"github.com/ClusterCockpit/cc-lib/schema" "github.com/ClusterCockpit/cc-lib/schema"
) )
// HandleImportFlag imports jobs from file pairs specified in a comma-separated flag string. // Import all jobs specified as `<path-to-meta.json>:<path-to-data.json>,...`
//
// The flag format is: "<path-to-meta.json>:<path-to-data.json>[,<path-to-meta2.json>:<path-to-data2.json>,...]"
//
// For each job pair, this function:
// 1. Reads and validates the metadata JSON file (schema.Job)
// 2. Reads and validates the job data JSON file (schema.JobData)
// 3. Enriches the job with calculated footprints and energy metrics
// 4. Validates the job using SanityChecks()
// 5. Imports the job into the archive
// 6. Inserts the job into the database with associated tags
//
// Schema validation is performed if config.Keys.Validate is true.
//
// Returns an error if file reading, validation, enrichment, or database operations fail.
// The function stops processing on the first error encountered.
func HandleImportFlag(flag string) error { func HandleImportFlag(flag string) error {
r := repository.GetJobRepository() r := repository.GetJobRepository()
@@ -57,7 +43,7 @@ func HandleImportFlag(flag string) error {
dec := json.NewDecoder(bytes.NewReader(raw)) dec := json.NewDecoder(bytes.NewReader(raw))
dec.DisallowUnknownFields() dec.DisallowUnknownFields()
job := schema.Job{ job := schema.Job{
Shared: "none", Exclusive: 1,
MonitoringStatus: schema.MonitoringStatusRunningOrArchiving, MonitoringStatus: schema.MonitoringStatusRunningOrArchiving,
} }
if err = dec.Decode(&job); err != nil { if err = dec.Decode(&job); err != nil {
@@ -86,8 +72,75 @@ func HandleImportFlag(flag string) error {
job.MonitoringStatus = schema.MonitoringStatusArchivingSuccessful job.MonitoringStatus = schema.MonitoringStatusArchivingSuccessful
if err = enrichJobMetadata(&job); err != nil { sc, err := archive.GetSubCluster(job.Cluster, job.SubCluster)
cclog.Errorf("Error enriching job metadata: %v", err) if err != nil {
cclog.Errorf("cannot get subcluster: %s", err.Error())
return err
}
job.Footprint = make(map[string]float64)
for _, fp := range sc.Footprint {
statType := "avg"
if i, err := archive.MetricIndex(sc.MetricConfig, fp); err != nil {
statType = sc.MetricConfig[i].Footprint
}
name := fmt.Sprintf("%s_%s", fp, statType)
job.Footprint[name] = repository.LoadJobStat(&job, fp, statType)
}
job.RawFootprint, err = json.Marshal(job.Footprint)
if err != nil {
cclog.Warn("Error while marshaling job footprint")
return err
}
job.EnergyFootprint = make(map[string]float64)
// Total Job Energy Outside Loop
totalEnergy := 0.0
for _, fp := range sc.EnergyFootprint {
// Always Init Metric Energy Inside Loop
metricEnergy := 0.0
if i, err := archive.MetricIndex(sc.MetricConfig, fp); err == nil {
// Note: For DB data, calculate and save as kWh
if sc.MetricConfig[i].Energy == "energy" { // this metric has energy as unit (Joules)
cclog.Warnf("Update EnergyFootprint for Job %d and Metric %s on cluster %s: Set to 'energy' in cluster.json: Not implemented, will return 0.0", job.JobID, job.Cluster, fp)
// FIXME: Needs sum as stats type
} else if sc.MetricConfig[i].Energy == "power" { // this metric has power as unit (Watt)
// Energy: Power (in Watts) * Time (in Seconds)
// Unit: (W * (s / 3600)) / 1000 = kWh
// Round 2 Digits: round(Energy * 100) / 100
// Here: (All-Node Metric Average * Number of Nodes) * (Job Duration in Seconds / 3600) / 1000
// Note: Shared Jobs handled correctly since "Node Average" is based on partial resources, while "numNodes" factor is 1
rawEnergy := ((repository.LoadJobStat(&job, fp, "avg") * float64(job.NumNodes)) * (float64(job.Duration) / 3600.0)) / 1000.0
metricEnergy = math.Round(rawEnergy*100.0) / 100.0
}
} else {
cclog.Warnf("Error while collecting energy metric %s for job, DB ID '%v', return '0.0'", fp, job.ID)
}
job.EnergyFootprint[fp] = metricEnergy
totalEnergy += metricEnergy
}
job.Energy = (math.Round(totalEnergy*100.0) / 100.0)
if job.RawEnergyFootprint, err = json.Marshal(job.EnergyFootprint); err != nil {
cclog.Warnf("Error while marshaling energy footprint for job INTO BYTES, DB ID '%v'", job.ID)
return err
}
job.RawResources, err = json.Marshal(job.Resources)
if err != nil {
cclog.Warn("Error while marshaling job resources")
return err
}
job.RawMetaData, err = json.Marshal(job.MetaData)
if err != nil {
cclog.Warn("Error while marshaling job metadata")
return err return err
} }

View File

@@ -16,12 +16,9 @@ import (
"github.com/ClusterCockpit/cc-backend/internal/importer" "github.com/ClusterCockpit/cc-backend/internal/importer"
"github.com/ClusterCockpit/cc-backend/internal/repository" "github.com/ClusterCockpit/cc-backend/internal/repository"
"github.com/ClusterCockpit/cc-backend/pkg/archive" "github.com/ClusterCockpit/cc-backend/pkg/archive"
ccconf "github.com/ClusterCockpit/cc-lib/ccConfig"
cclog "github.com/ClusterCockpit/cc-lib/ccLogger" cclog "github.com/ClusterCockpit/cc-lib/ccLogger"
) )
// copyFile copies a file from source path to destination path.
// Used by tests to set up test fixtures.
func copyFile(s string, d string) error { func copyFile(s string, d string) error {
r, err := os.Open(s) r, err := os.Open(s)
if err != nil { if err != nil {
@@ -37,29 +34,24 @@ func copyFile(s string, d string) error {
return nil return nil
} }
// setup initializes a test environment for importer tests.
//
// Creates a temporary directory with:
// - A test job archive with cluster configuration
// - A SQLite database initialized with schema
// - Configuration files loaded
//
// Returns a JobRepository instance for test assertions.
func setup(t *testing.T) *repository.JobRepository { func setup(t *testing.T) *repository.JobRepository {
const testconfig = `{ const testconfig = `{
"main": {
"addr": "0.0.0.0:8080", "addr": "0.0.0.0:8080",
"validate": false, "validate": false,
"apiAllowedIPs": [
"*"
]},
"archive": { "archive": {
"kind": "file", "kind": "file",
"path": "./var/job-archive" "path": "./var/job-archive"
}, },
"jwts": {
"max-age": "2m"
},
"apiAllowedIPs": [
"*"
],
"clusters": [ "clusters": [
{ {
"name": "testcluster", "name": "testcluster",
"metricDataRepository": {"kind": "test", "url": "bla:8081"},
"filterRanges": { "filterRanges": {
"numNodes": { "from": 1, "to": 64 }, "numNodes": { "from": 1, "to": 64 },
"duration": { "from": 0, "to": 86400 }, "duration": { "from": 0, "to": 86400 },
@@ -68,6 +60,7 @@ func setup(t *testing.T) *repository.JobRepository {
}, },
{ {
"name": "fritz", "name": "fritz",
"metricDataRepository": {"kind": "test", "url": "bla:8081"},
"filterRanges": { "filterRanges": {
"numNodes": { "from": 1, "to": 944 }, "numNodes": { "from": 1, "to": 944 },
"duration": { "from": 0, "to": 86400 }, "duration": { "from": 0, "to": 86400 },
@@ -76,6 +69,7 @@ func setup(t *testing.T) *repository.JobRepository {
}, },
{ {
"name": "taurus", "name": "taurus",
"metricDataRepository": {"kind": "test", "url": "bla:8081"},
"filterRanges": { "filterRanges": {
"numNodes": { "from": 1, "to": 4000 }, "numNodes": { "from": 1, "to": 4000 },
"duration": { "from": 0, "to": 604800 }, "duration": { "from": 0, "to": 604800 },
@@ -88,14 +82,14 @@ func setup(t *testing.T) *repository.JobRepository {
tmpdir := t.TempDir() tmpdir := t.TempDir()
jobarchive := filepath.Join(tmpdir, "job-archive") jobarchive := filepath.Join(tmpdir, "job-archive")
if err := os.Mkdir(jobarchive, 0o777); err != nil { if err := os.Mkdir(jobarchive, 0777); err != nil {
t.Fatal(err) t.Fatal(err)
} }
if err := os.WriteFile(filepath.Join(jobarchive, "version.txt"), fmt.Appendf(nil, "%d", 3), 0o666); err != nil { if err := os.WriteFile(filepath.Join(jobarchive, "version.txt"), []byte(fmt.Sprintf("%d", 2)), 0666); err != nil {
t.Fatal(err) t.Fatal(err)
} }
fritzArchive := filepath.Join(tmpdir, "job-archive", "fritz") fritzArchive := filepath.Join(tmpdir, "job-archive", "fritz")
if err := os.Mkdir(fritzArchive, 0o777); err != nil { if err := os.Mkdir(fritzArchive, 0777); err != nil {
t.Fatal(err) t.Fatal(err)
} }
if err := copyFile(filepath.Join("testdata", "cluster-fritz.json"), if err := copyFile(filepath.Join("testdata", "cluster-fritz.json"),
@@ -104,29 +98,17 @@ func setup(t *testing.T) *repository.JobRepository {
} }
dbfilepath := filepath.Join(tmpdir, "test.db") dbfilepath := filepath.Join(tmpdir, "test.db")
err := repository.MigrateDB(dbfilepath) err := repository.MigrateDB("sqlite3", dbfilepath)
if err != nil { if err != nil {
t.Fatal(err) t.Fatal(err)
} }
cfgFilePath := filepath.Join(tmpdir, "config.json") cfgFilePath := filepath.Join(tmpdir, "config.json")
if err := os.WriteFile(cfgFilePath, []byte(testconfig), 0o666); err != nil { if err := os.WriteFile(cfgFilePath, []byte(testconfig), 0666); err != nil {
t.Fatal(err) t.Fatal(err)
} }
ccconf.Init(cfgFilePath) config.Init(cfgFilePath)
// Load and check main configuration
if cfg := ccconf.GetPackageConfig("main"); cfg != nil {
if clustercfg := ccconf.GetPackageConfig("clusters"); clustercfg != nil {
config.Init(cfg, clustercfg)
} else {
t.Fatal("Cluster configuration must be present")
}
} else {
t.Fatal("Main configuration must be present")
}
archiveCfg := fmt.Sprintf("{\"kind\": \"file\",\"path\": \"%s\"}", jobarchive) archiveCfg := fmt.Sprintf("{\"kind\": \"file\",\"path\": \"%s\"}", jobarchive)
if err := archive.Init(json.RawMessage(archiveCfg), config.Keys.DisableArchive); err != nil { if err := archive.Init(json.RawMessage(archiveCfg), config.Keys.DisableArchive); err != nil {
@@ -137,7 +119,6 @@ func setup(t *testing.T) *repository.JobRepository {
return repository.GetJobRepository() return repository.GetJobRepository()
} }
// Result represents the expected test result for job import verification.
type Result struct { type Result struct {
JobId int64 JobId int64
Cluster string Cluster string
@@ -145,8 +126,6 @@ type Result struct {
Duration int32 Duration int32
} }
// readResult reads the expected test result from a golden file.
// Golden files contain the expected job attributes after import.
func readResult(t *testing.T, testname string) Result { func readResult(t *testing.T, testname string) Result {
var r Result var r Result
@@ -164,13 +143,6 @@ func readResult(t *testing.T, testname string) Result {
return r return r
} }
// TestHandleImportFlag tests the HandleImportFlag function with various job import scenarios.
//
// The test uses golden files in testdata/ to verify that jobs are correctly:
// - Parsed from metadata and data JSON files
// - Enriched with footprints and energy metrics
// - Inserted into the database
// - Retrievable with correct attributes
func TestHandleImportFlag(t *testing.T) { func TestHandleImportFlag(t *testing.T) {
r := setup(t) r := setup(t)

View File

@@ -2,15 +2,6 @@
// All rights reserved. This file is part of cc-backend. // All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style // Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file. // license that can be found in the LICENSE file.
// Package importer provides functionality for importing job data into the ClusterCockpit database.
//
// The package supports two primary use cases:
// 1. Bulk database initialization from archived jobs via InitDB()
// 2. Individual job import from file pairs via HandleImportFlag()
//
// Both operations enrich job metadata by calculating footprints and energy metrics
// before persisting to the database.
package importer package importer
import ( import (
@@ -31,21 +22,8 @@ const (
setTagQuery = "INSERT INTO jobtag (job_id, tag_id) VALUES (?, ?)" setTagQuery = "INSERT INTO jobtag (job_id, tag_id) VALUES (?, ?)"
) )
// InitDB reinitializes the job database from archived job data. // Delete the tables "job", "tag" and "jobtag" from the database and
// // repopulate them using the jobs found in `archive`.
// This function performs the following operations:
// 1. Flushes existing job, tag, and jobtag tables
// 2. Iterates through all jobs in the archive
// 3. Enriches each job with calculated footprints and energy metrics
// 4. Inserts jobs and tags into the database in batched transactions
//
// Jobs are processed in batches of 100 for optimal performance. The function
// continues processing even if individual jobs fail, logging errors and
// returning a summary at the end.
//
// Returns an error if database initialization, transaction management, or
// critical operations fail. Individual job failures are logged but do not
// stop the overall import process.
func InitDB() error { func InitDB() error {
r := repository.GetJobRepository() r := repository.GetJobRepository()
if err := r.Flush(); err != nil { if err := r.Flush(); err != nil {
@@ -62,7 +40,7 @@ func InitDB() error {
} }
tags := make(map[string]int64) tags := make(map[string]int64)
// Not using cclog.Print because we want the line to end with `\r` and // Not using log.Print because we want the line to end with `\r` and
// this function is only ever called when a special command line flag // this function is only ever called when a special command line flag
// is passed anyways. // is passed anyways.
fmt.Printf("%d jobs inserted...\r", 0) fmt.Printf("%d jobs inserted...\r", 0)
@@ -74,32 +52,85 @@ func InitDB() error {
for jobContainer := range ar.Iter(false) { for jobContainer := range ar.Iter(false) {
jobMeta := jobContainer.Meta jobMeta := jobContainer.Meta
if jobMeta == nil {
cclog.Warn("skipping job with nil metadata")
errorOccured++
continue
}
// Bundle 100 inserts into one transaction for better performance // Bundle 100 inserts into one transaction for better performance
if i%100 == 0 { if i%100 == 0 {
if i > 0 { r.TransactionCommit(t)
if err := t.Commit(); err != nil {
cclog.Errorf("transaction commit error: %v", err)
return err
}
// Start a new transaction for the next batch
t, err = r.TransactionInit()
if err != nil {
cclog.Errorf("transaction init error: %v", err)
return err
}
}
fmt.Printf("%d jobs inserted...\r", i) fmt.Printf("%d jobs inserted...\r", i)
} }
jobMeta.MonitoringStatus = schema.MonitoringStatusArchivingSuccessful jobMeta.MonitoringStatus = schema.MonitoringStatusArchivingSuccessful
if err := enrichJobMetadata(jobMeta); err != nil { sc, err := archive.GetSubCluster(jobMeta.Cluster, jobMeta.SubCluster)
if err != nil {
cclog.Errorf("cannot get subcluster: %s", err.Error())
return err
}
jobMeta.Footprint = make(map[string]float64)
for _, fp := range sc.Footprint {
statType := "avg"
if i, err := archive.MetricIndex(sc.MetricConfig, fp); err != nil {
statType = sc.MetricConfig[i].Footprint
}
name := fmt.Sprintf("%s_%s", fp, statType)
jobMeta.Footprint[name] = repository.LoadJobStat(jobMeta, fp, statType)
}
jobMeta.RawFootprint, err = json.Marshal(jobMeta.Footprint)
if err != nil {
cclog.Warn("Error while marshaling job footprint")
return err
}
jobMeta.EnergyFootprint = make(map[string]float64)
// Total Job Energy Outside Loop
totalEnergy := 0.0
for _, fp := range sc.EnergyFootprint {
// Always Init Metric Energy Inside Loop
metricEnergy := 0.0
if i, err := archive.MetricIndex(sc.MetricConfig, fp); err == nil {
// Note: For DB data, calculate and save as kWh
if sc.MetricConfig[i].Energy == "energy" { // this metric has energy as unit (Joules)
cclog.Warnf("Update EnergyFootprint for Job %d and Metric %s on cluster %s: Set to 'energy' in cluster.json: Not implemented, will return 0.0", jobMeta.JobID, jobMeta.Cluster, fp)
// FIXME: Needs sum as stats type
} else if sc.MetricConfig[i].Energy == "power" { // this metric has power as unit (Watt)
// Energy: Power (in Watts) * Time (in Seconds)
// Unit: (W * (s / 3600)) / 1000 = kWh
// Round 2 Digits: round(Energy * 100) / 100
// Here: (All-Node Metric Average * Number of Nodes) * (Job Duration in Seconds / 3600) / 1000
// Note: Shared Jobs handled correctly since "Node Average" is based on partial resources, while "numNodes" factor is 1
rawEnergy := ((repository.LoadJobStat(jobMeta, fp, "avg") * float64(jobMeta.NumNodes)) * (float64(jobMeta.Duration) / 3600.0)) / 1000.0
metricEnergy = math.Round(rawEnergy*100.0) / 100.0
}
} else {
cclog.Warnf("Error while collecting energy metric %s for job, DB ID '%v', return '0.0'", fp, jobMeta.ID)
}
jobMeta.EnergyFootprint[fp] = metricEnergy
totalEnergy += metricEnergy
}
jobMeta.Energy = (math.Round(totalEnergy*100.0) / 100.0)
if jobMeta.RawEnergyFootprint, err = json.Marshal(jobMeta.EnergyFootprint); err != nil {
cclog.Warnf("Error while marshaling energy footprint for job INTO BYTES, DB ID '%v'", jobMeta.ID)
return err
}
jobMeta.RawResources, err = json.Marshal(jobMeta.Resources)
if err != nil {
cclog.Errorf("repository initDB(): %v", err)
errorOccured++
continue
}
jobMeta.RawMetaData, err = json.Marshal(jobMeta.MetaData)
if err != nil {
cclog.Errorf("repository initDB(): %v", err) cclog.Errorf("repository initDB(): %v", err)
errorOccured++ errorOccured++
continue continue
@@ -111,23 +142,19 @@ func InitDB() error {
continue continue
} }
id, jobErr := r.TransactionAddNamed(t, id, err := r.TransactionAddNamed(t,
repository.NamedJobInsert, jobMeta) repository.NamedJobInsert, jobMeta)
if jobErr != nil { if err != nil {
cclog.Errorf("repository initDB(): %v", jobErr) cclog.Errorf("repository initDB(): %v", err)
errorOccured++ errorOccured++
continue continue
} }
// Job successfully inserted, increment counter
i += 1
for _, tag := range jobMeta.Tags { for _, tag := range jobMeta.Tags {
tagstr := tag.Name + ":" + tag.Type tagstr := tag.Name + ":" + tag.Type
tagID, ok := tags[tagstr] tagId, ok := tags[tagstr]
if !ok { if !ok {
var err error tagId, err = r.TransactionAdd(t,
tagID, err = r.TransactionAdd(t,
addTagQuery, addTagQuery,
tag.Name, tag.Type) tag.Name, tag.Type)
if err != nil { if err != nil {
@@ -135,12 +162,16 @@ func InitDB() error {
errorOccured++ errorOccured++
continue continue
} }
tags[tagstr] = tagID tags[tagstr] = tagId
} }
r.TransactionAdd(t, r.TransactionAdd(t,
setTagQuery, setTagQuery,
id, tagID) id, tagId)
}
if err == nil {
i += 1
} }
} }
@@ -149,114 +180,11 @@ func InitDB() error {
} }
r.TransactionEnd(t) r.TransactionEnd(t)
cclog.Infof("A total of %d jobs have been registered in %.3f seconds.", i, time.Since(starttime).Seconds()) cclog.Printf("A total of %d jobs have been registered in %.3f seconds.\n", i, time.Since(starttime).Seconds())
return nil return nil
} }
// enrichJobMetadata calculates and populates job footprints, energy metrics, and serialized fields. // This function also sets the subcluster if necessary!
//
// This function performs the following enrichment operations:
// 1. Calculates job footprint metrics based on the subcluster configuration
// 2. Computes energy footprint and total energy consumption in kWh
// 3. Marshals footprints, resources, and metadata into JSON for database storage
//
// The function expects the job's MonitoringStatus and SubCluster to be already set.
// Energy calculations convert power metrics (Watts) to energy (kWh) using the formula:
//
// Energy (kWh) = (Power (W) * Duration (s) / 3600) / 1000
//
// Returns an error if subcluster retrieval, metric indexing, or JSON marshaling fails.
func enrichJobMetadata(job *schema.Job) error {
sc, err := archive.GetSubCluster(job.Cluster, job.SubCluster)
if err != nil {
cclog.Errorf("cannot get subcluster: %s", err.Error())
return err
}
job.Footprint = make(map[string]float64)
for _, fp := range sc.Footprint {
statType := "avg"
if i, err := archive.MetricIndex(sc.MetricConfig, fp); err != nil {
statType = sc.MetricConfig[i].Footprint
}
name := fmt.Sprintf("%s_%s", fp, statType)
job.Footprint[name] = repository.LoadJobStat(job, fp, statType)
}
job.RawFootprint, err = json.Marshal(job.Footprint)
if err != nil {
cclog.Warn("Error while marshaling job footprint")
return err
}
job.EnergyFootprint = make(map[string]float64)
// Total Job Energy Outside Loop
totalEnergy := 0.0
for _, fp := range sc.EnergyFootprint {
// Always Init Metric Energy Inside Loop
metricEnergy := 0.0
if i, err := archive.MetricIndex(sc.MetricConfig, fp); err == nil {
// Note: For DB data, calculate and save as kWh
switch sc.MetricConfig[i].Energy {
case "energy": // this metric has energy as unit (Joules)
cclog.Warnf("Update EnergyFootprint for Job %d and Metric %s on cluster %s: Set to 'energy' in cluster.json: Not implemented, will return 0.0", job.JobID, job.Cluster, fp)
// FIXME: Needs sum as stats type
case "power": // this metric has power as unit (Watt)
// Energy: Power (in Watts) * Time (in Seconds)
// Unit: (W * (s / 3600)) / 1000 = kWh
// Round 2 Digits: round(Energy * 100) / 100
// Here: (All-Node Metric Average * Number of Nodes) * (Job Duration in Seconds / 3600) / 1000
// Note: Shared Jobs handled correctly since "Node Average" is based on partial resources, while "numNodes" factor is 1
rawEnergy := ((repository.LoadJobStat(job, fp, "avg") * float64(job.NumNodes)) * (float64(job.Duration) / 3600.0)) / 1000.0
metricEnergy = math.Round(rawEnergy*100.0) / 100.0
}
} else {
cclog.Warnf("Error while collecting energy metric %s for job, DB ID '%v', return '0.0'", fp, job.ID)
}
job.EnergyFootprint[fp] = metricEnergy
totalEnergy += metricEnergy
}
job.Energy = (math.Round(totalEnergy*100.0) / 100.0)
if job.RawEnergyFootprint, err = json.Marshal(job.EnergyFootprint); err != nil {
cclog.Warnf("Error while marshaling energy footprint for job INTO BYTES, DB ID '%v'", job.ID)
return err
}
job.RawResources, err = json.Marshal(job.Resources)
if err != nil {
cclog.Warn("Error while marshaling job resources")
return err
}
job.RawMetaData, err = json.Marshal(job.MetaData)
if err != nil {
cclog.Warn("Error while marshaling job metadata")
return err
}
return nil
}
// SanityChecks validates job metadata and ensures cluster/subcluster configuration is valid.
//
// This function performs the following validations:
// 1. Verifies the cluster exists in the archive configuration
// 2. Assigns and validates the subcluster (may modify job.SubCluster)
// 3. Validates job state is a recognized value
// 4. Ensures resources and user fields are populated
// 5. Validates node counts and hardware thread counts are positive
// 6. Verifies the number of resources matches the declared node count
//
// The function may modify the job's SubCluster field if it needs to be assigned.
//
// Returns an error if any validation check fails.
func SanityChecks(job *schema.Job) error { func SanityChecks(job *schema.Job) error {
if c := archive.GetCluster(job.Cluster); c == nil { if c := archive.GetCluster(job.Cluster); c == nil {
return fmt.Errorf("no such cluster: %v", job.Cluster) return fmt.Errorf("no such cluster: %v", job.Cluster)
@@ -281,14 +209,6 @@ func SanityChecks(job *schema.Job) error {
return nil return nil
} }
// checkJobData normalizes metric units in job data based on average values.
//
// NOTE: This function is currently unused and contains incomplete implementation.
// It was intended to normalize byte and file-related metrics to appropriate SI prefixes,
// but the normalization logic is commented out. Consider removing or completing this
// function based on project requirements.
//
// TODO: Either implement the metric normalization or remove this dead code.
func checkJobData(d *schema.JobData) error { func checkJobData(d *schema.JobData) error {
for _, scopes := range *d { for _, scopes := range *d {
// var newUnit schema.Unit // var newUnit schema.Unit

View File

@@ -10,24 +10,10 @@ import (
ccunits "github.com/ClusterCockpit/cc-lib/ccUnits" ccunits "github.com/ClusterCockpit/cc-lib/ccUnits"
) )
// getNormalizationFactor calculates the scaling factor needed to normalize a value
// to a more readable range (typically between 1.0 and 1000.0).
//
// For values greater than 1000, the function scales down by factors of 1000 (returns negative exponent).
// For values less than 1.0, the function scales up by factors of 1000 (returns positive exponent).
//
// Returns:
// - factor: The multiplicative factor to apply (10^(count*scale))
// - exponent: The power of 10 representing the adjustment (multiple of 3 for SI prefixes)
func getNormalizationFactor(v float64) (float64, int) { func getNormalizationFactor(v float64) (float64, int) {
count := 0 count := 0
scale := -3 scale := -3
// Prevent infinite loop for zero or negative values
if v <= 0.0 {
return 1.0, 0
}
if v > 1000.0 { if v > 1000.0 {
for v > 1000.0 { for v > 1000.0 {
v *= 1e-3 v *= 1e-3
@@ -43,22 +29,9 @@ func getNormalizationFactor(v float64) (float64, int) {
return math.Pow10(count * scale), count * scale return math.Pow10(count * scale), count * scale
} }
// getExponent calculates the SI prefix exponent from a numeric prefix value.
//
// For example:
// - Input: 1000.0 (kilo) returns 3
// - Input: 1000000.0 (mega) returns 6
// - Input: 1000000000.0 (giga) returns 9
//
// Returns the exponent representing the power of 10 for the SI prefix.
func getExponent(p float64) int { func getExponent(p float64) int {
count := 0 count := 0
// Prevent infinite loop for infinity or NaN values
if math.IsInf(p, 0) || math.IsNaN(p) || p <= 0.0 {
return 0
}
for p > 1.0 { for p > 1.0 {
p = p / 1000.0 p = p / 1000.0
count++ count++
@@ -67,42 +40,12 @@ func getExponent(p float64) int {
return count * 3 return count * 3
} }
// newPrefixFromFactor computes a new SI unit prefix after applying a normalization factor.
//
// Given an original prefix and an exponent adjustment, this function calculates
// the resulting SI prefix. For example, if normalizing from bytes (no prefix) by
// a factor of 10^9, the result would be the "G" (giga) prefix.
//
// Parameters:
// - op: The original SI prefix value
// - e: The exponent adjustment to apply
//
// Returns the new SI prefix after adjustment.
func newPrefixFromFactor(op ccunits.Prefix, e int) ccunits.Prefix { func newPrefixFromFactor(op ccunits.Prefix, e int) ccunits.Prefix {
f := float64(op) f := float64(op)
exp := math.Pow10(getExponent(f) - e) exp := math.Pow10(getExponent(f) - e)
return ccunits.Prefix(exp) return ccunits.Prefix(exp)
} }
// Normalize adjusts a metric value and its SI unit prefix to a more readable range.
//
// This function is useful for automatically scaling metrics to appropriate units.
// For example, normalizing 2048 MiB might result in ~2.0 GiB.
//
// The function analyzes the average value and determines if a different SI prefix
// would make the number more human-readable (typically keeping values between 1 and 1000).
//
// Parameters:
// - avg: The metric value to normalize
// - p: The current SI prefix as a string (e.g., "K", "M", "G")
//
// Returns:
// - factor: The multiplicative factor to apply to convert the value
// - newPrefix: The new SI prefix string to use
//
// Example:
//
// factor, newPrefix := Normalize(2048.0, "M") // returns factor for MB->GB conversion, "G"
func Normalize(avg float64, p string) (float64, string) { func Normalize(avg float64, p string) (float64, string) {
f, e := getNormalizationFactor(avg) f, e := getNormalizationFactor(avg)

View File

@@ -11,8 +11,6 @@ import (
ccunits "github.com/ClusterCockpit/cc-lib/ccUnits" ccunits "github.com/ClusterCockpit/cc-lib/ccUnits"
) )
// TestNormalizeFactor tests the normalization of large byte values to gigabyte prefix.
// Verifies that values in the billions are correctly scaled to the "G" (giga) prefix.
func TestNormalizeFactor(t *testing.T) { func TestNormalizeFactor(t *testing.T) {
// var us string // var us string
s := []float64{2890031237, 23998994567, 389734042344, 390349424345} s := []float64{2890031237, 23998994567, 389734042344, 390349424345}
@@ -40,8 +38,6 @@ func TestNormalizeFactor(t *testing.T) {
} }
} }
// TestNormalizeKeep tests that values already in an appropriate range maintain their prefix.
// Verifies that when values don't require rescaling, the original "G" prefix is preserved.
func TestNormalizeKeep(t *testing.T) { func TestNormalizeKeep(t *testing.T) {
s := []float64{3.0, 24.0, 390.0, 391.0} s := []float64{3.0, 24.0, 390.0, 391.0}

View File

@@ -1 +1 @@
{"jobId":398955,"user":"k106eb10","project":"k106eb","cluster":"fritz","subCluster":"main","partition":"singlenode","arrayJobId":0,"numNodes":1,"numHwthreads":72,"numAcc":0,"shared":"none","monitoringStatus":1,"smt":0,"jobState":"completed","duration":260,"walltime":86340,"resources":[{"hostname":"f0720"}],"metaData":{"jobName":"ams_pipeline","jobScript":"#!/bin/bash -l\n#SBATCH --job-name=ams_pipeline\n#SBATCH --time=23:59:00\n#SBATCH --partition=singlenode\n#SBATCH --ntasks=72\n#SBATCH --hint=multithread\n#SBATCH --chdir=/home/atuin/k106eb/k106eb10/ACE/Ni-Al/DFT/VASP_PBE_500_0.125_0.1_NM/AlNi/binaries/bulk/base-hcp/occ-shaken/hcp16.occ.4.shake.0/cfg/NiAl3NiAl11\n#SBATCH --export=NONE\nunset SLURM_EXPORT_ENV\nuss=$(whoami)\nfind /dev/shm/ -user $uss -type f -mmin +30 -delete\ncd \"/home/atuin/k106eb/k106eb10/ACE/Ni-Al/DFT/VASP_PBE_500_0.125_0.1_NM/AlNi/binaries/bulk/base-hcp/occ-shaken/hcp16.occ.4.shake.0/cfg/NiAl3NiAl11\"\nams_pipeline pipeline.json \u003e \"/home/atuin/k106eb/k106eb10/ACE/Ni-Al/DFT/VASP_PBE_500_0.125_0.1_NM/AlNi/binaries/bulk/base-hcp/occ-shaken/hcp16.occ.4.shake.0/cfg/NiAl3NiAl11/ams_pipeline_job.sh.out\" 2\u003e \"/home/atuin/k106eb/k106eb10/ACE/Ni-Al/DFT/VASP_PBE_500_0.125_0.1_NM/AlNi/binaries/bulk/base-hcp/occ-shaken/hcp16.occ.4.shake.0/cfg/NiAl3NiAl11/ams_pipeline_job.sh.err\"\n","slurmInfo":"\nJobId=398955 JobName=ams_pipeline\n UserId=k106eb10(210387) GroupId=80111\n Account=k106eb QOS=normal \n Requeue=False Restarts=0 BatchFlag=True \n TimeLimit=1439\n SubmitTime=2023-02-09T14:11:22\n Partition=singlenode \n NodeList=f0720\n NumNodes=1 NumCPUs=72 NumTasks=72 CPUs/Task=1\n NTasksPerNode:Socket:Core=0:None:None\n TRES_req=cpu=72,mem=250000M,node=1,billing=72\n TRES_alloc=cpu=72,node=1,billing=72\n Command=/home/atuin/k106eb/k106eb10/ACE/Ni-Al/DFT/VASP_PBE_500_0.125_0.1_NM/AlNi/binaries/bulk/base-hcp/occ-shaken/hcp16.occ.4.shake.0/cfg/NiAl3NiAl11/ams_pipeline_job.sh\n WorkDir=/home/atuin/k106eb/k106eb10/ACE/Ni-Al/DFT/VASP_PBE_500_0.125_0.1_NM/AlNi/binaries/bulk/base-hcp/occ-shaken/hcp16.occ.4.shake.0/cfg/NiAl3NiAl11\n StdErr=\n StdOut=ams_pipeline.o%j\n"},"startTime":1675956725,"statistics":{"clock":{"unit":{"base":"Hz","prefix":"M"},"avg":2335.254,"min":800.418,"max":2734.922},"cpu_load":{"unit":{"base":""},"avg":52.72,"min":34.46,"max":71.91},"cpu_power":{"unit":{"base":"W"},"avg":407.767,"min":93.932,"max":497.636},"cpu_user":{"unit":{"base":""},"avg":63.678,"min":19.872,"max":96.633},"flops_any":{"unit":{"base":"F/s","prefix":"G"},"avg":635.672,"min":0,"max":1332.874},"flops_dp":{"unit":{"base":"F/s","prefix":"G"},"avg":261.006,"min":0,"max":382.294},"flops_sp":{"unit":{"base":"F/s","prefix":"G"},"avg":113.659,"min":0,"max":568.286},"ib_recv":{"unit":{"base":"B/s"},"avg":27981.111,"min":69.4,"max":48084.589},"ib_recv_pkts":{"unit":{"base":"packets/s"},"avg":398.939,"min":0.5,"max":693.817},"ib_xmit":{"unit":{"base":"B/s"},"avg":188.513,"min":39.597,"max":724.568},"ib_xmit_pkts":{"unit":{"base":"packets/s"},"avg":0.867,"min":0.2,"max":2.933},"ipc":{"unit":{"base":"IPC"},"avg":0.944,"min":0.564,"max":1.291},"mem_bw":{"unit":{"base":"B/s","prefix":"G"},"avg":79.565,"min":0.021,"max":116.02},"mem_power":{"unit":{"base":"W"},"avg":24.692,"min":7.883,"max":31.318},"mem_used":{"unit":{"base":"B","prefix":"G"},"avg":22.566,"min":8.225,"max":27.613},"nfs4_read":{"unit":{"base":"B/s","prefix":"M"},"avg":647,"min":0,"max":1946},"nfs4_total":{"unit":{"base":"B/s","prefix":"M"},"avg":6181.6,"min":1270,"max":11411},"nfs4_write":{"unit":{"base":"B/s","prefix":"M"},"avg":22.4,"min":11,"max":29},"vectorization_ratio":{"unit":{"base":"%"},"avg":77.351,"min":0,"max":98.837}}} {"jobId":398955,"user":"k106eb10","project":"k106eb","cluster":"fritz","subCluster":"main","partition":"singlenode","arrayJobId":0,"numNodes":1,"numHwthreads":72,"numAcc":0,"exclusive":1,"monitoringStatus":1,"smt":0,"jobState":"completed","duration":260,"walltime":86340,"resources":[{"hostname":"f0720"}],"metaData":{"jobName":"ams_pipeline","jobScript":"#!/bin/bash -l\n#SBATCH --job-name=ams_pipeline\n#SBATCH --time=23:59:00\n#SBATCH --partition=singlenode\n#SBATCH --ntasks=72\n#SBATCH --hint=multithread\n#SBATCH --chdir=/home/atuin/k106eb/k106eb10/ACE/Ni-Al/DFT/VASP_PBE_500_0.125_0.1_NM/AlNi/binaries/bulk/base-hcp/occ-shaken/hcp16.occ.4.shake.0/cfg/NiAl3NiAl11\n#SBATCH --export=NONE\nunset SLURM_EXPORT_ENV\nuss=$(whoami)\nfind /dev/shm/ -user $uss -type f -mmin +30 -delete\ncd \"/home/atuin/k106eb/k106eb10/ACE/Ni-Al/DFT/VASP_PBE_500_0.125_0.1_NM/AlNi/binaries/bulk/base-hcp/occ-shaken/hcp16.occ.4.shake.0/cfg/NiAl3NiAl11\"\nams_pipeline pipeline.json \u003e \"/home/atuin/k106eb/k106eb10/ACE/Ni-Al/DFT/VASP_PBE_500_0.125_0.1_NM/AlNi/binaries/bulk/base-hcp/occ-shaken/hcp16.occ.4.shake.0/cfg/NiAl3NiAl11/ams_pipeline_job.sh.out\" 2\u003e \"/home/atuin/k106eb/k106eb10/ACE/Ni-Al/DFT/VASP_PBE_500_0.125_0.1_NM/AlNi/binaries/bulk/base-hcp/occ-shaken/hcp16.occ.4.shake.0/cfg/NiAl3NiAl11/ams_pipeline_job.sh.err\"\n","slurmInfo":"\nJobId=398955 JobName=ams_pipeline\n UserId=k106eb10(210387) GroupId=80111\n Account=k106eb QOS=normal \n Requeue=False Restarts=0 BatchFlag=True \n TimeLimit=1439\n SubmitTime=2023-02-09T14:11:22\n Partition=singlenode \n NodeList=f0720\n NumNodes=1 NumCPUs=72 NumTasks=72 CPUs/Task=1\n NTasksPerNode:Socket:Core=0:None:None\n TRES_req=cpu=72,mem=250000M,node=1,billing=72\n TRES_alloc=cpu=72,node=1,billing=72\n Command=/home/atuin/k106eb/k106eb10/ACE/Ni-Al/DFT/VASP_PBE_500_0.125_0.1_NM/AlNi/binaries/bulk/base-hcp/occ-shaken/hcp16.occ.4.shake.0/cfg/NiAl3NiAl11/ams_pipeline_job.sh\n WorkDir=/home/atuin/k106eb/k106eb10/ACE/Ni-Al/DFT/VASP_PBE_500_0.125_0.1_NM/AlNi/binaries/bulk/base-hcp/occ-shaken/hcp16.occ.4.shake.0/cfg/NiAl3NiAl11\n StdErr=\n StdOut=ams_pipeline.o%j\n"},"startTime":1675956725,"statistics":{"clock":{"unit":{"base":"Hz","prefix":"M"},"avg":2335.254,"min":800.418,"max":2734.922},"cpu_load":{"unit":{"base":""},"avg":52.72,"min":34.46,"max":71.91},"cpu_power":{"unit":{"base":"W"},"avg":407.767,"min":93.932,"max":497.636},"cpu_user":{"unit":{"base":""},"avg":63.678,"min":19.872,"max":96.633},"flops_any":{"unit":{"base":"F/s","prefix":"G"},"avg":635.672,"min":0,"max":1332.874},"flops_dp":{"unit":{"base":"F/s","prefix":"G"},"avg":261.006,"min":0,"max":382.294},"flops_sp":{"unit":{"base":"F/s","prefix":"G"},"avg":113.659,"min":0,"max":568.286},"ib_recv":{"unit":{"base":"B/s"},"avg":27981.111,"min":69.4,"max":48084.589},"ib_recv_pkts":{"unit":{"base":"packets/s"},"avg":398.939,"min":0.5,"max":693.817},"ib_xmit":{"unit":{"base":"B/s"},"avg":188.513,"min":39.597,"max":724.568},"ib_xmit_pkts":{"unit":{"base":"packets/s"},"avg":0.867,"min":0.2,"max":2.933},"ipc":{"unit":{"base":"IPC"},"avg":0.944,"min":0.564,"max":1.291},"mem_bw":{"unit":{"base":"B/s","prefix":"G"},"avg":79.565,"min":0.021,"max":116.02},"mem_power":{"unit":{"base":"W"},"avg":24.692,"min":7.883,"max":31.318},"mem_used":{"unit":{"base":"B","prefix":"G"},"avg":22.566,"min":8.225,"max":27.613},"nfs4_read":{"unit":{"base":"B/s","prefix":"M"},"avg":647,"min":0,"max":1946},"nfs4_total":{"unit":{"base":"B/s","prefix":"M"},"avg":6181.6,"min":1270,"max":11411},"nfs4_write":{"unit":{"base":"B/s","prefix":"M"},"avg":22.4,"min":11,"max":29},"vectorization_ratio":{"unit":{"base":"%"},"avg":77.351,"min":0,"max":98.837}}}

View File

@@ -1 +1 @@
{"jobId":398764,"user":"k106eb10","project":"k106eb","cluster":"fritz","subCluster":"main","numNodes":1,"shared":"none","jobState":"completed","duration":177,"resources":[{"hostname":"f0649"}],"startTime":1675954353,"statistics":{"clock":{"unit":{"base":"Hz","prefix":"M"},"avg":1336.519,"min":801.564,"max":2348.215},"cpu_load":{"unit":{"base":""},"avg":31.64,"min":17.36,"max":45.54},"cpu_power":{"unit":{"base":"W"},"avg":150.018,"min":93.672,"max":261.592},"cpu_user":{"unit":{"base":""},"avg":28.518,"min":0.09,"max":57.343},"flops_any":{"unit":{"base":"F/s","prefix":"G"},"avg":45.012,"min":0,"max":135.037},"flops_dp":{"unit":{"base":"F/s","prefix":"G"},"avg":22.496,"min":0,"max":67.488},"flops_sp":{"unit":{"base":"F/s","prefix":"G"},"avg":0.02,"min":0,"max":0.061},"ib_recv":{"unit":{"base":"B/s"},"avg":14442.82,"min":219.998,"max":42581.368},"ib_recv_pkts":{"unit":{"base":"packets/s"},"avg":201.532,"min":1.25,"max":601.345},"ib_xmit":{"unit":{"base":"B/s"},"avg":282.098,"min":56.2,"max":569.363},"ib_xmit_pkts":{"unit":{"base":"packets/s"},"avg":1.228,"min":0.433,"max":2},"ipc":{"unit":{"base":"IPC"},"avg":0.77,"min":0.564,"max":0.906},"mem_bw":{"unit":{"base":"B/s","prefix":"G"},"avg":4.872,"min":0.025,"max":14.552},"mem_power":{"unit":{"base":"W"},"avg":7.725,"min":6.286,"max":10.556},"mem_used":{"unit":{"base":"B","prefix":"G"},"avg":6.162,"min":6.103,"max":6.226},"nfs4_read":{"unit":{"base":"B/s","prefix":"M"},"avg":1045.333,"min":311,"max":1525},"nfs4_total":{"unit":{"base":"B/s","prefix":"M"},"avg":6430,"min":2796,"max":11518},"nfs4_write":{"unit":{"base":"B/s","prefix":"M"},"avg":24.333,"min":0,"max":38},"vectorization_ratio":{"unit":{"base":"%"},"avg":25.528,"min":0,"max":76.585}}} {"jobId":398764,"user":"k106eb10","project":"k106eb","cluster":"fritz","subCluster":"main","numNodes":1,"exclusive":1,"jobState":"completed","duration":177,"resources":[{"hostname":"f0649"}],"startTime":1675954353,"statistics":{"clock":{"unit":{"base":"Hz","prefix":"M"},"avg":1336.519,"min":801.564,"max":2348.215},"cpu_load":{"unit":{"base":""},"avg":31.64,"min":17.36,"max":45.54},"cpu_power":{"unit":{"base":"W"},"avg":150.018,"min":93.672,"max":261.592},"cpu_user":{"unit":{"base":""},"avg":28.518,"min":0.09,"max":57.343},"flops_any":{"unit":{"base":"F/s","prefix":"G"},"avg":45.012,"min":0,"max":135.037},"flops_dp":{"unit":{"base":"F/s","prefix":"G"},"avg":22.496,"min":0,"max":67.488},"flops_sp":{"unit":{"base":"F/s","prefix":"G"},"avg":0.02,"min":0,"max":0.061},"ib_recv":{"unit":{"base":"B/s"},"avg":14442.82,"min":219.998,"max":42581.368},"ib_recv_pkts":{"unit":{"base":"packets/s"},"avg":201.532,"min":1.25,"max":601.345},"ib_xmit":{"unit":{"base":"B/s"},"avg":282.098,"min":56.2,"max":569.363},"ib_xmit_pkts":{"unit":{"base":"packets/s"},"avg":1.228,"min":0.433,"max":2},"ipc":{"unit":{"base":"IPC"},"avg":0.77,"min":0.564,"max":0.906},"mem_bw":{"unit":{"base":"B/s","prefix":"G"},"avg":4.872,"min":0.025,"max":14.552},"mem_power":{"unit":{"base":"W"},"avg":7.725,"min":6.286,"max":10.556},"mem_used":{"unit":{"base":"B","prefix":"G"},"avg":6.162,"min":6.103,"max":6.226},"nfs4_read":{"unit":{"base":"B/s","prefix":"M"},"avg":1045.333,"min":311,"max":1525},"nfs4_total":{"unit":{"base":"B/s","prefix":"M"},"avg":6430,"min":2796,"max":11518},"nfs4_write":{"unit":{"base":"B/s","prefix":"M"},"avg":24.333,"min":0,"max":38},"vectorization_ratio":{"unit":{"base":"%"},"avg":25.528,"min":0,"max":76.585}}}

View File

@@ -1,232 +0,0 @@
// Copyright (C) NHR@FAU, University Erlangen-Nuremberg.
// All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.
package memorystore
import (
"errors"
"fmt"
"math"
"github.com/ClusterCockpit/cc-lib/schema"
"github.com/ClusterCockpit/cc-lib/util"
)
var (
ErrInvalidTimeRange = errors.New("[METRICSTORE]> invalid time range: 'from' must be before 'to'")
ErrEmptyCluster = errors.New("[METRICSTORE]> cluster name cannot be empty")
)
type APIMetricData struct {
Error *string `json:"error,omitempty"`
Data schema.FloatArray `json:"data,omitempty"`
From int64 `json:"from"`
To int64 `json:"to"`
Resolution int64 `json:"resolution"`
Avg schema.Float `json:"avg"`
Min schema.Float `json:"min"`
Max schema.Float `json:"max"`
}
type APIQueryRequest struct {
Cluster string `json:"cluster"`
Queries []APIQuery `json:"queries"`
ForAllNodes []string `json:"for-all-nodes"`
From int64 `json:"from"`
To int64 `json:"to"`
WithStats bool `json:"with-stats"`
WithData bool `json:"with-data"`
WithPadding bool `json:"with-padding"`
}
type APIQueryResponse struct {
Queries []APIQuery `json:"queries,omitempty"`
Results [][]APIMetricData `json:"results"`
}
type APIQuery struct {
Type *string `json:"type,omitempty"`
SubType *string `json:"subtype,omitempty"`
Metric string `json:"metric"`
Hostname string `json:"host"`
Resolution int64 `json:"resolution"`
TypeIds []string `json:"type-ids,omitempty"`
SubTypeIds []string `json:"subtype-ids,omitempty"`
ScaleFactor schema.Float `json:"scale-by,omitempty"`
Aggregate bool `json:"aggreg"`
}
// TODO: Optimize this, just like the stats endpoint!
func (data *APIMetricData) AddStats() {
n := 0
sum, min, max := 0.0, math.MaxFloat64, -math.MaxFloat64
for _, x := range data.Data {
if x.IsNaN() {
continue
}
n += 1
sum += float64(x)
min = math.Min(min, float64(x))
max = math.Max(max, float64(x))
}
if n > 0 {
avg := sum / float64(n)
data.Avg = schema.Float(avg)
data.Min = schema.Float(min)
data.Max = schema.Float(max)
} else {
data.Avg, data.Min, data.Max = schema.NaN, schema.NaN, schema.NaN
}
}
func (data *APIMetricData) ScaleBy(f schema.Float) {
if f == 0 || f == 1 {
return
}
data.Avg *= f
data.Min *= f
data.Max *= f
for i := 0; i < len(data.Data); i++ {
data.Data[i] *= f
}
}
func (data *APIMetricData) PadDataWithNull(ms *MemoryStore, from, to int64, metric string) {
minfo, ok := ms.Metrics[metric]
if !ok {
return
}
if (data.From / minfo.Frequency) > (from / minfo.Frequency) {
padfront := int((data.From / minfo.Frequency) - (from / minfo.Frequency))
ndata := make([]schema.Float, 0, padfront+len(data.Data))
for range padfront {
ndata = append(ndata, schema.NaN)
}
for j := 0; j < len(data.Data); j++ {
ndata = append(ndata, data.Data[j])
}
data.Data = ndata
}
}
func FetchData(req APIQueryRequest) (*APIQueryResponse, error) {
if req.From > req.To {
return nil, ErrInvalidTimeRange
}
if req.Cluster == "" && req.ForAllNodes != nil {
return nil, ErrEmptyCluster
}
req.WithData = true
ms := GetMemoryStore()
if ms == nil {
return nil, fmt.Errorf("memorystore not initialized")
}
response := APIQueryResponse{
Results: make([][]APIMetricData, 0, len(req.Queries)),
}
if req.ForAllNodes != nil {
nodes := ms.ListChildren([]string{req.Cluster})
for _, node := range nodes {
for _, metric := range req.ForAllNodes {
q := APIQuery{
Metric: metric,
Hostname: node,
}
req.Queries = append(req.Queries, q)
response.Queries = append(response.Queries, q)
}
}
}
for _, query := range req.Queries {
sels := make([]util.Selector, 0, 1)
if query.Aggregate || query.Type == nil {
sel := util.Selector{{String: req.Cluster}, {String: query.Hostname}}
if query.Type != nil {
if len(query.TypeIds) == 1 {
sel = append(sel, util.SelectorElement{String: *query.Type + query.TypeIds[0]})
} else {
ids := make([]string, len(query.TypeIds))
for i, id := range query.TypeIds {
ids[i] = *query.Type + id
}
sel = append(sel, util.SelectorElement{Group: ids})
}
if query.SubType != nil {
if len(query.SubTypeIds) == 1 {
sel = append(sel, util.SelectorElement{String: *query.SubType + query.SubTypeIds[0]})
} else {
ids := make([]string, len(query.SubTypeIds))
for i, id := range query.SubTypeIds {
ids[i] = *query.SubType + id
}
sel = append(sel, util.SelectorElement{Group: ids})
}
}
}
sels = append(sels, sel)
} else {
for _, typeID := range query.TypeIds {
if query.SubType != nil {
for _, subTypeID := range query.SubTypeIds {
sels = append(sels, util.Selector{
{String: req.Cluster},
{String: query.Hostname},
{String: *query.Type + typeID},
{String: *query.SubType + subTypeID},
})
}
} else {
sels = append(sels, util.Selector{
{String: req.Cluster},
{String: query.Hostname},
{String: *query.Type + typeID},
})
}
}
}
// log.Printf("query: %#v\n", query)
// log.Printf("sels: %#v\n", sels)
var err error
res := make([]APIMetricData, 0, len(sels))
for _, sel := range sels {
data := APIMetricData{}
data.Data, data.From, data.To, data.Resolution, err = ms.Read(sel, query.Metric, req.From, req.To, query.Resolution)
if err != nil {
msg := err.Error()
data.Error = &msg
res = append(res, data)
continue
}
if req.WithStats {
data.AddStats()
}
if query.ScaleFactor != 0 {
data.ScaleBy(query.ScaleFactor)
}
if req.WithPadding {
data.PadDataWithNull(ms, req.From, req.To, query.Metric)
}
if !req.WithData {
data.Data = nil
}
res = append(res, data)
}
response.Results = append(response.Results, res)
}
return &response, nil
}

View File

@@ -1,196 +0,0 @@
// Copyright (C) NHR@FAU, University Erlangen-Nuremberg.
// All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.
package memorystore
import (
"archive/zip"
"bufio"
"context"
"errors"
"fmt"
"io"
"os"
"path/filepath"
"sync"
"sync/atomic"
"time"
cclog "github.com/ClusterCockpit/cc-lib/ccLogger"
)
func Archiving(wg *sync.WaitGroup, ctx context.Context) {
go func() {
defer wg.Done()
d, err := time.ParseDuration(Keys.Archive.Interval)
if err != nil {
cclog.Fatalf("[METRICSTORE]> error parsing archive interval duration: %v\n", err)
}
if d <= 0 {
return
}
ticker := time.NewTicker(d)
defer ticker.Stop()
for {
select {
case <-ctx.Done():
return
case <-ticker.C:
t := time.Now().Add(-d)
cclog.Infof("[METRICSTORE]> start archiving checkpoints (older than %s)...", t.Format(time.RFC3339))
n, err := ArchiveCheckpoints(Keys.Checkpoints.RootDir,
Keys.Archive.RootDir, t.Unix(), Keys.Archive.DeleteInstead)
if err != nil {
cclog.Errorf("[METRICSTORE]> archiving failed: %s", err.Error())
} else {
cclog.Infof("[METRICSTORE]> done: %d files zipped and moved to archive", n)
}
}
}
}()
}
var ErrNoNewArchiveData error = errors.New("all data already archived")
// ZIP all checkpoint files older than `from` together and write them to the `archiveDir`,
// deleting them from the `checkpointsDir`.
func ArchiveCheckpoints(checkpointsDir, archiveDir string, from int64, deleteInstead bool) (int, error) {
entries1, err := os.ReadDir(checkpointsDir)
if err != nil {
return 0, err
}
type workItem struct {
cdir, adir string
cluster, host string
}
var wg sync.WaitGroup
n, errs := int32(0), int32(0)
work := make(chan workItem, Keys.NumWorkers)
wg.Add(Keys.NumWorkers)
for worker := 0; worker < Keys.NumWorkers; worker++ {
go func() {
defer wg.Done()
for workItem := range work {
m, err := archiveCheckpoints(workItem.cdir, workItem.adir, from, deleteInstead)
if err != nil {
cclog.Errorf("error while archiving %s/%s: %s", workItem.cluster, workItem.host, err.Error())
atomic.AddInt32(&errs, 1)
}
atomic.AddInt32(&n, int32(m))
}
}()
}
for _, de1 := range entries1 {
entries2, e := os.ReadDir(filepath.Join(checkpointsDir, de1.Name()))
if e != nil {
err = e
}
for _, de2 := range entries2 {
cdir := filepath.Join(checkpointsDir, de1.Name(), de2.Name())
adir := filepath.Join(archiveDir, de1.Name(), de2.Name())
work <- workItem{
adir: adir, cdir: cdir,
cluster: de1.Name(), host: de2.Name(),
}
}
}
close(work)
wg.Wait()
if err != nil {
return int(n), err
}
if errs > 0 {
return int(n), fmt.Errorf("%d errors happened while archiving (%d successes)", errs, n)
}
return int(n), nil
}
// Helper function for `ArchiveCheckpoints`.
func archiveCheckpoints(dir string, archiveDir string, from int64, deleteInstead bool) (int, error) {
entries, err := os.ReadDir(dir)
if err != nil {
return 0, err
}
extension := Keys.Checkpoints.FileFormat
files, err := findFiles(entries, from, extension, false)
if err != nil {
return 0, err
}
if deleteInstead {
n := 0
for _, checkpoint := range files {
filename := filepath.Join(dir, checkpoint)
if err = os.Remove(filename); err != nil {
return n, err
}
n += 1
}
return n, nil
}
filename := filepath.Join(archiveDir, fmt.Sprintf("%d.zip", from))
f, err := os.OpenFile(filename, os.O_CREATE|os.O_WRONLY, CheckpointFilePerms)
if err != nil && os.IsNotExist(err) {
err = os.MkdirAll(archiveDir, CheckpointDirPerms)
if err == nil {
f, err = os.OpenFile(filename, os.O_CREATE|os.O_WRONLY, CheckpointFilePerms)
}
}
if err != nil {
return 0, err
}
defer f.Close()
bw := bufio.NewWriter(f)
defer bw.Flush()
zw := zip.NewWriter(bw)
defer zw.Close()
n := 0
for _, checkpoint := range files {
// Use closure to ensure file is closed immediately after use,
// avoiding file descriptor leak from defer in loop
err := func() error {
filename := filepath.Join(dir, checkpoint)
r, err := os.Open(filename)
if err != nil {
return err
}
defer r.Close()
w, err := zw.Create(checkpoint)
if err != nil {
return err
}
if _, err = io.Copy(w, r); err != nil {
return err
}
if err = os.Remove(filename); err != nil {
return err
}
return nil
}()
if err != nil {
return n, err
}
n += 1
}
return n, nil
}

View File

@@ -1,477 +0,0 @@
// Copyright (C) NHR@FAU, University Erlangen-Nuremberg.
// All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.
package memorystore
import (
"bufio"
"encoding/json"
"errors"
"fmt"
"os"
"path"
"sort"
"strconv"
"strings"
"sync"
"sync/atomic"
"time"
cclog "github.com/ClusterCockpit/cc-lib/ccLogger"
"github.com/ClusterCockpit/cc-lib/schema"
"github.com/linkedin/goavro/v2"
)
var NumAvroWorkers int = DefaultAvroWorkers
var startUp bool = true
func (as *AvroStore) ToCheckpoint(dir string, dumpAll bool) (int, error) {
levels := make([]*AvroLevel, 0)
selectors := make([][]string, 0)
as.root.lock.RLock()
// Cluster
for sel1, l1 := range as.root.children {
l1.lock.RLock()
// Node
for sel2, l2 := range l1.children {
l2.lock.RLock()
// Frequency
for sel3, l3 := range l2.children {
levels = append(levels, l3)
selectors = append(selectors, []string{sel1, sel2, sel3})
}
l2.lock.RUnlock()
}
l1.lock.RUnlock()
}
as.root.lock.RUnlock()
type workItem struct {
level *AvroLevel
dir string
selector []string
}
n, errs := int32(0), int32(0)
var wg sync.WaitGroup
wg.Add(NumAvroWorkers)
work := make(chan workItem, NumAvroWorkers*2)
for range NumAvroWorkers {
go func() {
defer wg.Done()
for workItem := range work {
from := getTimestamp(workItem.dir)
if err := workItem.level.toCheckpoint(workItem.dir, from, dumpAll); err != nil {
if err == ErrNoNewArchiveData {
continue
}
cclog.Errorf("error while checkpointing %#v: %s", workItem.selector, err.Error())
atomic.AddInt32(&errs, 1)
} else {
atomic.AddInt32(&n, 1)
}
}
}()
}
for i := range len(levels) {
dir := path.Join(dir, path.Join(selectors[i]...))
work <- workItem{
level: levels[i],
dir: dir,
selector: selectors[i],
}
}
close(work)
wg.Wait()
if errs > 0 {
return int(n), fmt.Errorf("%d errors happend while creating avro checkpoints (%d successes)", errs, n)
}
startUp = false
return int(n), nil
}
// getTimestamp returns the timestamp from the directory name
func getTimestamp(dir string) int64 {
// Extract the resolution and timestamp from the directory name
// The existing avro file will be in epoch timestamp format
// iterate over all the files in the directory and find the maximum timestamp
// and return it
resolution := path.Base(dir)
dir = path.Dir(dir)
files, err := os.ReadDir(dir)
if err != nil {
return 0
}
var maxTS int64 = 0
if len(files) == 0 {
return 0
}
for _, file := range files {
if file.IsDir() {
continue
}
name := file.Name()
if len(name) < 5 || !strings.HasSuffix(name, ".avro") || !strings.HasPrefix(name, resolution+"_") {
continue
}
ts, err := strconv.ParseInt(name[strings.Index(name, "_")+1:len(name)-5], 10, 64)
if err != nil {
fmt.Printf("error while parsing timestamp: %s\n", err.Error())
continue
}
if ts > maxTS {
maxTS = ts
}
}
interval, _ := time.ParseDuration(Keys.Checkpoints.Interval)
updateTime := time.Unix(maxTS, 0).Add(interval).Add(time.Duration(CheckpointBufferMinutes-1) * time.Minute).Unix()
if startUp {
return 0
}
if updateTime < time.Now().Unix() {
return 0
}
return maxTS
}
func (l *AvroLevel) toCheckpoint(dir string, from int64, dumpAll bool) error {
l.lock.Lock()
defer l.lock.Unlock()
// fmt.Printf("Checkpointing directory: %s\n", dir)
// filepath contains the resolution
intRes, _ := strconv.Atoi(path.Base(dir))
// find smallest overall timestamp in l.data map and delete it from l.data
minTS := int64(1<<63 - 1)
for ts, dat := range l.data {
if ts < minTS && len(dat) != 0 {
minTS = ts
}
}
if from == 0 && minTS != int64(1<<63-1) {
from = minTS
}
if from == 0 {
return ErrNoNewArchiveData
}
var schema string
var codec *goavro.Codec
recordList := make([]map[string]any, 0)
var f *os.File
filePath := dir + fmt.Sprintf("_%d.avro", from)
var err error
fp_, err_ := os.Stat(filePath)
if errors.Is(err_, os.ErrNotExist) {
err = os.MkdirAll(path.Dir(dir), 0o755)
if err != nil {
return fmt.Errorf("failed to create directory: %v", err)
}
} else if fp_.Size() != 0 {
f, err = os.Open(filePath)
if err != nil {
return fmt.Errorf("failed to open existing avro file: %v", err)
}
br := bufio.NewReader(f)
reader, err := goavro.NewOCFReader(br)
if err != nil {
return fmt.Errorf("failed to create OCF reader: %v", err)
}
codec = reader.Codec()
schema = codec.Schema()
f.Close()
}
timeRef := time.Now().Add(time.Duration(-CheckpointBufferMinutes+1) * time.Minute).Unix()
if dumpAll {
timeRef = time.Now().Unix()
}
// Empty values
if len(l.data) == 0 {
// we checkpoint avro files every 60 seconds
repeat := 60 / intRes
for range repeat {
recordList = append(recordList, make(map[string]any))
}
}
readFlag := true
for ts := range l.data {
flag := false
if ts < timeRef {
data := l.data[ts]
schemaGen, err := generateSchema(data)
if err != nil {
return err
}
flag, schema, err = compareSchema(schema, schemaGen)
if err != nil {
return fmt.Errorf("failed to compare read and generated schema: %v", err)
}
if flag && readFlag && !errors.Is(err_, os.ErrNotExist) {
f.Close()
f, err = os.Open(filePath)
if err != nil {
return fmt.Errorf("failed to open Avro file: %v", err)
}
br := bufio.NewReader(f)
ocfReader, err := goavro.NewOCFReader(br)
if err != nil {
return fmt.Errorf("failed to create OCF reader while changing schema: %v", err)
}
for ocfReader.Scan() {
record, err := ocfReader.Read()
if err != nil {
return fmt.Errorf("failed to read record: %v", err)
}
recordList = append(recordList, record.(map[string]any))
}
f.Close()
err = os.Remove(filePath)
if err != nil {
return fmt.Errorf("failed to delete file: %v", err)
}
readFlag = false
}
codec, err = goavro.NewCodec(schema)
if err != nil {
return fmt.Errorf("failed to create codec after merged schema: %v", err)
}
recordList = append(recordList, generateRecord(data))
delete(l.data, ts)
}
}
if len(recordList) == 0 {
return ErrNoNewArchiveData
}
f, err = os.OpenFile(filePath, os.O_CREATE|os.O_APPEND|os.O_RDWR, 0o644)
if err != nil {
return fmt.Errorf("failed to append new avro file: %v", err)
}
// fmt.Printf("Codec : %#v\n", codec)
writer, err := goavro.NewOCFWriter(goavro.OCFConfig{
W: f,
Codec: codec,
CompressionName: goavro.CompressionDeflateLabel,
})
if err != nil {
return fmt.Errorf("failed to create OCF writer: %v", err)
}
// Append the new record
if err := writer.Append(recordList); err != nil {
return fmt.Errorf("failed to append record: %v", err)
}
f.Close()
return nil
}
func compareSchema(schemaRead, schemaGen string) (bool, string, error) {
var genSchema, readSchema AvroSchema
if schemaRead == "" {
return false, schemaGen, nil
}
// Unmarshal the schema strings into AvroSchema structs
if err := json.Unmarshal([]byte(schemaGen), &genSchema); err != nil {
return false, "", fmt.Errorf("failed to parse generated schema: %v", err)
}
if err := json.Unmarshal([]byte(schemaRead), &readSchema); err != nil {
return false, "", fmt.Errorf("failed to parse read schema: %v", err)
}
sort.Slice(genSchema.Fields, func(i, j int) bool {
return genSchema.Fields[i].Name < genSchema.Fields[j].Name
})
sort.Slice(readSchema.Fields, func(i, j int) bool {
return readSchema.Fields[i].Name < readSchema.Fields[j].Name
})
// Check if schemas are identical
schemasEqual := true
if len(genSchema.Fields) <= len(readSchema.Fields) {
for i := range genSchema.Fields {
if genSchema.Fields[i].Name != readSchema.Fields[i].Name {
schemasEqual = false
break
}
}
// If schemas are identical, return the read schema
if schemasEqual {
return false, schemaRead, nil
}
}
// Create a map to hold unique fields from both schemas
fieldMap := make(map[string]AvroField)
// Add fields from the read schema
for _, field := range readSchema.Fields {
fieldMap[field.Name] = field
}
// Add or update fields from the generated schema
for _, field := range genSchema.Fields {
fieldMap[field.Name] = field
}
// Create a union schema by collecting fields from the map
var mergedFields []AvroField
for _, field := range fieldMap {
mergedFields = append(mergedFields, field)
}
// Sort fields by name for consistency
sort.Slice(mergedFields, func(i, j int) bool {
return mergedFields[i].Name < mergedFields[j].Name
})
// Create the merged schema
mergedSchema := AvroSchema{
Type: "record",
Name: genSchema.Name,
Fields: mergedFields,
}
// Check if schemas are identical
schemasEqual = len(mergedSchema.Fields) == len(readSchema.Fields)
if schemasEqual {
for i := range mergedSchema.Fields {
if mergedSchema.Fields[i].Name != readSchema.Fields[i].Name {
schemasEqual = false
break
}
}
if schemasEqual {
return false, schemaRead, nil
}
}
// Marshal the merged schema back to JSON
mergedSchemaJSON, err := json.Marshal(mergedSchema)
if err != nil {
return false, "", fmt.Errorf("failed to marshal merged schema: %v", err)
}
return true, string(mergedSchemaJSON), nil
}
func generateSchema(data map[string]schema.Float) (string, error) {
// Define the Avro schema structure
schema := map[string]any{
"type": "record",
"name": "DataRecord",
"fields": []map[string]any{},
}
fieldTracker := make(map[string]struct{})
for key := range data {
if _, exists := fieldTracker[key]; !exists {
key = correctKey(key)
field := map[string]any{
"name": key,
"type": "double",
"default": -1.0,
}
schema["fields"] = append(schema["fields"].([]map[string]any), field)
fieldTracker[key] = struct{}{}
}
}
schemaString, err := json.Marshal(schema)
if err != nil {
return "", fmt.Errorf("failed to marshal schema: %v", err)
}
return string(schemaString), nil
}
func generateRecord(data map[string]schema.Float) map[string]any {
record := make(map[string]any)
// Iterate through each map in data
for key, value := range data {
key = correctKey(key)
// Set the value in the record
// avro only accepts basic types
record[key] = value.Double()
}
return record
}
func correctKey(key string) string {
key = strings.ReplaceAll(key, "_", "_0x5F_")
key = strings.ReplaceAll(key, ":", "_0x3A_")
key = strings.ReplaceAll(key, ".", "_0x2E_")
return key
}
func ReplaceKey(key string) string {
key = strings.ReplaceAll(key, "_0x2E_", ".")
key = strings.ReplaceAll(key, "_0x3A_", ":")
key = strings.ReplaceAll(key, "_0x5F_", "_")
return key
}

View File

@@ -1,84 +0,0 @@
// Copyright (C) NHR@FAU, University Erlangen-Nuremberg.
// All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.
package memorystore
import (
"context"
"slices"
"strconv"
"sync"
cclog "github.com/ClusterCockpit/cc-lib/ccLogger"
)
func DataStaging(wg *sync.WaitGroup, ctx context.Context) {
// AvroPool is a pool of Avro writers.
go func() {
if Keys.Checkpoints.FileFormat == "json" {
wg.Done() // Mark this goroutine as done
return // Exit the goroutine
}
defer wg.Done()
var avroLevel *AvroLevel
oldSelector := make([]string, 0)
for {
select {
case <-ctx.Done():
return
case val := <-LineProtocolMessages:
// Fetch the frequency of the metric from the global configuration
freq, err := GetMetricFrequency(val.MetricName)
if err != nil {
cclog.Errorf("Error fetching metric frequency: %s\n", err)
continue
}
metricName := ""
for _, selectorName := range val.Selector {
metricName += selectorName + SelectorDelimiter
}
metricName += val.MetricName
// Create a new selector for the Avro level
// The selector is a slice of strings that represents the path to the
// Avro level. It is created by appending the cluster, node, and metric
// name to the selector.
var selector []string
selector = append(selector, val.Cluster, val.Node, strconv.FormatInt(freq, 10))
if !stringSlicesEqual(oldSelector, selector) {
// Get the Avro level for the metric
avroLevel = avroStore.root.findAvroLevelOrCreate(selector)
// If the Avro level is nil, create a new one
if avroLevel == nil {
cclog.Errorf("Error creating or finding the level with cluster : %s, node : %s, metric : %s\n", val.Cluster, val.Node, val.MetricName)
}
oldSelector = slices.Clone(selector)
}
avroLevel.addMetric(metricName, val.Value, val.Timestamp, int(freq))
}
}
}()
}
func stringSlicesEqual(a, b []string) bool {
if len(a) != len(b) {
return false
}
for i := range a {
if a[i] != b[i] {
return false
}
}
return true
}

View File

@@ -1,167 +0,0 @@
// Copyright (C) NHR@FAU, University Erlangen-Nuremberg.
// All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.
package memorystore
import (
"sync"
"github.com/ClusterCockpit/cc-lib/schema"
)
var (
LineProtocolMessages = make(chan *AvroStruct)
// SelectorDelimiter separates hierarchical selector components in metric names for Avro encoding
SelectorDelimiter = "_SEL_"
)
var CheckpointBufferMinutes = DefaultCheckpointBufferMin
type AvroStruct struct {
MetricName string
Cluster string
Node string
Selector []string
Value schema.Float
Timestamp int64
}
type AvroStore struct {
root AvroLevel
}
var avroStore AvroStore
type AvroLevel struct {
children map[string]*AvroLevel
data map[int64]map[string]schema.Float
lock sync.RWMutex
}
type AvroField struct {
Name string `json:"name"`
Type any `json:"type"`
Default any `json:"default,omitempty"`
}
type AvroSchema struct {
Type string `json:"type"`
Name string `json:"name"`
Fields []AvroField `json:"fields"`
}
func (l *AvroLevel) findAvroLevelOrCreate(selector []string) *AvroLevel {
if len(selector) == 0 {
return l
}
// Allow concurrent reads:
l.lock.RLock()
var child *AvroLevel
var ok bool
if l.children == nil {
// Children map needs to be created...
l.lock.RUnlock()
} else {
child, ok := l.children[selector[0]]
l.lock.RUnlock()
if ok {
return child.findAvroLevelOrCreate(selector[1:])
}
}
// The level does not exist, take write lock for unique access:
l.lock.Lock()
// While this thread waited for the write lock, another thread
// could have created the child node.
if l.children != nil {
child, ok = l.children[selector[0]]
if ok {
l.lock.Unlock()
return child.findAvroLevelOrCreate(selector[1:])
}
}
child = &AvroLevel{
data: make(map[int64]map[string]schema.Float, 0),
children: nil,
}
if l.children != nil {
l.children[selector[0]] = child
} else {
l.children = map[string]*AvroLevel{selector[0]: child}
}
l.lock.Unlock()
return child.findAvroLevelOrCreate(selector[1:])
}
func (l *AvroLevel) addMetric(metricName string, value schema.Float, timestamp int64, Freq int) {
l.lock.Lock()
defer l.lock.Unlock()
KeyCounter := int(CheckpointBufferMinutes * 60 / Freq)
// Create keys in advance for the given amount of time
if len(l.data) != KeyCounter {
if len(l.data) == 0 {
for i := range KeyCounter {
l.data[timestamp+int64(i*Freq)] = make(map[string]schema.Float, 0)
}
} else {
// Get the last timestamp
var lastTS int64
for ts := range l.data {
if ts > lastTS {
lastTS = ts
}
}
// Create keys for the next KeyCounter timestamps
l.data[lastTS+int64(Freq)] = make(map[string]schema.Float, 0)
}
}
closestTS := int64(0)
minDiff := int64(Freq) + 1 // Start with diff just outside the valid range
found := false
// Iterate over timestamps and choose the one which is within range.
// Since its epoch time, we check if the difference is less than 60 seconds.
for ts, dat := range l.data {
// Check if timestamp is within range
diff := timestamp - ts
if diff < -int64(Freq) || diff > int64(Freq) {
continue
}
// Metric already present at this timestamp — skip
if _, ok := dat[metricName]; ok {
continue
}
// Check if this is the closest timestamp so far
if Abs(diff) < minDiff {
minDiff = Abs(diff)
closestTS = ts
found = true
}
}
if found {
l.data[closestTS][metricName] = value
}
}
func GetAvroStore() *AvroStore {
return &avroStore
}
// Abs returns the absolute value of x.
func Abs(x int64) int64 {
if x < 0 {
return -x
}
return x
}

View File

@@ -1,190 +0,0 @@
// Copyright (C) NHR@FAU, University Erlangen-Nuremberg.
// All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.
package memorystore
import (
"errors"
"sync"
"github.com/ClusterCockpit/cc-lib/schema"
)
// BufferCap is the default buffer capacity.
// buffer.data will only ever grow up to its capacity and a new link
// in the buffer chain will be created if needed so that no copying
// of data or reallocation needs to happen on writes.
const BufferCap int = DefaultBufferCapacity
var bufferPool sync.Pool = sync.Pool{
New: func() any {
return &buffer{
data: make([]schema.Float, 0, BufferCap),
}
},
}
var (
ErrNoData error = errors.New("[METRICSTORE]> no data for this metric/level")
ErrDataDoesNotAlign error = errors.New("[METRICSTORE]> data from lower granularities does not align")
)
// Each metric on each level has it's own buffer.
// This is where the actual values go.
// If `cap(data)` is reached, a new buffer is created and
// becomes the new head of a buffer list.
type buffer struct {
prev *buffer
next *buffer
data []schema.Float
frequency int64
start int64
archived bool
closed bool
}
func newBuffer(ts, freq int64) *buffer {
b := bufferPool.Get().(*buffer)
b.frequency = freq
b.start = ts - (freq / 2)
b.prev = nil
b.next = nil
b.archived = false
b.closed = false
b.data = b.data[:0]
return b
}
// If a new buffer was created, the new head is returnd.
// Otherwise, the existing buffer is returnd.
// Normaly, only "newer" data should be written, but if the value would
// end up in the same buffer anyways it is allowed.
func (b *buffer) write(ts int64, value schema.Float) (*buffer, error) {
if ts < b.start {
return nil, errors.New("[METRICSTORE]> cannot write value to buffer from past")
}
// idx := int((ts - b.start + (b.frequency / 3)) / b.frequency)
idx := int((ts - b.start) / b.frequency)
if idx >= cap(b.data) {
newbuf := newBuffer(ts, b.frequency)
newbuf.prev = b
b.next = newbuf
b = newbuf
idx = 0
}
// Overwriting value or writing value from past
if idx < len(b.data) {
b.data[idx] = value
return b, nil
}
// Fill up unwritten slots with NaN
for i := len(b.data); i < idx; i++ {
b.data = append(b.data, schema.NaN)
}
b.data = append(b.data, value)
return b, nil
}
func (b *buffer) end() int64 {
return b.firstWrite() + int64(len(b.data))*b.frequency
}
func (b *buffer) firstWrite() int64 {
return b.start + (b.frequency / 2)
}
// Return all known values from `from` to `to`. Gaps of information are represented as NaN.
// Simple linear interpolation is done between the two neighboring cells if possible.
// If values at the start or end are missing, instead of NaN values, the second and thrid
// return values contain the actual `from`/`to`.
// This function goes back the buffer chain if `from` is older than the currents buffer start.
// The loaded values are added to `data` and `data` is returned, possibly with a shorter length.
// If `data` is not long enough to hold all values, this function will panic!
func (b *buffer) read(from, to int64, data []schema.Float) ([]schema.Float, int64, int64, error) {
if from < b.firstWrite() {
if b.prev != nil {
return b.prev.read(from, to, data)
}
from = b.firstWrite()
}
i := 0
t := from
for ; t < to; t += b.frequency {
idx := int((t - b.start) / b.frequency)
if idx >= cap(b.data) {
if b.next == nil {
break
}
b = b.next
idx = 0
}
if idx >= len(b.data) {
if b.next == nil || to <= b.next.start {
break
}
data[i] += schema.NaN
} else if t < b.start {
data[i] += schema.NaN
} else {
data[i] += b.data[idx]
}
i++
}
return data[:i], from, t, nil
}
// Returns true if this buffer needs to be freed.
func (b *buffer) free(t int64) (delme bool, n int) {
if b.prev != nil {
delme, m := b.prev.free(t)
n += m
if delme {
b.prev.next = nil
if cap(b.prev.data) == BufferCap {
bufferPool.Put(b.prev)
}
b.prev = nil
}
}
end := b.end()
if end < t {
return true, n + 1
}
return false, n
}
// Call `callback` on every buffer that contains data in the range from `from` to `to`.
func (b *buffer) iterFromTo(from, to int64, callback func(b *buffer) error) error {
if b == nil {
return nil
}
if err := b.prev.iterFromTo(from, to, callback); err != nil {
return err
}
if from <= b.end() && b.start <= to {
return callback(b)
}
return nil
}
func (b *buffer) count() int64 {
res := int64(len(b.data))
if b.prev != nil {
res += b.prev.count()
}
return res
}

View File

@@ -1,761 +0,0 @@
// Copyright (C) NHR@FAU, University Erlangen-Nuremberg.
// All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.
package memorystore
import (
"bufio"
"context"
"encoding/json"
"errors"
"fmt"
"io/fs"
"os"
"path"
"path/filepath"
"runtime"
"sort"
"strconv"
"strings"
"sync"
"sync/atomic"
"time"
cclog "github.com/ClusterCockpit/cc-lib/ccLogger"
"github.com/ClusterCockpit/cc-lib/schema"
"github.com/linkedin/goavro/v2"
)
const (
CheckpointFilePerms = 0o644
CheckpointDirPerms = 0o755
GCTriggerInterval = DefaultGCTriggerInterval
)
// Whenever changed, update MarshalJSON as well!
type CheckpointMetrics struct {
Data []schema.Float `json:"data"`
Frequency int64 `json:"frequency"`
Start int64 `json:"start"`
}
type CheckpointFile struct {
Metrics map[string]*CheckpointMetrics `json:"metrics"`
Children map[string]*CheckpointFile `json:"children"`
From int64 `json:"from"`
To int64 `json:"to"`
}
var lastCheckpoint time.Time
func Checkpointing(wg *sync.WaitGroup, ctx context.Context) {
lastCheckpoint = time.Now()
if Keys.Checkpoints.FileFormat == "json" {
ms := GetMemoryStore()
go func() {
defer wg.Done()
d, err := time.ParseDuration(Keys.Checkpoints.Interval)
if err != nil {
cclog.Fatal(err)
}
if d <= 0 {
return
}
ticker := time.NewTicker(d)
defer ticker.Stop()
for {
select {
case <-ctx.Done():
return
case <-ticker.C:
cclog.Infof("[METRICSTORE]> start checkpointing (starting at %s)...", lastCheckpoint.Format(time.RFC3339))
now := time.Now()
n, err := ms.ToCheckpoint(Keys.Checkpoints.RootDir,
lastCheckpoint.Unix(), now.Unix())
if err != nil {
cclog.Errorf("[METRICSTORE]> checkpointing failed: %s", err.Error())
} else {
cclog.Infof("[METRICSTORE]> done: %d checkpoint files created", n)
lastCheckpoint = now
}
}
}
}()
} else {
go func() {
defer wg.Done()
select {
case <-ctx.Done():
return
case <-time.After(time.Duration(CheckpointBufferMinutes) * time.Minute):
GetAvroStore().ToCheckpoint(Keys.Checkpoints.RootDir, false)
}
ticker := time.NewTicker(DefaultAvroCheckpointInterval)
defer ticker.Stop()
for {
select {
case <-ctx.Done():
return
case <-ticker.C:
GetAvroStore().ToCheckpoint(Keys.Checkpoints.RootDir, false)
}
}
}()
}
}
// As `Float` implements a custom MarshalJSON() function,
// serializing an array of such types has more overhead
// than one would assume (because of extra allocations, interfaces and so on).
func (cm *CheckpointMetrics) MarshalJSON() ([]byte, error) {
buf := make([]byte, 0, 128+len(cm.Data)*8)
buf = append(buf, `{"frequency":`...)
buf = strconv.AppendInt(buf, cm.Frequency, 10)
buf = append(buf, `,"start":`...)
buf = strconv.AppendInt(buf, cm.Start, 10)
buf = append(buf, `,"data":[`...)
for i, x := range cm.Data {
if i != 0 {
buf = append(buf, ',')
}
if x.IsNaN() {
buf = append(buf, `null`...)
} else {
buf = strconv.AppendFloat(buf, float64(x), 'f', 1, 32)
}
}
buf = append(buf, `]}`...)
return buf, nil
}
// Metrics stored at the lowest 2 levels are not stored away (root and cluster)!
// On a per-host basis a new JSON file is created. I have no idea if this will scale.
// The good thing: Only a host at a time is locked, so this function can run
// in parallel to writes/reads.
func (m *MemoryStore) ToCheckpoint(dir string, from, to int64) (int, error) {
levels := make([]*Level, 0)
selectors := make([][]string, 0)
m.root.lock.RLock()
for sel1, l1 := range m.root.children {
l1.lock.RLock()
for sel2, l2 := range l1.children {
levels = append(levels, l2)
selectors = append(selectors, []string{sel1, sel2})
}
l1.lock.RUnlock()
}
m.root.lock.RUnlock()
type workItem struct {
level *Level
dir string
selector []string
}
n, errs := int32(0), int32(0)
var wg sync.WaitGroup
wg.Add(Keys.NumWorkers)
work := make(chan workItem, Keys.NumWorkers*2)
for worker := 0; worker < Keys.NumWorkers; worker++ {
go func() {
defer wg.Done()
for workItem := range work {
if err := workItem.level.toCheckpoint(workItem.dir, from, to, m); err != nil {
if err == ErrNoNewArchiveData {
continue
}
cclog.Errorf("[METRICSTORE]> error while checkpointing %#v: %s", workItem.selector, err.Error())
atomic.AddInt32(&errs, 1)
} else {
atomic.AddInt32(&n, 1)
}
}
}()
}
for i := 0; i < len(levels); i++ {
dir := path.Join(dir, path.Join(selectors[i]...))
work <- workItem{
level: levels[i],
dir: dir,
selector: selectors[i],
}
}
close(work)
wg.Wait()
if errs > 0 {
return int(n), fmt.Errorf("[METRICSTORE]> %d errors happened while creating checkpoints (%d successes)", errs, n)
}
return int(n), nil
}
func (l *Level) toCheckpointFile(from, to int64, m *MemoryStore) (*CheckpointFile, error) {
l.lock.RLock()
defer l.lock.RUnlock()
retval := &CheckpointFile{
From: from,
To: to,
Metrics: make(map[string]*CheckpointMetrics),
Children: make(map[string]*CheckpointFile),
}
for metric, minfo := range m.Metrics {
b := l.metrics[minfo.offset]
if b == nil {
continue
}
allArchived := true
b.iterFromTo(from, to, func(b *buffer) error {
if !b.archived {
allArchived = false
}
return nil
})
if allArchived {
continue
}
data := make([]schema.Float, (to-from)/b.frequency+1)
data, start, end, err := b.read(from, to, data)
if err != nil {
return nil, err
}
for i := int((end - start) / b.frequency); i < len(data); i++ {
data[i] = schema.NaN
}
retval.Metrics[metric] = &CheckpointMetrics{
Frequency: b.frequency,
Start: start,
Data: data,
}
}
for name, child := range l.children {
val, err := child.toCheckpointFile(from, to, m)
if err != nil {
return nil, err
}
if val != nil {
retval.Children[name] = val
}
}
if len(retval.Children) == 0 && len(retval.Metrics) == 0 {
return nil, nil
}
return retval, nil
}
func (l *Level) toCheckpoint(dir string, from, to int64, m *MemoryStore) error {
cf, err := l.toCheckpointFile(from, to, m)
if err != nil {
return err
}
if cf == nil {
return ErrNoNewArchiveData
}
filepath := path.Join(dir, fmt.Sprintf("%d.json", from))
f, err := os.OpenFile(filepath, os.O_CREATE|os.O_WRONLY, CheckpointFilePerms)
if err != nil && os.IsNotExist(err) {
err = os.MkdirAll(dir, CheckpointDirPerms)
if err == nil {
f, err = os.OpenFile(filepath, os.O_CREATE|os.O_WRONLY, CheckpointFilePerms)
}
}
if err != nil {
return err
}
defer f.Close()
bw := bufio.NewWriter(f)
if err = json.NewEncoder(bw).Encode(cf); err != nil {
return err
}
return bw.Flush()
}
func (m *MemoryStore) FromCheckpoint(dir string, from int64, extension string) (int, error) {
var wg sync.WaitGroup
work := make(chan [2]string, Keys.NumWorkers)
n, errs := int32(0), int32(0)
wg.Add(Keys.NumWorkers)
for worker := 0; worker < Keys.NumWorkers; worker++ {
go func() {
defer wg.Done()
for host := range work {
lvl := m.root.findLevelOrCreate(host[:], len(m.Metrics))
nn, err := lvl.fromCheckpoint(m, filepath.Join(dir, host[0], host[1]), from, extension)
if err != nil {
cclog.Errorf("[METRICSTORE]> error while loading checkpoints for %s/%s: %s", host[0], host[1], err.Error())
atomic.AddInt32(&errs, 1)
}
atomic.AddInt32(&n, int32(nn))
}
}()
}
i := 0
clustersDir, err := os.ReadDir(dir)
for _, clusterDir := range clustersDir {
if !clusterDir.IsDir() {
err = errors.New("[METRICSTORE]> expected only directories at first level of checkpoints/ directory")
goto done
}
hostsDir, e := os.ReadDir(filepath.Join(dir, clusterDir.Name()))
if e != nil {
err = e
goto done
}
for _, hostDir := range hostsDir {
if !hostDir.IsDir() {
err = errors.New("[METRICSTORE]> expected only directories at second level of checkpoints/ directory")
goto done
}
i++
if i%Keys.NumWorkers == 0 && i > GCTriggerInterval {
// Forcing garbage collection runs here regulary during the loading of checkpoints
// will decrease the total heap size after loading everything back to memory is done.
// While loading data, the heap will grow fast, so the GC target size will double
// almost always. By forcing GCs here, we can keep it growing more slowly so that
// at the end, less memory is wasted.
runtime.GC()
}
work <- [2]string{clusterDir.Name(), hostDir.Name()}
}
}
done:
close(work)
wg.Wait()
if err != nil {
return int(n), err
}
if errs > 0 {
return int(n), fmt.Errorf("[METRICSTORE]> %d errors happened while creating checkpoints (%d successes)", errs, n)
}
return int(n), nil
}
// Metrics stored at the lowest 2 levels are not loaded (root and cluster)!
// This function can only be called once and before the very first write or read.
// Different host's data is loaded to memory in parallel.
func (m *MemoryStore) FromCheckpointFiles(dir string, from int64) (int, error) {
if _, err := os.Stat(dir); os.IsNotExist(err) {
// The directory does not exist, so create it using os.MkdirAll()
err := os.MkdirAll(dir, CheckpointDirPerms) // CheckpointDirPerms sets the permissions for the directory
if err != nil {
cclog.Fatalf("[METRICSTORE]> Error creating directory: %#v\n", err)
}
cclog.Debugf("[METRICSTORE]> %#v Directory created successfully", dir)
}
// Config read (replace with your actual config read)
fileFormat := Keys.Checkpoints.FileFormat
if fileFormat == "" {
fileFormat = "avro"
}
// Map to easily get the fallback format
oppositeFormat := map[string]string{
"json": "avro",
"avro": "json",
}
// First, attempt to load the specified format
if found, err := checkFilesWithExtension(dir, fileFormat); err != nil {
return 0, fmt.Errorf("[METRICSTORE]> error checking files with extension: %v", err)
} else if found {
cclog.Infof("[METRICSTORE]> Loading %s files because fileformat is %s", fileFormat, fileFormat)
return m.FromCheckpoint(dir, from, fileFormat)
}
// If not found, attempt the opposite format
altFormat := oppositeFormat[fileFormat]
if found, err := checkFilesWithExtension(dir, altFormat); err != nil {
return 0, fmt.Errorf("[METRICSTORE]> error checking files with extension: %v", err)
} else if found {
cclog.Infof("[METRICSTORE]> Loading %s files but fileformat is %s", altFormat, fileFormat)
return m.FromCheckpoint(dir, from, altFormat)
}
cclog.Print("[METRICSTORE]> No valid checkpoint files found in the directory")
return 0, nil
}
func checkFilesWithExtension(dir string, extension string) (bool, error) {
found := false
err := filepath.Walk(dir, func(path string, info os.FileInfo, err error) error {
if err != nil {
return fmt.Errorf("[METRICSTORE]> error accessing path %s: %v", path, err)
}
if !info.IsDir() && filepath.Ext(info.Name()) == "."+extension {
found = true
return nil
}
return nil
})
if err != nil {
return false, fmt.Errorf("[METRICSTORE]> error walking through directories: %s", err)
}
return found, nil
}
func (l *Level) loadAvroFile(m *MemoryStore, f *os.File, from int64) error {
br := bufio.NewReader(f)
fileName := f.Name()[strings.LastIndex(f.Name(), "/")+1:]
resolution, err := strconv.ParseInt(fileName[0:strings.Index(fileName, "_")], 10, 64)
if err != nil {
return fmt.Errorf("[METRICSTORE]> error while reading avro file (resolution parsing) : %s", err)
}
fromTimestamp, err := strconv.ParseInt(fileName[strings.Index(fileName, "_")+1:len(fileName)-5], 10, 64)
// Same logic according to lineprotocol
fromTimestamp -= (resolution / 2)
if err != nil {
return fmt.Errorf("[METRICSTORE]> error converting timestamp from the avro file : %s", err)
}
// fmt.Printf("File : %s with resolution : %d\n", fileName, resolution)
var recordCounter int64 = 0
// Create a new OCF reader from the buffered reader
ocfReader, err := goavro.NewOCFReader(br)
if err != nil {
return fmt.Errorf("[METRICSTORE]> error creating OCF reader: %w", err)
}
metricsData := make(map[string]schema.FloatArray)
for ocfReader.Scan() {
datum, err := ocfReader.Read()
if err != nil {
return fmt.Errorf("[METRICSTORE]> error while reading avro file : %s", err)
}
record, ok := datum.(map[string]any)
if !ok {
return fmt.Errorf("[METRICSTORE]> failed to assert datum as map[string]interface{}")
}
for key, value := range record {
metricsData[key] = append(metricsData[key], schema.ConvertToFloat(value.(float64)))
}
recordCounter += 1
}
to := (fromTimestamp + (recordCounter / (60 / resolution) * 60))
if to < from {
return nil
}
for key, floatArray := range metricsData {
metricName := ReplaceKey(key)
if strings.Contains(metricName, SelectorDelimiter) {
subString := strings.Split(metricName, SelectorDelimiter)
lvl := l
for i := 0; i < len(subString)-1; i++ {
sel := subString[i]
if lvl.children == nil {
lvl.children = make(map[string]*Level)
}
child, ok := lvl.children[sel]
if !ok {
child = &Level{
metrics: make([]*buffer, len(m.Metrics)),
children: nil,
}
lvl.children[sel] = child
}
lvl = child
}
leafMetricName := subString[len(subString)-1]
err = lvl.createBuffer(m, leafMetricName, floatArray, fromTimestamp, resolution)
if err != nil {
return fmt.Errorf("[METRICSTORE]> error while creating buffers from avroReader : %s", err)
}
} else {
err = l.createBuffer(m, metricName, floatArray, fromTimestamp, resolution)
if err != nil {
return fmt.Errorf("[METRICSTORE]> error while creating buffers from avroReader : %s", err)
}
}
}
return nil
}
func (l *Level) createBuffer(m *MemoryStore, metricName string, floatArray schema.FloatArray, from int64, resolution int64) error {
n := len(floatArray)
b := &buffer{
frequency: resolution,
start: from,
data: floatArray[0:n:n],
prev: nil,
next: nil,
archived: true,
}
minfo, ok := m.Metrics[metricName]
if !ok {
return nil
}
prev := l.metrics[minfo.offset]
if prev == nil {
l.metrics[minfo.offset] = b
} else {
if prev.start > b.start {
return fmt.Errorf("[METRICSTORE]> buffer start time %d is before previous buffer start %d", b.start, prev.start)
}
b.prev = prev
prev.next = b
missingCount := ((int(b.start) - int(prev.start)) - len(prev.data)*int(b.frequency))
if missingCount > 0 {
missingCount /= int(b.frequency)
for range missingCount {
prev.data = append(prev.data, schema.NaN)
}
prev.data = prev.data[0:len(prev.data):len(prev.data)]
}
}
l.metrics[minfo.offset] = b
return nil
}
func (l *Level) loadJSONFile(m *MemoryStore, f *os.File, from int64) error {
br := bufio.NewReader(f)
cf := &CheckpointFile{}
if err := json.NewDecoder(br).Decode(cf); err != nil {
return err
}
if cf.To != 0 && cf.To < from {
return nil
}
if err := l.loadFile(cf, m); err != nil {
return err
}
return nil
}
func (l *Level) loadFile(cf *CheckpointFile, m *MemoryStore) error {
for name, metric := range cf.Metrics {
n := len(metric.Data)
b := &buffer{
frequency: metric.Frequency,
start: metric.Start,
data: metric.Data[0:n:n],
prev: nil,
next: nil,
archived: true,
}
minfo, ok := m.Metrics[name]
if !ok {
continue
}
prev := l.metrics[minfo.offset]
if prev == nil {
l.metrics[minfo.offset] = b
} else {
if prev.start > b.start {
return fmt.Errorf("[METRICSTORE]> buffer start time %d is before previous buffer start %d", b.start, prev.start)
}
b.prev = prev
prev.next = b
}
l.metrics[minfo.offset] = b
}
if len(cf.Children) > 0 && l.children == nil {
l.children = make(map[string]*Level)
}
for sel, childCf := range cf.Children {
child, ok := l.children[sel]
if !ok {
child = &Level{
metrics: make([]*buffer, len(m.Metrics)),
children: nil,
}
l.children[sel] = child
}
if err := child.loadFile(childCf, m); err != nil {
return err
}
}
return nil
}
func (l *Level) fromCheckpoint(m *MemoryStore, dir string, from int64, extension string) (int, error) {
direntries, err := os.ReadDir(dir)
if err != nil {
if os.IsNotExist(err) {
return 0, nil
}
return 0, err
}
allFiles := make([]fs.DirEntry, 0)
filesLoaded := 0
for _, e := range direntries {
if e.IsDir() {
child := &Level{
metrics: make([]*buffer, len(m.Metrics)),
children: make(map[string]*Level),
}
files, err := child.fromCheckpoint(m, path.Join(dir, e.Name()), from, extension)
filesLoaded += files
if err != nil {
return filesLoaded, err
}
l.children[e.Name()] = child
} else if strings.HasSuffix(e.Name(), "."+extension) {
allFiles = append(allFiles, e)
} else {
continue
}
}
files, err := findFiles(allFiles, from, extension, true)
if err != nil {
return filesLoaded, err
}
loaders := map[string]func(*MemoryStore, *os.File, int64) error{
"json": l.loadJSONFile,
"avro": l.loadAvroFile,
}
loader := loaders[extension]
for _, filename := range files {
// Use a closure to ensure file is closed immediately after use
err := func() error {
f, err := os.Open(path.Join(dir, filename))
if err != nil {
return err
}
defer f.Close()
return loader(m, f, from)
}()
if err != nil {
return filesLoaded, err
}
filesLoaded += 1
}
return filesLoaded, nil
}
// This will probably get very slow over time!
// A solution could be some sort of an index file in which all other files
// and the timespan they contain is listed.
func findFiles(direntries []fs.DirEntry, t int64, extension string, findMoreRecentFiles bool) ([]string, error) {
nums := map[string]int64{}
for _, e := range direntries {
if !strings.HasSuffix(e.Name(), "."+extension) {
continue
}
ts, err := strconv.ParseInt(e.Name()[strings.Index(e.Name(), "_")+1:len(e.Name())-5], 10, 64)
if err != nil {
return nil, err
}
nums[e.Name()] = ts
}
sort.Slice(direntries, func(i, j int) bool {
a, b := direntries[i], direntries[j]
return nums[a.Name()] < nums[b.Name()]
})
filenames := make([]string, 0)
for i := range direntries {
e := direntries[i]
ts1 := nums[e.Name()]
if findMoreRecentFiles && t <= ts1 {
filenames = append(filenames, e.Name())
}
if i == len(direntries)-1 {
continue
}
enext := direntries[i+1]
ts2 := nums[enext.Name()]
if findMoreRecentFiles {
if ts1 < t && t < ts2 {
filenames = append(filenames, e.Name())
}
} else {
if ts2 < t {
filenames = append(filenames, e.Name())
}
}
}
return filenames, nil
}

View File

@@ -1,115 +0,0 @@
// Copyright (C) NHR@FAU, University Erlangen-Nuremberg.
// All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.
package memorystore
import (
"fmt"
"time"
)
const (
DefaultMaxWorkers = 10
DefaultBufferCapacity = 512
DefaultGCTriggerInterval = 100
DefaultAvroWorkers = 4
DefaultCheckpointBufferMin = 3
DefaultAvroCheckpointInterval = time.Minute
)
type MetricStoreConfig struct {
// Number of concurrent workers for checkpoint and archive operations.
// If not set or 0, defaults to min(runtime.NumCPU()/2+1, 10)
NumWorkers int `json:"num-workers"`
Checkpoints struct {
FileFormat string `json:"file-format"`
Interval string `json:"interval"`
RootDir string `json:"directory"`
Restore string `json:"restore"`
} `json:"checkpoints"`
Debug struct {
DumpToFile string `json:"dump-to-file"`
EnableGops bool `json:"gops"`
} `json:"debug"`
RetentionInMemory string `json:"retention-in-memory"`
Archive struct {
Interval string `json:"interval"`
RootDir string `json:"directory"`
DeleteInstead bool `json:"delete-instead"`
} `json:"archive"`
Subscriptions []struct {
// Channel name
SubscribeTo string `json:"subscribe-to"`
// Allow lines without a cluster tag, use this as default, optional
ClusterTag string `json:"cluster-tag"`
} `json:"subscriptions"`
}
var Keys MetricStoreConfig
// AggregationStrategy for aggregation over multiple values at different cpus/sockets/..., not time!
type AggregationStrategy int
const (
NoAggregation AggregationStrategy = iota
SumAggregation
AvgAggregation
)
func AssignAggregationStrategy(str string) (AggregationStrategy, error) {
switch str {
case "":
return NoAggregation, nil
case "sum":
return SumAggregation, nil
case "avg":
return AvgAggregation, nil
default:
return NoAggregation, fmt.Errorf("[METRICSTORE]> unknown aggregation strategy: %s", str)
}
}
type MetricConfig struct {
// Interval in seconds at which measurements are stored
Frequency int64
// Can be 'sum', 'avg' or null. Describes how to aggregate metrics from the same timestep over the hierarchy.
Aggregation AggregationStrategy
// Private, used internally...
offset int
}
var Metrics map[string]MetricConfig
func GetMetricFrequency(metricName string) (int64, error) {
if metric, ok := Metrics[metricName]; ok {
return metric.Frequency, nil
}
return 0, fmt.Errorf("[METRICSTORE]> metric %s not found", metricName)
}
// AddMetric adds logic to add metrics. Redundant metrics should be updated with max frequency.
// use metric.Name to check if the metric already exists.
// if not, add it to the Metrics map.
func AddMetric(name string, metric MetricConfig) error {
if Metrics == nil {
Metrics = make(map[string]MetricConfig, 0)
}
if existingMetric, ok := Metrics[name]; ok {
if existingMetric.Frequency != metric.Frequency {
if existingMetric.Frequency < metric.Frequency {
existingMetric.Frequency = metric.Frequency
Metrics[name] = existingMetric
}
}
} else {
Metrics[name] = metric
}
return nil
}

View File

@@ -1,95 +0,0 @@
// Copyright (C) NHR@FAU, University Erlangen-Nuremberg.
// All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.
package memorystore
const configSchema = `{
"type": "object",
"description": "Configuration specific to built-in metric-store.",
"properties": {
"checkpoints": {
"description": "Configuration for checkpointing the metrics within metric-store",
"type": "object",
"properties": {
"file-format": {
"description": "Specify the type of checkpoint file. There are 2 variants: 'avro' and 'json'. If nothing is specified, 'avro' is default.",
"type": "string"
},
"interval": {
"description": "Interval at which the metrics should be checkpointed.",
"type": "string"
},
"directory": {
"description": "Specify the parent directy in which the checkpointed files should be placed.",
"type": "string"
},
"restore": {
"description": "When cc-backend starts up, look for checkpointed files that are less than X hours old and load metrics from these selected checkpoint files.",
"type": "string"
}
}
},
"archive": {
"description": "Configuration for archiving the already checkpointed files.",
"type": "object",
"properties": {
"interval": {
"description": "Interval at which the checkpointed files should be archived.",
"type": "string"
},
"directory": {
"description": "Specify the parent directy in which the archived files should be placed.",
"type": "string"
}
}
},
"retention-in-memory": {
"description": "Keep the metrics within memory for given time interval. Retention for X hours, then the metrics would be freed.",
"type": "string"
},
"nats": {
"description": "Configuration for accepting published data through NATS.",
"type": "array",
"items": {
"type": "object",
"properties": {
"address": {
"description": "Address of the NATS server.",
"type": "string"
},
"username": {
"description": "Optional: If configured with username/password method.",
"type": "string"
},
"password": {
"description": "Optional: If configured with username/password method.",
"type": "string"
},
"creds-file-path": {
"description": "Optional: If configured with Credential File method. Path to your NATS cred file.",
"type": "string"
},
"subscriptions": {
"description": "Array of various subscriptions. Allows to subscibe to different subjects and publishers.",
"type": "array",
"items": {
"type": "object",
"properties": {
"subscribe-to": {
"description": "Channel name",
"type": "string"
},
"cluster-tag": {
"description": "Optional: Allow lines without a cluster tag, use this as default",
"type": "string"
}
}
}
}
}
}
}
}
}`

View File

@@ -1,112 +0,0 @@
// Copyright (C) NHR@FAU, University Erlangen-Nuremberg.
// All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.
package memorystore
import (
"bufio"
"fmt"
"strconv"
)
func (b *buffer) debugDump(buf []byte) []byte {
if b.prev != nil {
buf = b.prev.debugDump(buf)
}
start, len, end := b.start, len(b.data), b.start+b.frequency*int64(len(b.data))
buf = append(buf, `{"start":`...)
buf = strconv.AppendInt(buf, start, 10)
buf = append(buf, `,"len":`...)
buf = strconv.AppendInt(buf, int64(len), 10)
buf = append(buf, `,"end":`...)
buf = strconv.AppendInt(buf, end, 10)
if b.archived {
buf = append(buf, `,"saved":true`...)
}
if b.next != nil {
buf = append(buf, `},`...)
} else {
buf = append(buf, `}`...)
}
return buf
}
func (l *Level) debugDump(m *MemoryStore, w *bufio.Writer, lvlname string, buf []byte, depth int) ([]byte, error) {
l.lock.RLock()
defer l.lock.RUnlock()
for i := 0; i < depth; i++ {
buf = append(buf, '\t')
}
buf = append(buf, '"')
buf = append(buf, lvlname...)
buf = append(buf, "\":{\n"...)
depth += 1
objitems := 0
for name, mc := range m.Metrics {
if b := l.metrics[mc.offset]; b != nil {
for i := 0; i < depth; i++ {
buf = append(buf, '\t')
}
buf = append(buf, '"')
buf = append(buf, name...)
buf = append(buf, `":[`...)
buf = b.debugDump(buf)
buf = append(buf, "],\n"...)
objitems++
}
}
for name, lvl := range l.children {
_, err := w.Write(buf)
if err != nil {
return nil, err
}
buf = buf[0:0]
buf, err = lvl.debugDump(m, w, name, buf, depth)
if err != nil {
return nil, err
}
buf = append(buf, ',', '\n')
objitems++
}
// remove final `,`:
if objitems > 0 {
buf = append(buf[0:len(buf)-1], '\n')
}
depth -= 1
for i := 0; i < depth; i++ {
buf = append(buf, '\t')
}
buf = append(buf, '}')
return buf, nil
}
func (m *MemoryStore) DebugDump(w *bufio.Writer, selector []string) error {
lvl := m.root.findLevel(selector)
if lvl == nil {
return fmt.Errorf("[METRICSTORE]> not found: %#v", selector)
}
buf := make([]byte, 0, 2048)
buf = append(buf, "{"...)
buf, err := lvl.debugDump(m, w, "data", buf, 0)
if err != nil {
return err
}
buf = append(buf, "}\n"...)
if _, err = w.Write(buf); err != nil {
return err
}
return w.Flush()
}

View File

@@ -1,92 +0,0 @@
// Copyright (C) NHR@FAU, University Erlangen-Nuremberg.
// All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.
package memorystore
import (
"bufio"
"fmt"
"time"
)
// MaxMissingDataPoints is a threshold that allows a node to be healthy with certain number of data points missing.
// Suppose a node does not receive last 5 data points, then healthCheck endpoint will still say a
// node is healthy. Anything more than 5 missing points in metrics of the node will deem the node unhealthy.
const MaxMissingDataPoints int64 = 5
// MaxUnhealthyMetrics is a threshold which allows upto certain number of metrics in a node to be unhealthly.
// Works with MaxMissingDataPoints. Say 5 metrics (including submetrics) do not receive the last
// MaxMissingDataPoints data points, then the node will be deemed healthy. Any more metrics that does
// not receive data for MaxMissingDataPoints data points will deem the node unhealthy.
const MaxUnhealthyMetrics int64 = 5
func (b *buffer) healthCheck() int64 {
// Check if the buffer is empty
if b.data == nil {
return 1
}
bufferEnd := b.start + b.frequency*int64(len(b.data))
t := time.Now().Unix()
// Check if the buffer is too old
if t-bufferEnd > MaxMissingDataPoints*b.frequency {
return 1
}
return 0
}
func (l *Level) healthCheck(m *MemoryStore, count int64) (int64, error) {
l.lock.RLock()
defer l.lock.RUnlock()
for _, mc := range m.Metrics {
if b := l.metrics[mc.offset]; b != nil {
count += b.healthCheck()
}
}
for _, lvl := range l.children {
c, err := lvl.healthCheck(m, 0)
if err != nil {
return 0, err
}
count += c
}
return count, nil
}
func (m *MemoryStore) HealthCheck(w *bufio.Writer, selector []string) error {
lvl := m.root.findLevel(selector)
if lvl == nil {
return fmt.Errorf("[METRICSTORE]> not found: %#v", selector)
}
buf := make([]byte, 0, 25)
// buf = append(buf, "{"...)
var count int64 = 0
unhealthyMetricsCount, err := lvl.healthCheck(m, count)
if err != nil {
return err
}
if unhealthyMetricsCount < MaxUnhealthyMetrics {
buf = append(buf, "Healthy"...)
} else {
buf = append(buf, "Unhealthy"...)
}
// buf = append(buf, "}\n"...)
if _, err = w.Write(buf); err != nil {
return err
}
return w.Flush()
}

View File

@@ -1,192 +0,0 @@
// Copyright (C) NHR@FAU, University Erlangen-Nuremberg.
// All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.
package memorystore
import (
"sync"
"unsafe"
"github.com/ClusterCockpit/cc-lib/util"
)
// Could also be called "node" as this forms a node in a tree structure.
// Called Level because "node" might be confusing here.
// Can be both a leaf or a inner node. In this tree structue, inner nodes can
// also hold data (in `metrics`).
type Level struct {
children map[string]*Level
metrics []*buffer
lock sync.RWMutex
}
// Find the correct level for the given selector, creating it if
// it does not exist. Example selector in the context of the
// ClusterCockpit could be: []string{ "emmy", "host123", "cpu0" }.
// This function would probably benefit a lot from `level.children` beeing a `sync.Map`?
func (l *Level) findLevelOrCreate(selector []string, nMetrics int) *Level {
if len(selector) == 0 {
return l
}
// Allow concurrent reads:
l.lock.RLock()
var child *Level
var ok bool
if l.children == nil {
// Children map needs to be created...
l.lock.RUnlock()
} else {
child, ok = l.children[selector[0]]
l.lock.RUnlock()
if ok {
return child.findLevelOrCreate(selector[1:], nMetrics)
}
}
// The level does not exist, take write lock for unique access:
l.lock.Lock()
// While this thread waited for the write lock, another thread
// could have created the child node.
if l.children != nil {
child, ok = l.children[selector[0]]
if ok {
l.lock.Unlock()
return child.findLevelOrCreate(selector[1:], nMetrics)
}
}
child = &Level{
metrics: make([]*buffer, nMetrics),
children: nil,
}
if l.children != nil {
l.children[selector[0]] = child
} else {
l.children = map[string]*Level{selector[0]: child}
}
l.lock.Unlock()
return child.findLevelOrCreate(selector[1:], nMetrics)
}
func (l *Level) free(t int64) (int, error) {
l.lock.Lock()
defer l.lock.Unlock()
n := 0
for i, b := range l.metrics {
if b != nil {
delme, m := b.free(t)
n += m
if delme {
if cap(b.data) == BufferCap {
bufferPool.Put(b)
}
l.metrics[i] = nil
}
}
}
for _, l := range l.children {
m, err := l.free(t)
n += m
if err != nil {
return n, err
}
}
return n, nil
}
func (l *Level) sizeInBytes() int64 {
l.lock.RLock()
defer l.lock.RUnlock()
size := int64(0)
for _, b := range l.metrics {
if b != nil {
size += b.count() * int64(unsafe.Sizeof(util.Float(0)))
}
}
for _, child := range l.children {
size += child.sizeInBytes()
}
return size
}
func (l *Level) findLevel(selector []string) *Level {
if len(selector) == 0 {
return l
}
l.lock.RLock()
defer l.lock.RUnlock()
lvl := l.children[selector[0]]
if lvl == nil {
return nil
}
return lvl.findLevel(selector[1:])
}
func (l *Level) findBuffers(selector util.Selector, offset int, f func(b *buffer) error) error {
l.lock.RLock()
defer l.lock.RUnlock()
if len(selector) == 0 {
b := l.metrics[offset]
if b != nil {
return f(b)
}
for _, lvl := range l.children {
err := lvl.findBuffers(nil, offset, f)
if err != nil {
return err
}
}
return nil
}
sel := selector[0]
if len(sel.String) != 0 && l.children != nil {
lvl, ok := l.children[sel.String]
if ok {
err := lvl.findBuffers(selector[1:], offset, f)
if err != nil {
return err
}
}
return nil
}
if sel.Group != nil && l.children != nil {
for _, key := range sel.Group {
lvl, ok := l.children[key]
if ok {
err := lvl.findBuffers(selector[1:], offset, f)
if err != nil {
return err
}
}
}
return nil
}
if sel.Any && l.children != nil {
for _, lvl := range l.children {
if err := lvl.findBuffers(selector[1:], offset, f); err != nil {
return err
}
}
return nil
}
return nil
}

View File

@@ -1,258 +0,0 @@
// Copyright (C) NHR@FAU, University Erlangen-Nuremberg.
// All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.
package memorystore
import (
"context"
"fmt"
"sync"
"time"
"github.com/ClusterCockpit/cc-backend/pkg/nats"
cclog "github.com/ClusterCockpit/cc-lib/ccLogger"
"github.com/ClusterCockpit/cc-lib/schema"
"github.com/influxdata/line-protocol/v2/lineprotocol"
)
func ReceiveNats(ms *MemoryStore,
workers int,
ctx context.Context,
) error {
nc := nats.GetClient()
if nc == nil {
cclog.Warn("NATS client not initialized")
return nil
}
var wg sync.WaitGroup
msgs := make(chan []byte, workers*2)
for _, sc := range Keys.Subscriptions {
clusterTag := sc.ClusterTag
if workers > 1 {
wg.Add(workers)
for range workers {
go func() {
for m := range msgs {
dec := lineprotocol.NewDecoderWithBytes(m)
if err := DecodeLine(dec, ms, clusterTag); err != nil {
cclog.Errorf("error: %s", err.Error())
}
}
wg.Done()
}()
}
nc.Subscribe(sc.SubscribeTo, func(subject string, data []byte) {
msgs <- data
})
} else {
nc.Subscribe(sc.SubscribeTo, func(subject string, data []byte) {
dec := lineprotocol.NewDecoderWithBytes(data)
if err := DecodeLine(dec, ms, clusterTag); err != nil {
cclog.Errorf("error: %s", err.Error())
}
})
}
cclog.Infof("NATS subscription to '%s' established", sc.SubscribeTo)
}
close(msgs)
wg.Wait()
return nil
}
// Place `prefix` in front of `buf` but if possible,
// do that inplace in `buf`.
func reorder(buf, prefix []byte) []byte {
n := len(prefix)
m := len(buf)
if cap(buf) < m+n {
return append(prefix[:n:n], buf...)
} else {
buf = buf[:n+m]
for i := m - 1; i >= 0; i-- {
buf[i+n] = buf[i]
}
for i := range n {
buf[i] = prefix[i]
}
return buf
}
}
// Decode lines using dec and make write calls to the MemoryStore.
// If a line is missing its cluster tag, use clusterDefault as default.
func DecodeLine(dec *lineprotocol.Decoder,
ms *MemoryStore,
clusterDefault string,
) error {
// Reduce allocations in loop:
t := time.Now()
metric, metricBuf := Metric{}, make([]byte, 0, 16)
selector := make([]string, 0, 4)
typeBuf, subTypeBuf := make([]byte, 0, 16), make([]byte, 0)
// Optimize for the case where all lines in a "batch" are about the same
// cluster and host. By using `WriteToLevel` (level = host), we do not need
// to take the root- and cluster-level lock as often.
var lvl *Level = nil
prevCluster, prevHost := "", ""
var ok bool
for dec.Next() {
rawmeasurement, err := dec.Measurement()
if err != nil {
return err
}
// Needs to be copied because another call to dec.* would
// invalidate the returned slice.
metricBuf = append(metricBuf[:0], rawmeasurement...)
// The go compiler optimizes map[string(byteslice)] lookups:
metric.MetricConfig, ok = ms.Metrics[string(rawmeasurement)]
if !ok {
continue
}
typeBuf, subTypeBuf := typeBuf[:0], subTypeBuf[:0]
cluster, host := clusterDefault, ""
for {
key, val, err := dec.NextTag()
if err != nil {
return err
}
if key == nil {
break
}
// The go compiler optimizes string([]byte{...}) == "...":
switch string(key) {
case "cluster":
if string(val) == prevCluster {
cluster = prevCluster
} else {
cluster = string(val)
lvl = nil
}
case "hostname", "host":
if string(val) == prevHost {
host = prevHost
} else {
host = string(val)
lvl = nil
}
case "type":
if string(val) == "node" {
break
}
// We cannot be sure that the "type" tag comes before the "type-id" tag:
if len(typeBuf) == 0 {
typeBuf = append(typeBuf, val...)
} else {
typeBuf = reorder(typeBuf, val)
}
case "type-id":
typeBuf = append(typeBuf, val...)
case "subtype":
// We cannot be sure that the "subtype" tag comes before the "stype-id" tag:
if len(subTypeBuf) == 0 {
subTypeBuf = append(subTypeBuf, val...)
} else {
subTypeBuf = reorder(subTypeBuf, val)
// subTypeBuf = reorder(typeBuf, val)
}
case "stype-id":
subTypeBuf = append(subTypeBuf, val...)
default:
}
}
// If the cluster or host changed, the lvl was set to nil
if lvl == nil {
selector = selector[:2]
selector[0], selector[1] = cluster, host
lvl = ms.GetLevel(selector)
prevCluster, prevHost = cluster, host
}
// subtypes:
selector = selector[:0]
if len(typeBuf) > 0 {
selector = append(selector, string(typeBuf)) // <- Allocation :(
if len(subTypeBuf) > 0 {
selector = append(selector, string(subTypeBuf))
}
}
for {
key, val, err := dec.NextField()
if err != nil {
return err
}
if key == nil {
break
}
if string(key) != "value" {
return fmt.Errorf("host %s: unknown field: '%s' (value: %#v)", host, string(key), val)
}
if val.Kind() == lineprotocol.Float {
metric.Value = schema.Float(val.FloatV())
} else if val.Kind() == lineprotocol.Int {
metric.Value = schema.Float(val.IntV())
} else if val.Kind() == lineprotocol.Uint {
metric.Value = schema.Float(val.UintV())
} else {
return fmt.Errorf("host %s: unsupported value type in message: %s", host, val.Kind().String())
}
}
if t, err = dec.Time(lineprotocol.Second, t); err != nil {
t = time.Now()
if t, err = dec.Time(lineprotocol.Millisecond, t); err != nil {
t = time.Now()
if t, err = dec.Time(lineprotocol.Microsecond, t); err != nil {
t = time.Now()
if t, err = dec.Time(lineprotocol.Nanosecond, t); err != nil {
return fmt.Errorf("host %s: timestamp : %#v with error : %#v", host, t, err.Error())
}
}
}
}
if err != nil {
return fmt.Errorf("host %s: timestamp : %#v with error : %#v", host, t, err.Error())
}
time := t.Unix()
if Keys.Checkpoints.FileFormat != "json" {
LineProtocolMessages <- &AvroStruct{
MetricName: string(metricBuf),
Cluster: cluster,
Node: host,
Selector: append([]string{}, selector...),
Value: metric.Value,
Timestamp: time,
}
}
if err := ms.WriteToLevel(lvl, selector, time, []Metric{metric}); err != nil {
return err
}
}
return nil
}

View File

@@ -1,429 +0,0 @@
// Copyright (C) NHR@FAU, University Erlangen-Nuremberg.
// All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.
// Package memorystore provides an efficient in-memory time-series metric storage system
// with support for hierarchical data organization, checkpointing, and archiving.
//
// The package organizes metrics in a tree structure (cluster → host → component) and
// provides concurrent read/write access to metric data with configurable aggregation strategies.
// Background goroutines handle periodic checkpointing (JSON or Avro format), archiving old data,
// and enforcing retention policies.
//
// Key features:
// - In-memory metric storage with configurable retention
// - Hierarchical data organization (selectors)
// - Concurrent checkpoint/archive workers
// - Support for sum and average aggregation
// - NATS integration for metric ingestion
package memorystore
import (
"bytes"
"context"
"encoding/json"
"errors"
"runtime"
"sync"
"time"
"github.com/ClusterCockpit/cc-backend/internal/config"
"github.com/ClusterCockpit/cc-backend/pkg/archive"
cclog "github.com/ClusterCockpit/cc-lib/ccLogger"
"github.com/ClusterCockpit/cc-lib/resampler"
"github.com/ClusterCockpit/cc-lib/schema"
"github.com/ClusterCockpit/cc-lib/util"
)
var (
singleton sync.Once
msInstance *MemoryStore
// shutdownFunc stores the context cancellation function created in Init
// and is called during Shutdown to cancel all background goroutines
shutdownFunc context.CancelFunc
)
type Metric struct {
Name string
Value schema.Float
MetricConfig MetricConfig
}
type MemoryStore struct {
Metrics map[string]MetricConfig
root Level
}
func Init(rawConfig json.RawMessage, wg *sync.WaitGroup) {
startupTime := time.Now()
if rawConfig != nil {
config.Validate(configSchema, rawConfig)
dec := json.NewDecoder(bytes.NewReader(rawConfig))
// dec.DisallowUnknownFields()
if err := dec.Decode(&Keys); err != nil {
cclog.Abortf("[METRICSTORE]> Metric Store Config Init: Could not decode config file '%s'.\nError: %s\n", rawConfig, err.Error())
}
}
// Set NumWorkers from config or use default
if Keys.NumWorkers <= 0 {
Keys.NumWorkers = min(runtime.NumCPU()/2+1, DefaultMaxWorkers)
}
cclog.Debugf("[METRICSTORE]> Using %d workers for checkpoint/archive operations\n", Keys.NumWorkers)
// Helper function to add metric configuration
addMetricConfig := func(mc schema.MetricConfig) {
agg, err := AssignAggregationStrategy(mc.Aggregation)
if err != nil {
cclog.Warnf("Could not find aggregation strategy for metric config '%s': %s", mc.Name, err.Error())
}
AddMetric(mc.Name, MetricConfig{
Frequency: int64(mc.Timestep),
Aggregation: agg,
})
}
for _, c := range archive.Clusters {
for _, mc := range c.MetricConfig {
addMetricConfig(*mc)
}
for _, sc := range c.SubClusters {
for _, mc := range sc.MetricConfig {
addMetricConfig(mc)
}
}
}
// Pass the config.MetricStoreKeys
InitMetrics(Metrics)
ms := GetMemoryStore()
d, err := time.ParseDuration(Keys.Checkpoints.Restore)
if err != nil {
cclog.Fatal(err)
}
restoreFrom := startupTime.Add(-d)
cclog.Infof("[METRICSTORE]> Loading checkpoints newer than %s\n", restoreFrom.Format(time.RFC3339))
files, err := ms.FromCheckpointFiles(Keys.Checkpoints.RootDir, restoreFrom.Unix())
loadedData := ms.SizeInBytes() / 1024 / 1024 // In MB
if err != nil {
cclog.Fatalf("[METRICSTORE]> Loading checkpoints failed: %s\n", err.Error())
} else {
cclog.Infof("[METRICSTORE]> Checkpoints loaded (%d files, %d MB, that took %fs)\n", files, loadedData, time.Since(startupTime).Seconds())
}
// Try to use less memory by forcing a GC run here and then
// lowering the target percentage. The default of 100 means
// that only once the ratio of new allocations execeds the
// previously active heap, a GC is triggered.
// Forcing a GC here will set the "previously active heap"
// to a minumum.
runtime.GC()
ctx, shutdown := context.WithCancel(context.Background())
wg.Add(4)
Retention(wg, ctx)
Checkpointing(wg, ctx)
Archiving(wg, ctx)
DataStaging(wg, ctx)
// Note: Signal handling has been removed from this function.
// The caller is responsible for handling shutdown signals and calling
// the shutdown() function when appropriate.
// Store the shutdown function for later use by Shutdown()
shutdownFunc = shutdown
err = ReceiveNats(ms, 1, ctx)
if err != nil {
cclog.Fatal(err)
}
}
// InitMetrics creates a new, initialized instance of a MemoryStore.
// Will panic if values in the metric configurations are invalid.
func InitMetrics(metrics map[string]MetricConfig) {
singleton.Do(func() {
offset := 0
for key, cfg := range metrics {
if cfg.Frequency == 0 {
panic("[METRICSTORE]> invalid frequency")
}
metrics[key] = MetricConfig{
Frequency: cfg.Frequency,
Aggregation: cfg.Aggregation,
offset: offset,
}
offset += 1
}
msInstance = &MemoryStore{
root: Level{
metrics: make([]*buffer, len(metrics)),
children: make(map[string]*Level),
},
Metrics: metrics,
}
})
}
func GetMemoryStore() *MemoryStore {
if msInstance == nil {
return nil
}
return msInstance
}
func Shutdown() {
// Check if memorystore was initialized
if msInstance == nil {
cclog.Debug("[METRICSTORE]> MemoryStore not initialized, skipping shutdown")
return
}
// Cancel the context to signal all background goroutines to stop
if shutdownFunc != nil {
shutdownFunc()
}
cclog.Infof("[METRICSTORE]> Writing to '%s'...\n", Keys.Checkpoints.RootDir)
var files int
var err error
ms := GetMemoryStore()
if Keys.Checkpoints.FileFormat == "json" {
files, err = ms.ToCheckpoint(Keys.Checkpoints.RootDir, lastCheckpoint.Unix(), time.Now().Unix())
} else {
files, err = GetAvroStore().ToCheckpoint(Keys.Checkpoints.RootDir, true)
close(LineProtocolMessages)
}
if err != nil {
cclog.Errorf("[METRICSTORE]> Writing checkpoint failed: %s\n", err.Error())
}
cclog.Infof("[METRICSTORE]> Done! (%d files written)\n", files)
}
func getName(m *MemoryStore, i int) string {
for key, val := range m.Metrics {
if val.offset == i {
return key
}
}
return ""
}
func Retention(wg *sync.WaitGroup, ctx context.Context) {
ms := GetMemoryStore()
go func() {
defer wg.Done()
d, err := time.ParseDuration(Keys.RetentionInMemory)
if err != nil {
cclog.Fatal(err)
}
if d <= 0 {
return
}
tickInterval := d / 2
if tickInterval <= 0 {
return
}
ticker := time.NewTicker(tickInterval)
defer ticker.Stop()
for {
select {
case <-ctx.Done():
return
case <-ticker.C:
t := time.Now().Add(-d)
cclog.Infof("[METRICSTORE]> start freeing buffers (older than %s)...\n", t.Format(time.RFC3339))
freed, err := ms.Free(nil, t.Unix())
if err != nil {
cclog.Errorf("[METRICSTORE]> freeing up buffers failed: %s\n", err.Error())
} else {
cclog.Infof("[METRICSTORE]> done: %d buffers freed\n", freed)
}
}
}
}()
}
// Write all values in `metrics` to the level specified by `selector` for time `ts`.
// Look at `findLevelOrCreate` for how selectors work.
func (m *MemoryStore) Write(selector []string, ts int64, metrics []Metric) error {
var ok bool
for i, metric := range metrics {
if metric.MetricConfig.Frequency == 0 {
metric.MetricConfig, ok = m.Metrics[metric.Name]
if !ok {
metric.MetricConfig.Frequency = 0
}
metrics[i] = metric
}
}
return m.WriteToLevel(&m.root, selector, ts, metrics)
}
func (m *MemoryStore) GetLevel(selector []string) *Level {
return m.root.findLevelOrCreate(selector, len(m.Metrics))
}
// WriteToLevel assumes that `minfo` in `metrics` is filled in
func (m *MemoryStore) WriteToLevel(l *Level, selector []string, ts int64, metrics []Metric) error {
l = l.findLevelOrCreate(selector, len(m.Metrics))
l.lock.Lock()
defer l.lock.Unlock()
for _, metric := range metrics {
if metric.MetricConfig.Frequency == 0 {
continue
}
b := l.metrics[metric.MetricConfig.offset]
if b == nil {
// First write to this metric and level
b = newBuffer(ts, metric.MetricConfig.Frequency)
l.metrics[metric.MetricConfig.offset] = b
}
nb, err := b.write(ts, metric.Value)
if err != nil {
return err
}
// Last write created a new buffer...
if b != nb {
l.metrics[metric.MetricConfig.offset] = nb
}
}
return nil
}
// Read returns all values for metric `metric` from `from` to `to` for the selected level(s).
// If the level does not hold the metric itself, the data will be aggregated recursively from the children.
// The second and third return value are the actual from/to for the data. Those can be different from
// the range asked for if no data was available.
func (m *MemoryStore) Read(selector util.Selector, metric string, from, to, resolution int64) ([]schema.Float, int64, int64, int64, error) {
if from > to {
return nil, 0, 0, 0, errors.New("[METRICSTORE]> invalid time range")
}
minfo, ok := m.Metrics[metric]
if !ok {
return nil, 0, 0, 0, errors.New("[METRICSTORE]> unknown metric: " + metric)
}
n, data := 0, make([]schema.Float, (to-from)/minfo.Frequency+1)
err := m.root.findBuffers(selector, minfo.offset, func(b *buffer) error {
cdata, cfrom, cto, err := b.read(from, to, data)
if err != nil {
return err
}
if n == 0 {
from, to = cfrom, cto
} else if from != cfrom || to != cto || len(data) != len(cdata) {
missingfront, missingback := int((from-cfrom)/minfo.Frequency), int((to-cto)/minfo.Frequency)
if missingfront != 0 {
return ErrDataDoesNotAlign
}
newlen := len(cdata) - missingback
if newlen < 1 {
return ErrDataDoesNotAlign
}
cdata = cdata[0:newlen]
if len(cdata) != len(data) {
return ErrDataDoesNotAlign
}
from, to = cfrom, cto
}
data = cdata
n += 1
return nil
})
if err != nil {
return nil, 0, 0, 0, err
} else if n == 0 {
return nil, 0, 0, 0, errors.New("[METRICSTORE]> metric or host not found")
} else if n > 1 {
if minfo.Aggregation == AvgAggregation {
normalize := 1. / schema.Float(n)
for i := 0; i < len(data); i++ {
data[i] *= normalize
}
} else if minfo.Aggregation != SumAggregation {
return nil, 0, 0, 0, errors.New("[METRICSTORE]> invalid aggregation")
}
}
data, resolution, err = resampler.LargestTriangleThreeBucket(data, minfo.Frequency, resolution)
if err != nil {
return nil, 0, 0, 0, err
}
return data, from, to, resolution, nil
}
// Free releases all buffers for the selected level and all its children that
// contain only values older than `t`.
func (m *MemoryStore) Free(selector []string, t int64) (int, error) {
return m.GetLevel(selector).free(t)
}
func (m *MemoryStore) FreeAll() error {
for k := range m.root.children {
delete(m.root.children, k)
}
return nil
}
func (m *MemoryStore) SizeInBytes() int64 {
return m.root.sizeInBytes()
}
// ListChildren , given a selector, returns a list of all children of the level
// selected.
func (m *MemoryStore) ListChildren(selector []string) []string {
lvl := &m.root
for lvl != nil && len(selector) != 0 {
lvl.lock.RLock()
next := lvl.children[selector[0]]
lvl.lock.RUnlock()
lvl = next
selector = selector[1:]
}
if lvl == nil {
return nil
}
lvl.lock.RLock()
defer lvl.lock.RUnlock()
children := make([]string, 0, len(lvl.children))
for child := range lvl.children {
children = append(children, child)
}
return children
}

View File

@@ -1,156 +0,0 @@
// Copyright (C) NHR@FAU, University Erlangen-Nuremberg.
// All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.
package memorystore
import (
"testing"
"github.com/ClusterCockpit/cc-lib/schema"
)
func TestAssignAggregationStrategy(t *testing.T) {
tests := []struct {
name string
input string
expected AggregationStrategy
wantErr bool
}{
{"empty string", "", NoAggregation, false},
{"sum", "sum", SumAggregation, false},
{"avg", "avg", AvgAggregation, false},
{"invalid", "invalid", NoAggregation, true},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
result, err := AssignAggregationStrategy(tt.input)
if (err != nil) != tt.wantErr {
t.Errorf("AssignAggregationStrategy(%q) error = %v, wantErr %v", tt.input, err, tt.wantErr)
return
}
if result != tt.expected {
t.Errorf("AssignAggregationStrategy(%q) = %v, want %v", tt.input, result, tt.expected)
}
})
}
}
func TestAddMetric(t *testing.T) {
// Reset Metrics before test
Metrics = make(map[string]MetricConfig)
err := AddMetric("test_metric", MetricConfig{
Frequency: 60,
Aggregation: SumAggregation,
})
if err != nil {
t.Errorf("AddMetric() error = %v", err)
}
if _, ok := Metrics["test_metric"]; !ok {
t.Error("AddMetric() did not add metric to Metrics map")
}
// Test updating with higher frequency
err = AddMetric("test_metric", MetricConfig{
Frequency: 120,
Aggregation: SumAggregation,
})
if err != nil {
t.Errorf("AddMetric() error = %v", err)
}
if Metrics["test_metric"].Frequency != 120 {
t.Errorf("AddMetric() frequency = %d, want 120", Metrics["test_metric"].Frequency)
}
// Test updating with lower frequency (should not update)
err = AddMetric("test_metric", MetricConfig{
Frequency: 30,
Aggregation: SumAggregation,
})
if err != nil {
t.Errorf("AddMetric() error = %v", err)
}
if Metrics["test_metric"].Frequency != 120 {
t.Errorf("AddMetric() frequency = %d, want 120 (should not downgrade)", Metrics["test_metric"].Frequency)
}
}
func TestGetMetricFrequency(t *testing.T) {
// Reset Metrics before test
Metrics = map[string]MetricConfig{
"test_metric": {
Frequency: 60,
Aggregation: SumAggregation,
},
}
freq, err := GetMetricFrequency("test_metric")
if err != nil {
t.Errorf("GetMetricFrequency() error = %v", err)
}
if freq != 60 {
t.Errorf("GetMetricFrequency() = %d, want 60", freq)
}
_, err = GetMetricFrequency("nonexistent")
if err == nil {
t.Error("GetMetricFrequency() expected error for nonexistent metric")
}
}
func TestBufferWrite(t *testing.T) {
b := newBuffer(100, 10)
// Test writing value
nb, err := b.write(100, schema.Float(42.0))
if err != nil {
t.Errorf("buffer.write() error = %v", err)
}
if nb != b {
t.Error("buffer.write() created new buffer unexpectedly")
}
if len(b.data) != 1 {
t.Errorf("buffer.write() len(data) = %d, want 1", len(b.data))
}
if b.data[0] != schema.Float(42.0) {
t.Errorf("buffer.write() data[0] = %v, want 42.0", b.data[0])
}
// Test writing value from past (should error)
_, err = b.write(50, schema.Float(10.0))
if err == nil {
t.Error("buffer.write() expected error for past timestamp")
}
}
func TestBufferRead(t *testing.T) {
b := newBuffer(100, 10)
// Write some test data
b.write(100, schema.Float(1.0))
b.write(110, schema.Float(2.0))
b.write(120, schema.Float(3.0))
// Read data
data := make([]schema.Float, 3)
result, from, to, err := b.read(100, 130, data)
if err != nil {
t.Errorf("buffer.read() error = %v", err)
}
// Buffer read should return from as firstWrite (start + freq/2)
if from != 100 {
t.Errorf("buffer.read() from = %d, want 100", from)
}
if to != 130 {
t.Errorf("buffer.read() to = %d, want 130", to)
}
if len(result) != 3 {
t.Errorf("buffer.read() len(result) = %d, want 3", len(result))
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,124 +0,0 @@
// Copyright (C) NHR@FAU, University Erlangen-Nuremberg.
// All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.
package memorystore
import (
"errors"
"math"
"github.com/ClusterCockpit/cc-lib/util"
)
type Stats struct {
Samples int
Avg util.Float
Min util.Float
Max util.Float
}
func (b *buffer) stats(from, to int64) (Stats, int64, int64, error) {
if from < b.start {
if b.prev != nil {
return b.prev.stats(from, to)
}
from = b.start
}
// TODO: Check if b.closed and if so and the full buffer is queried,
// use b.statistics instead of iterating over the buffer.
samples := 0
sum, min, max := 0.0, math.MaxFloat32, -math.MaxFloat32
var t int64
for t = from; t < to; t += b.frequency {
idx := int((t - b.start) / b.frequency)
if idx >= cap(b.data) {
b = b.next
if b == nil {
break
}
idx = 0
}
if t < b.start || idx >= len(b.data) {
continue
}
xf := float64(b.data[idx])
if math.IsNaN(xf) {
continue
}
samples += 1
sum += xf
min = math.Min(min, xf)
max = math.Max(max, xf)
}
return Stats{
Samples: samples,
Avg: util.Float(sum) / util.Float(samples),
Min: util.Float(min),
Max: util.Float(max),
}, from, t, nil
}
// Returns statistics for the requested metric on the selected node/level.
// Data is aggregated to the selected level the same way as in `MemoryStore.Read`.
// If `Stats.Samples` is zero, the statistics should not be considered as valid.
func (m *MemoryStore) Stats(selector util.Selector, metric string, from, to int64) (*Stats, int64, int64, error) {
if from > to {
return nil, 0, 0, errors.New("invalid time range")
}
minfo, ok := m.Metrics[metric]
if !ok {
return nil, 0, 0, errors.New("unknown metric: " + metric)
}
n, samples := 0, 0
avg, min, max := util.Float(0), math.MaxFloat32, -math.MaxFloat32
err := m.root.findBuffers(selector, minfo.offset, func(b *buffer) error {
stats, cfrom, cto, err := b.stats(from, to)
if err != nil {
return err
}
if n == 0 {
from, to = cfrom, cto
} else if from != cfrom || to != cto {
return ErrDataDoesNotAlign
}
samples += stats.Samples
avg += stats.Avg
min = math.Min(min, float64(stats.Min))
max = math.Max(max, float64(stats.Max))
n += 1
return nil
})
if err != nil {
return nil, 0, 0, err
}
if n == 0 {
return nil, 0, 0, ErrNoData
}
if minfo.Aggregation == AvgAggregation {
avg /= util.Float(n)
} else if n > 1 && minfo.Aggregation != SumAggregation {
return nil, 0, 0, errors.New("invalid aggregation")
}
return &Stats{
Samples: samples,
Avg: avg,
Min: util.Float(min),
Max: util.Float(max),
}, from, to, nil
}

View File

@@ -0,0 +1,382 @@
// Copyright (C) NHR@FAU, University Erlangen-Nuremberg.
// All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.
package metricDataDispatcher
import (
"context"
"fmt"
"math"
"time"
"github.com/ClusterCockpit/cc-backend/internal/config"
"github.com/ClusterCockpit/cc-backend/internal/graph/model"
"github.com/ClusterCockpit/cc-backend/internal/metricdata"
"github.com/ClusterCockpit/cc-backend/pkg/archive"
cclog "github.com/ClusterCockpit/cc-lib/ccLogger"
"github.com/ClusterCockpit/cc-lib/lrucache"
"github.com/ClusterCockpit/cc-lib/resampler"
"github.com/ClusterCockpit/cc-lib/schema"
)
var cache *lrucache.Cache = lrucache.New(128 * 1024 * 1024)
func cacheKey(
job *schema.Job,
metrics []string,
scopes []schema.MetricScope,
resolution int,
) string {
// Duration and StartTime do not need to be in the cache key as StartTime is less unique than
// job.ID and the TTL of the cache entry makes sure it does not stay there forever.
return fmt.Sprintf("%d(%s):[%v],[%v]-%d",
job.ID, job.State, metrics, scopes, resolution)
}
// Fetches the metric data for a job.
func LoadData(job *schema.Job,
metrics []string,
scopes []schema.MetricScope,
ctx context.Context,
resolution int,
) (schema.JobData, error) {
data := cache.Get(cacheKey(job, metrics, scopes, resolution), func() (_ interface{}, ttl time.Duration, size int) {
var jd schema.JobData
var err error
if job.State == schema.JobStateRunning ||
job.MonitoringStatus == schema.MonitoringStatusRunningOrArchiving ||
config.Keys.DisableArchive {
repo, err := metricdata.GetMetricDataRepo(job.Cluster)
if err != nil {
return fmt.Errorf("METRICDATA/METRICDATA > no metric data repository configured for '%s'", job.Cluster), 0, 0
}
if scopes == nil {
scopes = append(scopes, schema.MetricScopeNode)
}
if metrics == nil {
cluster := archive.GetCluster(job.Cluster)
for _, mc := range cluster.MetricConfig {
metrics = append(metrics, mc.Name)
}
}
jd, err = repo.LoadData(job, metrics, scopes, ctx, resolution)
if err != nil {
if len(jd) != 0 {
cclog.Warnf("partial error: %s", err.Error())
// return err, 0, 0 // Reactivating will block archiving on one partial error
} else {
cclog.Error("Error while loading job data from metric repository")
return err, 0, 0
}
}
size = jd.Size()
} else {
var jd_temp schema.JobData
jd_temp, err = archive.GetHandle().LoadJobData(job)
if err != nil {
cclog.Error("Error while loading job data from archive")
return err, 0, 0
}
// Deep copy the cached archive hashmap
jd = metricdata.DeepCopy(jd_temp)
// Resampling for archived data.
// Pass the resolution from frontend here.
for _, v := range jd {
for _, v_ := range v {
timestep := 0
for i := 0; i < len(v_.Series); i += 1 {
v_.Series[i].Data, timestep, err = resampler.LargestTriangleThreeBucket(v_.Series[i].Data, v_.Timestep, resolution)
if err != nil {
return err, 0, 0
}
}
v_.Timestep = timestep
}
}
// Avoid sending unrequested data to the client:
if metrics != nil || scopes != nil {
if metrics == nil {
metrics = make([]string, 0, len(jd))
for k := range jd {
metrics = append(metrics, k)
}
}
res := schema.JobData{}
for _, metric := range metrics {
if perscope, ok := jd[metric]; ok {
if len(perscope) > 1 {
subset := make(map[schema.MetricScope]*schema.JobMetric)
for _, scope := range scopes {
if jm, ok := perscope[scope]; ok {
subset[scope] = jm
}
}
if len(subset) > 0 {
perscope = subset
}
}
res[metric] = perscope
}
}
jd = res
}
size = jd.Size()
}
ttl = 5 * time.Hour
if job.State == schema.JobStateRunning {
ttl = 2 * time.Minute
}
// FIXME: Review: Is this really necessary or correct.
// Note: Lines 147-170 formerly known as prepareJobData(jobData, scopes)
// For /monitoring/job/<job> and some other places, flops_any and mem_bw need
// to be available at the scope 'node'. If a job has a lot of nodes,
// statisticsSeries should be available so that a min/median/max Graph can be
// used instead of a lot of single lines.
// NOTE: New StatsSeries will always be calculated as 'min/median/max'
// Existing (archived) StatsSeries can be 'min/mean/max'!
const maxSeriesSize int = 15
for _, scopes := range jd {
for _, jm := range scopes {
if jm.StatisticsSeries != nil || len(jm.Series) <= maxSeriesSize {
continue
}
jm.AddStatisticsSeries()
}
}
nodeScopeRequested := false
for _, scope := range scopes {
if scope == schema.MetricScopeNode {
nodeScopeRequested = true
}
}
if nodeScopeRequested {
jd.AddNodeScope("flops_any")
jd.AddNodeScope("mem_bw")
}
// Round Resulting Stat Values
jd.RoundMetricStats()
return jd, ttl, size
})
if err, ok := data.(error); ok {
cclog.Error("Error in returned dataset")
return nil, err
}
return data.(schema.JobData), nil
}
// Used for the jobsFootprint GraphQL-Query. TODO: Rename/Generalize.
func LoadAverages(
job *schema.Job,
metrics []string,
data [][]schema.Float,
ctx context.Context,
) error {
if job.State != schema.JobStateRunning && !config.Keys.DisableArchive {
return archive.LoadAveragesFromArchive(job, metrics, data) // #166 change also here?
}
repo, err := metricdata.GetMetricDataRepo(job.Cluster)
if err != nil {
return fmt.Errorf("METRICDATA/METRICDATA > no metric data repository configured for '%s'", job.Cluster)
}
stats, err := repo.LoadStats(job, metrics, ctx) // #166 how to handle stats for acc normalizazion?
if err != nil {
cclog.Errorf("Error while loading statistics for job %v (User %v, Project %v)", job.JobID, job.User, job.Project)
return err
}
for i, m := range metrics {
nodes, ok := stats[m]
if !ok {
data[i] = append(data[i], schema.NaN)
continue
}
sum := 0.0
for _, node := range nodes {
sum += node.Avg
}
data[i] = append(data[i], schema.Float(sum))
}
return nil
}
// Used for statsTable in frontend: Return scoped statistics by metric.
func LoadScopedJobStats(
job *schema.Job,
metrics []string,
scopes []schema.MetricScope,
ctx context.Context,
) (schema.ScopedJobStats, error) {
if job.State != schema.JobStateRunning && !config.Keys.DisableArchive {
return archive.LoadScopedStatsFromArchive(job, metrics, scopes)
}
repo, err := metricdata.GetMetricDataRepo(job.Cluster)
if err != nil {
return nil, fmt.Errorf("job %d: no metric data repository configured for '%s'", job.JobID, job.Cluster)
}
scopedStats, err := repo.LoadScopedStats(job, metrics, scopes, ctx)
if err != nil {
cclog.Errorf("error while loading scoped statistics for job %d (User %s, Project %s)", job.JobID, job.User, job.Project)
return nil, err
}
return scopedStats, nil
}
// Used for polar plots in frontend: Aggregates statistics for all nodes to single values for job per metric.
func LoadJobStats(
job *schema.Job,
metrics []string,
ctx context.Context,
) (map[string]schema.MetricStatistics, error) {
if job.State != schema.JobStateRunning && !config.Keys.DisableArchive {
return archive.LoadStatsFromArchive(job, metrics)
}
data := make(map[string]schema.MetricStatistics, len(metrics))
repo, err := metricdata.GetMetricDataRepo(job.Cluster)
if err != nil {
return data, fmt.Errorf("job %d: no metric data repository configured for '%s'", job.JobID, job.Cluster)
}
stats, err := repo.LoadStats(job, metrics, ctx)
if err != nil {
cclog.Errorf("error while loading statistics for job %d (User %s, Project %s)", job.JobID, job.User, job.Project)
return data, err
}
for _, m := range metrics {
sum, avg, min, max := 0.0, 0.0, 0.0, 0.0
nodes, ok := stats[m]
if !ok {
data[m] = schema.MetricStatistics{Min: min, Avg: avg, Max: max}
continue
}
for _, node := range nodes {
sum += node.Avg
min = math.Min(min, node.Min)
max = math.Max(max, node.Max)
}
data[m] = schema.MetricStatistics{
Avg: (math.Round((sum/float64(job.NumNodes))*100) / 100),
Min: (math.Round(min*100) / 100),
Max: (math.Round(max*100) / 100),
}
}
return data, nil
}
// Used for the classic node/system view. Returns a map of nodes to a map of metrics.
func LoadNodeData(
cluster string,
metrics, nodes []string,
scopes []schema.MetricScope,
from, to time.Time,
ctx context.Context,
) (map[string]map[string][]*schema.JobMetric, error) {
repo, err := metricdata.GetMetricDataRepo(cluster)
if err != nil {
return nil, fmt.Errorf("METRICDATA/METRICDATA > no metric data repository configured for '%s'", cluster)
}
if metrics == nil {
for _, m := range archive.GetCluster(cluster).MetricConfig {
metrics = append(metrics, m.Name)
}
}
data, err := repo.LoadNodeData(cluster, metrics, nodes, scopes, from, to, ctx)
if err != nil {
if len(data) != 0 {
cclog.Warnf("partial error: %s", err.Error())
} else {
cclog.Error("Error while loading node data from metric repository")
return nil, err
}
}
if data == nil {
return nil, fmt.Errorf("METRICDATA/METRICDATA > the metric data repository for '%s' does not support this query", cluster)
}
return data, nil
}
func LoadNodeListData(
cluster, subCluster, nodeFilter string,
metrics []string,
scopes []schema.MetricScope,
resolution int,
from, to time.Time,
page *model.PageRequest,
ctx context.Context,
) (map[string]schema.JobData, int, bool, error) {
repo, err := metricdata.GetMetricDataRepo(cluster)
if err != nil {
return nil, 0, false, fmt.Errorf("METRICDATA/METRICDATA > no metric data repository configured for '%s'", cluster)
}
if metrics == nil {
for _, m := range archive.GetCluster(cluster).MetricConfig {
metrics = append(metrics, m.Name)
}
}
data, totalNodes, hasNextPage, err := repo.LoadNodeListData(cluster, subCluster, nodeFilter, metrics, scopes, resolution, from, to, page, ctx)
if err != nil {
if len(data) != 0 {
cclog.Warnf("partial error: %s", err.Error())
} else {
cclog.Error("Error while loading node data from metric repository")
return nil, totalNodes, hasNextPage, err
}
}
// NOTE: New StatsSeries will always be calculated as 'min/median/max'
const maxSeriesSize int = 8
for _, jd := range data {
for _, scopes := range jd {
for _, jm := range scopes {
if jm.StatisticsSeries != nil || len(jm.Series) < maxSeriesSize {
continue
}
jm.AddStatisticsSeries()
}
}
}
if data == nil {
return nil, totalNodes, hasNextPage, fmt.Errorf("METRICDATA/METRICDATA > the metric data repository for '%s' does not support this query", cluster)
}
return data, totalNodes, hasNextPage, nil
}

View File

@@ -1,8 +1,7 @@
// Copyright (C) NHR@FAU, University Erlangen-Nuremberg. // Copyright (C) NHR@FAU, University Erlangen-Nuremberg.
// All rights reserved. // All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style // Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file. // license that can be found in the LICENSE file.
package metricdata package metricdata
import ( import (
@@ -12,10 +11,12 @@ import (
"encoding/json" "encoding/json"
"fmt" "fmt"
"net/http" "net/http"
"sort"
"strconv" "strconv"
"strings" "strings"
"time" "time"
"github.com/ClusterCockpit/cc-backend/internal/graph/model"
"github.com/ClusterCockpit/cc-backend/pkg/archive" "github.com/ClusterCockpit/cc-backend/pkg/archive"
cclog "github.com/ClusterCockpit/cc-lib/ccLogger" cclog "github.com/ClusterCockpit/cc-lib/ccLogger"
"github.com/ClusterCockpit/cc-lib/schema" "github.com/ClusterCockpit/cc-lib/schema"
@@ -23,7 +24,7 @@ import (
type CCMetricStoreConfig struct { type CCMetricStoreConfig struct {
Kind string `json:"kind"` Kind string `json:"kind"`
URL string `json:"url"` Url string `json:"url"`
Token string `json:"token"` Token string `json:"token"`
// If metrics are known to this MetricDataRepository under a different // If metrics are known to this MetricDataRepository under a different
@@ -41,9 +42,9 @@ type CCMetricStore struct {
queryEndpoint string queryEndpoint string
} }
type APIQueryRequest struct { type ApiQueryRequest struct {
Cluster string `json:"cluster"` Cluster string `json:"cluster"`
Queries []APIQuery `json:"queries"` Queries []ApiQuery `json:"queries"`
ForAllNodes []string `json:"for-all-nodes"` ForAllNodes []string `json:"for-all-nodes"`
From int64 `json:"from"` From int64 `json:"from"`
To int64 `json:"to"` To int64 `json:"to"`
@@ -51,7 +52,7 @@ type APIQueryRequest struct {
WithData bool `json:"with-data"` WithData bool `json:"with-data"`
} }
type APIQuery struct { type ApiQuery struct {
Type *string `json:"type,omitempty"` Type *string `json:"type,omitempty"`
SubType *string `json:"subtype,omitempty"` SubType *string `json:"subtype,omitempty"`
Metric string `json:"metric"` Metric string `json:"metric"`
@@ -62,12 +63,12 @@ type APIQuery struct {
Aggregate bool `json:"aggreg"` Aggregate bool `json:"aggreg"`
} }
type APIQueryResponse struct { type ApiQueryResponse struct {
Queries []APIQuery `json:"queries,omitempty"` Queries []ApiQuery `json:"queries,omitempty"`
Results [][]APIMetricData `json:"results"` Results [][]ApiMetricData `json:"results"`
} }
type APIMetricData struct { type ApiMetricData struct {
Error *string `json:"error"` Error *string `json:"error"`
Data []schema.Float `json:"data"` Data []schema.Float `json:"data"`
From int64 `json:"from"` From int64 `json:"from"`
@@ -78,14 +79,6 @@ type APIMetricData struct {
Max schema.Float `json:"max"` Max schema.Float `json:"max"`
} }
var (
hwthreadString = string(schema.MetricScopeHWThread)
coreString = string(schema.MetricScopeCore)
memoryDomainString = string(schema.MetricScopeMemoryDomain)
socketString = string(schema.MetricScopeSocket)
acceleratorString = string(schema.MetricScopeAccelerator)
)
func (ccms *CCMetricStore) Init(rawConfig json.RawMessage) error { func (ccms *CCMetricStore) Init(rawConfig json.RawMessage) error {
var config CCMetricStoreConfig var config CCMetricStoreConfig
if err := json.Unmarshal(rawConfig, &config); err != nil { if err := json.Unmarshal(rawConfig, &config); err != nil {
@@ -93,8 +86,8 @@ func (ccms *CCMetricStore) Init(rawConfig json.RawMessage) error {
return err return err
} }
ccms.url = config.URL ccms.url = config.Url
ccms.queryEndpoint = fmt.Sprintf("%s/api/query", config.URL) ccms.queryEndpoint = fmt.Sprintf("%s/api/query", config.Url)
ccms.jwt = config.Token ccms.jwt = config.Token
ccms.client = http.Client{ ccms.client = http.Client{
Timeout: 10 * time.Second, Timeout: 10 * time.Second,
@@ -132,8 +125,8 @@ func (ccms *CCMetricStore) toLocalName(metric string) string {
func (ccms *CCMetricStore) doRequest( func (ccms *CCMetricStore) doRequest(
ctx context.Context, ctx context.Context,
body *APIQueryRequest, body *ApiQueryRequest,
) (*APIQueryResponse, error) { ) (*ApiQueryResponse, error) {
buf := &bytes.Buffer{} buf := &bytes.Buffer{}
if err := json.NewEncoder(buf).Encode(body); err != nil { if err := json.NewEncoder(buf).Encode(body); err != nil {
cclog.Errorf("Error while encoding request body: %s", err.Error()) cclog.Errorf("Error while encoding request body: %s", err.Error())
@@ -166,7 +159,7 @@ func (ccms *CCMetricStore) doRequest(
return nil, fmt.Errorf("'%s': HTTP Status: %s", ccms.queryEndpoint, res.Status) return nil, fmt.Errorf("'%s': HTTP Status: %s", ccms.queryEndpoint, res.Status)
} }
var resBody APIQueryResponse var resBody ApiQueryResponse
if err := json.NewDecoder(bufio.NewReader(res.Body)).Decode(&resBody); err != nil { if err := json.NewDecoder(bufio.NewReader(res.Body)).Decode(&resBody); err != nil {
cclog.Errorf("Error while decoding result body: %s", err.Error()) cclog.Errorf("Error while decoding result body: %s", err.Error())
return nil, err return nil, err
@@ -188,7 +181,7 @@ func (ccms *CCMetricStore) LoadData(
return nil, err return nil, err
} }
req := APIQueryRequest{ req := ApiQueryRequest{
Cluster: job.Cluster, Cluster: job.Cluster,
From: job.StartTime, From: job.StartTime,
To: job.StartTime + int64(job.Duration), To: job.StartTime + int64(job.Duration),
@@ -277,13 +270,21 @@ func (ccms *CCMetricStore) LoadData(
return jobData, nil return jobData, nil
} }
var (
hwthreadString = string(schema.MetricScopeHWThread)
coreString = string(schema.MetricScopeCore)
memoryDomainString = string(schema.MetricScopeMemoryDomain)
socketString = string(schema.MetricScopeSocket)
acceleratorString = string(schema.MetricScopeAccelerator)
)
func (ccms *CCMetricStore) buildQueries( func (ccms *CCMetricStore) buildQueries(
job *schema.Job, job *schema.Job,
metrics []string, metrics []string,
scopes []schema.MetricScope, scopes []schema.MetricScope,
resolution int, resolution int,
) ([]APIQuery, []schema.MetricScope, error) { ) ([]ApiQuery, []schema.MetricScope, error) {
queries := make([]APIQuery, 0, len(metrics)*len(scopes)*len(job.Resources)) queries := make([]ApiQuery, 0, len(metrics)*len(scopes)*len(job.Resources))
assignedScope := []schema.MetricScope{} assignedScope := []schema.MetricScope{}
subcluster, scerr := archive.GetSubCluster(job.Cluster, job.SubCluster) subcluster, scerr := archive.GetSubCluster(job.Cluster, job.SubCluster)
@@ -305,7 +306,7 @@ func (ccms *CCMetricStore) buildQueries(
if len(mc.SubClusters) != 0 { if len(mc.SubClusters) != 0 {
isRemoved := false isRemoved := false
for _, scConfig := range mc.SubClusters { for _, scConfig := range mc.SubClusters {
if scConfig.Name == job.SubCluster && scConfig.Remove { if scConfig.Name == job.SubCluster && scConfig.Remove == true {
isRemoved = true isRemoved = true
break break
} }
@@ -346,7 +347,7 @@ func (ccms *CCMetricStore) buildQueries(
continue continue
} }
queries = append(queries, APIQuery{ queries = append(queries, ApiQuery{
Metric: remoteName, Metric: remoteName,
Hostname: host.Hostname, Hostname: host.Hostname,
Aggregate: false, Aggregate: false,
@@ -364,7 +365,7 @@ func (ccms *CCMetricStore) buildQueries(
continue continue
} }
queries = append(queries, APIQuery{ queries = append(queries, ApiQuery{
Metric: remoteName, Metric: remoteName,
Hostname: host.Hostname, Hostname: host.Hostname,
Aggregate: true, Aggregate: true,
@@ -378,7 +379,7 @@ func (ccms *CCMetricStore) buildQueries(
// HWThread -> HWThead // HWThread -> HWThead
if nativeScope == schema.MetricScopeHWThread && scope == schema.MetricScopeHWThread { if nativeScope == schema.MetricScopeHWThread && scope == schema.MetricScopeHWThread {
queries = append(queries, APIQuery{ queries = append(queries, ApiQuery{
Metric: remoteName, Metric: remoteName,
Hostname: host.Hostname, Hostname: host.Hostname,
Aggregate: false, Aggregate: false,
@@ -394,7 +395,7 @@ func (ccms *CCMetricStore) buildQueries(
if nativeScope == schema.MetricScopeHWThread && scope == schema.MetricScopeCore { if nativeScope == schema.MetricScopeHWThread && scope == schema.MetricScopeCore {
cores, _ := topology.GetCoresFromHWThreads(hwthreads) cores, _ := topology.GetCoresFromHWThreads(hwthreads)
for _, core := range cores { for _, core := range cores {
queries = append(queries, APIQuery{ queries = append(queries, ApiQuery{
Metric: remoteName, Metric: remoteName,
Hostname: host.Hostname, Hostname: host.Hostname,
Aggregate: true, Aggregate: true,
@@ -411,7 +412,7 @@ func (ccms *CCMetricStore) buildQueries(
if nativeScope == schema.MetricScopeHWThread && scope == schema.MetricScopeSocket { if nativeScope == schema.MetricScopeHWThread && scope == schema.MetricScopeSocket {
sockets, _ := topology.GetSocketsFromHWThreads(hwthreads) sockets, _ := topology.GetSocketsFromHWThreads(hwthreads)
for _, socket := range sockets { for _, socket := range sockets {
queries = append(queries, APIQuery{ queries = append(queries, ApiQuery{
Metric: remoteName, Metric: remoteName,
Hostname: host.Hostname, Hostname: host.Hostname,
Aggregate: true, Aggregate: true,
@@ -426,7 +427,7 @@ func (ccms *CCMetricStore) buildQueries(
// HWThread -> Node // HWThread -> Node
if nativeScope == schema.MetricScopeHWThread && scope == schema.MetricScopeNode { if nativeScope == schema.MetricScopeHWThread && scope == schema.MetricScopeNode {
queries = append(queries, APIQuery{ queries = append(queries, ApiQuery{
Metric: remoteName, Metric: remoteName,
Hostname: host.Hostname, Hostname: host.Hostname,
Aggregate: true, Aggregate: true,
@@ -441,7 +442,7 @@ func (ccms *CCMetricStore) buildQueries(
// Core -> Core // Core -> Core
if nativeScope == schema.MetricScopeCore && scope == schema.MetricScopeCore { if nativeScope == schema.MetricScopeCore && scope == schema.MetricScopeCore {
cores, _ := topology.GetCoresFromHWThreads(hwthreads) cores, _ := topology.GetCoresFromHWThreads(hwthreads)
queries = append(queries, APIQuery{ queries = append(queries, ApiQuery{
Metric: remoteName, Metric: remoteName,
Hostname: host.Hostname, Hostname: host.Hostname,
Aggregate: false, Aggregate: false,
@@ -457,7 +458,7 @@ func (ccms *CCMetricStore) buildQueries(
if nativeScope == schema.MetricScopeCore && scope == schema.MetricScopeSocket { if nativeScope == schema.MetricScopeCore && scope == schema.MetricScopeSocket {
sockets, _ := topology.GetSocketsFromCores(hwthreads) sockets, _ := topology.GetSocketsFromCores(hwthreads)
for _, socket := range sockets { for _, socket := range sockets {
queries = append(queries, APIQuery{ queries = append(queries, ApiQuery{
Metric: remoteName, Metric: remoteName,
Hostname: host.Hostname, Hostname: host.Hostname,
Aggregate: true, Aggregate: true,
@@ -473,7 +474,7 @@ func (ccms *CCMetricStore) buildQueries(
// Core -> Node // Core -> Node
if nativeScope == schema.MetricScopeCore && scope == schema.MetricScopeNode { if nativeScope == schema.MetricScopeCore && scope == schema.MetricScopeNode {
cores, _ := topology.GetCoresFromHWThreads(hwthreads) cores, _ := topology.GetCoresFromHWThreads(hwthreads)
queries = append(queries, APIQuery{ queries = append(queries, ApiQuery{
Metric: remoteName, Metric: remoteName,
Hostname: host.Hostname, Hostname: host.Hostname,
Aggregate: true, Aggregate: true,
@@ -488,7 +489,7 @@ func (ccms *CCMetricStore) buildQueries(
// MemoryDomain -> MemoryDomain // MemoryDomain -> MemoryDomain
if nativeScope == schema.MetricScopeMemoryDomain && scope == schema.MetricScopeMemoryDomain { if nativeScope == schema.MetricScopeMemoryDomain && scope == schema.MetricScopeMemoryDomain {
sockets, _ := topology.GetMemoryDomainsFromHWThreads(hwthreads) sockets, _ := topology.GetMemoryDomainsFromHWThreads(hwthreads)
queries = append(queries, APIQuery{ queries = append(queries, ApiQuery{
Metric: remoteName, Metric: remoteName,
Hostname: host.Hostname, Hostname: host.Hostname,
Aggregate: false, Aggregate: false,
@@ -503,7 +504,7 @@ func (ccms *CCMetricStore) buildQueries(
// MemoryDoman -> Node // MemoryDoman -> Node
if nativeScope == schema.MetricScopeMemoryDomain && scope == schema.MetricScopeNode { if nativeScope == schema.MetricScopeMemoryDomain && scope == schema.MetricScopeNode {
sockets, _ := topology.GetMemoryDomainsFromHWThreads(hwthreads) sockets, _ := topology.GetMemoryDomainsFromHWThreads(hwthreads)
queries = append(queries, APIQuery{ queries = append(queries, ApiQuery{
Metric: remoteName, Metric: remoteName,
Hostname: host.Hostname, Hostname: host.Hostname,
Aggregate: true, Aggregate: true,
@@ -518,7 +519,7 @@ func (ccms *CCMetricStore) buildQueries(
// Socket -> Socket // Socket -> Socket
if nativeScope == schema.MetricScopeSocket && scope == schema.MetricScopeSocket { if nativeScope == schema.MetricScopeSocket && scope == schema.MetricScopeSocket {
sockets, _ := topology.GetSocketsFromHWThreads(hwthreads) sockets, _ := topology.GetSocketsFromHWThreads(hwthreads)
queries = append(queries, APIQuery{ queries = append(queries, ApiQuery{
Metric: remoteName, Metric: remoteName,
Hostname: host.Hostname, Hostname: host.Hostname,
Aggregate: false, Aggregate: false,
@@ -533,7 +534,7 @@ func (ccms *CCMetricStore) buildQueries(
// Socket -> Node // Socket -> Node
if nativeScope == schema.MetricScopeSocket && scope == schema.MetricScopeNode { if nativeScope == schema.MetricScopeSocket && scope == schema.MetricScopeNode {
sockets, _ := topology.GetSocketsFromHWThreads(hwthreads) sockets, _ := topology.GetSocketsFromHWThreads(hwthreads)
queries = append(queries, APIQuery{ queries = append(queries, ApiQuery{
Metric: remoteName, Metric: remoteName,
Hostname: host.Hostname, Hostname: host.Hostname,
Aggregate: true, Aggregate: true,
@@ -547,7 +548,7 @@ func (ccms *CCMetricStore) buildQueries(
// Node -> Node // Node -> Node
if nativeScope == schema.MetricScopeNode && scope == schema.MetricScopeNode { if nativeScope == schema.MetricScopeNode && scope == schema.MetricScopeNode {
queries = append(queries, APIQuery{ queries = append(queries, ApiQuery{
Metric: remoteName, Metric: remoteName,
Hostname: host.Hostname, Hostname: host.Hostname,
Resolution: resolution, Resolution: resolution,
@@ -575,7 +576,7 @@ func (ccms *CCMetricStore) LoadStats(
return nil, err return nil, err
} }
req := APIQueryRequest{ req := ApiQueryRequest{
Cluster: job.Cluster, Cluster: job.Cluster,
From: job.StartTime, From: job.StartTime,
To: job.StartTime + int64(job.Duration), To: job.StartTime + int64(job.Duration),
@@ -621,6 +622,7 @@ func (ccms *CCMetricStore) LoadStats(
return stats, nil return stats, nil
} }
// Used for Job-View Statistics Table
func (ccms *CCMetricStore) LoadScopedStats( func (ccms *CCMetricStore) LoadScopedStats(
job *schema.Job, job *schema.Job,
metrics []string, metrics []string,
@@ -633,7 +635,7 @@ func (ccms *CCMetricStore) LoadScopedStats(
return nil, err return nil, err
} }
req := APIQueryRequest{ req := ApiQueryRequest{
Cluster: job.Cluster, Cluster: job.Cluster,
From: job.StartTime, From: job.StartTime,
To: job.StartTime + int64(job.Duration), To: job.StartTime + int64(job.Duration),
@@ -711,6 +713,7 @@ func (ccms *CCMetricStore) LoadScopedStats(
return scopedJobStats, nil return scopedJobStats, nil
} }
// Used for Systems-View Node-Overview
func (ccms *CCMetricStore) LoadNodeData( func (ccms *CCMetricStore) LoadNodeData(
cluster string, cluster string,
metrics, nodes []string, metrics, nodes []string,
@@ -718,7 +721,7 @@ func (ccms *CCMetricStore) LoadNodeData(
from, to time.Time, from, to time.Time,
ctx context.Context, ctx context.Context,
) (map[string]map[string][]*schema.JobMetric, error) { ) (map[string]map[string][]*schema.JobMetric, error) {
req := APIQueryRequest{ req := ApiQueryRequest{
Cluster: cluster, Cluster: cluster,
From: from.Unix(), From: from.Unix(),
To: to.Unix(), To: to.Unix(),
@@ -733,7 +736,7 @@ func (ccms *CCMetricStore) LoadNodeData(
} else { } else {
for _, node := range nodes { for _, node := range nodes {
for _, metric := range metrics { for _, metric := range metrics {
req.Queries = append(req.Queries, APIQuery{ req.Queries = append(req.Queries, ApiQuery{
Hostname: node, Hostname: node,
Metric: ccms.toRemoteName(metric), Metric: ccms.toRemoteName(metric),
Resolution: 0, // Default for Node Queries: Will return metric $Timestep Resolution Resolution: 0, // Default for Node Queries: Will return metric $Timestep Resolution
@@ -751,7 +754,7 @@ func (ccms *CCMetricStore) LoadNodeData(
var errors []string var errors []string
data := make(map[string]map[string][]*schema.JobMetric) data := make(map[string]map[string][]*schema.JobMetric)
for i, res := range resBody.Results { for i, res := range resBody.Results {
var query APIQuery var query ApiQuery
if resBody.Queries != nil { if resBody.Queries != nil {
query = resBody.Queries[i] query = resBody.Queries[i]
} else { } else {
@@ -802,23 +805,69 @@ func (ccms *CCMetricStore) LoadNodeData(
return data, nil return data, nil
} }
// Used for Systems-View Node-List
func (ccms *CCMetricStore) LoadNodeListData( func (ccms *CCMetricStore) LoadNodeListData(
cluster, subCluster string, cluster, subCluster, nodeFilter string,
nodes []string,
metrics []string, metrics []string,
scopes []schema.MetricScope, scopes []schema.MetricScope,
resolution int, resolution int,
from, to time.Time, from, to time.Time,
page *model.PageRequest,
ctx context.Context, ctx context.Context,
) (map[string]schema.JobData, error) { ) (map[string]schema.JobData, int, bool, error) {
// Note: Order of node data is not guaranteed after this point // 0) Init additional vars
var totalNodes int = 0
var hasNextPage bool = false
// 1) Get list of all nodes
var nodes []string
if subCluster != "" {
scNodes := archive.NodeLists[cluster][subCluster]
nodes = scNodes.PrintList()
} else {
subClusterNodeLists := archive.NodeLists[cluster]
for _, nodeList := range subClusterNodeLists {
nodes = append(nodes, nodeList.PrintList()...)
}
}
// 2) Filter nodes
if nodeFilter != "" {
filteredNodes := []string{}
for _, node := range nodes {
if strings.Contains(node, nodeFilter) {
filteredNodes = append(filteredNodes, node)
}
}
nodes = filteredNodes
}
// 2.1) Count total nodes && Sort nodes -> Sorting invalidated after ccms return ...
totalNodes = len(nodes)
sort.Strings(nodes)
// 3) Apply paging
if len(nodes) > page.ItemsPerPage {
start := (page.Page - 1) * page.ItemsPerPage
end := start + page.ItemsPerPage
if end >= len(nodes) {
end = len(nodes)
hasNextPage = false
} else {
hasNextPage = true
}
nodes = nodes[start:end]
}
// Note: Order of node data is not guaranteed after this point, but contents match page and filter criteria
queries, assignedScope, err := ccms.buildNodeQueries(cluster, subCluster, nodes, metrics, scopes, resolution) queries, assignedScope, err := ccms.buildNodeQueries(cluster, subCluster, nodes, metrics, scopes, resolution)
if err != nil { if err != nil {
cclog.Errorf("Error while building node queries for Cluster %s, SubCLuster %s, Metrics %v, Scopes %v: %s", cluster, subCluster, metrics, scopes, err.Error()) cclog.Errorf("Error while building node queries for Cluster %s, SubCLuster %s, Metrics %v, Scopes %v: %s", cluster, subCluster, metrics, scopes, err.Error())
return nil, err return nil, totalNodes, hasNextPage, err
} }
req := APIQueryRequest{ req := ApiQueryRequest{
Cluster: cluster, Cluster: cluster,
Queries: queries, Queries: queries,
From: from.Unix(), From: from.Unix(),
@@ -830,13 +879,13 @@ func (ccms *CCMetricStore) LoadNodeListData(
resBody, err := ccms.doRequest(ctx, &req) resBody, err := ccms.doRequest(ctx, &req)
if err != nil { if err != nil {
cclog.Errorf("Error while performing request: %s", err.Error()) cclog.Errorf("Error while performing request: %s", err.Error())
return nil, err return nil, totalNodes, hasNextPage, err
} }
var errors []string var errors []string
data := make(map[string]schema.JobData) data := make(map[string]schema.JobData)
for i, row := range resBody.Results { for i, row := range resBody.Results {
var query APIQuery var query ApiQuery
if resBody.Queries != nil { if resBody.Queries != nil {
query = resBody.Queries[i] query = resBody.Queries[i]
} else { } else {
@@ -910,10 +959,10 @@ func (ccms *CCMetricStore) LoadNodeListData(
if len(errors) != 0 { if len(errors) != 0 {
/* Returns list of "partial errors" */ /* Returns list of "partial errors" */
return data, fmt.Errorf("METRICDATA/CCMS > Errors: %s", strings.Join(errors, ", ")) return data, totalNodes, hasNextPage, fmt.Errorf("METRICDATA/CCMS > Errors: %s", strings.Join(errors, ", "))
} }
return data, nil return data, totalNodes, hasNextPage, nil
} }
func (ccms *CCMetricStore) buildNodeQueries( func (ccms *CCMetricStore) buildNodeQueries(
@@ -923,8 +972,8 @@ func (ccms *CCMetricStore) buildNodeQueries(
metrics []string, metrics []string,
scopes []schema.MetricScope, scopes []schema.MetricScope,
resolution int, resolution int,
) ([]APIQuery, []schema.MetricScope, error) { ) ([]ApiQuery, []schema.MetricScope, error) {
queries := make([]APIQuery, 0, len(metrics)*len(scopes)*len(nodes)) queries := make([]ApiQuery, 0, len(metrics)*len(scopes)*len(nodes))
assignedScope := []schema.MetricScope{} assignedScope := []schema.MetricScope{}
// Get Topol before loop if subCluster given // Get Topol before loop if subCluster given
@@ -951,7 +1000,7 @@ func (ccms *CCMetricStore) buildNodeQueries(
if mc.SubClusters != nil { if mc.SubClusters != nil {
isRemoved := false isRemoved := false
for _, scConfig := range mc.SubClusters { for _, scConfig := range mc.SubClusters {
if scConfig.Name == subCluster && scConfig.Remove { if scConfig.Name == subCluster && scConfig.Remove == true {
isRemoved = true isRemoved = true
break break
} }
@@ -1007,7 +1056,7 @@ func (ccms *CCMetricStore) buildNodeQueries(
continue continue
} }
queries = append(queries, APIQuery{ queries = append(queries, ApiQuery{
Metric: remoteName, Metric: remoteName,
Hostname: hostname, Hostname: hostname,
Aggregate: false, Aggregate: false,
@@ -1025,7 +1074,7 @@ func (ccms *CCMetricStore) buildNodeQueries(
continue continue
} }
queries = append(queries, APIQuery{ queries = append(queries, ApiQuery{
Metric: remoteName, Metric: remoteName,
Hostname: hostname, Hostname: hostname,
Aggregate: true, Aggregate: true,
@@ -1039,7 +1088,7 @@ func (ccms *CCMetricStore) buildNodeQueries(
// HWThread -> HWThead // HWThread -> HWThead
if nativeScope == schema.MetricScopeHWThread && scope == schema.MetricScopeHWThread { if nativeScope == schema.MetricScopeHWThread && scope == schema.MetricScopeHWThread {
queries = append(queries, APIQuery{ queries = append(queries, ApiQuery{
Metric: remoteName, Metric: remoteName,
Hostname: hostname, Hostname: hostname,
Aggregate: false, Aggregate: false,
@@ -1055,7 +1104,7 @@ func (ccms *CCMetricStore) buildNodeQueries(
if nativeScope == schema.MetricScopeHWThread && scope == schema.MetricScopeCore { if nativeScope == schema.MetricScopeHWThread && scope == schema.MetricScopeCore {
cores, _ := topology.GetCoresFromHWThreads(topology.Node) cores, _ := topology.GetCoresFromHWThreads(topology.Node)
for _, core := range cores { for _, core := range cores {
queries = append(queries, APIQuery{ queries = append(queries, ApiQuery{
Metric: remoteName, Metric: remoteName,
Hostname: hostname, Hostname: hostname,
Aggregate: true, Aggregate: true,
@@ -1072,7 +1121,7 @@ func (ccms *CCMetricStore) buildNodeQueries(
if nativeScope == schema.MetricScopeHWThread && scope == schema.MetricScopeSocket { if nativeScope == schema.MetricScopeHWThread && scope == schema.MetricScopeSocket {
sockets, _ := topology.GetSocketsFromHWThreads(topology.Node) sockets, _ := topology.GetSocketsFromHWThreads(topology.Node)
for _, socket := range sockets { for _, socket := range sockets {
queries = append(queries, APIQuery{ queries = append(queries, ApiQuery{
Metric: remoteName, Metric: remoteName,
Hostname: hostname, Hostname: hostname,
Aggregate: true, Aggregate: true,
@@ -1087,7 +1136,7 @@ func (ccms *CCMetricStore) buildNodeQueries(
// HWThread -> Node // HWThread -> Node
if nativeScope == schema.MetricScopeHWThread && scope == schema.MetricScopeNode { if nativeScope == schema.MetricScopeHWThread && scope == schema.MetricScopeNode {
queries = append(queries, APIQuery{ queries = append(queries, ApiQuery{
Metric: remoteName, Metric: remoteName,
Hostname: hostname, Hostname: hostname,
Aggregate: true, Aggregate: true,
@@ -1102,7 +1151,7 @@ func (ccms *CCMetricStore) buildNodeQueries(
// Core -> Core // Core -> Core
if nativeScope == schema.MetricScopeCore && scope == schema.MetricScopeCore { if nativeScope == schema.MetricScopeCore && scope == schema.MetricScopeCore {
cores, _ := topology.GetCoresFromHWThreads(topology.Node) cores, _ := topology.GetCoresFromHWThreads(topology.Node)
queries = append(queries, APIQuery{ queries = append(queries, ApiQuery{
Metric: remoteName, Metric: remoteName,
Hostname: hostname, Hostname: hostname,
Aggregate: false, Aggregate: false,
@@ -1118,7 +1167,7 @@ func (ccms *CCMetricStore) buildNodeQueries(
if nativeScope == schema.MetricScopeCore && scope == schema.MetricScopeSocket { if nativeScope == schema.MetricScopeCore && scope == schema.MetricScopeSocket {
sockets, _ := topology.GetSocketsFromCores(topology.Node) sockets, _ := topology.GetSocketsFromCores(topology.Node)
for _, socket := range sockets { for _, socket := range sockets {
queries = append(queries, APIQuery{ queries = append(queries, ApiQuery{
Metric: remoteName, Metric: remoteName,
Hostname: hostname, Hostname: hostname,
Aggregate: true, Aggregate: true,
@@ -1134,7 +1183,7 @@ func (ccms *CCMetricStore) buildNodeQueries(
// Core -> Node // Core -> Node
if nativeScope == schema.MetricScopeCore && scope == schema.MetricScopeNode { if nativeScope == schema.MetricScopeCore && scope == schema.MetricScopeNode {
cores, _ := topology.GetCoresFromHWThreads(topology.Node) cores, _ := topology.GetCoresFromHWThreads(topology.Node)
queries = append(queries, APIQuery{ queries = append(queries, ApiQuery{
Metric: remoteName, Metric: remoteName,
Hostname: hostname, Hostname: hostname,
Aggregate: true, Aggregate: true,
@@ -1149,7 +1198,7 @@ func (ccms *CCMetricStore) buildNodeQueries(
// MemoryDomain -> MemoryDomain // MemoryDomain -> MemoryDomain
if nativeScope == schema.MetricScopeMemoryDomain && scope == schema.MetricScopeMemoryDomain { if nativeScope == schema.MetricScopeMemoryDomain && scope == schema.MetricScopeMemoryDomain {
sockets, _ := topology.GetMemoryDomainsFromHWThreads(topology.Node) sockets, _ := topology.GetMemoryDomainsFromHWThreads(topology.Node)
queries = append(queries, APIQuery{ queries = append(queries, ApiQuery{
Metric: remoteName, Metric: remoteName,
Hostname: hostname, Hostname: hostname,
Aggregate: false, Aggregate: false,
@@ -1164,7 +1213,7 @@ func (ccms *CCMetricStore) buildNodeQueries(
// MemoryDoman -> Node // MemoryDoman -> Node
if nativeScope == schema.MetricScopeMemoryDomain && scope == schema.MetricScopeNode { if nativeScope == schema.MetricScopeMemoryDomain && scope == schema.MetricScopeNode {
sockets, _ := topology.GetMemoryDomainsFromHWThreads(topology.Node) sockets, _ := topology.GetMemoryDomainsFromHWThreads(topology.Node)
queries = append(queries, APIQuery{ queries = append(queries, ApiQuery{
Metric: remoteName, Metric: remoteName,
Hostname: hostname, Hostname: hostname,
Aggregate: true, Aggregate: true,
@@ -1179,7 +1228,7 @@ func (ccms *CCMetricStore) buildNodeQueries(
// Socket -> Socket // Socket -> Socket
if nativeScope == schema.MetricScopeSocket && scope == schema.MetricScopeSocket { if nativeScope == schema.MetricScopeSocket && scope == schema.MetricScopeSocket {
sockets, _ := topology.GetSocketsFromHWThreads(topology.Node) sockets, _ := topology.GetSocketsFromHWThreads(topology.Node)
queries = append(queries, APIQuery{ queries = append(queries, ApiQuery{
Metric: remoteName, Metric: remoteName,
Hostname: hostname, Hostname: hostname,
Aggregate: false, Aggregate: false,
@@ -1194,7 +1243,7 @@ func (ccms *CCMetricStore) buildNodeQueries(
// Socket -> Node // Socket -> Node
if nativeScope == schema.MetricScopeSocket && scope == schema.MetricScopeNode { if nativeScope == schema.MetricScopeSocket && scope == schema.MetricScopeNode {
sockets, _ := topology.GetSocketsFromHWThreads(topology.Node) sockets, _ := topology.GetSocketsFromHWThreads(topology.Node)
queries = append(queries, APIQuery{ queries = append(queries, ApiQuery{
Metric: remoteName, Metric: remoteName,
Hostname: hostname, Hostname: hostname,
Aggregate: true, Aggregate: true,
@@ -1208,7 +1257,7 @@ func (ccms *CCMetricStore) buildNodeQueries(
// Node -> Node // Node -> Node
if nativeScope == schema.MetricScopeNode && scope == schema.MetricScopeNode { if nativeScope == schema.MetricScopeNode && scope == schema.MetricScopeNode {
queries = append(queries, APIQuery{ queries = append(queries, ApiQuery{
Metric: remoteName, Metric: remoteName,
Hostname: hostname, Hostname: hostname,
Resolution: resolution, Resolution: resolution,

View File

@@ -2,7 +2,6 @@
// All rights reserved. This file is part of cc-backend. // All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style // Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file. // license that can be found in the LICENSE file.
package metricdata package metricdata
import ( import (
@@ -12,6 +11,7 @@ import (
"time" "time"
"github.com/ClusterCockpit/cc-backend/internal/config" "github.com/ClusterCockpit/cc-backend/internal/config"
"github.com/ClusterCockpit/cc-backend/internal/graph/model"
cclog "github.com/ClusterCockpit/cc-lib/ccLogger" cclog "github.com/ClusterCockpit/cc-lib/ccLogger"
"github.com/ClusterCockpit/cc-lib/schema" "github.com/ClusterCockpit/cc-lib/schema"
) )
@@ -34,94 +34,51 @@ type MetricDataRepository interface {
LoadNodeData(cluster string, metrics, nodes []string, scopes []schema.MetricScope, from, to time.Time, ctx context.Context) (map[string]map[string][]*schema.JobMetric, error) LoadNodeData(cluster string, metrics, nodes []string, scopes []schema.MetricScope, from, to time.Time, ctx context.Context) (map[string]map[string][]*schema.JobMetric, error)
// Return a map of hosts to a map of metrics to a map of scopes for multiple nodes. // Return a map of hosts to a map of metrics to a map of scopes for multiple nodes.
LoadNodeListData(cluster, subCluster string, nodes, metrics []string, scopes []schema.MetricScope, resolution int, from, to time.Time, ctx context.Context) (map[string]schema.JobData, error) LoadNodeListData(cluster, subCluster, nodeFilter string, metrics []string, scopes []schema.MetricScope, resolution int, from, to time.Time, page *model.PageRequest, ctx context.Context) (map[string]schema.JobData, int, bool, error)
} }
var upstreamMetricDataRepo MetricDataRepository var metricDataRepos map[string]MetricDataRepository = map[string]MetricDataRepository{}
// func Init() error { func Init() error {
// for _, cluster := range config.Clusters { for _, cluster := range config.Keys.Clusters {
// if cluster.MetricDataRepository != nil { if cluster.MetricDataRepository != nil {
// var kind struct { var kind struct {
// Kind string `json:"kind"` Kind string `json:"kind"`
// } }
// if err := json.Unmarshal(cluster.MetricDataRepository, &kind); err != nil { if err := json.Unmarshal(cluster.MetricDataRepository, &kind); err != nil {
// cclog.Warn("Error while unmarshaling raw json MetricDataRepository") cclog.Warn("Error while unmarshaling raw json MetricDataRepository")
// return err return err
// } }
//
// var mdr MetricDataRepository
// switch kind.Kind {
// case "cc-metric-store":
// mdr = &CCMetricStore{}
// case "prometheus":
// mdr = &PrometheusDataRepository{}
// case "test":
// mdr = &TestMetricDataRepository{}
// default:
// return fmt.Errorf("METRICDATA/METRICDATA > Unknown MetricDataRepository %v for cluster %v", kind.Kind, cluster.Name)
// }
//
// if err := mdr.Init(cluster.MetricDataRepository); err != nil {
// cclog.Errorf("Error initializing MetricDataRepository %v for cluster %v", kind.Kind, cluster.Name)
// return err
// }
// metricDataRepos[cluster.Name] = mdr
// }
// }
// return nil
// }
// func GetMetricDataRepo(cluster string) (MetricDataRepository, error) { var mdr MetricDataRepository
// var err error switch kind.Kind {
// repo, ok := metricDataRepos[cluster] case "cc-metric-store":
// mdr = &CCMetricStore{}
// if !ok { case "prometheus":
// err = fmt.Errorf("METRICDATA/METRICDATA > no metric data repository configured for '%s'", cluster) mdr = &PrometheusDataRepository{}
// } case "test":
// mdr = &TestMetricDataRepository{}
// return repo, err default:
// } return fmt.Errorf("METRICDATA/METRICDATA > Unknown MetricDataRepository %v for cluster %v", kind.Kind, cluster.Name)
}
// InitUpstreamRepos initializes global upstream metric data repository for the pull worker if err := mdr.Init(cluster.MetricDataRepository); err != nil {
func InitUpstreamRepos() error { cclog.Errorf("Error initializing MetricDataRepository %v for cluster %v", kind.Kind, cluster.Name)
if config.Keys.UpstreamMetricRepository == nil { return err
return nil }
metricDataRepos[cluster.Name] = mdr
}
} }
var kind struct {
Kind string `json:"kind"`
}
if err := json.Unmarshal(*config.Keys.UpstreamMetricRepository, &kind); err != nil {
cclog.Warn("Error while unmarshaling raw json UpstreamMetricRepository")
return err
}
var mdr MetricDataRepository
switch kind.Kind {
case "cc-metric-store":
mdr = &CCMetricStore{}
case "prometheus":
mdr = &PrometheusDataRepository{}
case "test":
mdr = &TestMetricDataRepository{}
default:
return fmt.Errorf("METRICDATA/METRICDATA > Unknown UpstreamMetricRepository %v", kind.Kind)
}
if err := mdr.Init(*config.Keys.UpstreamMetricRepository); err != nil {
cclog.Errorf("Error initializing UpstreamMetricRepository %v", kind.Kind)
return err
}
upstreamMetricDataRepo = mdr
cclog.Infof("Initialized global upstream metric repository '%s'", kind.Kind)
return nil return nil
} }
// GetUpstreamMetricDataRepo returns the global upstream metric data repository func GetMetricDataRepo(cluster string) (MetricDataRepository, error) {
func GetUpstreamMetricDataRepo() (MetricDataRepository, error) { var err error
if upstreamMetricDataRepo == nil { repo, ok := metricDataRepos[cluster]
return nil, fmt.Errorf("METRICDATA/METRICDATA > no upstream metric data repository configured")
if !ok {
err = fmt.Errorf("METRICDATA/METRICDATA > no metric data repository configured for '%s'", cluster)
} }
return upstreamMetricDataRepo, nil
return repo, err
} }

View File

@@ -2,7 +2,6 @@
// All rights reserved. This file is part of cc-backend. // All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style // Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file. // license that can be found in the LICENSE file.
package metricdata package metricdata
import ( import (
@@ -21,6 +20,7 @@ import (
"text/template" "text/template"
"time" "time"
"github.com/ClusterCockpit/cc-backend/internal/graph/model"
"github.com/ClusterCockpit/cc-backend/pkg/archive" "github.com/ClusterCockpit/cc-backend/pkg/archive"
cclog "github.com/ClusterCockpit/cc-lib/ccLogger" cclog "github.com/ClusterCockpit/cc-lib/ccLogger"
"github.com/ClusterCockpit/cc-lib/schema" "github.com/ClusterCockpit/cc-lib/schema"
@@ -494,17 +494,62 @@ func (pdb *PrometheusDataRepository) LoadScopedStats(
// Implemented by NHR@FAU; Used in NodeList-View // Implemented by NHR@FAU; Used in NodeList-View
func (pdb *PrometheusDataRepository) LoadNodeListData( func (pdb *PrometheusDataRepository) LoadNodeListData(
cluster, subCluster string, cluster, subCluster, nodeFilter string,
nodes []string,
metrics []string, metrics []string,
scopes []schema.MetricScope, scopes []schema.MetricScope,
resolution int, resolution int,
from, to time.Time, from, to time.Time,
page *model.PageRequest,
ctx context.Context, ctx context.Context,
) (map[string]schema.JobData, error) { ) (map[string]schema.JobData, int, bool, error) {
// Assumption: pdb.loadData() only returns series node-scope - use node scope for NodeList // Assumption: pdb.loadData() only returns series node-scope - use node scope for NodeList
// Fetch Data, based on pdb.LoadNodeData() // 0) Init additional vars
var totalNodes int = 0
var hasNextPage bool = false
// 1) Get list of all nodes
var nodes []string
if subCluster != "" {
scNodes := archive.NodeLists[cluster][subCluster]
nodes = scNodes.PrintList()
} else {
subClusterNodeLists := archive.NodeLists[cluster]
for _, nodeList := range subClusterNodeLists {
nodes = append(nodes, nodeList.PrintList()...)
}
}
// 2) Filter nodes
if nodeFilter != "" {
filteredNodes := []string{}
for _, node := range nodes {
if strings.Contains(node, nodeFilter) {
filteredNodes = append(filteredNodes, node)
}
}
nodes = filteredNodes
}
// 2.1) Count total nodes && Sort nodes -> Sorting invalidated after return ...
totalNodes = len(nodes)
sort.Strings(nodes)
// 3) Apply paging
if len(nodes) > page.ItemsPerPage {
start := (page.Page - 1) * page.ItemsPerPage
end := start + page.ItemsPerPage
if end >= len(nodes) {
end = len(nodes)
hasNextPage = false
} else {
hasNextPage = true
}
nodes = nodes[start:end]
}
// 4) Fetch Data, based on pdb.LoadNodeData()
t0 := time.Now() t0 := time.Now()
// Map of hosts of jobData // Map of hosts of jobData
data := make(map[string]schema.JobData) data := make(map[string]schema.JobData)
@@ -527,12 +572,12 @@ func (pdb *PrometheusDataRepository) LoadNodeListData(
metricConfig := archive.GetMetricConfig(cluster, metric) metricConfig := archive.GetMetricConfig(cluster, metric)
if metricConfig == nil { if metricConfig == nil {
cclog.Warnf("Error in LoadNodeListData: Metric %s for cluster %s not configured", metric, cluster) cclog.Warnf("Error in LoadNodeListData: Metric %s for cluster %s not configured", metric, cluster)
return nil, errors.New("Prometheus config error") return nil, totalNodes, hasNextPage, errors.New("Prometheus config error")
} }
query, err := pdb.FormatQuery(metric, scope, nodes, cluster) query, err := pdb.FormatQuery(metric, scope, nodes, cluster)
if err != nil { if err != nil {
cclog.Warn("Error while formatting prometheus query") cclog.Warn("Error while formatting prometheus query")
return nil, err return nil, totalNodes, hasNextPage, err
} }
// ranged query over all nodes // ranged query over all nodes
@@ -544,7 +589,7 @@ func (pdb *PrometheusDataRepository) LoadNodeListData(
result, warnings, err := pdb.queryClient.QueryRange(ctx, query, r) result, warnings, err := pdb.queryClient.QueryRange(ctx, query, r)
if err != nil { if err != nil {
cclog.Errorf("Prometheus query error in LoadNodeData: %v\n", err) cclog.Errorf("Prometheus query error in LoadNodeData: %v\n", err)
return nil, errors.New("Prometheus query error") return nil, totalNodes, hasNextPage, errors.New("Prometheus query error")
} }
if len(warnings) > 0 { if len(warnings) > 0 {
cclog.Warnf("Warnings: %v\n", warnings) cclog.Warnf("Warnings: %v\n", warnings)
@@ -584,5 +629,5 @@ func (pdb *PrometheusDataRepository) LoadNodeListData(
} }
t1 := time.Since(t0) t1 := time.Since(t0)
cclog.Debugf("LoadNodeListData of %v nodes took %s", len(data), t1) cclog.Debugf("LoadNodeListData of %v nodes took %s", len(data), t1)
return data, nil return data, totalNodes, hasNextPage, nil
} }

View File

@@ -2,7 +2,6 @@
// All rights reserved. This file is part of cc-backend. // All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style // Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file. // license that can be found in the LICENSE file.
package metricdata package metricdata
import ( import (
@@ -10,6 +9,7 @@ import (
"encoding/json" "encoding/json"
"time" "time"
"github.com/ClusterCockpit/cc-backend/internal/graph/model"
"github.com/ClusterCockpit/cc-lib/schema" "github.com/ClusterCockpit/cc-lib/schema"
) )
@@ -17,7 +17,7 @@ var TestLoadDataCallback func(job *schema.Job, metrics []string, scopes []schema
panic("TODO") panic("TODO")
} }
// TestMetricDataRepository is only a mock for unit-testing. // Only a mock for unit-testing.
type TestMetricDataRepository struct{} type TestMetricDataRepository struct{}
func (tmdr *TestMetricDataRepository) Init(_ json.RawMessage) error { func (tmdr *TestMetricDataRepository) Init(_ json.RawMessage) error {
@@ -62,13 +62,59 @@ func (tmdr *TestMetricDataRepository) LoadNodeData(
} }
func (tmdr *TestMetricDataRepository) LoadNodeListData( func (tmdr *TestMetricDataRepository) LoadNodeListData(
cluster, subCluster string, cluster, subCluster, nodeFilter string,
nodes []string,
metrics []string, metrics []string,
scopes []schema.MetricScope, scopes []schema.MetricScope,
resolution int, resolution int,
from, to time.Time, from, to time.Time,
page *model.PageRequest,
ctx context.Context, ctx context.Context,
) (map[string]schema.JobData, error) { ) (map[string]schema.JobData, int, bool, error) {
panic("TODO") panic("TODO")
} }
func DeepCopy(jd_temp schema.JobData) schema.JobData {
var jd schema.JobData
jd = make(schema.JobData, len(jd_temp))
for k, v := range jd_temp {
jd[k] = make(map[schema.MetricScope]*schema.JobMetric, len(jd_temp[k]))
for k_, v_ := range v {
jd[k][k_] = new(schema.JobMetric)
jd[k][k_].Series = make([]schema.Series, len(v_.Series))
for i := 0; i < len(v_.Series); i += 1 {
jd[k][k_].Series[i].Data = make([]schema.Float, len(v_.Series[i].Data))
copy(jd[k][k_].Series[i].Data, v_.Series[i].Data)
jd[k][k_].Series[i].Hostname = v_.Series[i].Hostname
jd[k][k_].Series[i].Id = v_.Series[i].Id
jd[k][k_].Series[i].Statistics.Avg = v_.Series[i].Statistics.Avg
jd[k][k_].Series[i].Statistics.Min = v_.Series[i].Statistics.Min
jd[k][k_].Series[i].Statistics.Max = v_.Series[i].Statistics.Max
}
jd[k][k_].Timestep = v_.Timestep
jd[k][k_].Unit.Base = v_.Unit.Base
jd[k][k_].Unit.Prefix = v_.Unit.Prefix
if v_.StatisticsSeries != nil {
// Init Slices
jd[k][k_].StatisticsSeries = new(schema.StatsSeries)
jd[k][k_].StatisticsSeries.Max = make([]schema.Float, len(v_.StatisticsSeries.Max))
jd[k][k_].StatisticsSeries.Min = make([]schema.Float, len(v_.StatisticsSeries.Min))
jd[k][k_].StatisticsSeries.Median = make([]schema.Float, len(v_.StatisticsSeries.Median))
jd[k][k_].StatisticsSeries.Mean = make([]schema.Float, len(v_.StatisticsSeries.Mean))
// Copy Data
copy(jd[k][k_].StatisticsSeries.Max, v_.StatisticsSeries.Max)
copy(jd[k][k_].StatisticsSeries.Min, v_.StatisticsSeries.Min)
copy(jd[k][k_].StatisticsSeries.Median, v_.StatisticsSeries.Median)
copy(jd[k][k_].StatisticsSeries.Mean, v_.StatisticsSeries.Mean)
// Handle Percentiles
for k__, v__ := range v_.StatisticsSeries.Percentiles {
jd[k][k_].StatisticsSeries.Percentiles[k__] = make([]schema.Float, len(v__))
copy(jd[k][k_].StatisticsSeries.Percentiles[k__], v__)
}
} else {
jd[k][k_].StatisticsSeries = v_.StatisticsSeries
}
}
}
return jd
}

View File

@@ -1,490 +0,0 @@
// Copyright (C) NHR@FAU, University Erlangen-Nuremberg.
// All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.
// Package metricdispatcher provides a unified interface for loading and caching job metric data.
//
// This package serves as a central dispatcher that routes metric data requests to the appropriate
// backend based on job state. For running jobs, data is fetched from the metric store (e.g., cc-metric-store).
// For completed jobs, data is retrieved from the file-based job archive.
//
// # Key Features
//
// - Automatic backend selection based on job state (running vs. archived)
// - LRU cache for performance optimization (128 MB default cache size)
// - Data resampling using Largest Triangle Three Bucket algorithm for archived data
// - Automatic statistics series generation for jobs with many nodes
// - Support for scoped metrics (node, socket, accelerator, core)
//
// # Cache Behavior
//
// Cached data has different TTL (time-to-live) values depending on job state:
// - Running jobs: 2 minutes (data changes frequently)
// - Completed jobs: 5 hours (data is static)
//
// The cache key is based on job ID, state, requested metrics, scopes, and resolution.
//
// # Usage
//
// The primary entry point is LoadData, which automatically handles both running and archived jobs:
//
// jobData, err := metricdispatcher.LoadData(job, metrics, scopes, ctx, resolution)
// if err != nil {
// // Handle error
// }
//
// For statistics only, use LoadJobStats, LoadScopedJobStats, or LoadAverages depending on the required format.
package metricdispatcher
import (
"context"
"fmt"
"math"
"time"
"github.com/ClusterCockpit/cc-backend/internal/config"
"github.com/ClusterCockpit/cc-backend/internal/memorystore"
"github.com/ClusterCockpit/cc-backend/pkg/archive"
cclog "github.com/ClusterCockpit/cc-lib/ccLogger"
"github.com/ClusterCockpit/cc-lib/lrucache"
"github.com/ClusterCockpit/cc-lib/resampler"
"github.com/ClusterCockpit/cc-lib/schema"
)
// cache is an LRU cache with 128 MB capacity for storing loaded job metric data.
// The cache reduces load on both the metric store and archive backends.
var cache *lrucache.Cache = lrucache.New(128 * 1024 * 1024)
// cacheKey generates a unique cache key for a job's metric data based on job ID, state,
// requested metrics, scopes, and resolution. Duration and StartTime are intentionally excluded
// because job.ID is more unique and the cache TTL ensures entries don't persist indefinitely.
func cacheKey(
job *schema.Job,
metrics []string,
scopes []schema.MetricScope,
resolution int,
) string {
return fmt.Sprintf("%d(%s):[%v],[%v]-%d",
job.ID, job.State, metrics, scopes, resolution)
}
// LoadData retrieves metric data for a job from the appropriate backend (memory store for running jobs,
// archive for completed jobs) and applies caching, resampling, and statistics generation as needed.
//
// For running jobs or when archive is disabled, data is fetched from the metric store.
// For completed archived jobs, data is loaded from the job archive and resampled if needed.
//
// Parameters:
// - job: The job for which to load metric data
// - metrics: List of metric names to load (nil loads all metrics for the cluster)
// - scopes: Metric scopes to include (nil defaults to node scope)
// - ctx: Context for cancellation and timeouts
// - resolution: Target number of data points for resampling (only applies to archived data)
//
// Returns the loaded job data and any error encountered. For partial errors (some metrics failed),
// the function returns the successfully loaded data with a warning logged.
func LoadData(job *schema.Job,
metrics []string,
scopes []schema.MetricScope,
ctx context.Context,
resolution int,
) (schema.JobData, error) {
data := cache.Get(cacheKey(job, metrics, scopes, resolution), func() (_ any, ttl time.Duration, size int) {
var jd schema.JobData
var err error
if job.State == schema.JobStateRunning ||
job.MonitoringStatus == schema.MonitoringStatusRunningOrArchiving ||
config.Keys.DisableArchive {
if scopes == nil {
scopes = append(scopes, schema.MetricScopeNode)
}
if metrics == nil {
cluster := archive.GetCluster(job.Cluster)
for _, mc := range cluster.MetricConfig {
metrics = append(metrics, mc.Name)
}
}
jd, err = memorystore.LoadData(job, metrics, scopes, ctx, resolution)
if err != nil {
if len(jd) != 0 {
cclog.Warnf("partial error loading metrics from store for job %d (user: %s, project: %s): %s",
job.JobID, job.User, job.Project, err.Error())
} else {
cclog.Errorf("failed to load job data from metric store for job %d (user: %s, project: %s): %s",
job.JobID, job.User, job.Project, err.Error())
return err, 0, 0
}
}
size = jd.Size()
} else {
var jdTemp schema.JobData
jdTemp, err = archive.GetHandle().LoadJobData(job)
if err != nil {
cclog.Errorf("failed to load job data from archive for job %d (user: %s, project: %s): %s",
job.JobID, job.User, job.Project, err.Error())
return err, 0, 0
}
jd = deepCopy(jdTemp)
// Resample archived data using Largest Triangle Three Bucket algorithm to reduce data points
// to the requested resolution, improving transfer performance and client-side rendering.
for _, v := range jd {
for _, v_ := range v {
timestep := int64(0)
for i := 0; i < len(v_.Series); i += 1 {
v_.Series[i].Data, timestep, err = resampler.LargestTriangleThreeBucket(v_.Series[i].Data, int64(v_.Timestep), int64(resolution))
if err != nil {
return err, 0, 0
}
}
v_.Timestep = int(timestep)
}
}
// Filter job data to only include requested metrics and scopes, avoiding unnecessary data transfer.
if metrics != nil || scopes != nil {
if metrics == nil {
metrics = make([]string, 0, len(jd))
for k := range jd {
metrics = append(metrics, k)
}
}
res := schema.JobData{}
for _, metric := range metrics {
if perscope, ok := jd[metric]; ok {
if len(perscope) > 1 {
subset := make(map[schema.MetricScope]*schema.JobMetric)
for _, scope := range scopes {
if jm, ok := perscope[scope]; ok {
subset[scope] = jm
}
}
if len(subset) > 0 {
perscope = subset
}
}
res[metric] = perscope
}
}
jd = res
}
size = jd.Size()
}
ttl = 5 * time.Hour
if job.State == schema.JobStateRunning {
ttl = 2 * time.Minute
}
// Generate statistics series for jobs with many nodes to enable min/median/max graphs
// instead of overwhelming the UI with individual node lines. Note that newly calculated
// statistics use min/median/max, while archived statistics may use min/mean/max.
const maxSeriesSize int = 15
for _, scopes := range jd {
for _, jm := range scopes {
if jm.StatisticsSeries != nil || len(jm.Series) <= maxSeriesSize {
continue
}
jm.AddStatisticsSeries()
}
}
nodeScopeRequested := false
for _, scope := range scopes {
if scope == schema.MetricScopeNode {
nodeScopeRequested = true
}
}
if nodeScopeRequested {
jd.AddNodeScope("flops_any")
jd.AddNodeScope("mem_bw")
}
// Round Resulting Stat Values
jd.RoundMetricStats()
return jd, ttl, size
})
if err, ok := data.(error); ok {
cclog.Errorf("error in cached dataset for job %d: %s", job.JobID, err.Error())
return nil, err
}
return data.(schema.JobData), nil
}
// LoadAverages computes average values for the specified metrics across all nodes of a job.
// For running jobs, it loads statistics from the metric store. For completed jobs, it uses
// the pre-calculated averages from the job archive. The results are appended to the data slice.
func LoadAverages(
job *schema.Job,
metrics []string,
data [][]schema.Float,
ctx context.Context,
) error {
if job.State != schema.JobStateRunning && !config.Keys.DisableArchive {
return archive.LoadAveragesFromArchive(job, metrics, data) // #166 change also here?
}
stats, err := memorystore.LoadStats(job, metrics, ctx)
if err != nil {
cclog.Errorf("failed to load statistics from metric store for job %d (user: %s, project: %s): %s",
job.JobID, job.User, job.Project, err.Error())
return err
}
for i, m := range metrics {
nodes, ok := stats[m]
if !ok {
data[i] = append(data[i], schema.NaN)
continue
}
sum := 0.0
for _, node := range nodes {
sum += node.Avg
}
data[i] = append(data[i], schema.Float(sum))
}
return nil
}
// LoadScopedJobStats retrieves job statistics organized by metric scope (node, socket, core, accelerator).
// For running jobs, statistics are computed from the metric store. For completed jobs, pre-calculated
// statistics are loaded from the job archive.
func LoadScopedJobStats(
job *schema.Job,
metrics []string,
scopes []schema.MetricScope,
ctx context.Context,
) (schema.ScopedJobStats, error) {
if job.State != schema.JobStateRunning && !config.Keys.DisableArchive {
return archive.LoadScopedStatsFromArchive(job, metrics, scopes)
}
scopedStats, err := memorystore.LoadScopedStats(job, metrics, scopes, ctx)
if err != nil {
cclog.Errorf("failed to load scoped statistics from metric store for job %d (user: %s, project: %s): %s",
job.JobID, job.User, job.Project, err.Error())
return nil, err
}
return scopedStats, nil
}
// LoadJobStats retrieves aggregated statistics (min/avg/max) for each requested metric across all job nodes.
// For running jobs, statistics are computed from the metric store. For completed jobs, pre-calculated
// statistics are loaded from the job archive.
func LoadJobStats(
job *schema.Job,
metrics []string,
ctx context.Context,
) (map[string]schema.MetricStatistics, error) {
if job.State != schema.JobStateRunning && !config.Keys.DisableArchive {
return archive.LoadStatsFromArchive(job, metrics)
}
data := make(map[string]schema.MetricStatistics, len(metrics))
stats, err := memorystore.LoadStats(job, metrics, ctx)
if err != nil {
cclog.Errorf("failed to load statistics from metric store for job %d (user: %s, project: %s): %s",
job.JobID, job.User, job.Project, err.Error())
return data, err
}
for _, m := range metrics {
sum, avg, min, max := 0.0, 0.0, 0.0, 0.0
nodes, ok := stats[m]
if !ok {
data[m] = schema.MetricStatistics{Min: min, Avg: avg, Max: max}
continue
}
for _, node := range nodes {
sum += node.Avg
min = math.Min(min, node.Min)
max = math.Max(max, node.Max)
}
data[m] = schema.MetricStatistics{
Avg: (math.Round((sum/float64(job.NumNodes))*100) / 100),
Min: (math.Round(min*100) / 100),
Max: (math.Round(max*100) / 100),
}
}
return data, nil
}
// LoadNodeData retrieves metric data for specific nodes in a cluster within a time range.
// This is used for node monitoring views and system status pages. Data is always fetched from
// the metric store (not the archive) since it's for current/recent node status monitoring.
//
// Returns a nested map structure: node -> metric -> scoped data.
func LoadNodeData(
cluster string,
metrics, nodes []string,
scopes []schema.MetricScope,
from, to time.Time,
ctx context.Context,
) (map[string]map[string][]*schema.JobMetric, error) {
if metrics == nil {
for _, m := range archive.GetCluster(cluster).MetricConfig {
metrics = append(metrics, m.Name)
}
}
data, err := memorystore.LoadNodeData(cluster, metrics, nodes, scopes, from, to, ctx)
if err != nil {
if len(data) != 0 {
cclog.Warnf("partial error loading node data from metric store for cluster %s: %s", cluster, err.Error())
} else {
cclog.Errorf("failed to load node data from metric store for cluster %s: %s", cluster, err.Error())
return nil, err
}
}
if data == nil {
return nil, fmt.Errorf("metric store for cluster '%s' does not support node data queries", cluster)
}
return data, nil
}
// LoadNodeListData retrieves time-series metric data for multiple nodes within a time range,
// with optional resampling and automatic statistics generation for large datasets.
// This is used for comparing multiple nodes or displaying node status over time.
//
// Returns a map of node names to their job-like metric data structures.
func LoadNodeListData(
cluster, subCluster string,
nodes []string,
metrics []string,
scopes []schema.MetricScope,
resolution int,
from, to time.Time,
ctx context.Context,
) (map[string]schema.JobData, error) {
if metrics == nil {
for _, m := range archive.GetCluster(cluster).MetricConfig {
metrics = append(metrics, m.Name)
}
}
data, err := memorystore.LoadNodeListData(cluster, subCluster, nodes, metrics, scopes, resolution, from, to, ctx)
if err != nil {
if len(data) != 0 {
cclog.Warnf("partial error loading node list data from metric store for cluster %s, subcluster %s: %s",
cluster, subCluster, err.Error())
} else {
cclog.Errorf("failed to load node list data from metric store for cluster %s, subcluster %s: %s",
cluster, subCluster, err.Error())
return nil, err
}
}
// Generate statistics series for datasets with many series to improve visualization performance.
// Statistics are calculated as min/median/max.
const maxSeriesSize int = 8
for _, jd := range data {
for _, scopes := range jd {
for _, jm := range scopes {
if jm.StatisticsSeries != nil || len(jm.Series) < maxSeriesSize {
continue
}
jm.AddStatisticsSeries()
}
}
}
if data == nil {
return nil, fmt.Errorf("metric store for cluster '%s' does not support node list queries", cluster)
}
return data, nil
}
// deepCopy creates a deep copy of JobData to prevent cache corruption when modifying
// archived data (e.g., during resampling). This ensures the cached archive data remains
// immutable while allowing per-request transformations.
func deepCopy(source schema.JobData) schema.JobData {
result := make(schema.JobData, len(source))
for metricName, scopeMap := range source {
result[metricName] = make(map[schema.MetricScope]*schema.JobMetric, len(scopeMap))
for scope, jobMetric := range scopeMap {
result[metricName][scope] = copyJobMetric(jobMetric)
}
}
return result
}
func copyJobMetric(src *schema.JobMetric) *schema.JobMetric {
dst := &schema.JobMetric{
Timestep: src.Timestep,
Unit: src.Unit,
Series: make([]schema.Series, len(src.Series)),
}
for i := range src.Series {
dst.Series[i] = copySeries(&src.Series[i])
}
if src.StatisticsSeries != nil {
dst.StatisticsSeries = copyStatisticsSeries(src.StatisticsSeries)
}
return dst
}
func copySeries(src *schema.Series) schema.Series {
dst := schema.Series{
Hostname: src.Hostname,
Id: src.Id,
Statistics: src.Statistics,
Data: make([]schema.Float, len(src.Data)),
}
copy(dst.Data, src.Data)
return dst
}
func copyStatisticsSeries(src *schema.StatsSeries) *schema.StatsSeries {
dst := &schema.StatsSeries{
Min: make([]schema.Float, len(src.Min)),
Mean: make([]schema.Float, len(src.Mean)),
Median: make([]schema.Float, len(src.Median)),
Max: make([]schema.Float, len(src.Max)),
}
copy(dst.Min, src.Min)
copy(dst.Mean, src.Mean)
copy(dst.Median, src.Median)
copy(dst.Max, src.Max)
if len(src.Percentiles) > 0 {
dst.Percentiles = make(map[int][]schema.Float, len(src.Percentiles))
for percentile, values := range src.Percentiles {
dst.Percentiles[percentile] = make([]schema.Float, len(values))
copy(dst.Percentiles[percentile], values)
}
}
return dst
}

View File

@@ -1,125 +0,0 @@
// Copyright (C) NHR@FAU, University Erlangen-Nuremberg.
// All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.
package metricdispatcher
import (
"testing"
"github.com/ClusterCockpit/cc-lib/schema"
)
func TestDeepCopy(t *testing.T) {
nodeId := "0"
original := schema.JobData{
"cpu_load": {
schema.MetricScopeNode: &schema.JobMetric{
Timestep: 60,
Unit: schema.Unit{Base: "load", Prefix: ""},
Series: []schema.Series{
{
Hostname: "node001",
Id: &nodeId,
Data: []schema.Float{1.0, 2.0, 3.0},
Statistics: schema.MetricStatistics{
Min: 1.0,
Avg: 2.0,
Max: 3.0,
},
},
},
StatisticsSeries: &schema.StatsSeries{
Min: []schema.Float{1.0, 1.5, 2.0},
Mean: []schema.Float{2.0, 2.5, 3.0},
Median: []schema.Float{2.0, 2.5, 3.0},
Max: []schema.Float{3.0, 3.5, 4.0},
Percentiles: map[int][]schema.Float{
25: {1.5, 2.0, 2.5},
75: {2.5, 3.0, 3.5},
},
},
},
},
}
copied := deepCopy(original)
original["cpu_load"][schema.MetricScopeNode].Series[0].Data[0] = 999.0
original["cpu_load"][schema.MetricScopeNode].StatisticsSeries.Min[0] = 888.0
original["cpu_load"][schema.MetricScopeNode].StatisticsSeries.Percentiles[25][0] = 777.0
if copied["cpu_load"][schema.MetricScopeNode].Series[0].Data[0] != 1.0 {
t.Errorf("Series data was not deeply copied: got %v, want 1.0",
copied["cpu_load"][schema.MetricScopeNode].Series[0].Data[0])
}
if copied["cpu_load"][schema.MetricScopeNode].StatisticsSeries.Min[0] != 1.0 {
t.Errorf("StatisticsSeries was not deeply copied: got %v, want 1.0",
copied["cpu_load"][schema.MetricScopeNode].StatisticsSeries.Min[0])
}
if copied["cpu_load"][schema.MetricScopeNode].StatisticsSeries.Percentiles[25][0] != 1.5 {
t.Errorf("Percentiles was not deeply copied: got %v, want 1.5",
copied["cpu_load"][schema.MetricScopeNode].StatisticsSeries.Percentiles[25][0])
}
if copied["cpu_load"][schema.MetricScopeNode].Timestep != 60 {
t.Errorf("Timestep not copied correctly: got %v, want 60",
copied["cpu_load"][schema.MetricScopeNode].Timestep)
}
if copied["cpu_load"][schema.MetricScopeNode].Series[0].Hostname != "node001" {
t.Errorf("Hostname not copied correctly: got %v, want node001",
copied["cpu_load"][schema.MetricScopeNode].Series[0].Hostname)
}
}
func TestDeepCopyNilStatisticsSeries(t *testing.T) {
original := schema.JobData{
"mem_used": {
schema.MetricScopeNode: &schema.JobMetric{
Timestep: 60,
Series: []schema.Series{
{
Hostname: "node001",
Data: []schema.Float{1.0, 2.0},
},
},
StatisticsSeries: nil,
},
},
}
copied := deepCopy(original)
if copied["mem_used"][schema.MetricScopeNode].StatisticsSeries != nil {
t.Errorf("StatisticsSeries should be nil, got %v",
copied["mem_used"][schema.MetricScopeNode].StatisticsSeries)
}
}
func TestDeepCopyEmptyPercentiles(t *testing.T) {
original := schema.JobData{
"cpu_load": {
schema.MetricScopeNode: &schema.JobMetric{
Timestep: 60,
Series: []schema.Series{},
StatisticsSeries: &schema.StatsSeries{
Min: []schema.Float{1.0},
Mean: []schema.Float{2.0},
Median: []schema.Float{2.0},
Max: []schema.Float{3.0},
Percentiles: nil,
},
},
},
}
copied := deepCopy(original)
if copied["cpu_load"][schema.MetricScopeNode].StatisticsSeries.Percentiles != nil {
t.Errorf("Percentiles should be nil when source is nil/empty")
}
}

View File

@@ -1,68 +0,0 @@
// Copyright (C) NHR@FAU, University Erlangen-Nuremberg.
// All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.
package repository
import "time"
// RepositoryConfig holds configuration for repository operations.
// All fields have sensible defaults, so this configuration is optional.
type RepositoryConfig struct {
// CacheSize is the LRU cache size in bytes for job metadata and energy footprints.
// Default: 1MB (1024 * 1024 bytes)
CacheSize int
// MaxOpenConnections is the maximum number of open database connections.
// Default: 4
MaxOpenConnections int
// MaxIdleConnections is the maximum number of idle database connections.
// Default: 4
MaxIdleConnections int
// ConnectionMaxLifetime is the maximum amount of time a connection may be reused.
// Default: 1 hour
ConnectionMaxLifetime time.Duration
// ConnectionMaxIdleTime is the maximum amount of time a connection may be idle.
// Default: 1 hour
ConnectionMaxIdleTime time.Duration
// MinRunningJobDuration is the minimum duration in seconds for a job to be
// considered in "running jobs" queries. This filters out very short jobs.
// Default: 600 seconds (10 minutes)
MinRunningJobDuration int
}
// DefaultConfig returns the default repository configuration.
// These values are optimized for typical deployments.
func DefaultConfig() *RepositoryConfig {
return &RepositoryConfig{
CacheSize: 1 * 1024 * 1024, // 1MB
MaxOpenConnections: 4,
MaxIdleConnections: 4,
ConnectionMaxLifetime: time.Hour,
ConnectionMaxIdleTime: time.Hour,
MinRunningJobDuration: 600, // 10 minutes
}
}
// repoConfig is the package-level configuration instance.
// It is initialized with defaults and can be overridden via SetConfig.
var repoConfig *RepositoryConfig = DefaultConfig()
// SetConfig sets the repository configuration.
// This must be called before any repository initialization (Connect, GetJobRepository, etc.).
// If not called, default values from DefaultConfig() are used.
func SetConfig(cfg *RepositoryConfig) {
if cfg != nil {
repoConfig = cfg
}
}
// GetConfig returns the current repository configuration.
func GetConfig() *RepositoryConfig {
return repoConfig
}

View File

@@ -2,7 +2,6 @@
// All rights reserved. This file is part of cc-backend. // All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style // Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file. // license that can be found in the LICENSE file.
package repository package repository
import ( import (
@@ -36,15 +35,21 @@ type DatabaseOptions struct {
ConnectionMaxIdleTime time.Duration ConnectionMaxIdleTime time.Duration
} }
func setupSqlite(db *sql.DB) error { func setupSqlite(db *sql.DB) (err error) {
pragmas := []string{ pragmas := []string{
// "journal_mode = WAL",
// "busy_timeout = 5000",
// "synchronous = NORMAL",
// "cache_size = 1000000000", // 1GB
// "foreign_keys = true",
"temp_store = memory", "temp_store = memory",
// "mmap_size = 3000000000",
} }
for _, pragma := range pragmas { for _, pragma := range pragmas {
_, err := db.Exec("PRAGMA " + pragma) _, err = db.Exec("PRAGMA " + pragma)
if err != nil { if err != nil {
return err return
} }
} }
@@ -55,44 +60,45 @@ func Connect(driver string, db string) {
var err error var err error
var dbHandle *sqlx.DB var dbHandle *sqlx.DB
if driver != "sqlite3" {
cclog.Abortf("Unsupported database driver '%s'. Only 'sqlite3' is supported.\n", driver)
}
dbConnOnce.Do(func() { dbConnOnce.Do(func() {
opts := DatabaseOptions{ opts := DatabaseOptions{
URL: db, URL: db,
MaxOpenConnections: repoConfig.MaxOpenConnections, MaxOpenConnections: 4,
MaxIdleConnections: repoConfig.MaxIdleConnections, MaxIdleConnections: 4,
ConnectionMaxLifetime: repoConfig.ConnectionMaxLifetime, ConnectionMaxLifetime: time.Hour,
ConnectionMaxIdleTime: repoConfig.ConnectionMaxIdleTime, ConnectionMaxIdleTime: time.Hour,
} }
// TODO: Have separate DB handles for Writes and Reads switch driver {
// Optimize SQLite connection: https://kerkour.com/sqlite-for-servers case "sqlite3":
connectionURLParams := make(url.Values) // TODO: Have separate DB handles for Writes and Reads
connectionURLParams.Add("_txlock", "immediate") // Optimize SQLite connection: https://kerkour.com/sqlite-for-servers
connectionURLParams.Add("_journal_mode", "WAL") connectionUrlParams := make(url.Values)
connectionURLParams.Add("_busy_timeout", "5000") connectionUrlParams.Add("_txlock", "immediate")
connectionURLParams.Add("_synchronous", "NORMAL") connectionUrlParams.Add("_journal_mode", "WAL")
connectionURLParams.Add("_cache_size", "1000000000") connectionUrlParams.Add("_busy_timeout", "5000")
connectionURLParams.Add("_foreign_keys", "true") connectionUrlParams.Add("_synchronous", "NORMAL")
opts.URL = fmt.Sprintf("file:%s?%s", opts.URL, connectionURLParams.Encode()) connectionUrlParams.Add("_cache_size", "1000000000")
connectionUrlParams.Add("_foreign_keys", "true")
opts.URL = fmt.Sprintf("file:%s?%s", opts.URL, connectionUrlParams.Encode())
if cclog.Loglevel() == "debug" { if cclog.Loglevel() == "debug" {
sql.Register("sqlite3WithHooks", sqlhooks.Wrap(&sqlite3.SQLiteDriver{}, &Hooks{})) sql.Register("sqlite3WithHooks", sqlhooks.Wrap(&sqlite3.SQLiteDriver{}, &Hooks{}))
dbHandle, err = sqlx.Open("sqlite3WithHooks", opts.URL) dbHandle, err = sqlx.Open("sqlite3WithHooks", opts.URL)
} else { } else {
dbHandle, err = sqlx.Open("sqlite3", opts.URL) dbHandle, err = sqlx.Open("sqlite3", opts.URL)
}
setupSqlite(dbHandle.DB)
case "mysql":
opts.URL += "?multiStatements=true"
dbHandle, err = sqlx.Open("mysql", opts.URL)
default:
cclog.Abortf("DB Connection: Unsupported database driver '%s'.\n", driver)
} }
if err != nil { if err != nil {
cclog.Abortf("DB Connection: Could not connect to SQLite database with sqlx.Open().\nError: %s\n", err.Error()) cclog.Abortf("DB Connection: Could not connect to '%s' database with sqlx.Open().\nError: %s\n", driver, err.Error())
}
err = setupSqlite(dbHandle.DB)
if err != nil {
cclog.Abortf("Failed sqlite db setup.\nError: %s\n", err.Error())
} }
dbHandle.SetMaxOpenConns(opts.MaxOpenConnections) dbHandle.SetMaxOpenConns(opts.MaxOpenConnections)
@@ -101,7 +107,7 @@ func Connect(driver string, db string) {
dbHandle.SetConnMaxIdleTime(opts.ConnectionMaxIdleTime) dbHandle.SetConnMaxIdleTime(opts.ConnectionMaxIdleTime)
dbConnInstance = &DBConnection{DB: dbHandle, Driver: driver} dbConnInstance = &DBConnection{DB: dbHandle, Driver: driver}
err = checkDBVersion(dbHandle.DB) err = checkDBVersion(driver, dbHandle.DB)
if err != nil { if err != nil {
cclog.Abortf("DB Connection: Failed DB version check.\nError: %s\n", err.Error()) cclog.Abortf("DB Connection: Failed DB version check.\nError: %s\n", err.Error())
} }

View File

@@ -2,61 +2,6 @@
// All rights reserved. This file is part of cc-backend. // All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style // Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file. // license that can be found in the LICENSE file.
// Package repository provides the data access layer for cc-backend using the repository pattern.
//
// The repository pattern abstracts database operations and provides a clean interface for
// data access. Each major entity (Job, User, Node, Tag) has its own repository with CRUD
// operations and specialized queries.
//
// # Database Connection
//
// Initialize the database connection before using any repository:
//
// repository.Connect("sqlite3", "./var/job.db")
//
// # Configuration
//
// Optional: Configure repository settings before initialization:
//
// repository.SetConfig(&repository.RepositoryConfig{
// CacheSize: 2 * 1024 * 1024, // 2MB cache
// MaxOpenConnections: 8, // Connection pool size
// MinRunningJobDuration: 300, // Filter threshold
// })
//
// If not configured, sensible defaults are used automatically.
//
// # Repositories
//
// - JobRepository: Job lifecycle management and querying
// - UserRepository: User management and authentication
// - NodeRepository: Cluster node state tracking
// - Tags: Job tagging and categorization
//
// # Caching
//
// Repositories use LRU caching to improve performance. Cache keys are constructed
// as "type:id" (e.g., "metadata:123"). Cache is automatically invalidated on
// mutations to maintain consistency.
//
// # Transaction Support
//
// For batch operations, use transactions:
//
// t, err := jobRepo.TransactionInit()
// if err != nil {
// return err
// }
// defer t.Rollback() // Rollback if not committed
//
// // Perform operations...
// jobRepo.TransactionAdd(t, query, args...)
//
// // Commit when done
// if err := t.Commit(); err != nil {
// return err
// }
package repository package repository
import ( import (
@@ -100,7 +45,7 @@ func GetJobRepository() *JobRepository {
driver: db.Driver, driver: db.Driver,
stmtCache: sq.NewStmtCache(db.DB), stmtCache: sq.NewStmtCache(db.DB),
cache: lrucache.New(repoConfig.CacheSize), cache: lrucache.New(1024 * 1024),
} }
}) })
return jobRepoInstance return jobRepoInstance
@@ -109,7 +54,7 @@ func GetJobRepository() *JobRepository {
var jobColumns []string = []string{ var jobColumns []string = []string{
"job.id", "job.job_id", "job.hpc_user", "job.project", "job.cluster", "job.subcluster", "job.id", "job.job_id", "job.hpc_user", "job.project", "job.cluster", "job.subcluster",
"job.start_time", "job.cluster_partition", "job.array_job_id", "job.num_nodes", "job.start_time", "job.cluster_partition", "job.array_job_id", "job.num_nodes",
"job.num_hwthreads", "job.num_acc", "job.shared", "job.monitoring_status", "job.num_hwthreads", "job.num_acc", "job.exclusive", "job.monitoring_status",
"job.smt", "job.job_state", "job.duration", "job.walltime", "job.resources", "job.smt", "job.job_state", "job.duration", "job.walltime", "job.resources",
"job.footprint", "job.energy", "job.footprint", "job.energy",
} }
@@ -118,7 +63,7 @@ var jobCacheColumns []string = []string{
"job_cache.id", "job_cache.job_id", "job_cache.hpc_user", "job_cache.project", "job_cache.cluster", "job_cache.id", "job_cache.job_id", "job_cache.hpc_user", "job_cache.project", "job_cache.cluster",
"job_cache.subcluster", "job_cache.start_time", "job_cache.cluster_partition", "job_cache.subcluster", "job_cache.start_time", "job_cache.cluster_partition",
"job_cache.array_job_id", "job_cache.num_nodes", "job_cache.num_hwthreads", "job_cache.array_job_id", "job_cache.num_nodes", "job_cache.num_hwthreads",
"job_cache.num_acc", "job_cache.shared", "job_cache.monitoring_status", "job_cache.smt", "job_cache.num_acc", "job_cache.exclusive", "job_cache.monitoring_status", "job_cache.smt",
"job_cache.job_state", "job_cache.duration", "job_cache.walltime", "job_cache.resources", "job_cache.job_state", "job_cache.duration", "job_cache.walltime", "job_cache.resources",
"job_cache.footprint", "job_cache.energy", "job_cache.footprint", "job_cache.energy",
} }
@@ -128,8 +73,8 @@ func scanJob(row interface{ Scan(...any) error }) (*schema.Job, error) {
if err := row.Scan( if err := row.Scan(
&job.ID, &job.JobID, &job.User, &job.Project, &job.Cluster, &job.SubCluster, &job.ID, &job.JobID, &job.User, &job.Project, &job.Cluster, &job.SubCluster,
&job.StartTime, &job.Partition, &job.ArrayJobID, &job.NumNodes, &job.NumHWThreads, &job.StartTime, &job.Partition, &job.ArrayJobId, &job.NumNodes, &job.NumHWThreads,
&job.NumAcc, &job.Shared, &job.MonitoringStatus, &job.SMT, &job.State, &job.NumAcc, &job.Exclusive, &job.MonitoringStatus, &job.SMT, &job.State,
&job.Duration, &job.Walltime, &job.RawResources, &job.RawFootprint, &job.Energy); err != nil { &job.Duration, &job.Walltime, &job.RawResources, &job.RawFootprint, &job.Energy); err != nil {
cclog.Warnf("Error while scanning rows (Job): %v", err) cclog.Warnf("Error while scanning rows (Job): %v", err)
return nil, err return nil, err
@@ -156,22 +101,52 @@ func scanJob(row interface{ Scan(...any) error }) (*schema.Job, error) {
} }
func (r *JobRepository) Optimize() error { func (r *JobRepository) Optimize() error {
if _, err := r.DB.Exec(`VACUUM`); err != nil { var err error
return err
switch r.driver {
case "sqlite3":
if _, err = r.DB.Exec(`VACUUM`); err != nil {
return err
}
case "mysql":
cclog.Info("Optimize currently not supported for mysql driver")
} }
return nil return nil
} }
func (r *JobRepository) Flush() error { func (r *JobRepository) Flush() error {
if _, err := r.DB.Exec(`DELETE FROM jobtag`); err != nil { var err error
return err
} switch r.driver {
if _, err := r.DB.Exec(`DELETE FROM tag`); err != nil { case "sqlite3":
return err if _, err = r.DB.Exec(`DELETE FROM jobtag`); err != nil {
} return err
if _, err := r.DB.Exec(`DELETE FROM job`); err != nil { }
return err if _, err = r.DB.Exec(`DELETE FROM tag`); err != nil {
return err
}
if _, err = r.DB.Exec(`DELETE FROM job`); err != nil {
return err
}
case "mysql":
if _, err = r.DB.Exec(`SET FOREIGN_KEY_CHECKS = 0`); err != nil {
return err
}
if _, err = r.DB.Exec(`TRUNCATE TABLE jobtag`); err != nil {
return err
}
if _, err = r.DB.Exec(`TRUNCATE TABLE tag`); err != nil {
return err
}
if _, err = r.DB.Exec(`TRUNCATE TABLE job`); err != nil {
return err
}
if _, err = r.DB.Exec(`SET FOREIGN_KEY_CHECKS = 1`); err != nil {
return err
}
} }
return nil return nil
} }
@@ -289,50 +264,11 @@ func (r *JobRepository) FetchEnergyFootprint(job *schema.Job) (map[string]float6
return job.EnergyFootprint, nil return job.EnergyFootprint, nil
} }
func (r *JobRepository) DeleteJobsBefore(startTime int64, omitTagged bool) (int, error) { func (r *JobRepository) DeleteJobsBefore(startTime int64) (int, error) {
var cnt int var cnt int
q := sq.Select("count(*)").From("job").Where("job.start_time < ?", startTime) q := sq.Select("count(*)").From("job").Where("job.start_time < ?", startTime)
q.RunWith(r.DB).QueryRow().Scan(cnt)
if omitTagged {
q = q.Where("NOT EXISTS (SELECT 1 FROM jobtag WHERE jobtag.job_id = job.id)")
}
if err := q.RunWith(r.DB).QueryRow().Scan(&cnt); err != nil {
cclog.Errorf("Error counting jobs before %d: %v", startTime, err)
return 0, err
}
// Invalidate cache for jobs being deleted (get job IDs first)
if cnt > 0 {
var jobIds []int64
selectQuery := sq.Select("id").From("job").Where("job.start_time < ?", startTime)
if omitTagged {
selectQuery = selectQuery.Where("NOT EXISTS (SELECT 1 FROM jobtag WHERE jobtag.job_id = job.id)")
}
rows, err := selectQuery.RunWith(r.DB).Query()
if err == nil {
defer rows.Close()
for rows.Next() {
var id int64
if err := rows.Scan(&id); err == nil {
jobIds = append(jobIds, id)
}
}
// Invalidate cache entries
for _, id := range jobIds {
r.cache.Del(fmt.Sprintf("metadata:%d", id))
r.cache.Del(fmt.Sprintf("energyFootprint:%d", id))
}
}
}
qd := sq.Delete("job").Where("job.start_time < ?", startTime) qd := sq.Delete("job").Where("job.start_time < ?", startTime)
if omitTagged {
qd = qd.Where("NOT EXISTS (SELECT 1 FROM jobtag WHERE jobtag.job_id = job.id)")
}
_, err := qd.RunWith(r.DB).Exec() _, err := qd.RunWith(r.DB).Exec()
if err != nil { if err != nil {
@@ -344,11 +280,7 @@ func (r *JobRepository) DeleteJobsBefore(startTime int64, omitTagged bool) (int,
return cnt, err return cnt, err
} }
func (r *JobRepository) DeleteJobByID(id int64) error { func (r *JobRepository) DeleteJobById(id int64) error {
// Invalidate cache entries before deletion
r.cache.Del(fmt.Sprintf("metadata:%d", id))
r.cache.Del(fmt.Sprintf("energyFootprint:%d", id))
qd := sq.Delete("job").Where("job.id = ?", id) qd := sq.Delete("job").Where("job.id = ?", id)
_, err := qd.RunWith(r.DB).Exec() _, err := qd.RunWith(r.DB).Exec()
@@ -405,10 +337,10 @@ func (r *JobRepository) FindColumnValue(user *schema.User, searchterm string, ta
// theSql, args, theErr := theQuery.ToSql() // theSql, args, theErr := theQuery.ToSql()
// if theErr != nil { // if theErr != nil {
// cclog.Warn("Error while converting query to sql") // log.Warn("Error while converting query to sql")
// return "", err // return "", err
// } // }
// cclog.Debugf("SQL query (FindColumnValue): `%s`, args: %#v", theSql, args) // log.Debugf("SQL query (FindColumnValue): `%s`, args: %#v", theSql, args)
err := theQuery.RunWith(r.stmtCache).QueryRow().Scan(&result) err := theQuery.RunWith(r.stmtCache).QueryRow().Scan(&result)
@@ -518,14 +450,13 @@ func (r *JobRepository) AllocatedNodes(cluster string) (map[string]map[string]in
// FIXME: Set duration to requested walltime? // FIXME: Set duration to requested walltime?
func (r *JobRepository) StopJobsExceedingWalltimeBy(seconds int) error { func (r *JobRepository) StopJobsExceedingWalltimeBy(seconds int) error {
start := time.Now() start := time.Now()
currentTime := time.Now().Unix()
res, err := sq.Update("job"). res, err := sq.Update("job").
Set("monitoring_status", schema.MonitoringStatusArchivingFailed). Set("monitoring_status", schema.MonitoringStatusArchivingFailed).
Set("duration", 0). Set("duration", 0).
Set("job_state", schema.JobStateFailed). Set("job_state", schema.JobStateFailed).
Where("job.job_state = 'running'"). Where("job.job_state = 'running'").
Where("job.walltime > 0"). Where("job.walltime > 0").
Where("(? - job.start_time) > (job.walltime + ?)", currentTime, seconds). Where(fmt.Sprintf("(%d - job.start_time) > (job.walltime + %d)", time.Now().Unix(), seconds)).
RunWith(r.DB).Exec() RunWith(r.DB).Exec()
if err != nil { if err != nil {
cclog.Warn("Error while stopping jobs exceeding walltime") cclog.Warn("Error while stopping jobs exceeding walltime")
@@ -545,10 +476,10 @@ func (r *JobRepository) StopJobsExceedingWalltimeBy(seconds int) error {
return nil return nil
} }
func (r *JobRepository) FindJobIdsByTag(tagID int64) ([]int64, error) { func (r *JobRepository) FindJobIdsByTag(tagId int64) ([]int64, error) {
query := sq.Select("job.id").From("job"). query := sq.Select("job.id").From("job").
Join("jobtag ON jobtag.job_id = job.id"). Join("jobtag ON jobtag.job_id = job.id").
Where(sq.Eq{"jobtag.tag_id": tagID}).Distinct() Where(sq.Eq{"jobtag.tag_id": tagId}).Distinct()
rows, err := query.RunWith(r.stmtCache).Query() rows, err := query.RunWith(r.stmtCache).Query()
if err != nil { if err != nil {
cclog.Error("Error while running query") cclog.Error("Error while running query")
@@ -557,15 +488,15 @@ func (r *JobRepository) FindJobIdsByTag(tagID int64) ([]int64, error) {
jobIds := make([]int64, 0, 100) jobIds := make([]int64, 0, 100)
for rows.Next() { for rows.Next() {
var jobID int64 var jobId int64
if err := rows.Scan(&jobID); err != nil { if err := rows.Scan(&jobId); err != nil {
rows.Close() rows.Close()
cclog.Warn("Error while scanning rows") cclog.Warn("Error while scanning rows")
return nil, err return nil, err
} }
jobIds = append(jobIds, jobID) jobIds = append(jobIds, jobId)
} }
return jobIds, nil return jobIds, nil
@@ -574,21 +505,21 @@ func (r *JobRepository) FindJobIdsByTag(tagID int64) ([]int64, error) {
// FIXME: Reconsider filtering short jobs with harcoded threshold // FIXME: Reconsider filtering short jobs with harcoded threshold
func (r *JobRepository) FindRunningJobs(cluster string) ([]*schema.Job, error) { func (r *JobRepository) FindRunningJobs(cluster string) ([]*schema.Job, error) {
query := sq.Select(jobColumns...).From("job"). query := sq.Select(jobColumns...).From("job").
Where("job.cluster = ?", cluster). Where(fmt.Sprintf("job.cluster = '%s'", cluster)).
Where("job.job_state = 'running'"). Where("job.job_state = 'running'").
Where("job.duration > ?", repoConfig.MinRunningJobDuration) Where("job.duration > 600")
rows, err := query.RunWith(r.stmtCache).Query() rows, err := query.RunWith(r.stmtCache).Query()
if err != nil { if err != nil {
cclog.Error("Error while running query") cclog.Error("Error while running query")
return nil, err return nil, err
} }
defer rows.Close()
jobs := make([]*schema.Job, 0, 50) jobs := make([]*schema.Job, 0, 50)
for rows.Next() { for rows.Next() {
job, err := scanJob(rows) job, err := scanJob(rows)
if err != nil { if err != nil {
rows.Close()
cclog.Warn("Error while scanning rows") cclog.Warn("Error while scanning rows")
return nil, err return nil, err
} }
@@ -612,7 +543,7 @@ func (r *JobRepository) UpdateDuration() error {
return nil return nil
} }
func (r *JobRepository) FindJobsBetween(startTimeBegin int64, startTimeEnd int64, omitTagged bool) ([]*schema.Job, error) { func (r *JobRepository) FindJobsBetween(startTimeBegin int64, startTimeEnd int64) ([]*schema.Job, error) {
var query sq.SelectBuilder var query sq.SelectBuilder
if startTimeBegin == startTimeEnd || startTimeBegin > startTimeEnd { if startTimeBegin == startTimeEnd || startTimeBegin > startTimeEnd {
@@ -621,14 +552,12 @@ func (r *JobRepository) FindJobsBetween(startTimeBegin int64, startTimeEnd int64
if startTimeBegin == 0 { if startTimeBegin == 0 {
cclog.Infof("Find jobs before %d", startTimeEnd) cclog.Infof("Find jobs before %d", startTimeEnd)
query = sq.Select(jobColumns...).From("job").Where("job.start_time < ?", startTimeEnd) query = sq.Select(jobColumns...).From("job").Where(fmt.Sprintf(
"job.start_time < %d", startTimeEnd))
} else { } else {
cclog.Infof("Find jobs between %d and %d", startTimeBegin, startTimeEnd) cclog.Infof("Find jobs between %d and %d", startTimeBegin, startTimeEnd)
query = sq.Select(jobColumns...).From("job").Where("job.start_time BETWEEN ? AND ?", startTimeBegin, startTimeEnd) query = sq.Select(jobColumns...).From("job").Where(fmt.Sprintf(
} "job.start_time BETWEEN %d AND %d", startTimeBegin, startTimeEnd))
if omitTagged {
query = query.Where("NOT EXISTS (SELECT 1 FROM jobtag WHERE jobtag.job_id = job.id)")
} }
rows, err := query.RunWith(r.stmtCache).Query() rows, err := query.RunWith(r.stmtCache).Query()
@@ -636,12 +565,12 @@ func (r *JobRepository) FindJobsBetween(startTimeBegin int64, startTimeEnd int64
cclog.Error("Error while running query") cclog.Error("Error while running query")
return nil, err return nil, err
} }
defer rows.Close()
jobs := make([]*schema.Job, 0, 50) jobs := make([]*schema.Job, 0, 50)
for rows.Next() { for rows.Next() {
job, err := scanJob(rows) job, err := scanJob(rows)
if err != nil { if err != nil {
rows.Close()
cclog.Warn("Error while scanning rows") cclog.Warn("Error while scanning rows")
return nil, err return nil, err
} }
@@ -653,16 +582,12 @@ func (r *JobRepository) FindJobsBetween(startTimeBegin int64, startTimeEnd int64
} }
func (r *JobRepository) UpdateMonitoringStatus(job int64, monitoringStatus int32) (err error) { func (r *JobRepository) UpdateMonitoringStatus(job int64, monitoringStatus int32) (err error) {
// Invalidate cache entries as monitoring status affects job state
r.cache.Del(fmt.Sprintf("metadata:%d", job))
r.cache.Del(fmt.Sprintf("energyFootprint:%d", job))
stmt := sq.Update("job"). stmt := sq.Update("job").
Set("monitoring_status", monitoringStatus). Set("monitoring_status", monitoringStatus).
Where("job.id = ?", job) Where("job.id = ?", job)
_, err = stmt.RunWith(r.stmtCache).Exec() _, err = stmt.RunWith(r.stmtCache).Exec()
return err return
} }
func (r *JobRepository) Execute(stmt sq.UpdateBuilder) error { func (r *JobRepository) Execute(stmt sq.UpdateBuilder) error {
@@ -699,11 +624,10 @@ func (r *JobRepository) UpdateEnergy(
metricEnergy := 0.0 metricEnergy := 0.0
if i, err := archive.MetricIndex(sc.MetricConfig, fp); err == nil { if i, err := archive.MetricIndex(sc.MetricConfig, fp); err == nil {
// Note: For DB data, calculate and save as kWh // Note: For DB data, calculate and save as kWh
switch sc.MetricConfig[i].Energy { if sc.MetricConfig[i].Energy == "energy" { // this metric has energy as unit (Joules or Wh)
case "energy": // this metric has energy as unit (Joules or Wh)
cclog.Warnf("Update EnergyFootprint for Job %d and Metric %s on cluster %s: Set to 'energy' in cluster.json: Not implemented, will return 0.0", jobMeta.JobID, jobMeta.Cluster, fp) cclog.Warnf("Update EnergyFootprint for Job %d and Metric %s on cluster %s: Set to 'energy' in cluster.json: Not implemented, will return 0.0", jobMeta.JobID, jobMeta.Cluster, fp)
// FIXME: Needs sum as stats type // FIXME: Needs sum as stats type
case "power": // this metric has power as unit (Watt) } else if sc.MetricConfig[i].Energy == "power" { // this metric has power as unit (Watt)
// Energy: Power (in Watts) * Time (in Seconds) // Energy: Power (in Watts) * Time (in Seconds)
// Unit: (W * (s / 3600)) / 1000 = kWh // Unit: (W * (s / 3600)) / 1000 = kWh
// Round 2 Digits: round(Energy * 100) / 100 // Round 2 Digits: round(Energy * 100) / 100

View File

@@ -2,7 +2,6 @@
// All rights reserved. This file is part of cc-backend. // All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style // Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file. // license that can be found in the LICENSE file.
package repository package repository
import ( import (
@@ -16,25 +15,24 @@ import (
const NamedJobCacheInsert string = `INSERT INTO job_cache ( const NamedJobCacheInsert string = `INSERT INTO job_cache (
job_id, hpc_user, project, cluster, subcluster, cluster_partition, array_job_id, num_nodes, num_hwthreads, num_acc, job_id, hpc_user, project, cluster, subcluster, cluster_partition, array_job_id, num_nodes, num_hwthreads, num_acc,
shared, monitoring_status, smt, job_state, start_time, duration, walltime, footprint, energy, energy_footprint, resources, meta_data exclusive, monitoring_status, smt, job_state, start_time, duration, walltime, footprint, energy, energy_footprint, resources, meta_data
) VALUES ( ) VALUES (
:job_id, :hpc_user, :project, :cluster, :subcluster, :cluster_partition, :array_job_id, :num_nodes, :num_hwthreads, :num_acc, :job_id, :hpc_user, :project, :cluster, :subcluster, :cluster_partition, :array_job_id, :num_nodes, :num_hwthreads, :num_acc,
:shared, :monitoring_status, :smt, :job_state, :start_time, :duration, :walltime, :footprint, :energy, :energy_footprint, :resources, :meta_data :exclusive, :monitoring_status, :smt, :job_state, :start_time, :duration, :walltime, :footprint, :energy, :energy_footprint, :resources, :meta_data
);` );`
const NamedJobInsert string = `INSERT INTO job ( const NamedJobInsert string = `INSERT INTO job (
job_id, hpc_user, project, cluster, subcluster, cluster_partition, array_job_id, num_nodes, num_hwthreads, num_acc, job_id, hpc_user, project, cluster, subcluster, cluster_partition, array_job_id, num_nodes, num_hwthreads, num_acc,
shared, monitoring_status, smt, job_state, start_time, duration, walltime, footprint, energy, energy_footprint, resources, meta_data exclusive, monitoring_status, smt, job_state, start_time, duration, walltime, footprint, energy, energy_footprint, resources, meta_data
) VALUES ( ) VALUES (
:job_id, :hpc_user, :project, :cluster, :subcluster, :cluster_partition, :array_job_id, :num_nodes, :num_hwthreads, :num_acc, :job_id, :hpc_user, :project, :cluster, :subcluster, :cluster_partition, :array_job_id, :num_nodes, :num_hwthreads, :num_acc,
:shared, :monitoring_status, :smt, :job_state, :start_time, :duration, :walltime, :footprint, :energy, :energy_footprint, :resources, :meta_data :exclusive, :monitoring_status, :smt, :job_state, :start_time, :duration, :walltime, :footprint, :energy, :energy_footprint, :resources, :meta_data
);` );`
func (r *JobRepository) InsertJob(job *schema.Job) (int64, error) { func (r *JobRepository) InsertJob(job *schema.Job) (int64, error) {
r.Mutex.Lock() r.Mutex.Lock()
defer r.Mutex.Unlock()
res, err := r.DB.NamedExec(NamedJobCacheInsert, job) res, err := r.DB.NamedExec(NamedJobCacheInsert, job)
r.Mutex.Unlock()
if err != nil { if err != nil {
cclog.Warn("Error while NamedJobInsert") cclog.Warn("Error while NamedJobInsert")
return 0, err return 0, err
@@ -59,12 +57,12 @@ func (r *JobRepository) SyncJobs() ([]*schema.Job, error) {
cclog.Errorf("Error while running query %v", err) cclog.Errorf("Error while running query %v", err)
return nil, err return nil, err
} }
defer rows.Close()
jobs := make([]*schema.Job, 0, 50) jobs := make([]*schema.Job, 0, 50)
for rows.Next() { for rows.Next() {
job, err := scanJob(rows) job, err := scanJob(rows)
if err != nil { if err != nil {
rows.Close()
cclog.Warn("Error while scanning rows") cclog.Warn("Error while scanning rows")
return nil, err return nil, err
} }
@@ -72,7 +70,7 @@ func (r *JobRepository) SyncJobs() ([]*schema.Job, error) {
} }
_, err = r.DB.Exec( _, err = r.DB.Exec(
"INSERT INTO job (job_id, cluster, subcluster, start_time, hpc_user, project, cluster_partition, array_job_id, num_nodes, num_hwthreads, num_acc, shared, monitoring_status, smt, job_state, duration, walltime, footprint, energy, energy_footprint, resources, meta_data) SELECT job_id, cluster, subcluster, start_time, hpc_user, project, cluster_partition, array_job_id, num_nodes, num_hwthreads, num_acc, shared, monitoring_status, smt, job_state, duration, walltime, footprint, energy, energy_footprint, resources, meta_data FROM job_cache") "INSERT INTO job (job_id, cluster, subcluster, start_time, hpc_user, project, cluster_partition, array_job_id, num_nodes, num_hwthreads, num_acc, exclusive, monitoring_status, smt, job_state, duration, walltime, footprint, energy, energy_footprint, resources, meta_data) SELECT job_id, cluster, subcluster, start_time, hpc_user, project, cluster_partition, array_job_id, num_nodes, num_hwthreads, num_acc, exclusive, monitoring_status, smt, job_state, duration, walltime, footprint, energy, energy_footprint, resources, meta_data FROM job_cache")
if err != nil { if err != nil {
cclog.Warnf("Error while Job sync: %v", err) cclog.Warnf("Error while Job sync: %v", err)
return nil, err return nil, err
@@ -110,39 +108,33 @@ func (r *JobRepository) Start(job *schema.Job) (id int64, err error) {
// Stop updates the job with the database id jobId using the provided arguments. // Stop updates the job with the database id jobId using the provided arguments.
func (r *JobRepository) Stop( func (r *JobRepository) Stop(
jobID int64, jobId int64,
duration int32, duration int32,
state schema.JobState, state schema.JobState,
monitoringStatus int32, monitoringStatus int32,
) (err error) { ) (err error) {
// Invalidate cache entries as job state is changing
r.cache.Del(fmt.Sprintf("metadata:%d", jobID))
r.cache.Del(fmt.Sprintf("energyFootprint:%d", jobID))
stmt := sq.Update("job"). stmt := sq.Update("job").
Set("job_state", state). Set("job_state", state).
Set("duration", duration). Set("duration", duration).
Set("monitoring_status", monitoringStatus). Set("monitoring_status", monitoringStatus).
Where("job.id = ?", jobID) Where("job.id = ?", jobId)
_, err = stmt.RunWith(r.stmtCache).Exec() _, err = stmt.RunWith(r.stmtCache).Exec()
return err return
} }
func (r *JobRepository) StopCached( func (r *JobRepository) StopCached(
jobID int64, jobId int64,
duration int32, duration int32,
state schema.JobState, state schema.JobState,
monitoringStatus int32, monitoringStatus int32,
) (err error) { ) (err error) {
// Note: StopCached updates job_cache table, not the main job table
// Cache invalidation happens when job is synced to main table
stmt := sq.Update("job_cache"). stmt := sq.Update("job_cache").
Set("job_state", state). Set("job_state", state).
Set("duration", duration). Set("duration", duration).
Set("monitoring_status", monitoringStatus). Set("monitoring_status", monitoringStatus).
Where("job_cache.id = ?", jobID) Where("job.id = ?", jobId)
_, err = stmt.RunWith(r.stmtCache).Exec() _, err = stmt.RunWith(r.stmtCache).Exec()
return err return
} }

View File

@@ -2,7 +2,6 @@
// All rights reserved. This file is part of cc-backend. // All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style // Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file. // license that can be found in the LICENSE file.
package repository package repository
import ( import (
@@ -23,13 +22,13 @@ import (
// It returns a pointer to a schema.Job data structure and an error variable. // It returns a pointer to a schema.Job data structure and an error variable.
// To check if no job was found test err == sql.ErrNoRows // To check if no job was found test err == sql.ErrNoRows
func (r *JobRepository) Find( func (r *JobRepository) Find(
jobID *int64, jobId *int64,
cluster *string, cluster *string,
startTime *int64, startTime *int64,
) (*schema.Job, error) { ) (*schema.Job, error) {
start := time.Now() start := time.Now()
q := sq.Select(jobColumns...).From("job"). q := sq.Select(jobColumns...).From("job").
Where("job.job_id = ?", *jobID) Where("job.job_id = ?", *jobId)
if cluster != nil { if cluster != nil {
q = q.Where("job.cluster = ?", *cluster) q = q.Where("job.cluster = ?", *cluster)
@@ -45,12 +44,12 @@ func (r *JobRepository) Find(
} }
func (r *JobRepository) FindCached( func (r *JobRepository) FindCached(
jobID *int64, jobId *int64,
cluster *string, cluster *string,
startTime *int64, startTime *int64,
) (*schema.Job, error) { ) (*schema.Job, error) {
q := sq.Select(jobCacheColumns...).From("job_cache"). q := sq.Select(jobCacheColumns...).From("job_cache").
Where("job_cache.job_id = ?", *jobID) Where("job_cache.job_id = ?", *jobId)
if cluster != nil { if cluster != nil {
q = q.Where("job_cache.cluster = ?", *cluster) q = q.Where("job_cache.cluster = ?", *cluster)
@@ -64,19 +63,19 @@ func (r *JobRepository) FindCached(
return scanJob(q.RunWith(r.stmtCache).QueryRow()) return scanJob(q.RunWith(r.stmtCache).QueryRow())
} }
// FindAll executes a SQL query to find all batch jobs matching the given criteria. // Find executes a SQL query to find a specific batch job.
// Jobs are queried using the batch job id, and optionally filtered by cluster name // The job is queried using the batch job id, the cluster name,
// and start time (UNIX epoch time seconds). // and the start time of the job in UNIX epoch time seconds.
// It returns a slice of pointers to schema.Job data structures and an error variable. // It returns a pointer to a schema.Job data structure and an error variable.
// An empty slice is returned if no matching jobs are found. // To check if no job was found test err == sql.ErrNoRows
func (r *JobRepository) FindAll( func (r *JobRepository) FindAll(
jobID *int64, jobId *int64,
cluster *string, cluster *string,
startTime *int64, startTime *int64,
) ([]*schema.Job, error) { ) ([]*schema.Job, error) {
start := time.Now() start := time.Now()
q := sq.Select(jobColumns...).From("job"). q := sq.Select(jobColumns...).From("job").
Where("job.job_id = ?", *jobID) Where("job.job_id = ?", *jobId)
if cluster != nil { if cluster != nil {
q = q.Where("job.cluster = ?", *cluster) q = q.Where("job.cluster = ?", *cluster)
@@ -90,7 +89,6 @@ func (r *JobRepository) FindAll(
cclog.Error("Error while running query") cclog.Error("Error while running query")
return nil, err return nil, err
} }
defer rows.Close()
jobs := make([]*schema.Job, 0, 10) jobs := make([]*schema.Job, 0, 10)
for rows.Next() { for rows.Next() {
@@ -105,31 +103,25 @@ func (r *JobRepository) FindAll(
return jobs, nil return jobs, nil
} }
// GetJobList returns job IDs for non-running jobs. // Get complete joblist only consisting of db ids.
// This is useful to process large job counts and intended to be used // This is useful to process large job counts and intended to be used
// together with FindById to process jobs one by one. // together with FindById to process jobs one by one
// Use limit and offset for pagination. Use limit=0 to get all results (not recommended for large datasets). func (r *JobRepository) GetJobList() ([]int64, error) {
func (r *JobRepository) GetJobList(limit int, offset int) ([]int64, error) {
query := sq.Select("id").From("job"). query := sq.Select("id").From("job").
Where("job.job_state != 'running'") Where("job.job_state != 'running'")
// Add pagination if limit is specified
if limit > 0 {
query = query.Limit(uint64(limit)).Offset(uint64(offset))
}
rows, err := query.RunWith(r.stmtCache).Query() rows, err := query.RunWith(r.stmtCache).Query()
if err != nil { if err != nil {
cclog.Error("Error while running query") cclog.Error("Error while running query")
return nil, err return nil, err
} }
defer rows.Close()
jl := make([]int64, 0, 1000) jl := make([]int64, 0, 1000)
for rows.Next() { for rows.Next() {
var id int64 var id int64
err := rows.Scan(&id) err := rows.Scan(&id)
if err != nil { if err != nil {
rows.Close()
cclog.Warn("Error while scanning rows") cclog.Warn("Error while scanning rows")
return nil, err return nil, err
} }
@@ -140,13 +132,13 @@ func (r *JobRepository) GetJobList(limit int, offset int) ([]int64, error) {
return jl, nil return jl, nil
} }
// FindByID executes a SQL query to find a specific batch job. // FindById executes a SQL query to find a specific batch job.
// The job is queried using the database id. // The job is queried using the database id.
// It returns a pointer to a schema.Job data structure and an error variable. // It returns a pointer to a schema.Job data structure and an error variable.
// To check if no job was found test err == sql.ErrNoRows // To check if no job was found test err == sql.ErrNoRows
func (r *JobRepository) FindByID(ctx context.Context, jobID int64) (*schema.Job, error) { func (r *JobRepository) FindById(ctx context.Context, jobId int64) (*schema.Job, error) {
q := sq.Select(jobColumns...). q := sq.Select(jobColumns...).
From("job").Where("job.id = ?", jobID) From("job").Where("job.id = ?", jobId)
q, qerr := SecurityCheck(ctx, q) q, qerr := SecurityCheck(ctx, q)
if qerr != nil { if qerr != nil {
@@ -156,14 +148,14 @@ func (r *JobRepository) FindByID(ctx context.Context, jobID int64) (*schema.Job,
return scanJob(q.RunWith(r.stmtCache).QueryRow()) return scanJob(q.RunWith(r.stmtCache).QueryRow())
} }
// FindByIDWithUser executes a SQL query to find a specific batch job. // FindByIdWithUser executes a SQL query to find a specific batch job.
// The job is queried using the database id. The user is passed directly, // The job is queried using the database id. The user is passed directly,
// instead as part of the context. // instead as part of the context.
// It returns a pointer to a schema.Job data structure and an error variable. // It returns a pointer to a schema.Job data structure and an error variable.
// To check if no job was found test err == sql.ErrNoRows // To check if no job was found test err == sql.ErrNoRows
func (r *JobRepository) FindByIDWithUser(user *schema.User, jobID int64) (*schema.Job, error) { func (r *JobRepository) FindByIdWithUser(user *schema.User, jobId int64) (*schema.Job, error) {
q := sq.Select(jobColumns...). q := sq.Select(jobColumns...).
From("job").Where("job.id = ?", jobID) From("job").Where("job.id = ?", jobId)
q, qerr := SecurityCheckWithUser(user, q) q, qerr := SecurityCheckWithUser(user, q)
if qerr != nil { if qerr != nil {
@@ -173,24 +165,24 @@ func (r *JobRepository) FindByIDWithUser(user *schema.User, jobID int64) (*schem
return scanJob(q.RunWith(r.stmtCache).QueryRow()) return scanJob(q.RunWith(r.stmtCache).QueryRow())
} }
// FindByIDDirect executes a SQL query to find a specific batch job. // FindByIdDirect executes a SQL query to find a specific batch job.
// The job is queried using the database id. // The job is queried using the database id.
// It returns a pointer to a schema.Job data structure and an error variable. // It returns a pointer to a schema.Job data structure and an error variable.
// To check if no job was found test err == sql.ErrNoRows // To check if no job was found test err == sql.ErrNoRows
func (r *JobRepository) FindByIDDirect(jobID int64) (*schema.Job, error) { func (r *JobRepository) FindByIdDirect(jobId int64) (*schema.Job, error) {
q := sq.Select(jobColumns...). q := sq.Select(jobColumns...).
From("job").Where("job.id = ?", jobID) From("job").Where("job.id = ?", jobId)
return scanJob(q.RunWith(r.stmtCache).QueryRow()) return scanJob(q.RunWith(r.stmtCache).QueryRow())
} }
// FindByJobID executes a SQL query to find a specific batch job. // FindByJobId executes a SQL query to find a specific batch job.
// The job is queried using the slurm id and the clustername. // The job is queried using the slurm id and the clustername.
// It returns a pointer to a schema.Job data structure and an error variable. // It returns a pointer to a schema.Job data structure and an error variable.
// To check if no job was found test err == sql.ErrNoRows // To check if no job was found test err == sql.ErrNoRows
func (r *JobRepository) FindByJobID(ctx context.Context, jobID int64, startTime int64, cluster string) (*schema.Job, error) { func (r *JobRepository) FindByJobId(ctx context.Context, jobId int64, startTime int64, cluster string) (*schema.Job, error) {
q := sq.Select(jobColumns...). q := sq.Select(jobColumns...).
From("job"). From("job").
Where("job.job_id = ?", jobID). Where("job.job_id = ?", jobId).
Where("job.cluster = ?", cluster). Where("job.cluster = ?", cluster).
Where("job.start_time = ?", startTime) Where("job.start_time = ?", startTime)
@@ -206,10 +198,10 @@ func (r *JobRepository) FindByJobID(ctx context.Context, jobID int64, startTime
// The job is queried using the slurm id,a username and the cluster. // The job is queried using the slurm id,a username and the cluster.
// It returns a bool. // It returns a bool.
// If job was found, user is owner: test err != sql.ErrNoRows // If job was found, user is owner: test err != sql.ErrNoRows
func (r *JobRepository) IsJobOwner(jobID int64, startTime int64, user string, cluster string) bool { func (r *JobRepository) IsJobOwner(jobId int64, startTime int64, user string, cluster string) bool {
q := sq.Select("id"). q := sq.Select("id").
From("job"). From("job").
Where("job.job_id = ?", jobID). Where("job.job_id = ?", jobId).
Where("job.hpc_user = ?", user). Where("job.hpc_user = ?", user).
Where("job.cluster = ?", cluster). Where("job.cluster = ?", cluster).
Where("job.start_time = ?", startTime) Where("job.start_time = ?", startTime)
@@ -264,25 +256,24 @@ func (r *JobRepository) FindConcurrentJobs(
cclog.Errorf("Error while running query: %v", err) cclog.Errorf("Error while running query: %v", err)
return nil, err return nil, err
} }
defer rows.Close()
items := make([]*model.JobLink, 0, 10) items := make([]*model.JobLink, 0, 10)
queryString := fmt.Sprintf("cluster=%s", job.Cluster) queryString := fmt.Sprintf("cluster=%s", job.Cluster)
for rows.Next() { for rows.Next() {
var id, jobID, startTime sql.NullInt64 var id, jobId, startTime sql.NullInt64
if err = rows.Scan(&id, &jobID, &startTime); err != nil { if err = rows.Scan(&id, &jobId, &startTime); err != nil {
cclog.Warn("Error while scanning rows") cclog.Warn("Error while scanning rows")
return nil, err return nil, err
} }
if id.Valid { if id.Valid {
queryString += fmt.Sprintf("&jobId=%d", int(jobID.Int64)) queryString += fmt.Sprintf("&jobId=%d", int(jobId.Int64))
items = append(items, items = append(items,
&model.JobLink{ &model.JobLink{
ID: fmt.Sprint(id.Int64), ID: fmt.Sprint(id.Int64),
JobID: int(jobID.Int64), JobID: int(jobId.Int64),
}) })
} }
} }
@@ -292,22 +283,21 @@ func (r *JobRepository) FindConcurrentJobs(
cclog.Errorf("Error while running query: %v", err) cclog.Errorf("Error while running query: %v", err)
return nil, err return nil, err
} }
defer rows.Close()
for rows.Next() { for rows.Next() {
var id, jobID, startTime sql.NullInt64 var id, jobId, startTime sql.NullInt64
if err := rows.Scan(&id, &jobID, &startTime); err != nil { if err := rows.Scan(&id, &jobId, &startTime); err != nil {
cclog.Warn("Error while scanning rows") cclog.Warn("Error while scanning rows")
return nil, err return nil, err
} }
if id.Valid { if id.Valid {
queryString += fmt.Sprintf("&jobId=%d", int(jobID.Int64)) queryString += fmt.Sprintf("&jobId=%d", int(jobId.Int64))
items = append(items, items = append(items,
&model.JobLink{ &model.JobLink{
ID: fmt.Sprint(id.Int64), ID: fmt.Sprint(id.Int64),
JobID: int(jobID.Int64), JobID: int(jobId.Int64),
}) })
} }
} }

View File

@@ -12,7 +12,6 @@ import (
"strings" "strings"
"time" "time"
"github.com/ClusterCockpit/cc-backend/internal/config"
"github.com/ClusterCockpit/cc-backend/internal/graph/model" "github.com/ClusterCockpit/cc-backend/internal/graph/model"
cclog "github.com/ClusterCockpit/cc-lib/ccLogger" cclog "github.com/ClusterCockpit/cc-lib/ccLogger"
"github.com/ClusterCockpit/cc-lib/schema" "github.com/ClusterCockpit/cc-lib/schema"
@@ -183,8 +182,8 @@ func BuildWhereClause(filter *model.JobFilter, query sq.SelectBuilder) sq.Select
now := time.Now().Unix() // There does not seam to be a portable way to get the current unix timestamp accross different DBs. now := time.Now().Unix() // There does not seam to be a portable way to get the current unix timestamp accross different DBs.
query = query.Where("(job.job_state != 'running' OR (? - job.start_time) > ?)", now, *filter.MinRunningFor) query = query.Where("(job.job_state != 'running' OR (? - job.start_time) > ?)", now, *filter.MinRunningFor)
} }
if filter.Shared != nil { if filter.Exclusive != nil {
query = query.Where("job.shared = ?", *filter.Shared) query = query.Where("job.exclusive = ?", *filter.Exclusive)
} }
if filter.State != nil { if filter.State != nil {
states := make([]string, len(filter.State)) states := make([]string, len(filter.State))
@@ -217,7 +216,7 @@ func BuildWhereClause(filter *model.JobFilter, query sq.SelectBuilder) sq.Select
return query return query
} }
func buildIntCondition(field string, cond *config.IntRange, query sq.SelectBuilder) sq.SelectBuilder { func buildIntCondition(field string, cond *schema.IntRange, query sq.SelectBuilder) sq.SelectBuilder {
return query.Where(field+" BETWEEN ? AND ?", cond.From, cond.To) return query.Where(field+" BETWEEN ? AND ?", cond.From, cond.To)
} }
@@ -225,7 +224,7 @@ func buildFloatCondition(field string, cond *model.FloatRange, query sq.SelectBu
return query.Where(field+" BETWEEN ? AND ?", cond.From, cond.To) return query.Where(field+" BETWEEN ? AND ?", cond.From, cond.To)
} }
func buildTimeCondition(field string, cond *config.TimeRange, query sq.SelectBuilder) sq.SelectBuilder { func buildTimeCondition(field string, cond *schema.TimeRange, query sq.SelectBuilder) sq.SelectBuilder {
if cond.From != nil && cond.To != nil { if cond.From != nil && cond.To != nil {
return query.Where(field+" BETWEEN ? AND ?", cond.From.Unix(), cond.To.Unix()) return query.Where(field+" BETWEEN ? AND ?", cond.From.Unix(), cond.To.Unix())
} else if cond.From != nil { } else if cond.From != nil {

View File

@@ -8,7 +8,6 @@ import (
"context" "context"
"fmt" "fmt"
"testing" "testing"
"time"
"github.com/ClusterCockpit/cc-lib/schema" "github.com/ClusterCockpit/cc-lib/schema"
_ "github.com/mattn/go-sqlite3" _ "github.com/mattn/go-sqlite3"
@@ -17,31 +16,31 @@ import (
func TestFind(t *testing.T) { func TestFind(t *testing.T) {
r := setup(t) r := setup(t)
jobID, cluster, startTime := int64(398800), "fritz", int64(1675954712) jobId, cluster, startTime := int64(398998), "fritz", int64(1675957496)
job, err := r.Find(&jobID, &cluster, &startTime) job, err := r.Find(&jobId, &cluster, &startTime)
if err != nil { if err != nil {
t.Fatal(err) t.Fatal(err)
} }
// fmt.Printf("%+v", job) // fmt.Printf("%+v", job)
if *job.ID != 345 { if *job.ID != 5 {
t.Errorf("wrong summary for diagnostic \ngot: %d \nwant: 345", job.JobID) t.Errorf("wrong summary for diagnostic 3\ngot: %d \nwant: 1366", job.JobID)
} }
} }
func TestFindById(t *testing.T) { func TestFindById(t *testing.T) {
r := setup(t) r := setup(t)
job, err := r.FindByID(getContext(t), 338) job, err := r.FindById(getContext(t), 5)
if err != nil { if err != nil {
t.Fatal(err) t.Fatal(err)
} }
// fmt.Printf("%+v", job) // fmt.Printf("%+v", job)
if job.JobID != 398793 { if job.JobID != 398998 {
t.Errorf("wrong summary for diagnostic \ngot: %d \nwant: 1404396", job.JobID) t.Errorf("wrong summary for diagnostic 3\ngot: %d \nwant: 1404396", job.JobID)
} }
} }
@@ -72,61 +71,3 @@ func TestGetTags(t *testing.T) {
t.Errorf("wrong tag count \ngot: %d \nwant: 0", counts["bandwidth"]) t.Errorf("wrong tag count \ngot: %d \nwant: 0", counts["bandwidth"])
} }
} }
func TestFindJobsBetween(t *testing.T) {
r := setup(t)
// 1. Find a job to use (Find all jobs)
// We use a large time range to ensure we get something if it exists
jobs, err := r.FindJobsBetween(0, 9999999999, false)
if err != nil {
t.Fatal(err)
}
if len(jobs) == 0 {
t.Fatal("No jobs in test db")
}
targetJob := jobs[0]
// 2. Create a tag
tagName := fmt.Sprintf("testtag_%d", time.Now().UnixNano())
tagId, err := r.CreateTag("testtype", tagName, "global")
if err != nil {
t.Fatal(err)
}
// 3. Link Tag (Manually to avoid archive dependency side-effects in unit test)
_, err = r.DB.Exec("INSERT INTO jobtag (job_id, tag_id) VALUES (?, ?)", *targetJob.ID, tagId)
if err != nil {
t.Fatal(err)
}
// 4. Search with omitTagged = false (Should find the job)
jobsFound, err := r.FindJobsBetween(0, 9999999999, false)
if err != nil {
t.Fatal(err)
}
var found bool
for _, j := range jobsFound {
if *j.ID == *targetJob.ID {
found = true
break
}
}
if !found {
t.Errorf("Target job %d should be found when omitTagged=false", *targetJob.ID)
}
// 5. Search with omitTagged = true (Should NOT find the job)
jobsFiltered, err := r.FindJobsBetween(0, 9999999999, true)
if err != nil {
t.Fatal(err)
}
for _, j := range jobsFiltered {
if *j.ID == *targetJob.ID {
t.Errorf("Target job %d should NOT be found when omitTagged=true", *targetJob.ID)
}
}
}

View File

@@ -2,7 +2,6 @@
// All rights reserved. This file is part of cc-backend. // All rights reserved. This file is part of cc-backend.
// Use of this source code is governed by a MIT-style // Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file. // license that can be found in the LICENSE file.
package repository package repository
import ( import (
@@ -12,6 +11,7 @@ import (
cclog "github.com/ClusterCockpit/cc-lib/ccLogger" cclog "github.com/ClusterCockpit/cc-lib/ccLogger"
"github.com/golang-migrate/migrate/v4" "github.com/golang-migrate/migrate/v4"
"github.com/golang-migrate/migrate/v4/database/mysql"
"github.com/golang-migrate/migrate/v4/database/sqlite3" "github.com/golang-migrate/migrate/v4/database/sqlite3"
"github.com/golang-migrate/migrate/v4/source/iofs" "github.com/golang-migrate/migrate/v4/source/iofs"
) )
@@ -21,19 +21,40 @@ const Version uint = 10
//go:embed migrations/* //go:embed migrations/*
var migrationFiles embed.FS var migrationFiles embed.FS
func checkDBVersion(db *sql.DB) error { func checkDBVersion(backend string, db *sql.DB) error {
driver, err := sqlite3.WithInstance(db, &sqlite3.Config{}) var m *migrate.Migrate
if err != nil {
return err
}
d, err := iofs.New(migrationFiles, "migrations/sqlite3")
if err != nil {
return err
}
m, err := migrate.NewWithInstance("iofs", d, "sqlite3", driver) switch backend {
if err != nil { case "sqlite3":
return err driver, err := sqlite3.WithInstance(db, &sqlite3.Config{})
if err != nil {
return err
}
d, err := iofs.New(migrationFiles, "migrations/sqlite3")
if err != nil {
return err
}
m, err = migrate.NewWithInstance("iofs", d, "sqlite3", driver)
if err != nil {
return err
}
case "mysql":
driver, err := mysql.WithInstance(db, &mysql.Config{})
if err != nil {
return err
}
d, err := iofs.New(migrationFiles, "migrations/mysql")
if err != nil {
return err
}
m, err = migrate.NewWithInstance("iofs", d, "mysql", driver)
if err != nil {
return err
}
default:
cclog.Abortf("Migration: Unsupported database backend '%s'.\n", backend)
} }
v, dirty, err := m.Version() v, dirty, err := m.Version()
@@ -58,22 +79,37 @@ func checkDBVersion(db *sql.DB) error {
return nil return nil
} }
func getMigrateInstance(db string) (m *migrate.Migrate, err error) { func getMigrateInstance(backend string, db string) (m *migrate.Migrate, err error) {
d, err := iofs.New(migrationFiles, "migrations/sqlite3") switch backend {
if err != nil { case "sqlite3":
return nil, err d, err := iofs.New(migrationFiles, "migrations/sqlite3")
} if err != nil {
cclog.Fatal(err)
}
m, err = migrate.NewWithSourceInstance("iofs", d, fmt.Sprintf("sqlite3://%s?_foreign_keys=on", db)) m, err = migrate.NewWithSourceInstance("iofs", d, fmt.Sprintf("sqlite3://%s?_foreign_keys=on", db))
if err != nil { if err != nil {
return nil, err return m, err
}
case "mysql":
d, err := iofs.New(migrationFiles, "migrations/mysql")
if err != nil {
return m, err
}
m, err = migrate.NewWithSourceInstance("iofs", d, fmt.Sprintf("mysql://%s?multiStatements=true", db))
if err != nil {
return m, err
}
default:
cclog.Abortf("Migration: Unsupported database backend '%s'.\n", backend)
} }
return m, nil return m, nil
} }
func MigrateDB(db string) error { func MigrateDB(backend string, db string) error {
m, err := getMigrateInstance(db) m, err := getMigrateInstance(backend, db)
if err != nil { if err != nil {
return err return err
} }
@@ -81,7 +117,7 @@ func MigrateDB(db string) error {
v, dirty, err := m.Version() v, dirty, err := m.Version()
if err != nil { if err != nil {
if err == migrate.ErrNilVersion { if err == migrate.ErrNilVersion {
cclog.Info("Legacy database without version or missing database file!") cclog.Warn("Legacy database without version or missing database file!")
} else { } else {
return err return err
} }
@@ -107,8 +143,8 @@ func MigrateDB(db string) error {
return nil return nil
} }
func RevertDB(db string) error { func RevertDB(backend string, db string) error {
m, err := getMigrateInstance(db) m, err := getMigrateInstance(backend, db)
if err != nil { if err != nil {
return err return err
} }
@@ -125,8 +161,8 @@ func RevertDB(db string) error {
return nil return nil
} }
func ForceDB(db string) error { func ForceDB(backend string, db string) error {
m, err := getMigrateInstance(db) m, err := getMigrateInstance(backend, db)
if err != nil { if err != nil {
return err return err
} }

View File

@@ -0,0 +1,5 @@
DROP TABLE IF EXISTS job;
DROP TABLE IF EXISTS tags;
DROP TABLE IF EXISTS jobtag;
DROP TABLE IF EXISTS configuration;
DROP TABLE IF EXISTS user;

View File

@@ -0,0 +1,66 @@
CREATE TABLE IF NOT EXISTS job (
id INTEGER AUTO_INCREMENT PRIMARY KEY ,
job_id BIGINT NOT NULL,
cluster VARCHAR(255) NOT NULL,
subcluster VARCHAR(255) NOT NULL,
start_time BIGINT NOT NULL, -- Unix timestamp
user VARCHAR(255) NOT NULL,
project VARCHAR(255) NOT NULL,
`partition` VARCHAR(255) NOT NULL,
array_job_id BIGINT NOT NULL,
duration INT NOT NULL DEFAULT 0,
walltime INT NOT NULL DEFAULT 0,
job_state VARCHAR(255) NOT NULL
CHECK(job_state IN ('running', 'completed', 'failed', 'cancelled',
'stopped', 'timeout', 'preempted', 'out_of_memory')),
meta_data TEXT, -- JSON
resources TEXT NOT NULL, -- JSON
num_nodes INT NOT NULL,
num_hwthreads INT NOT NULL,
num_acc INT NOT NULL,
smt TINYINT NOT NULL DEFAULT 1 CHECK(smt IN (0, 1 )),
exclusive TINYINT NOT NULL DEFAULT 1 CHECK(exclusive IN (0, 1, 2)),
monitoring_status TINYINT NOT NULL DEFAULT 1 CHECK(monitoring_status IN (0, 1, 2, 3)),
mem_used_max REAL NOT NULL DEFAULT 0.0,
flops_any_avg REAL NOT NULL DEFAULT 0.0,
mem_bw_avg REAL NOT NULL DEFAULT 0.0,
load_avg REAL NOT NULL DEFAULT 0.0,
net_bw_avg REAL NOT NULL DEFAULT 0.0,
net_data_vol_total REAL NOT NULL DEFAULT 0.0,
file_bw_avg REAL NOT NULL DEFAULT 0.0,
file_data_vol_total REAL NOT NULL DEFAULT 0.0,
UNIQUE (job_id, cluster, start_time)
);
CREATE TABLE IF NOT EXISTS tag (
id INTEGER PRIMARY KEY,
tag_type VARCHAR(255) NOT NULL,
tag_name VARCHAR(255) NOT NULL,
UNIQUE (tag_type, tag_name));
CREATE TABLE IF NOT EXISTS jobtag (
job_id INTEGER,
tag_id INTEGER,
PRIMARY KEY (job_id, tag_id),
FOREIGN KEY (job_id) REFERENCES job (id) ON DELETE CASCADE,
FOREIGN KEY (tag_id) REFERENCES tag (id) ON DELETE CASCADE);
CREATE TABLE IF NOT EXISTS user (
username varchar(255) PRIMARY KEY NOT NULL,
password varchar(255) DEFAULT NULL,
ldap tinyint NOT NULL DEFAULT 0, /* col called "ldap" for historic reasons, fills the "AuthSource" */
name varchar(255) DEFAULT NULL,
roles varchar(255) NOT NULL DEFAULT "[]",
email varchar(255) DEFAULT NULL);
CREATE TABLE IF NOT EXISTS configuration (
username varchar(255),
confkey varchar(255),
value varchar(255),
PRIMARY KEY (username, confkey),
FOREIGN KEY (username) REFERENCES user (username) ON DELETE CASCADE ON UPDATE NO ACTION);

Some files were not shown because too many files have changed in this diff Show More