diff --git a/.env b/.env index 4e9aa63..04f530a 100644 --- a/.env +++ b/.env @@ -2,15 +2,6 @@ # CCBACKEND DEVEL DOCKER SETTINGS ######################################################################## -######################################################################## -# SLURM -######################################################################## -SLURM_VERSION=22.05.6 -ARCH=aarch64 -MUNGE_UID=981 -SLURM_UID=982 -WORKER_UID=1000 - ######################################################################## # INFLUXDB ######################################################################## @@ -22,27 +13,6 @@ INFLUXDB_BUCKET=ClusterCockpit # Whether or not to check SSL Cert in Symfony Client, Default: false INFLUXDB_SSL=false -######################################################################## -# MARIADB -######################################################################## -MARIADB_ROOT_PASSWORD=root -MARIADB_DATABASE=ClusterCockpit -MARIADB_USER=clustercockpit -MARIADB_PASSWORD=clustercockpit -MARIADB_PORT=3306 - -######################################### -# LDAP -######################################################################## -LDAP_ADMIN_PASSWORD=mashup -LDAP_ORGANISATION=NHR@FAU -LDAP_DOMAIN=rrze.uni-erlangen.de - -######################################################################## -# PHPMyAdmin -######################################################################## -PHPMYADMIN_PORT=8081 - ######################################################################## # INTERNAL SETTINGS ######################################################################## diff --git a/.gitignore b/.gitignore index 23c5b49..28989ba 100644 --- a/.gitignore +++ b/.gitignore @@ -3,6 +3,11 @@ data/job-archive/** data/influxdb data/sqldata data/cc-metric-store +data/cc-metric-store-source +data/ldap +data/mariadb +data/slurm +data cc-backend cc-backend/** .vscode diff --git a/README.md b/README.md old mode 100644 new mode 100755 index 99ba46a..b196299 --- a/README.md +++ b/README.md @@ -1,74 +1,175 @@ # cc-docker This is a `docker-compose` setup which provides a quickly started environment for ClusterCockpit development and testing, using `cc-backend`. -A number of services is readily available as docker container (nats, cc-metric-store, InfluxDB, LDAP), or easily added by manual configuration (MySQL). +A number of services is readily available as docker container (nats, cc-metric-store, InfluxDB, LDAP, SLURM), or easily added by manual configuration (MariaDB). It includes the following containers: -* nats (Default) -* cc-metric-store (Default) -* influxdb (Default) -* openldap (Default) -* mysql (Optional) -* mariadb (Optional) -* phpmyadmin (Optional) +|Service full name|docker service name|port| +| --- | --- | --- | +|Slurm Controller service|slurmctld|6818| +|Slurm Database service|slurmdbd|6817| +|Slurm Rest service with JWT authentication|slurmrestd|6820| +|Slurm Worker|node01|6818| +|MariaDB service|mariadb|3306| +|InfluxDB serice|influxdb|8086| +|NATS service|nats|4222, 6222, 8222| +|cc-metric-store service|cc-metric-store|8084| +|OpenLDAP|openldap|389, 636| -The setup comes with fixture data for a Job archive, cc-metric-store checkpoints, InfluxDB, MySQL, and a LDAP user directory. +The setup comes with fixture data for a Job archive, cc-metric-store checkpoints, InfluxDB, MariaDB, and a LDAP user directory. -## Known Issues +## Prerequisites -* `docker-compose` installed on Ubuntu (18.04, 20.04) via `apt-get` can not correctly parse `docker-compose.yml` due to version differences. Install latest version of `docker-compose` from https://docs.docker.com/compose/install/ instead. -* You need to ensure that no other web server is running on ports 8080 (cc-backend), 8081 (phpmyadmin), 8084 (cc-metric-store), 8086 (nfluxDB), 4222 and 8222 (Nats), or 3306 (MySQL). If one or more ports are already in use, you habe to adapt the related config accordingly. -* Existing VPN connections sometimes cause problems with docker. If `docker-compose` does not start up correctly, try disabling any active VPN connection. Refer to https://stackoverflow.com/questions/45692255/how-make-openvpn-work-with-docker for further information. +For all the docker services to work correctly, you will need the following tools installed: -## Configuration Templates +1. `docker` and `docker-compose` +2. `golang` (for compiling cc-metric-store) +3. `perl` (for migrateTimestamp.pl) with Cpanel::JSON::XS, Data::Dumper, Time::Piece, Sort::Versions and File::Slurp perl modules. +4. `npm` (for cc-backend) +5. `make` (for building slurm base image) -Located in `./templates` -* `docker-compose.yml.default`: Docker-Compose file to setup cc-metric-store, InfluxDB, MariaDB, PhpMyadmin, and LDAP containers (Default). Used in `setupDev.sh`. -* `docker-compose.yml.mysql`: Docker-Compose configuration template if MySQL is desired instead of MariaDB. -* `env.default`: Environment variables for setup with cc-metric-store, InfluxDB, MariaDB, PhpMyadmin, and LDAP containers (Default). Used in `setupDev.sh`. -* `env.mysql`: Additional environment variables required if MySQL is desired instead of MariaDB. +It is also recommended to add docker service to sudouser group since the setupDev.sh script assumes sudo permissions for docker and docker-compose services. + +You can use: + +``` +sudo groupadd docker +sudo usermod -aG docker $USER + +# restart after adding your docker with your user to sudo group +sudo shutdown -r -t 0 +``` + +Note: You can install all these dependencies via predefined installation steps in `prerequisite_installation_script.sh`. + +If you are using different linux flavors, you will have to adapt `prerequisite_installation_script.sh` as well as `setupDev.sh`. ## Setup 1. Clone `cc-backend` repository in chosen base folder: `$> git clone https://github.com/ClusterCockpit/cc-backend.git` -2. Run `$ ./setupDev.sh`: **NOTICE** The script will download files of a total size of 338MB (mostly for the InfluxDB data). +2. Run `$ ./setupDev.sh`: **NOTICE** The script will download files of a total size of 338MB (mostly for the cc-metric-store data). -3. The setup-script launches the supporting container stack in the background automatically if everything went well. Run `$> ./cc-backend/cc-backend` to start `cc-backend.` +3. The setup-script launches the supporting container stack in the background automatically if everything went well. Run `$> ./cc-backend/cc-backend -server -dev` to start `cc-backend`. 4. By default, you can access `cc-backend` in your browser at `http://localhost:8080`. You can shut down the cc-backend server by pressing `CTRL-C`, remember to also shut down all containers via `$> docker-compose down` afterwards. 5. You can restart the containers with: `$> docker-compose up -d`. -## Post-Setup Adjustment for using `influxdb` - -When using `influxdb` as a metric database, one must adjust the following files: -* `cc-backend/var/job-archive/emmy/cluster.json` -* `cc-backend/var/job-archive/woody/cluster.json` - -In the JSON, exchange the content of the `metricDataRepository`-Entry (By default configured for `cc-metric-store`) with: -``` -"metricDataRepository": { - "kind": "influxdb", - "url": "http://localhost:8086", - "token": "egLfcf7fx0FESqFYU3RpAAbj", - "bucket": "ClusterCockpit", - "org": "ClusterCockpit", - "skiptls": false -} -``` - - -## Usage +## Credentials for logging into clustercockpit Credentials for the preconfigured demo user are: * User: `demo` -* Password: `AdminDev` +* Password: `demo` + +Credentials for the preconfigured LDAP user are: +* User: `ldapuser` +* Password: `ldapuser` You can also login as regular user using any credential in the LDAP user directory at `./data/ldap/users.ldif`. +## Preconfigured setup between docker services and ClusterCockpit components + +When you are done cloning the cc-backend repo and once you execute `setupDev.sh` file, it will copy a preconfigured `config.json` from `misc/config.json` and replace the `cc-backend/config.json`, which will be used by cc-backend, once you start the server. +The preconfigured config.json attaches to: +#### 1. MariaDB docker service on port 3306 (database: ccbackend) +#### 2. OpenLDAP docker service on port 389 +#### 3. cc-metric-store docker service on port 8084 + +cc-metric-store also has a preconfigured `config.json` in `cc-metric-store/config.json` which attaches to NATS docker service on port 4222 and subscribes to topic 'hpc-nats'. + +Basically, all the ClusterCockpit components and the docker services attach to each other like lego pieces. + +## Docker commands to access the services + +> Note: You need to be in cc-docker directory in order to execute any docker command + +You can view all docker processes running on either of the VM instance by using this command: + +``` +$ docker ps +``` + +Now that you can see the docker services, and if you want to manually access the docker services, you have to run **`bash`** command in those running services. + +> **`Example`**: You want to run slurm commands like `sinfo` or `squeue` or `scontrol` on slurm controller, you cannot directly access it. + +You need to **`bash`** into the running service by using the following command: + +``` +$ docker exec -it bash + +#example +$ docker exec -it slurmctld bash + +#or +$ docker exec -it mariadb bash +``` + +Once you start a **`bash`** on any docker service, then you may execute any service related commands in that **`bash`**. + +But for Cluster Cockpit development, you only need ports to access these docker services. You have to use `localhost:` when trying to access any docker service. You may need to configure the `cc-backend/config.json` based on these docker services and ports. + +## Slurm setup in cc-docker + +### 1. Slurm controller + +Currently slurm controller is aware of the 1 node that we have setup in our mini cluster i.e. node01. + +In order to execute slurm commands, you may need to **`bash`** into the **`slurmctld`** docker service. + +``` +$ docker exec -it slurmctld bash +``` + +Then you may be able to run slurm controller commands. A few examples without output are: + +``` +$ sinfo + +or + +$ squeue + +or + +$ scontrol show nodes +``` + +### 2. Slurm rest service + +You do not need to **`bash`** into the slurmrestd service but can directly access the rest API via localhost:6820. A simple example on how to CURL to the slurm rest API is given in the `curl_slurmrestd.sh`. + +You can directly use `curl_slurmrestd.sh` with a never expiring JWT token ( can be found in /data/slurm/secret/jwt_token.txt ) + +You may also use the never expiring token directly from the file for any of your custom CURL commands. + +## Known Issues + +* `docker-compose` installed on Ubuntu (18.04, 20.04) via `apt-get` can not correctly parse `docker-compose.yml` due to version differences. Install latest version of `docker-compose` from https://docs.docker.com/compose/install/ instead. +* You need to ensure that no other web server is running on ports 8080 (cc-backend), 8082 (cc-metric-store), 8086 (InfluxDB), 4222 and 8222 (Nats), or 3306 (MariaDB). If one or more ports are already in use, you have to adapt the related config accordingly. +* Existing VPN connections sometimes cause problems with docker. If `docker-compose` does not start up correctly, try disabling any active VPN connection. Refer to https://stackoverflow.com/questions/45692255/how-make-openvpn-work-with-docker for further information. + +## Docker services and restarting the services + +You can find all the docker services in `docker-compose.yml`. Feel free to modify it. + +Whenever you modify it, please use + +``` +$ docker compose down +``` + +in order to shut down all the services in all the VM’s (maininstance, nodeinstance, nodeinstance2) and then start all the services by using + +``` +$ docker compose up +``` + + + TODO: Update job archive and all other metric data. The job archive with 1867 jobs originates from the second half of 2020. Roughly 2700 jobs from the first week of 2021 are loaded with data from InfluxDB. Some views of ClusterCockpit (e.g. the Users view) show the last week or month. -To show some data there you have to set the filter to time periods with jobs (August 2020 to January 2021). +To show some data there you have to set the filter to time periods with jobs (August 2020 to January 2021). \ No newline at end of file diff --git a/cc-metric-store/Dockerfile b/cc-metric-store/Dockerfile index 4284c98..e7e6d1d 100644 --- a/cc-metric-store/Dockerfile +++ b/cc-metric-store/Dockerfile @@ -1,10 +1,12 @@ -FROM golang:1.17 +FROM golang:1.22.4 RUN apt-get update RUN apt-get -y install git +RUN rm -rf /cc-metric-store + RUN git clone https://github.com/ClusterCockpit/cc-metric-store.git /cc-metric-store -RUN cd /cc-metric-store && go build +RUN cd /cc-metric-store && go build ./cmd/cc-metric-store # Reactivate when latest commit is available #RUN go get -d -v github.com/ClusterCockpit/cc-metric-store diff --git a/cc-metric-store/config.json b/cc-metric-store/config.json index 674c67c..a7173b2 100644 --- a/cc-metric-store/config.json +++ b/cc-metric-store/config.json @@ -1,28 +1,201 @@ { "metrics": { - "clock": { "frequency": 60, "aggregation": null, "scope": "node" }, - "cpi": { "frequency": 60, "aggregation": null, "scope": "node" }, - "cpu_load": { "frequency": 60, "aggregation": null, "scope": "node" }, - "flops_any": { "frequency": 60, "aggregation": null, "scope": "node" }, - "flops_dp": { "frequency": 60, "aggregation": null, "scope": "node" }, - "flops_sp": { "frequency": 60, "aggregation": null, "scope": "node" }, - "ib_bw": { "frequency": 60, "aggregation": null, "scope": "node" }, - "lustre_bw": { "frequency": 60, "aggregation": null, "scope": "node" }, - "mem_bw": { "frequency": 60, "aggregation": null, "scope": "node" }, - "mem_used": { "frequency": 60, "aggregation": null, "scope": "node" }, - "rapl_power": { "frequency": 60, "aggregation": null, "scope": "node" } + "debug_metric": { + "frequency": 60, + "aggregation": "avg" + }, + "clock": { + "frequency": 60, + "aggregation": "avg" + }, + "cpu_idle": { + "frequency": 60, + "aggregation": "avg" + }, + "cpu_iowait": { + "frequency": 60, + "aggregation": "avg" + }, + "cpu_irq": { + "frequency": 60, + "aggregation": "avg" + }, + "cpu_system": { + "frequency": 60, + "aggregation": "avg" + }, + "cpu_user": { + "frequency": 60, + "aggregation": "avg" + }, + "nv_mem_util": { + "frequency": 60, + "aggregation": "avg" + }, + "nv_temp": { + "frequency": 60, + "aggregation": "avg" + }, + "nv_sm_clock": { + "frequency": 60, + "aggregation": "avg" + }, + "acc_utilization": { + "frequency": 60, + "aggregation": "avg" + }, + "acc_mem_used": { + "frequency": 60, + "aggregation": "sum" + }, + "acc_power": { + "frequency": 60, + "aggregation": "sum" + }, + "flops_any": { + "frequency": 60, + "aggregation": "sum" + }, + "flops_dp": { + "frequency": 60, + "aggregation": "sum" + }, + "flops_sp": { + "frequency": 60, + "aggregation": "sum" + }, + "ib_recv": { + "frequency": 60, + "aggregation": "sum" + }, + "ib_xmit": { + "frequency": 60, + "aggregation": "sum" + }, + "ib_recv_pkts": { + "frequency": 60, + "aggregation": "sum" + }, + "ib_xmit_pkts": { + "frequency": 60, + "aggregation": "sum" + }, + "cpu_power": { + "frequency": 60, + "aggregation": "sum" + }, + "core_power": { + "frequency": 60, + "aggregation": "sum" + }, + "mem_power": { + "frequency": 60, + "aggregation": "sum" + }, + "ipc": { + "frequency": 60, + "aggregation": "avg" + }, + "cpu_load": { + "frequency": 60, + "aggregation": null + }, + "lustre_close": { + "frequency": 60, + "aggregation": null + }, + "lustre_open": { + "frequency": 60, + "aggregation": null + }, + "lustre_statfs": { + "frequency": 60, + "aggregation": null + }, + "lustre_read_bytes": { + "frequency": 60, + "aggregation": null + }, + "lustre_write_bytes": { + "frequency": 60, + "aggregation": null + }, + "net_bw": { + "frequency": 60, + "aggregation": null + }, + "file_bw": { + "frequency": 60, + "aggregation": null + }, + "mem_bw": { + "frequency": 60, + "aggregation": "sum" + }, + "mem_cached": { + "frequency": 60, + "aggregation": null + }, + "mem_used": { + "frequency": 60, + "aggregation": null + }, + "net_bytes_in": { + "frequency": 60, + "aggregation": null + }, + "net_bytes_out": { + "frequency": 60, + "aggregation": null + }, + "nfs4_read": { + "frequency": 60, + "aggregation": null + }, + "nfs4_total": { + "frequency": 60, + "aggregation": null + }, + "nfs4_write": { + "frequency": 60, + "aggregation": null + }, + "vectorization_ratio": { + "frequency": 60, + "aggregation": "avg" + } }, "checkpoints": { - "interval": 100000000000, + "interval": "12h", "directory": "/data/checkpoints", - "restore": 100000000000 + "restore": "48h" }, "archive": { - "interval": 100000000000, + "interval": "50h", "directory": "/data/archive" }, - "retention-in-memory": 100000000000, - "http-api-address": "0.0.0.0:8081", - "nats": "nats://cc-nats:4222", + "http-api": { + "address": "0.0.0.0:8084", + "https-cert-file": null, + "https-key-file": null + }, + "retention-in-memory": "48h", + "nats": [ + { + "address": "nats://nats:4222", + "username": "root", + "password": "root", + "subscriptions": [ + { + "subscribe-to": "hpc-nats", + "cluster-tag": "fritz" + }, + { + "subscribe-to": "hpc-nats", + "cluster-tag": "alex" + } + ] + } + ], "jwt-public-key": "kzfYrYy+TzpanWZHJ5qSdMj5uKUWgq74BWhQG6copP0=" -} +} \ No newline at end of file diff --git a/data/init.sh b/data/init.sh deleted file mode 100755 index 3bddade..0000000 --- a/data/init.sh +++ /dev/null @@ -1,34 +0,0 @@ -#!/usr/bin/env bash - -if [ -d symfony ]; then - echo "Data already initialized!" - echo -n "Perform a fresh initialisation? [yes to proceed / no to exit] " - read -r answer - if [ "$answer" == "yes" ]; then - echo "Cleaning directories ..." - rm -rf symfony - rm -rf job-archive - rm -rf influxdb/data/* - rm -rf sqldata/* - echo "done." - else - echo "Aborting ..." - exit - fi -fi - -mkdir symfony -wget https://hpc-mover.rrze.uni-erlangen.de/HPC-Data/0x7b58aefb/eig7ahyo6fo2bais0ephuf2aitohv1ai/job-archive_stable.tar.xz -tar xJf job-archive_stable.tar.xz -rm ./job-archive_stable.tar.xz - -# 101 is the uid and gid of the user and group www-data in the cc-php container running php-fpm. -# For a demo with no new jobs it is enough to give www read permissions on that directory. -# echo "This script needs to chown the job-archive directory so that the application can write to it:" -# sudo chown -R 82:82 ./job-archive - -mkdir -p influxdb/data -wget https://hpc-mover.rrze.uni-erlangen.de/HPC-Data/0x7b58aefb/eig7ahyo6fo2bais0ephuf2aitohv1ai/influxdbv2-data_stable.tar.xz -cd influxdb/data -tar xJf ../../influxdbv2-data_stable.tar.xz -rm ../../influxdbv2-data_stable.tar.xz diff --git a/data/ldap/users.ldif b/data/ldap/users.ldif deleted file mode 100644 index 79a390a..0000000 --- a/data/ldap/users.ldif +++ /dev/null @@ -1,1027 +0,0 @@ -# extended LDIF -# -# LDAPv3 -# base with scope subtree -# filter: (objectclass=*) -# requesting: ALL - -# people, hpc, rrze.uni-erlangen.de -dn: ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -objectClass: organizationalUnit -objectClass: top -ou: hpc - -# emmyUser1, hpc, rrze.uni-erlangen.de -dn: uid=emmyUser1,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: emmyUser1 -uidNumber: 10000 -gecos: Ann Watson -cn: emmyUser1 -homeDirectory: /home/hpc/emmyUser1 -userPassword: emmyUser1 - -# emmyUser10, hpc, rrze.uni-erlangen.de -dn: uid=emmyUser10,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: emmyUser10 -uidNumber: 10001 -gecos: Kenneth Wallis -cn: emmyUser10 -homeDirectory: /home/hpc/emmyUser10 -userPassword: emmyUser10 - -# emmyUser2, hpc, rrze.uni-erlangen.de -dn: uid=emmyUser2,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: emmyUser2 -uidNumber: 10002 -gecos: Lewis Bennett -cn: emmyUser2 -homeDirectory: /home/hpc/emmyUser2 -userPassword: emmyUser2 - -# emmyUser3, hpc, rrze.uni-erlangen.de -dn: uid=emmyUser3,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: emmyUser3 -uidNumber: 10003 -gecos: Darren Jenkins -cn: emmyUser3 -homeDirectory: /home/hpc/emmyUser3 -userPassword: emmyUser3 - -# emmyUser4, hpc, rrze.uni-erlangen.de -dn: uid=emmyUser4,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: emmyUser4 -uidNumber: 10004 -gecos: Terry Johnson -cn: emmyUser4 -homeDirectory: /home/hpc/emmyUser4 -userPassword: emmyUser4 - -# emmyUser5, hpc, rrze.uni-erlangen.de -dn: uid=emmyUser5,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: emmyUser5 -uidNumber: 10005 -gecos: Shaun Hurst -cn: emmyUser5 -homeDirectory: /home/hpc/emmyUser5 -userPassword: emmyUser5 - -# emmyUser6, hpc, rrze.uni-erlangen.de -dn: uid=emmyUser6,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: emmyUser6 -uidNumber: 10006 -gecos: Peter Peters -cn: emmyUser6 -homeDirectory: /home/hpc/emmyUser6 -userPassword: emmyUser6 - -# emmyUser7, hpc, rrze.uni-erlangen.de -dn: uid=emmyUser7,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: emmyUser7 -uidNumber: 10007 -gecos: Sean Davies -cn: emmyUser7 -homeDirectory: /home/hpc/emmyUser7 -userPassword: emmyUser7 - -# emmyUser8, hpc, rrze.uni-erlangen.de -dn: uid=emmyUser8,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: emmyUser8 -uidNumber: 10008 -gecos: Kyle Lawrence -cn: emmyUser8 -homeDirectory: /home/hpc/emmyUser8 -userPassword: emmyUser8 - -# emmyUser9, hpc, rrze.uni-erlangen.de -dn: uid=emmyUser9,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: emmyUser9 -uidNumber: 10009 -gecos: Ryan Edwards -cn: emmyUser9 -homeDirectory: /home/hpc/emmyUser9 -userPassword: emmyUser9 - -# influxUser1, hpc, rrze.uni-erlangen.de -dn: uid=influxUser1,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser1 -uidNumber: 10010 -gecos: Dale Sharpe -cn: influxUser1 -homeDirectory: /home/hpc/influxUser1 -userPassword: influxUser1 - -# influxUser10, hpc, rrze.uni-erlangen.de -dn: uid=influxUser10,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser10 -uidNumber: 10011 -gecos: Tracey McCarthy -cn: influxUser10 -homeDirectory: /home/hpc/influxUser10 -userPassword: influxUser10 - -# influxUser11, hpc, rrze.uni-erlangen.de -dn: uid=influxUser11,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser11 -uidNumber: 10012 -gecos: Douglas Harrison -cn: influxUser11 -homeDirectory: /home/hpc/influxUser11 -userPassword: influxUser11 - -# influxUser12, hpc, rrze.uni-erlangen.de -dn: uid=influxUser12,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser12 -uidNumber: 10013 -gecos: Kimberley Powell -cn: influxUser12 -homeDirectory: /home/hpc/influxUser12 -userPassword: influxUser12 - -# influxUser13, hpc, rrze.uni-erlangen.de -dn: uid=influxUser13,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser13 -uidNumber: 10014 -gecos: Patrick Hill -cn: influxUser13 -homeDirectory: /home/hpc/influxUser13 -userPassword: influxUser13 - -# influxUser14, hpc, rrze.uni-erlangen.de -dn: uid=influxUser14,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser14 -uidNumber: 10015 -gecos: Harriet Chadwick -cn: influxUser14 -homeDirectory: /home/hpc/influxUser14 -userPassword: influxUser14 - -# influxUser15, hpc, rrze.uni-erlangen.de -dn: uid=influxUser15,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser15 -uidNumber: 10016 -gecos: Annette Parker -cn: influxUser15 -homeDirectory: /home/hpc/influxUser15 -userPassword: influxUser15 - -# influxUser16, hpc, rrze.uni-erlangen.de -dn: uid=influxUser16,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser16 -uidNumber: 10017 -gecos: Owen Price -cn: influxUser16 -homeDirectory: /home/hpc/influxUser16 -userPassword: influxUser16 - -# influxUser17, hpc, rrze.uni-erlangen.de -dn: uid=influxUser17,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser17 -uidNumber: 10018 -gecos: Kyle Patel -cn: influxUser17 -homeDirectory: /home/hpc/influxUser17 -userPassword: influxUser17 - -# influxUser18, hpc, rrze.uni-erlangen.de -dn: uid=influxUser18,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser18 -uidNumber: 10019 -gecos: Denis Barber -cn: influxUser18 -homeDirectory: /home/hpc/influxUser18 -userPassword: influxUser18 - -# influxUser19, hpc, rrze.uni-erlangen.de -dn: uid=influxUser19,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser19 -uidNumber: 10020 -gecos: Diane Birch -cn: influxUser19 -homeDirectory: /home/hpc/influxUser19 -userPassword: influxUser19 - -# influxUser2, hpc, rrze.uni-erlangen.de -dn: uid=influxUser2,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser2 -uidNumber: 10021 -gecos: Jordan Walker -cn: influxUser2 -homeDirectory: /home/hpc/influxUser2 -userPassword: influxUser2 - -# influxUser20, hpc, rrze.uni-erlangen.de -dn: uid=influxUser20,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser20 -uidNumber: 10022 -gecos: Brian Wilson -cn: influxUser20 -homeDirectory: /home/hpc/influxUser20 -userPassword: influxUser20 - -# influxUser21, hpc, rrze.uni-erlangen.de -dn: uid=influxUser21,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser21 -uidNumber: 10023 -gecos: Molly Miller -cn: influxUser21 -homeDirectory: /home/hpc/influxUser21 -userPassword: influxUser21 - -# influxUser22, hpc, rrze.uni-erlangen.de -dn: uid=influxUser22,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser22 -uidNumber: 10024 -gecos: Reece Godfrey -cn: influxUser22 -homeDirectory: /home/hpc/influxUser22 -userPassword: influxUser22 - -# influxUser23, hpc, rrze.uni-erlangen.de -dn: uid=influxUser23,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser23 -uidNumber: 10025 -gecos: Antony Cooper -cn: influxUser23 -homeDirectory: /home/hpc/influxUser23 -userPassword: influxUser23 - -# influxUser24, hpc, rrze.uni-erlangen.de -dn: uid=influxUser24,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser24 -uidNumber: 10026 -gecos: Mark Evans -cn: influxUser24 -homeDirectory: /home/hpc/influxUser24 -userPassword: influxUser24 - -# influxUser25, hpc, rrze.uni-erlangen.de -dn: uid=influxUser25,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser25 -uidNumber: 10027 -gecos: Edward Coleman -cn: influxUser25 -homeDirectory: /home/hpc/influxUser25 -userPassword: influxUser25 - -# influxUser26, hpc, rrze.uni-erlangen.de -dn: uid=influxUser26,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser26 -uidNumber: 10028 -gecos: Lucy Marsden -cn: influxUser26 -homeDirectory: /home/hpc/influxUser26 -userPassword: influxUser26 - -# influxUser27, hpc, rrze.uni-erlangen.de -dn: uid=influxUser27,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser27 -uidNumber: 10029 -gecos: Leonard King -cn: influxUser27 -homeDirectory: /home/hpc/influxUser27 -userPassword: influxUser27 - -# influxUser28, hpc, rrze.uni-erlangen.de -dn: uid=influxUser28,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser28 -uidNumber: 10030 -gecos: Marion Harvey -cn: influxUser28 -homeDirectory: /home/hpc/influxUser28 -userPassword: influxUser28 - -# influxUser29, hpc, rrze.uni-erlangen.de -dn: uid=influxUser29,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser29 -uidNumber: 10031 -gecos: Jean Phillips -cn: influxUser29 -homeDirectory: /home/hpc/influxUser29 -userPassword: influxUser29 - -# influxUser3, hpc, rrze.uni-erlangen.de -dn: uid=influxUser3,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser3 -uidNumber: 10032 -gecos: Derek Sutton -cn: influxUser3 -homeDirectory: /home/hpc/influxUser3 -userPassword: influxUser3 - -# influxUser30, hpc, rrze.uni-erlangen.de -dn: uid=influxUser30,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser30 -uidNumber: 10033 -gecos: Marion Powell -cn: influxUser30 -homeDirectory: /home/hpc/influxUser30 -userPassword: influxUser30 - -# influxUser31, hpc, rrze.uni-erlangen.de -dn: uid=influxUser31,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser31 -uidNumber: 10034 -gecos: Laura Matthews -cn: influxUser31 -homeDirectory: /home/hpc/influxUser31 -userPassword: influxUser31 - -# influxUser32, hpc, rrze.uni-erlangen.de -dn: uid=influxUser32,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser32 -uidNumber: 10035 -gecos: Julie Bell -cn: influxUser32 -homeDirectory: /home/hpc/influxUser32 -userPassword: influxUser32 - -# influxUser33, hpc, rrze.uni-erlangen.de -dn: uid=influxUser33,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser33 -uidNumber: 10036 -gecos: Thomas Davies -cn: influxUser33 -homeDirectory: /home/hpc/influxUser33 -userPassword: influxUser33 - -# influxUser34, hpc, rrze.uni-erlangen.de -dn: uid=influxUser34,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser34 -uidNumber: 10037 -gecos: Robin Webster -cn: influxUser34 -homeDirectory: /home/hpc/influxUser34 -userPassword: influxUser34 - -# influxUser35, hpc, rrze.uni-erlangen.de -dn: uid=influxUser35,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser35 -uidNumber: 10038 -gecos: Josh Robinson -cn: influxUser35 -homeDirectory: /home/hpc/influxUser35 -userPassword: influxUser35 - -# influxUser36, hpc, rrze.uni-erlangen.de -dn: uid=influxUser36,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser36 -uidNumber: 10039 -gecos: Eileen Murphy -cn: influxUser36 -homeDirectory: /home/hpc/influxUser36 -userPassword: influxUser36 - -# influxUser37, hpc, rrze.uni-erlangen.de -dn: uid=influxUser37,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser37 -uidNumber: 10040 -gecos: Charlene Carter -cn: influxUser37 -homeDirectory: /home/hpc/influxUser37 -userPassword: influxUser37 - -# influxUser38, hpc, rrze.uni-erlangen.de -dn: uid=influxUser38,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser38 -uidNumber: 10041 -gecos: Declan Brown -cn: influxUser38 -homeDirectory: /home/hpc/influxUser38 -userPassword: influxUser38 - -# influxUser39, hpc, rrze.uni-erlangen.de -dn: uid=influxUser39,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser39 -uidNumber: 10042 -gecos: Lee Wilson -cn: influxUser39 -homeDirectory: /home/hpc/influxUser39 -userPassword: influxUser39 - -# influxUser4, hpc, rrze.uni-erlangen.de -dn: uid=influxUser4,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser4 -uidNumber: 10043 -gecos: Steven Collier -cn: influxUser4 -homeDirectory: /home/hpc/influxUser4 -userPassword: influxUser4 - -# influxUser40, hpc, rrze.uni-erlangen.de -dn: uid=influxUser40,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser40 -uidNumber: 10044 -gecos: Ashley Smith -cn: influxUser40 -homeDirectory: /home/hpc/influxUser40 -userPassword: influxUser40 - -# influxUser41, hpc, rrze.uni-erlangen.de -dn: uid=influxUser41,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser41 -uidNumber: 10045 -gecos: Alison Robinson -cn: influxUser41 -homeDirectory: /home/hpc/influxUser41 -userPassword: influxUser41 - -# influxUser42, hpc, rrze.uni-erlangen.de -dn: uid=influxUser42,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser42 -uidNumber: 10046 -gecos: Sandra Dunn -cn: influxUser42 -homeDirectory: /home/hpc/influxUser42 -userPassword: influxUser42 - -# influxUser43, hpc, rrze.uni-erlangen.de -dn: uid=influxUser43,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser43 -uidNumber: 10047 -gecos: Cheryl Price -cn: influxUser43 -homeDirectory: /home/hpc/influxUser43 -userPassword: influxUser43 - -# influxUser44, hpc, rrze.uni-erlangen.de -dn: uid=influxUser44,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser44 -uidNumber: 10048 -gecos: June Nicholson -cn: influxUser44 -homeDirectory: /home/hpc/influxUser44 -userPassword: influxUser44 - -# influxUser45, hpc, rrze.uni-erlangen.de -dn: uid=influxUser45,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser45 -uidNumber: 10049 -gecos: Olivia Potter -cn: influxUser45 -homeDirectory: /home/hpc/influxUser45 -userPassword: influxUser45 - -# influxUser46, hpc, rrze.uni-erlangen.de -dn: uid=influxUser46,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser46 -uidNumber: 10050 -gecos: Melissa Welch -cn: influxUser46 -homeDirectory: /home/hpc/influxUser46 -userPassword: influxUser46 - -# influxUser47, hpc, rrze.uni-erlangen.de -dn: uid=influxUser47,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser47 -uidNumber: 10051 -gecos: Marc Sims -cn: influxUser47 -homeDirectory: /home/hpc/influxUser47 -userPassword: influxUser47 - -# influxUser48, hpc, rrze.uni-erlangen.de -dn: uid=influxUser48,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser48 -uidNumber: 10052 -gecos: Alan Harris -cn: influxUser48 -homeDirectory: /home/hpc/influxUser48 -userPassword: influxUser48 - -# influxUser49, hpc, rrze.uni-erlangen.de -dn: uid=influxUser49,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser49 -uidNumber: 10053 -gecos: Declan Harrison -cn: influxUser49 -homeDirectory: /home/hpc/influxUser49 -userPassword: influxUser49 - -# influxUser5, hpc, rrze.uni-erlangen.de -dn: uid=influxUser5,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser5 -uidNumber: 10054 -gecos: Maureen Hall -cn: influxUser5 -homeDirectory: /home/hpc/influxUser5 -userPassword: influxUser5 - -# influxUser50, hpc, rrze.uni-erlangen.de -dn: uid=influxUser50,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser50 -uidNumber: 10055 -gecos: Daniel Wilson -cn: influxUser50 -homeDirectory: /home/hpc/influxUser50 -userPassword: influxUser50 - -# influxUser51, hpc, rrze.uni-erlangen.de -dn: uid=influxUser51,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser51 -uidNumber: 10056 -gecos: Ben Palmer -cn: influxUser51 -homeDirectory: /home/hpc/influxUser51 -userPassword: influxUser51 - -# influxUser52, hpc, rrze.uni-erlangen.de -dn: uid=influxUser52,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser52 -uidNumber: 10057 -gecos: Sarah Lyons -cn: influxUser52 -homeDirectory: /home/hpc/influxUser52 -userPassword: influxUser52 - -# influxUser53, hpc, rrze.uni-erlangen.de -dn: uid=influxUser53,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser53 -uidNumber: 10058 -gecos: Frank Hill -cn: influxUser53 -homeDirectory: /home/hpc/influxUser53 -userPassword: influxUser53 - -# influxUser54, hpc, rrze.uni-erlangen.de -dn: uid=influxUser54,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser54 -uidNumber: 10059 -gecos: Elliott Brown -cn: influxUser54 -homeDirectory: /home/hpc/influxUser54 -userPassword: influxUser54 - -# influxUser55, hpc, rrze.uni-erlangen.de -dn: uid=influxUser55,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser55 -uidNumber: 10060 -gecos: Shirley Pritchard -cn: influxUser55 -homeDirectory: /home/hpc/influxUser55 -userPassword: influxUser55 - -# influxUser56, hpc, rrze.uni-erlangen.de -dn: uid=influxUser56,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser56 -uidNumber: 10061 -gecos: Sylvia Morris -cn: influxUser56 -homeDirectory: /home/hpc/influxUser56 -userPassword: influxUser56 - -# influxUser57, hpc, rrze.uni-erlangen.de -dn: uid=influxUser57,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser57 -uidNumber: 10062 -gecos: Arthur Green -cn: influxUser57 -homeDirectory: /home/hpc/influxUser57 -userPassword: influxUser57 - -# influxUser58, hpc, rrze.uni-erlangen.de -dn: uid=influxUser58,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser58 -uidNumber: 10063 -gecos: Steven Begum -cn: influxUser58 -homeDirectory: /home/hpc/influxUser58 -userPassword: influxUser58 - -# influxUser59, hpc, rrze.uni-erlangen.de -dn: uid=influxUser59,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser59 -uidNumber: 10064 -gecos: Joanne Barber -cn: influxUser59 -homeDirectory: /home/hpc/influxUser59 -userPassword: influxUser59 - -# influxUser6, hpc, rrze.uni-erlangen.de -dn: uid=influxUser6,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser6 -uidNumber: 10065 -gecos: Mohamed Henderson -cn: influxUser6 -homeDirectory: /home/hpc/influxUser6 -userPassword: influxUser6 - -# influxUser60, hpc, rrze.uni-erlangen.de -dn: uid=influxUser60,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser60 -uidNumber: 10066 -gecos: Nicola James -cn: influxUser60 -homeDirectory: /home/hpc/influxUser60 -userPassword: influxUser60 - -# influxUser61, hpc, rrze.uni-erlangen.de -dn: uid=influxUser61,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser61 -uidNumber: 10067 -gecos: Graham Cartwright -cn: influxUser61 -homeDirectory: /home/hpc/influxUser61 -userPassword: influxUser61 - -# influxUser62, hpc, rrze.uni-erlangen.de -dn: uid=influxUser62,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser62 -uidNumber: 10068 -gecos: Kirsty George -cn: influxUser62 -homeDirectory: /home/hpc/influxUser62 -userPassword: influxUser62 - -# influxUser63, hpc, rrze.uni-erlangen.de -dn: uid=influxUser63,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser63 -uidNumber: 10069 -gecos: Kelly Singh -cn: influxUser63 -homeDirectory: /home/hpc/influxUser63 -userPassword: influxUser63 - -# influxUser7, hpc, rrze.uni-erlangen.de -dn: uid=influxUser7,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser7 -uidNumber: 10070 -gecos: Rebecca Miles -cn: influxUser7 -homeDirectory: /home/hpc/influxUser7 -userPassword: influxUser7 - -# influxUser8, hpc, rrze.uni-erlangen.de -dn: uid=influxUser8,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser8 -uidNumber: 10071 -gecos: Katy Higgins -cn: influxUser8 -homeDirectory: /home/hpc/influxUser8 -userPassword: influxUser8 - -# influxUser9, hpc, rrze.uni-erlangen.de -dn: uid=influxUser9,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: influxUser9 -uidNumber: 10072 -gecos: Aimee Hill -cn: influxUser9 -homeDirectory: /home/hpc/influxUser9 -userPassword: influxUser9 - -# woodyUser1, hpc, rrze.uni-erlangen.de -dn: uid=woodyUser1,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: woodyUser1 -uidNumber: 10073 -gecos: Jay Gordon -cn: woodyUser1 -homeDirectory: /home/hpc/woodyUser1 -userPassword: woodyUser1 - -# woodyUser2, hpc, rrze.uni-erlangen.de -dn: uid=woodyUser2,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: woodyUser2 -uidNumber: 10074 -gecos: Donna Kirby -cn: woodyUser2 -homeDirectory: /home/hpc/woodyUser2 -userPassword: woodyUser2 - -# woodyUser3, hpc, rrze.uni-erlangen.de -dn: uid=woodyUser3,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: woodyUser3 -uidNumber: 10075 -gecos: Marion Bevan -cn: woodyUser3 -homeDirectory: /home/hpc/woodyUser3 -userPassword: woodyUser3 - -# woodyUser4, hpc, rrze.uni-erlangen.de -dn: uid=woodyUser4,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: woodyUser4 -uidNumber: 10076 -gecos: Amber Harvey -cn: woodyUser4 -homeDirectory: /home/hpc/woodyUser4 -userPassword: woodyUser4 - -# woodyUser5, hpc, rrze.uni-erlangen.de -dn: uid=woodyUser5,ou=hpc,dc=rrze,dc=uni-erlangen,dc=de -loginShell: /bin/bash -gidNumber: 12000 -objectClass: account -objectClass: posixAccount -uid: woodyUser5 -uidNumber: 10077 -gecos: Ryan Hughes -cn: woodyUser5 -homeDirectory: /home/hpc/woodyUser5 -userPassword: woodyUser5 - diff --git a/data/mariadb/slurm.cnf b/data/mariadb/slurm.cnf deleted file mode 100644 index 512356a..0000000 --- a/data/mariadb/slurm.cnf +++ /dev/null @@ -1,5 +0,0 @@ -[mysqld] -innodb_buffer_pool_size=4096M -innodb_log_file_size=64M -innodb_lock_wait_timeout=900 -max_allowed_packet=16M diff --git a/data/slurm/home/config/slurm.conf b/data/slurm/home/config/slurm.conf deleted file mode 100644 index caa130b..0000000 --- a/data/slurm/home/config/slurm.conf +++ /dev/null @@ -1,48 +0,0 @@ -# slurm.conf file generated by configurator.html. -# Put this file on all nodes of your cluster. -# See the slurm.conf man page for more information. -# -ClusterName=snowflake -SlurmctldHost=slurmctld -SlurmUser=slurm -SlurmctldPort=6817 -SlurmdPort=6818 -MpiDefault=none -ProctrackType=proctrack/linuxproc -ReturnToService=1 -SlurmctldPidFile=/var/run/slurmctld.pid -SlurmdPidFile=/var/run/slurmd.pid -SlurmdSpoolDir=/var/spool/slurm/d -StateSaveLocation=/var/spool/slurm/ctld -SwitchType=switch/none -TaskPlugin=task/affinity -# -# TIMERS -InactiveLimit=0 -KillWait=30 -MinJobAge=300 -SlurmctldTimeout=120 -SlurmdTimeout=300 -Waittime=0 -# -# SCHEDULING -SchedulerType=sched/backfill -SelectType=select/cons_tres -# -# LOGGING AND ACCOUNTING -AccountingStorageHost=slurmdb -AccountingStoragePort=6819 -AccountingStorageType=accounting_storage/slurmdbd -AccountingStorageUser=slurm -AccountingStoreFlags=job_script,job_comment,job_env,job_extra -JobCompType=jobcomp/none -JobAcctGatherFrequency=30 -JobAcctGatherType=jobacct_gather/linux -SlurmctldDebug=info -SlurmctldLogFile=/var/log/slurmctld.log -SlurmdDebug=info -SlurmdLogFile=/var/log/slurmd.log -# -# COMPUTE NODES -NodeName=node0[1-2] CPUs=1 State=UNKNOWN -PartitionName=main Nodes=ALL Default=YES MaxTime=INFINITE State=UP diff --git a/dataGenerationScript.sh b/dataGenerationScript.sh new file mode 100755 index 0000000..72efcd1 --- /dev/null +++ b/dataGenerationScript.sh @@ -0,0 +1,139 @@ +#!/bin/bash +echo "" +echo "|--------------------------------------------------------------------------------------|" +echo "| This is Data generation script for docker services |" +echo "| Starting file required by docker services in data/ |" +echo "|--------------------------------------------------------------------------------------|" + +# Download unedited checkpoint files to ./data/cc-metric-store-source/checkpoints +# After this, migrateTimestamp.pl will run from setupDev.sh. This will update the timestamps +# for all the checkpoint files, which then can be read by cc-metric-store. +# cc-metric-store reads only data upto certain time, like 48 hours of data. +# These checkpoint files have timestamp older than 48 hours and needs to be updated with +# migrateTimestamp.pl file, which will be automatically invoked from setupDev.sh. +if [ ! -d data/cc-metric-store-source ]; then + mkdir -p data/cc-metric-store-source/checkpoints + cd data/cc-metric-store-source/checkpoints + wget https://hpc-mover.rrze.uni-erlangen.de/HPC-Data/0x7b58aefb/eig7ahyo6fo2bais0ephuf2aitohv1ai/cc-metric-store-checkpoints.tar.xz + tar xf cc-metric-store-checkpoints.tar.xz + rm cc-metric-store-checkpoints.tar.xz + cd ../../../ +else + echo "'data/cc-metric-store-source' already exists!" +fi + +# A simple configuration file for mariadb docker service. +# Required because you can specify only one database per docker service. +# This file mentions the database to be created for cc-backend. +# This file automatically picked by mariadb after the docker service starts. +if [ ! -d data/mariadb ]; then + mkdir -p data/mariadb + cat > data/mariadb/01.databases.sql < data/ldap/add_users.ldif < data/nats/docker-entrypoint.sh <>sample_alex.txt + done + done + + # Nats client will publish the data from sample_alex.txt to 'hpc-nats' subject on this nats server + ./nats pub hpc-nats "\$(cat sample_alex.txt)" -s nats://0.0.0.0:4222 --user root --password root + + # Generate data for fritz cluster. Push to sample_fritz.txt + for metric in cpu_irq cpu_load mem_cached net_bytes_in cpu_user cpu_idle nfs4_read mem_used nfs4_write nfs4_total ib_xmit ib_xmit_pkts net_bytes_out cpu_iowait ib_recv cpu_system ib_recv_pkts; do + for hostname in f0201 f0202 f0203 f0204 f0205 f0206 f0207 f0208 f0209 f0210 f0211 f0212 f0213 f0214 f0215 f0217 f0218 f0219 f0220 f0221 f0222 f0223 f0224 f0225 f0226 f0227 f0228 f0229; do + echo "\$metric,cluster=fritz,hostname=\$hostname,type=node value=\$((1 + RANDOM % 100)).0 \$timestamp" >>sample_fritz.txt + done + done + + # Nats client will publish the data from sample_fritz.txt to 'hpc-nats' subject on this nats server + ./nats pub hpc-nats "\$(cat sample_fritz.txt)" -s nats://0.0.0.0:4222 --user root --password root + + rm sample_alex.txt + rm sample_fritz.txt + + sleep 1m + +done +EOF + +else + echo "'data/nats' already exists!" +fi + +# prepare folders for influxdb3 +if [ ! -d data/influxdb ]; then + mkdir -p data/influxdb/data + mkdir -p data/influxdb/config +else + echo "'data/influxdb' already exists!" +fi + +echo "" +echo "|--------------------------------------------------------------------------------------|" +echo "| Finished generating relevant files for docker services in data/ |" +echo "|--------------------------------------------------------------------------------------|" \ No newline at end of file diff --git a/docker-compose.yml b/docker-compose.yml old mode 100644 new mode 100755 index 345f60d..59f5891 --- a/docker-compose.yml +++ b/docker-compose.yml @@ -3,15 +3,19 @@ services: container_name: nats image: nats:alpine ports: - - "4222:4222" - - "8222:8222" + - "0.0.0.0:4222:4222" + - "0.0.0.0:8222:8222" + - "0.0.0.0:6222:6222" + volumes: + - ${DATADIR}/nats:/data + entrypoint: ["/bin/sh", "/data/docker-entrypoint.sh"] cc-metric-store: container_name: cc-metric-store build: context: ./cc-metric-store ports: - - "8084:8084" + - "0.0.0.0:8084:8084" volumes: - ${DATADIR}/cc-metric-store:/data depends_on: @@ -19,8 +23,8 @@ services: influxdb: container_name: influxdb - image: influxdb - command: ["--reporting-disabled"] + image: influxdb:latest + command: ["--reporting-disabled", "--log-level=debug"] environment: DOCKER_INFLUXDB_INIT_MODE: setup DOCKER_INFLUXDB_INIT_USERNAME: devel @@ -30,7 +34,7 @@ services: DOCKER_INFLUXDB_INIT_RETENTION: 100w DOCKER_INFLUXDB_INIT_ADMIN_TOKEN: ${INFLUXDB_ADMIN_TOKEN} ports: - - "127.0.0.1:${INFLUXDB_PORT}:8086" + - "0.0.0.0:8086:8086" volumes: - ${DATADIR}/influxdb/data:/var/lib/influxdb2 - ${DATADIR}/influxdb/config:/etc/influxdb2 @@ -40,9 +44,15 @@ services: image: osixia/openldap:1.5.0 command: --copy-service --loglevel debug environment: - - LDAP_ADMIN_PASSWORD=${LDAP_ADMIN_PASSWORD} - - LDAP_ORGANISATION=${LDAP_ORGANISATION} - - LDAP_DOMAIN=${LDAP_DOMAIN} + - LDAP_ADMIN_PASSWORD=mashup + - LDAP_ORGANISATION=Example Organization + - LDAP_DOMAIN=example.com + - LDAP_LOGGING=true + - LDAP_CONNECTION=default + - LDAP_CONNECTIONS=default + - LDAP_DEFAULT_HOSTS=0.0.0.0 + ports: + - "0.0.0.0:389:389" volumes: - ${DATADIR}/ldap:/container/service/slapd/assets/config/bootstrap/ldif/custom @@ -51,36 +61,18 @@ services: image: mariadb:latest command: ["--default-authentication-plugin=mysql_native_password"] environment: - MARIADB_ROOT_PASSWORD: ${MARIADB_ROOT_PASSWORD} + MARIADB_ROOT_PASSWORD: root MARIADB_DATABASE: slurm_acct_db MARIADB_USER: slurm MARIADB_PASSWORD: demo ports: - - "127.0.0.1:${MARIADB_PORT}:3306" + - "0.0.0.0:3306:3306" volumes: - - ${DATADIR}/mariadb:/etc/mysql/conf.d - # - ${DATADIR}/sql-init:/docker-entrypoint-initdb.d + - ${DATADIR}/mariadb:/docker-entrypoint-initdb.d cap_add: - SYS_NICE - # mysql: - # container_name: mysql - # image: mysql:8.0.22 - # command: ["--default-authentication-plugin=mysql_native_password"] - # environment: - # MYSQL_ROOT_PASSWORD: ${MYSQL_ROOT_PASSWORD} - # MYSQL_DATABASE: ${MYSQL_DATABASE} - # MYSQL_USER: ${MYSQL_USER} - # MYSQL_PASSWORD: ${MYSQL_PASSWORD} - # ports: - # - "127.0.0.1:${MYSQL_PORT}:3306" - # # volumes: - # # - ${DATADIR}/sql-init:/docker-entrypoint-initdb.d - # # - ${DATADIR}/sqldata:/var/lib/mysql - # cap_add: - # - SYS_NICE - - slurm-controller: + slurmctld: container_name: slurmctld hostname: slurmctld build: @@ -89,40 +81,66 @@ services: volumes: - ${DATADIR}/slurm/home:/home - ${DATADIR}/slurm/secret:/.secret + - ./slurm/controller/slurm.conf:/home/config/slurm.conf + - /etc/timezone:/etc/timezone:ro + - /etc/localtime:/etc/localtime:ro + - ${DATADIR}/slurm/state:/var/lib/slurm/d + ports: + - "6817:6817" - slurm-database: - container_name: slurmdb - hostname: slurmdb + slurmdbd: + container_name: slurmdbd + hostname: slurmdbd build: context: ./slurm/database depends_on: - mariadb - - slurm-controller + - slurmctld privileged: true volumes: - ${DATADIR}/slurm/home:/home - ${DATADIR}/slurm/secret:/.secret + - ./slurm/database/slurmdbd.conf:/home/config/slurmdbd.conf + - /etc/timezone:/etc/timezone:ro + - /etc/localtime:/etc/localtime:ro + ports: + - "6819:6819" - slurm-worker01: + node01: container_name: node01 hostname: node01 build: context: ./slurm/worker depends_on: - - slurm-controller + - slurmctld privileged: true volumes: - ${DATADIR}/slurm/home:/home - ${DATADIR}/slurm/secret:/.secret + - ./slurm/worker/cgroup.conf:/home/config/cgroup.conf + - ./slurm/controller/slurm.conf:/home/config/slurm.conf + - /etc/timezone:/etc/timezone:ro + - /etc/localtime:/etc/localtime:ro + ports: + - "6818:6818" - # slurm-worker02: - # container_name: node02 - # hostname: node02 - # build: - # context: ./slurm/worker - # depends_on: - # - slurm-controller - # privileged: true - # volumes: - # - ${DATADIR}/slurm/home:/home - # - ${DATADIR}/slurm/secret:/.secret + slurmrestd: + container_name: slurmrestd + hostname: slurmrestd + build: + context: ./slurm/rest + environment: + - SLURM_JWT=daemon + - SLURMRESTD_DEBUG=9 + depends_on: + - slurmctld + privileged: true + volumes: + - ${DATADIR}/slurm/home:/home + - ${DATADIR}/slurm/secret:/.secret + - ./slurm/controller/slurm.conf:/home/config/slurm.conf + - ./slurm/rest/slurmrestd.conf:/home/config/slurmrestd.conf + - /etc/timezone:/etc/timezone:ro + - /etc/localtime:/etc/localtime:ro + ports: + - "6820:6820" \ No newline at end of file diff --git a/env-template.txt b/env-template.txt deleted file mode 100644 index 3bdeb8f..0000000 --- a/env-template.txt +++ /dev/null @@ -1,5 +0,0 @@ -SLURM_VERSION=22.05.6 -ARCH=aarch64 -MUNGE_UID=981 -SLURM_UID=982 -WORKER_UID=1000 diff --git a/migrateTimestamps.pl b/migrateTimestamps.pl index 5699c80..0ffa221 100755 --- a/migrateTimestamps.pl +++ b/migrateTimestamps.pl @@ -9,7 +9,6 @@ use File::Slurp; use Data::Dumper; use Time::Piece; use Sort::Versions; -use REST::Client; ### JOB-ARCHIVE my $localtime = localtime; @@ -19,80 +18,80 @@ my $archiveSrc = './data/job-archive-source'; my @ArchiveClusters; # Get clusters by job-archive/$subfolder -opendir my $dh, $archiveSrc or die "can't open directory: $!"; -while ( readdir $dh ) { - chomp; next if $_ eq '.' or $_ eq '..' or $_ eq 'job-archive'; +# opendir my $dh, $archiveSrc or die "can't open directory: $!"; +# while ( readdir $dh ) { +# chomp; next if $_ eq '.' or $_ eq '..' or $_ eq 'job-archive' or $_ eq 'version.txt'; - my $cluster = $_; - push @ArchiveClusters, $cluster; -} +# my $cluster = $_; +# push @ArchiveClusters, $cluster; +# } -# start for jobarchive -foreach my $cluster ( @ArchiveClusters ) { - print "Starting to update start- and stoptimes in job-archive for $cluster\n"; +# # start for jobarchive +# foreach my $cluster ( @ArchiveClusters ) { +# print "Starting to update start- and stoptimes in job-archive for $cluster\n"; - opendir my $dhLevel1, "$archiveSrc/$cluster" or die "can't open directory: $!"; - while ( readdir $dhLevel1 ) { - chomp; next if $_ eq '.' or $_ eq '..'; - my $level1 = $_; +# opendir my $dhLevel1, "$archiveSrc/$cluster" or die "can't open directory: $!"; +# while ( readdir $dhLevel1 ) { +# chomp; next if $_ eq '.' or $_ eq '..'; +# my $level1 = $_; - if ( -d "$archiveSrc/$cluster/$level1" ) { - opendir my $dhLevel2, "$archiveSrc/$cluster/$level1" or die "can't open directory: $!"; - while ( readdir $dhLevel2 ) { - chomp; next if $_ eq '.' or $_ eq '..'; - my $level2 = $_; - my $jobSource = "$archiveSrc/$cluster/$level1/$level2"; - my $jobTarget = "$archiveTarget/$cluster/$level1/$level2/"; - my $jobOrigin = $jobSource; - # check if files are directly accessible (old format) else get subfolders as file and update path - if ( ! -e "$jobSource/meta.json") { - my @folders = read_dir($jobSource); - if (!@folders) { - next; - } - # Only use first subfolder for now TODO - $jobSource = "$jobSource/".$folders[0]; - } - # check if subfolder contains file, else remove source and skip - if ( ! -e "$jobSource/meta.json") { - # rmtree $jobOrigin; - next; - } +# if ( -d "$archiveSrc/$cluster/$level1" ) { +# opendir my $dhLevel2, "$archiveSrc/$cluster/$level1" or die "can't open directory: $!"; +# while ( readdir $dhLevel2 ) { +# chomp; next if $_ eq '.' or $_ eq '..'; +# my $level2 = $_; +# my $jobSource = "$archiveSrc/$cluster/$level1/$level2"; +# my $jobTarget = "$archiveTarget/$cluster/$level1/$level2/"; +# my $jobOrigin = $jobSource; +# # check if files are directly accessible (old format) else get subfolders as file and update path +# if ( ! -e "$jobSource/meta.json") { +# my @folders = read_dir($jobSource); +# if (!@folders) { +# next; +# } +# # Only use first subfolder for now TODO +# $jobSource = "$jobSource/".$folders[0]; +# } +# # check if subfolder contains file, else remove source and skip +# if ( ! -e "$jobSource/meta.json") { +# # rmtree $jobOrigin; +# next; +# } - my $rawstr = read_file("$jobSource/meta.json"); - my $json = decode_json($rawstr); +# my $rawstr = read_file("$jobSource/meta.json"); +# my $json = decode_json($rawstr); - # NOTE Start meta.json iteration here - # my $random_number = int(rand(UPPERLIMIT)) + LOWERLIMIT; - # Set new startTime: Between 5 days and 1 day before now +# # NOTE Start meta.json iteration here +# # my $random_number = int(rand(UPPERLIMIT)) + LOWERLIMIT; +# # Set new startTime: Between 5 days and 1 day before now - # Remove id from attributes - $json->{startTime} = $epochtime - (int(rand(432000)) + 86400); - $json->{stopTime} = $json->{startTime} + $json->{duration}; +# # Remove id from attributes +# $json->{startTime} = $epochtime - (int(rand(432000)) + 86400); +# $json->{stopTime} = $json->{startTime} + $json->{duration}; - # Add starttime subfolder to target path - $jobTarget .= $json->{startTime}; +# # Add starttime subfolder to target path +# $jobTarget .= $json->{startTime}; - # target is not directory - if ( not -d $jobTarget ){ - # print "Writing files\n"; - # print "$cluster/$level1/$level2\n"; - make_path($jobTarget); +# # target is not directory +# if ( not -d $jobTarget ){ +# # print "Writing files\n"; +# # print "$cluster/$level1/$level2\n"; +# make_path($jobTarget); - my $outstr = encode_json($json); - write_file("$jobTarget/meta.json", $outstr); +# my $outstr = encode_json($json); +# write_file("$jobTarget/meta.json", $outstr); - my $datstr = read_file("$jobSource/data.json"); - write_file("$jobTarget/data.json", $datstr); - } else { - # rmtree $jobSource; - } - } - } - } -} -print "Done for job-archive\n"; -sleep(1); +# my $datstr = read_file("$jobSource/data.json.gz"); +# write_file("$jobTarget/data.json.gz", $datstr); +# } else { +# # rmtree $jobSource; +# } +# } +# } +# } +# } +# print "Done for job-archive\n"; +# sleep(1); ## CHECKPOINTS chomp(my $checkpointStart=`date --date 'TZ="Europe/Berlin" 0:00 7 days ago' +%s`); diff --git a/misc/config.json b/misc/config.json new file mode 100644 index 0000000..2977e72 --- /dev/null +++ b/misc/config.json @@ -0,0 +1,77 @@ +{ + "addr": "127.0.0.1:8080", + "short-running-jobs-duration": 300, + "archive": { + "kind": "file", + "path": "./var/job-archive" + }, + "jwts": { + "max-age": "2000h" + }, + "db-driver": "mysql", + "db": "root:root@tcp(0.0.0.0:3306)/ccbackend", + "ldap": { + "url": "ldap://0.0.0.0", + "user_base": "ou=users,dc=example,dc=com", + "search_dn": "cn=admin,dc=example,dc=com", + "user_bind": "uid={username},ou=users,dc=example,dc=com", + "user_filter": "(&(objectclass=posixAccount))", + "syncUserOnLogin": true + }, + "enable-resampling": { + "trigger": 30, + "resolutions": [ + 600, + 300, + 120, + 60 + ] + }, + "emission-constant": 317, + "clusters": [ + { + "name": "fritz", + "metricDataRepository": { + "kind": "cc-metric-store", + "url": "http://0.0.0.0:8084", + "token": "eyJ0eXAiOiJKV1QiLCJhbGciOiJFZERTQSJ9.eyJ1c2VyIjoiYWRtaW4iLCJyb2xlcyI6WyJST0xFX0FETUlOIiwiUk9MRV9BTkFMWVNUIiwiUk9MRV9VU0VSIl19.d-3_3FZTsadPjDEdsWrrQ7nS0edMAR4zjl-eK7rJU3HziNBfI9PDHDIpJVHTNN5E5SlLGLFXctWyKAkwhXL-Dw" + }, + "filterRanges": { + "numNodes": { + "from": 1, + "to": 64 + }, + "duration": { + "from": 0, + "to": 86400 + }, + "startTime": { + "from": "2022-01-01T00:00:00Z", + "to": null + } + } + }, + { + "name": "alex", + "metricDataRepository": { + "kind": "cc-metric-store", + "url": "http://0.0.0.0:8084", + "token": "eyJ0eXAiOiJKV1QiLCJhbGciOiJFZERTQSJ9.eyJ1c2VyIjoiYWRtaW4iLCJyb2xlcyI6WyJST0xFX0FETUlOIiwiUk9MRV9BTkFMWVNUIiwiUk9MRV9VU0VSIl19.d-3_3FZTsadPjDEdsWrrQ7nS0edMAR4zjl-eK7rJU3HziNBfI9PDHDIpJVHTNN5E5SlLGLFXctWyKAkwhXL-Dw" + }, + "filterRanges": { + "numNodes": { + "from": 1, + "to": 64 + }, + "duration": { + "from": 0, + "to": 86400 + }, + "startTime": { + "from": "2022-01-01T00:00:00Z", + "to": null + } + } + } + ] +} \ No newline at end of file diff --git a/misc/curl_slurmrestd.sh b/misc/curl_slurmrestd.sh new file mode 100755 index 0000000..8168267 --- /dev/null +++ b/misc/curl_slurmrestd.sh @@ -0,0 +1,3 @@ +SLURM_JWT=$(cat data/slurm/secret/jwt_token.txt) +curl -X 'GET' -v 'http://localhost:6820/slurm/v0.0.39/node/node01' --location --silent --show-error -H "X-SLURM-USER-NAME: root" -H "X-SLURM-USER-TOKEN: $SLURM_JWT" +# curl -v --unix-socket data/slurm/tmp/slurmrestd.socket 'http://localhost:6820/slurm/v0.0.39/ping' \ No newline at end of file diff --git a/misc/jwt_verifier.py b/misc/jwt_verifier.py new file mode 100644 index 0000000..e9ec78e --- /dev/null +++ b/misc/jwt_verifier.py @@ -0,0 +1,27 @@ +#!/usr/bin/env python3 +import sys +import os +import pprint +import json +import time +from datetime import datetime, timedelta, timezone + +from jwt import JWT +from jwt.jwa import HS256 +from jwt.jwk import jwk_from_dict +from jwt.utils import b64decode,b64encode + +if len(sys.argv) != 2: + sys.exit("verify_jwt.py [JWT Token]"); + +with open("data/slurm/secret/jwt_hs256.key", "rb") as f: + priv_key = f.read() + +signing_key = jwk_from_dict({ + 'kty': 'oct', + 'k': b64encode(priv_key) +}) + +a = JWT() +b = a.decode(sys.argv[1], signing_key, algorithms=["HS256"]) +print(b) \ No newline at end of file diff --git a/scripts/prerequisite_installation_script.sh b/scripts/prerequisite_installation_script.sh new file mode 100644 index 0000000..e061a13 --- /dev/null +++ b/scripts/prerequisite_installation_script.sh @@ -0,0 +1,40 @@ +#!/bin/bash -l + +sudo apt-get update +sudo apt-get upgrade -f -y + +# Add Docker's official GPG key: +sudo apt-get update +sudo apt-get install ca-certificates curl +sudo install -m 0755 -d /etc/apt/keyrings +sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc +sudo chmod a+r /etc/apt/keyrings/docker.asc + +# Add the repository to Apt sources: +echo \ + "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \ + $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \ + sudo tee /etc/apt/sources.list.d/docker.list > /dev/null +sudo apt-get update + +sudo apt-get install -f -y gcc +sudo apt-get install -f -y npm +sudo apt-get install -f -y make +sudo apt-get install -f -y gh +sudo apt-get install -f -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin +sudo apt-get install -f -y docker-compose +sudo apt install perl -f -y libdatetime-perl libjson-perl +sudo apt-get install -f -y golang-go + +sudo cpan Cpanel::JSON::XS +sudo cpan File::Slurp +sudo cpan Data::Dumper +sudo cpan Time::Piece +sudo cpan Sort::Versions + +sudo groupadd docker +sudo usermod -aG docker ubuntu + +sudo shutdown -r -t 0 + + diff --git a/setupDev.sh b/setupDev.sh index 90aa011..2549181 100755 --- a/setupDev.sh +++ b/setupDev.sh @@ -1,48 +1,42 @@ #!/bin/bash +echo "" +echo "|--------------------------------------------------------------------------------------|" +echo "| Welcome to cc-docker automatic deployment script. |" +echo "| Make sure you have sudo rights to run docker services |" +echo "| This script assumes that docker command is added to sudo group |" +echo "| This means that docker commands do not explicitly require |" +echo "| 'sudo' keyword to run. You can use this following command: |" +echo "| |" +echo "| > sudo groupadd docker |" +echo "| > sudo usermod -aG docker $USER |" +echo "| |" +echo "| This will add docker to the sudo usergroup and all the docker |" +echo "| command will run as sudo by default without requiring |" +echo "| 'sudo' keyword. |" +echo "|--------------------------------------------------------------------------------------|" +echo "" -# Check cc-backend, touch job.db if exists +# Check cc-backend if exists if [ ! -d cc-backend ]; then echo "'cc-backend' not yet prepared! Please clone cc-backend repository before starting this script." echo -n "Stopped." exit -else - cd cc-backend - if [ ! -d var ]; then - mkdir var - touch var/job.db - else - echo "'cc-backend/var' exists. Cautiously exiting." - echo -n "Stopped." - exit - fi fi - -# Download unedited job-archive to ./data/job-archive-source -if [ ! -d data/job-archive-source ]; then - cd data - wget https://hpc-mover.rrze.uni-erlangen.de/HPC-Data/0x7b58aefb/eig7ahyo6fo2bais0ephuf2aitohv1ai/job-archive-demo.tar - tar xf job-archive-demo.tar - mv ./job-archive ./job-archive-source - rm ./job-archive-demo.tar - cd .. -else - echo "'data/job-archive-source' already exists!" +# Creates data directory if it does not exists. +# Contains all the mount points required by all the docker services +# and their static files. +if [ ! -d data ]; then + mkdir -m777 data fi -# Download unedited checkpoint files to ./data/cc-metric-store-source/checkpoints -if [ ! -d data/cc-metric-store-source ]; then - mkdir -p data/cc-metric-store-source/checkpoints - cd data/cc-metric-store-source/checkpoints - wget https://hpc-mover.rrze.uni-erlangen.de/HPC-Data/0x7b58aefb/eig7ahyo6fo2bais0ephuf2aitohv1ai/cc-metric-store-checkpoints.tar.xz - tar xf cc-metric-store-checkpoints.tar.xz - rm cc-metric-store-checkpoints.tar.xz - cd ../../../ -else - echo "'data/cc-metric-store-source' already exists!" -fi +# Invokes the dataGenerationScript.sh, which then populates the required +# static files by the docker services. These static files are required by docker services after startup. +chmod u+x dataGenerationScript.sh +./dataGenerationScript.sh -# Update timestamps +# Update timestamps for all the checkpoints in data/cc-metric-store-source +# and dumps new files in data/cc-metric-store. perl ./migrateTimestamps.pl # Create archive folder for rewritten ccms checkpoints @@ -51,32 +45,54 @@ if [ ! -d data/cc-metric-store/archive ]; then fi # cleanup sources -# rm -r ./data/job-archive-source -# rm -r ./data/cc-metric-store-source - -# prepare folders for influxdb2 -if [ ! -d data/influxdb ]; then - mkdir -p data/influxdb/data - mkdir -p data/influxdb/config/influx-configs -else - echo "'data/influxdb' already exists!" +if [ -d data/cc-metric-store-source ]; then + rm -r data/cc-metric-store-source fi -# Check dotenv-file and docker-compose-yml, copy accordingly if not present and build docker services -if [ ! -d .env ]; then - cp templates/env.default ./.env -fi +# Just in case user forgot manually shutdown the docker services. +docker-compose down +docker-compose down --remove-orphans -if [ ! -d docker-compose.yml ]; then - cp templates/docker-compose.yml.default ./docker-compose.yml -fi +# This automatically builds the base docker image for slurm. +# All the slurm docker service in docker-compose.yml refer to +# the base image created from this directory. +cd slurm/base/ +make +cd ../.. +# Starts all the docker services from docker-compose.yml. docker-compose build -./cc-backend/cc-backend --init-db --add-user demo:admin:AdminDev docker-compose up -d +cd cc-backend +if [ ! -d var ]; then + wget https://hpc-mover.rrze.uni-erlangen.de/HPC-Data/0x7b58aefb/eig7ahyo6fo2bais0ephuf2aitohv1ai/job-archive-demo.tar + tar xf job-archive-demo.tar + rm ./job-archive-demo.tar + + cp ./configs/env-template.txt .env + cp -f ../misc/config.json config.json + + make + + ./cc-backend -migrate-db + ./cc-backend --init-db --add-user demo:admin:demo + cd .. +else + cd .. + echo "'cc-backend/var' exists. Cautiously exiting." +fi + +echo "" +echo "|--------------------------------------------------------------------------------------|" +echo "| Check logs for each slurm service by using these commands: |" +echo "| docker-compose logs slurmctld |" +echo "| docker-compose logs slurmdbd |" +echo "| docker-compose logs slurmrestd |" +echo "| docker-compose logs node01 |" +echo "|======================================================================================|" +echo "| Setup complete, containers are up by default: Shut down with 'docker-compose down'. |" +echo "| Use './cc-backend/cc-backend -server' to start cc-backend. |" +echo "| Use scripts in /scripts to load data into influx or mariadb. |" +echo "|--------------------------------------------------------------------------------------|" echo "" -echo "Setup complete, containers are up by default: Shut down with 'docker-compose down'." -echo "Use './cc-backend/cc-backend' to start cc-backend." -echo "Use scripts in /scripts to load data into influx or mariadb." -# ./cc-backend/cc-backend diff --git a/slurm/base/Dockerfile b/slurm/base/Dockerfile index a006cc2..ca6b27f 100644 --- a/slurm/base/Dockerfile +++ b/slurm/base/Dockerfile @@ -1,41 +1,39 @@ FROM rockylinux:8 -MAINTAINER Jan Eitzinger +LABEL org.opencontainers.image.authors="jan.eitzinger@fau.de" -ENV SLURM_VERSION=22.05.6 -ENV ARCH=aarch64 +ENV SLURM_VERSION=24.05.3 +ENV HTTP_PARSER_VERSION=2.8.0 -RUN yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm -y +RUN yum install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm +RUN ARCH=$(uname -m) && yum install -y https://rpmfind.net/linux/almalinux/8.10/PowerTools/x86_64/os/Packages/http-parser-devel-2.8.0-9.el8.$ARCH.rpm RUN groupadd -g 981 munge \ && useradd -m -c "MUNGE Uid 'N' Gid Emporium" -d /var/lib/munge -u 981 -g munge -s /sbin/nologin munge \ - && groupadd -g 982 slurm \ - && useradd -m -c "Slurm workload manager" -d /var/lib/slurm -u 982 -g slurm -s /bin/bash slurm \ - && groupadd -g 1000 worker \ - && useradd -m -c "Workflow user" -d /home/worker -u 1000 -g worker -s /bin/bash worker + && groupadd -g 1000 slurm \ + && useradd -m -c "Slurm workload manager" -d /var/lib/slurm -u 1000 -g slurm -s /bin/bash slurm \ + && groupadd -g 982 worker \ + && useradd -m -c "Workflow user" -d /home/worker -u 982 -g worker -s /bin/bash worker -RUN yum install -y munge munge-libs -RUN dnf --enablerepo=powertools install munge-devel -y -RUN yum install rng-tools -y +RUN yum install -y munge munge-libs rng-tools \ + python3 gcc openssl openssl-devel \ + openssh-server openssh-clients dbus-devel \ + pam-devel numactl numactl-devel hwloc sudo \ + lua readline-devel ncurses-devel man2html \ + autoconf automake json-c-devel libjwt-devel \ + libibmad libibumad rpm-build perl-ExtUtils-MakeMaker.noarch rpm-build make wget -RUN yum install -y python3 gcc openssl openssl-devel \ -openssh-server openssh-clients dbus-devel \ -pam-devel numactl numactl-devel hwloc sudo \ -lua readline-devel ncurses-devel man2html \ -libibmad libibumad rpm-build perl-ExtUtils-MakeMaker.noarch rpm-build make wget +RUN dnf --enablerepo=powertools install -y munge-devel rrdtool-devel lua-devel hwloc-devel mariadb-server mariadb-devel -RUN dnf --enablerepo=powertools install rrdtool-devel lua-devel hwloc-devel rpm-build -y -RUN dnf install mariadb-server mariadb-devel -y -RUN mkdir /usr/local/slurm-tmp -RUN cd /usr/local/slurm-tmp -RUN wget https://download.schedmd.com/slurm/slurm-${SLURM_VERSION}.tar.bz2 -RUN rpmbuild -ta slurm-${SLURM_VERSION}.tar.bz2 +RUN mkdir -p /usr/local/slurm-tmp \ + && cd /usr/local/slurm-tmp \ + && wget https://download.schedmd.com/slurm/slurm-${SLURM_VERSION}.tar.bz2 \ + && rpmbuild -ta --with slurmrestd --with jwt slurm-${SLURM_VERSION}.tar.bz2 -WORKDIR /root/rpmbuild/RPMS/${ARCH} -RUN yum -y --nogpgcheck localinstall \ - slurm-${SLURM_VERSION}-1.el8.${ARCH}.rpm \ - slurm-perlapi-${SLURM_VERSION}-1.el8.${ARCH}.rpm \ - slurm-slurmctld-${SLURM_VERSION}-1.el8.${ARCH}.rpm -WORKDIR / +RUN ARCH=$(uname -m) \ + && yum -y --nogpgcheck localinstall \ + /root/rpmbuild/RPMS/$ARCH/slurm-${SLURM_VERSION}*.$ARCH.rpm \ + /root/rpmbuild/RPMS/$ARCH/slurm-perlapi-${SLURM_VERSION}*.$ARCH.rpm \ + /root/rpmbuild/RPMS/$ARCH/slurm-slurmctld-${SLURM_VERSION}*.$ARCH.rpm VOLUME ["/home", "/.secret"] # 22: SSH @@ -43,4 +41,5 @@ VOLUME ["/home", "/.secret"] # 6817: SlurmCtlD # 6818: SlurmD # 6819: SlurmDBD -EXPOSE 22 6817 6818 6819 +# 6820: SlurmRestD +EXPOSE 22 6817 6818 6819 6820 diff --git a/slurm/base/Makefile b/slurm/base/Makefile index dc0dff3..01029b8 100644 --- a/slurm/base/Makefile +++ b/slurm/base/Makefile @@ -1,6 +1,6 @@ include ../../.env IMAGE = clustercockpit/slurm.base - +SLURM_VERSION = 24.05.3 .PHONY: build clean build: diff --git a/slurm/controller/Dockerfile b/slurm/controller/Dockerfile index b627826..470748d 100644 --- a/slurm/controller/Dockerfile +++ b/slurm/controller/Dockerfile @@ -1,5 +1,5 @@ -FROM clustercockpit/slurm.base:22.05.6 -MAINTAINER Jan Eitzinger +FROM clustercockpit/slurm.base:24.05.3 +LABEL org.opencontainers.image.authors="jan.eitzinger@fau.de" # clean up RUN rm -f /root/rpmbuild/RPMS/slurm-*.rpm \ @@ -7,4 +7,5 @@ RUN rm -f /root/rpmbuild/RPMS/slurm-*.rpm \ && rm -rf /var/cache/yum COPY docker-entrypoint.sh /docker-entrypoint.sh +CMD ["/usr/sbin/init"] ENTRYPOINT ["/docker-entrypoint.sh"] diff --git a/slurm/controller/docker-entrypoint.sh b/slurm/controller/docker-entrypoint.sh index 75e36db..72ac3c9 100755 --- a/slurm/controller/docker-entrypoint.sh +++ b/slurm/controller/docker-entrypoint.sh @@ -1,23 +1,43 @@ #!/usr/bin/env bash set -e +# Determine the system architecture dynamically +ARCH=$(uname -m) +SLURM_VERSION="24.05.3" +SLURM_JWT=daemon +SLURMRESTD_SECURITY=disable_user_check + +_delete_secrets() { + if [ -f /.secret/munge.key ]; then + echo "Removing secrets" + sudo rm -rf /.secret/munge.key + sudo rm -rf /.secret/worker-secret.tar.gz + sudo rm -rf /.secret/setup-worker-ssh.sh + sudo rm -rf /.secret/jwt_hs256.key + sudo rm -rf /.secret/jwt_token.txt + + echo "Done removing secrets" + ls /.secret/ + fi +} + # start sshd server _sshd_host() { - if [ ! -d /var/run/sshd ]; then - mkdir /var/run/sshd - ssh-keygen -t rsa -f /etc/ssh/ssh_host_rsa_key -N '' - fi - echo "Starting sshd" - /usr/sbin/sshd + if [ ! -d /var/run/sshd ]; then + mkdir /var/run/sshd + ssh-keygen -t rsa -f /etc/ssh/ssh_host_rsa_key -N '' + fi + echo "Starting sshd" + /usr/sbin/sshd } # setup worker ssh to be passwordless _ssh_worker() { - if [[ ! -d /home/worker ]]; then + if [[ ! -d /home/worker ]]; then mkdir -p /home/worker chown -R worker:worker /home/worker fi - cat > /home/worker/setup-worker-ssh.sh </home/worker/setup-worker-ssh.sh < /etc/munge/munge.key" + sh -c "dd if=/dev/urandom bs=1 count=1024 > /etc/munge/munge.key" chown munge: /etc/munge/munge.key - chmod 400 /etc/munge/munge.key + chmod 600 /etc/munge/munge.key sudo -u munge /sbin/munged munge -n munge -n | unmunge @@ -61,31 +81,97 @@ _munge_start() { # copy secrets to /.secret directory for other nodes _copy_secrets() { - cp /home/worker/worker-secret.tar.gz /.secret/worker-secret.tar.gz - cp /home/worker/setup-worker-ssh.sh /.secret/setup-worker-ssh.sh - cp /etc/munge/munge.key /.secret/munge.key - rm -f /home/worker/worker-secret.tar.gz - rm -f /home/worker/setup-worker-ssh.sh + while [ ! -f /home/worker/worker-secret.tar.gz ]; do + echo -n "." + sleep 1 + done + cp /home/worker/worker-secret.tar.gz /.secret/worker-secret.tar.gz + cp /home/worker/setup-worker-ssh.sh /.secret/setup-worker-ssh.sh + cp /etc/munge/munge.key /.secret/munge.key + rm -f /home/worker/worker-secret.tar.gz + rm -f /home/worker/setup-worker-ssh.sh +} + +_openssl_jwt_key() { + + mkdir -p /var/spool/slurm/statesave + dd if=/dev/random of=/var/spool/slurm/statesave/jwt_hs256.key bs=32 count=1 + chown slurm:slurm /var/spool/slurm/statesave/jwt_hs256.key + chmod 0600 /var/spool/slurm/statesave/jwt_hs256.key + chown slurm:slurm /var/spool/slurm/statesave + chmod 0755 /var/spool/slurm/statesave + cp /var/spool/slurm/statesave/jwt_hs256.key /.secret/jwt_hs256.key + chmod 777 /.secret/jwt_hs256.key +} + +_generate_jwt_token() { + + secret_key=$(cat /var/spool/slurm/statesave/jwt_hs256.key) + start_time=$(date +%s) + exp_time=$((start_time + 100000000)) + base64url() { + # Don't wrap, make URL-safe, delete trailer. + base64 -w 0 | tr '+/' '-_' | tr -d '=' + } + + jwt_header=$(echo -n '{"alg":"HS256","typ":"JWT"}' | base64url) + + jwt_claims=$(cat < Monochrome output, compact output, join lines + + jwt_signature=$(echo -n "${jwt_header}.${jwt_claims}" | + openssl dgst -sha256 -hmac "$secret_key" -binary | base64url) + + # Use the same colours as jwt.io, more-or-less. + echo "$(tput setaf 1)${jwt_header}$(tput sgr0).$(tput setaf 5)${jwt_claims}$(tput sgr0).$(tput setaf 6)${jwt_signature}$(tput sgr0)" + + jwt="${jwt_header}.${jwt_claims}.${jwt_signature}" + + echo $jwt | cat >/.secret/jwt_token.txt + chmod 777 /.secret/jwt_token.txt } # run slurmctld _slurmctld() { - cd /root/rpmbuild/RPMS/aarch64 - yum -y --nogpgcheck localinstall slurm-22.05.6-1.el8.aarch64.rpm \ - slurm-perlapi-22.05.6-1.el8.aarch64.rpm \ - slurm-slurmd-22.05.6-1.el8.aarch64.rpm \ - slurm-torque-22.05.6-1.el8.aarch64.rpm \ - slurm-slurmctld-22.05.6-1.el8.aarch64.rpm + cd /root/rpmbuild/RPMS/$ARCH + + yum -y --nogpgcheck localinstall slurm-$SLURM_VERSION*.$ARCH.rpm \ + slurm-perlapi-$SLURM_VERSION*.$ARCH.rpm \ + slurm-slurmd-$SLURM_VERSION*.$ARCH.rpm \ + slurm-torque-$SLURM_VERSION*.$ARCH.rpm \ + slurm-slurmctld-$SLURM_VERSION*.$ARCH.rpm echo "checking for slurmdbd.conf" while [ ! -f /.secret/slurmdbd.conf ]; do - echo -n "." + echo "." sleep 1 done echo "" - mkdir -p /var/spool/slurm/ctld /var/spool/slurm/d /var/log/slurm /etc/slurm - chown -R slurm: /var/spool/slurm/ctld /var/spool/slurm/d /var/log/slurm + mkdir -p /var/spool/slurm/ctld /var/spool/slurm/d /var/log/slurm /etc/slurm /var/run/slurm/d /var/run/slurm/ctld /var/lib/slurm/d /var/lib/slurm/ctld + chown -R slurm: /var/spool/slurm/ctld /var/spool/slurm/d /var/log/slurm /var/spool /var/lib /var/run/slurm/d /var/run/slurm/ctld /var/lib/slurm/d /var/lib/slurm/ctld + mkdir -p /etc/config + chown -R slurm: /etc/config + touch /var/log/slurmctld.log - chown slurm: /var/log/slurmctld.log + chown -R slurm: /var/log/slurmctld.log + touch /var/log/slurmd.log + chown -R slurm: /var/log/slurmd.log + + touch /var/lib/slurm/d/job_state + chown -R slurm: /var/lib/slurm/d/job_state + touch /var/lib/slurm/d/fed_mgr_state + chown -R slurm: /var/lib/slurm/d/fed_mgr_state + touch /var/run/slurm/d/slurmctld.pid + chown -R slurm: /var/run/slurm/d/slurmctld.pid + touch /var/run/slurm/d/slurmd.pid + chown -R slurm: /var/run/slurm/d/slurmd.pid + if [[ ! -f /home/config/slurm.conf ]]; then echo "### Missing slurm.conf ###" exit @@ -95,15 +181,43 @@ _slurmctld() { chown slurm: /etc/slurm/slurm.conf chmod 600 /etc/slurm/slurm.conf fi - sacctmgr -i add cluster "snowflake" + + sudo yum install -y nc + sudo yum install -y procps + sudo yum install -y iputils + sudo yum install -y lsof + sudo yum install -y jq + + _openssl_jwt_key + + if [ ! -f /.secret/jwt_hs256.key ]; then + echo "### Missing jwt.key ###" + exit 1 + else + cp /.secret/jwt_hs256.key /etc/config/jwt_hs256.key + chown slurm: /etc/config/jwt_hs256.key + chmod 0600 /etc/config/jwt_hs256.key + fi + + _generate_jwt_token + + while ! nc -z slurmdbd 6819; do + echo "Waiting for slurmdbd to be ready..." + sleep 2 + done + + sacctmgr -i add cluster name=linux sleep 2s - echo "Starting slurmctld" + echo "Starting slurmctld" cp -f /etc/slurm/slurm.conf /.secret/ - /usr/sbin/slurmctld + /usr/sbin/slurmctld -Dvv + echo "Started slurmctld" } ### main ### +_delete_secrets _sshd_host + _ssh_worker _munge_start _copy_secrets diff --git a/slurm/controller/slurm.conf b/slurm/controller/slurm.conf new file mode 100644 index 0000000..f5a34ef --- /dev/null +++ b/slurm/controller/slurm.conf @@ -0,0 +1,108 @@ +# slurm.conf +# +# See the slurm.conf man page for more information. +# +ClusterName=linux +ControlMachine=slurmctld +ControlAddr=slurmctld +#BackupController= +#BackupAddr= +# +SlurmUser=slurm +#SlurmdUser=root +SlurmctldPort=6817 +SlurmdPort=6818 +AuthType=auth/munge +#JobCredentialPrivateKey= +#JobCredentialPublicCertificate= +StateSaveLocation=/var/lib/slurm/d +SlurmdSpoolDir=/var/spool/slurm/d +SwitchType=switch/none +MpiDefault=none +SlurmctldPidFile=/var/run/slurm/d/slurmctld.pid +SlurmdPidFile=/var/run/slurm/d/slurmd.pid +ProctrackType=proctrack/linuxproc +AuthAltTypes=auth/jwt +AuthAltParameters=jwt_key=/var/spool/slurm/statesave/jwt_hs256.key +#PluginDir= +#CacheGroups=0 +#FirstJobId= +ReturnToService=0 +#MaxJobCount= +#PlugStackConfig= +#PropagatePrioProcess= +#PropagateResourceLimits= +#PropagateResourceLimitsExcept= +#Prolog= +#Epilog= +#SrunProlog= +#SrunEpilog= +#TaskProlog= +#TaskEpilog= +TaskPlugin=task/affinity +#TrackWCKey=no +#TreeWidth=50 +#TmpFS= +#UsePAM= +# +# TIMERS +SlurmctldTimeout=300 +SlurmdTimeout=300 +InactiveLimit=0 +MinJobAge=300 +KillWait=30 +Waittime=0 +# +# SCHEDULING +SchedulerType=sched/backfill +#SchedulerAuth= +#SchedulerPort= +#SchedulerRootFilter= +# SelectType=select/con_res +SelectTypeParameters=CR_CPU_Memory +# FastSchedule=1 +#PriorityType=priority/multifactor +#PriorityDecayHalfLife=14-0 +#PriorityUsageResetPeriod=14-0 +#PriorityWeightFairshare=100000 +#PriorityWeightAge=1000 +#PriorityWeightPartition=10000 +#PriorityWeightJobSize=1000 +#PriorityMaxAge=1-0 +# +# LOGGING +SlurmctldDebug=6 +SlurmctldLogFile=/var/log/slurm/slurmctld.log +SlurmdDebug=6 +SlurmdLogFile=/var/log/slurm/slurmd.log +JobCompType=jobcomp/filetxt +JobCompLoc=/var/log/slurm/jobcomp.log +# +# ACCOUNTING +JobAcctGatherType=jobacct_gather/linux +#JobAcctGatherType=jobacct_gather/cgroup +#ProctrackType=proctrack/cgroup + +JobAcctGatherFrequency=30 +# +AccountingStorageType=accounting_storage/slurmdbd +AccountingStorageHost=slurmdbd +AccountingStoragePort=6819 +#AccountingStorageLoc=slurm_acct_db +#AccountingStoragePass= +#AccountingStorageUser= +# + +# COMPUTE NODES +PartitionName=DEFAULT Nodes=node01 +PartitionName=debug Nodes=node01 Default=YES MaxTime=INFINITE State=UP + +# # COMPUTE NODES +# NodeName=c[1-2] RealMemory=1000 State=UNKNOWN +NodeName=node01 CPUs=1 Boards=1 SocketsPerBoard=1 CoresPerSocket=1 ThreadsPerCore=1 + +# # +# # PARTITIONS +# PartitionName=normal Default=yes Nodes=c[1-2] Priority=50 DefMemPerCPU=500 Shared=NO MaxNodes=2 MaxTime=5-00:00:00 DefaultTime=5-00:00:00 State=UP + +#PrEpPlugins=pika diff --git a/slurm/database/Dockerfile b/slurm/database/Dockerfile index b627826..470748d 100644 --- a/slurm/database/Dockerfile +++ b/slurm/database/Dockerfile @@ -1,5 +1,5 @@ -FROM clustercockpit/slurm.base:22.05.6 -MAINTAINER Jan Eitzinger +FROM clustercockpit/slurm.base:24.05.3 +LABEL org.opencontainers.image.authors="jan.eitzinger@fau.de" # clean up RUN rm -f /root/rpmbuild/RPMS/slurm-*.rpm \ @@ -7,4 +7,5 @@ RUN rm -f /root/rpmbuild/RPMS/slurm-*.rpm \ && rm -rf /var/cache/yum COPY docker-entrypoint.sh /docker-entrypoint.sh +CMD ["/usr/sbin/init"] ENTRYPOINT ["/docker-entrypoint.sh"] diff --git a/slurm/database/docker-entrypoint.sh b/slurm/database/docker-entrypoint.sh index 97aff4e..62b967d 100755 --- a/slurm/database/docker-entrypoint.sh +++ b/slurm/database/docker-entrypoint.sh @@ -1,6 +1,10 @@ #!/usr/bin/env bash set -e +# Determine the system architecture dynamically +ARCH=$(uname -m) +SLURM_VERSION="24.05.3" +SLURM_JWT=daemon SLURM_ACCT_DB_SQL=/slurm_acct_db.sql # start sshd server @@ -48,12 +52,16 @@ _wait_for_worker() { # run slurmdbd _slurmdbd() { - cd /root/rpmbuild/RPMS/aarch64 - yum -y --nogpgcheck localinstall slurm-22.05.6-1.el8.aarch64.rpm \ - slurm-perlapi-22.05.6-1.el8.aarch64.rpm \ - slurm-slurmdbd-22.05.6-1.el8.aarch64.rpm + cd /root/rpmbuild/RPMS/$ARCH + yum -y --nogpgcheck localinstall slurm-$SLURM_VERSION*.$ARCH.rpm \ + slurm-perlapi-$SLURM_VERSION*.$ARCH.rpm \ + slurm-slurmdbd-$SLURM_VERSION*.$ARCH.rpm mkdir -p /var/spool/slurm/d /var/log/slurm /etc/slurm - chown slurm: /var/spool/slurm/d /var/log/slurm + chown -R slurm: /var/spool/slurm/d /var/log/slurm + + mkdir -p /etc/config + chown -R slurm: /etc/config + if [[ ! -f /home/config/slurmdbd.conf ]]; then echo "### Missing slurmdbd.conf ###" exit @@ -62,10 +70,31 @@ _slurmdbd() { cp /home/config/slurmdbd.conf /etc/slurm/slurmdbd.conf chown slurm: /etc/slurm/slurmdbd.conf chmod 600 /etc/slurm/slurmdbd.conf + cp /etc/slurm/slurmdbd.conf /.secret/slurmdbd.conf fi + + echo "checking for jwt.key" + while [ ! -f /.secret/jwt_hs256.key ]; do + echo "." + sleep 1 + done + + mkdir -p /var/spool/slurm/statesave + chown slurm:slurm /var/spool/slurm/statesave + chmod 0755 /var/spool/slurm/statesave + cp /.secret/jwt_hs256.key /var/spool/slurm/statesave/jwt_hs256.key + chown slurm: /var/spool/slurm/statesave/jwt_hs256.key + chmod 0600 /var/spool/slurm/statesave/jwt_hs256.key + + echo "" + + sudo yum install -y nc + sudo yum install -y procps + sudo yum install -y iputils + echo "Starting slurmdbd" - cp /etc/slurm/slurmdbd.conf /.secret/slurmdbd.conf - /usr/sbin/slurmdbd + /usr/sbin/slurmdbd -Dvv + echo "Started slurmdbd" } ### main ### diff --git a/data/slurm/home/config/slurmdbd.conf b/slurm/database/slurmdbd.conf similarity index 57% rename from data/slurm/home/config/slurmdbd.conf rename to slurm/database/slurmdbd.conf index 6ee97ca..884a988 100644 --- a/data/slurm/home/config/slurmdbd.conf +++ b/slurm/database/slurmdbd.conf @@ -1,3 +1,8 @@ +# +# Example slurmdbd.conf file. +# +# See the slurmdbd.conf man page for more information. +# # Archive info #ArchiveJobs=yes #ArchiveDir="/tmp" @@ -8,16 +13,19 @@ # # Authentication info AuthType=auth/munge -AuthInfo=/var/run/munge/munge.socket.2 -# +#AuthInfo=/var/run/munge/munge.socket.2 +AuthAltTypes=auth/jwt +AuthAltParameters=jwt_key=/var/spool/slurm/statesave/jwt_hs256.key # slurmDBD info -DbdAddr=slurmdb -DbdHost=slurmdb +DbdAddr=slurmdbd +DbdHost=slurmdbd DbdPort=6819 SlurmUser=slurm +#MessageTimeout=300 DebugLevel=4 +#DefaultQOS=normal,standby LogFile=/var/log/slurm/slurmdbd.log -PidFile=/var/run/slurmdbd.pid +# PidFile=/var/run/slurmdbd/slurmdbd.pid #PluginDir=/usr/lib/slurm #PrivateData=accounts,users,usage,jobs #TrackWCKey=yes @@ -25,7 +33,6 @@ PidFile=/var/run/slurmdbd.pid # Database info StorageType=accounting_storage/mysql StorageHost=mariadb -StoragePort=3306 -StoragePass=demo StorageUser=slurm +StoragePass=demo StorageLoc=slurm_acct_db diff --git a/slurm/rest/Dockerfile b/slurm/rest/Dockerfile index b627826..470748d 100644 --- a/slurm/rest/Dockerfile +++ b/slurm/rest/Dockerfile @@ -1,5 +1,5 @@ -FROM clustercockpit/slurm.base:22.05.6 -MAINTAINER Jan Eitzinger +FROM clustercockpit/slurm.base:24.05.3 +LABEL org.opencontainers.image.authors="jan.eitzinger@fau.de" # clean up RUN rm -f /root/rpmbuild/RPMS/slurm-*.rpm \ @@ -7,4 +7,5 @@ RUN rm -f /root/rpmbuild/RPMS/slurm-*.rpm \ && rm -rf /var/cache/yum COPY docker-entrypoint.sh /docker-entrypoint.sh +CMD ["/usr/sbin/init"] ENTRYPOINT ["/docker-entrypoint.sh"] diff --git a/slurm/rest/docker-entrypoint.sh b/slurm/rest/docker-entrypoint.sh index 6ef6bcb..146ceff 100755 --- a/slurm/rest/docker-entrypoint.sh +++ b/slurm/rest/docker-entrypoint.sh @@ -1,108 +1,142 @@ #!/usr/bin/env bash set -e +# Determine the system architecture dynamically +ARCH=$(uname -m) +SLURM_VERSION="24.05.3" +# SLURMRESTD="/tmp/slurmrestd.socket" +SLURM_JWT=daemon + # start sshd server _sshd_host() { - if [ ! -d /var/run/sshd ]; then - mkdir /var/run/sshd - ssh-keygen -t rsa -f /etc/ssh/ssh_host_rsa_key -N '' - fi - /usr/sbin/sshd -} - -# setup worker ssh to be passwordless -_ssh_worker() { - if [[ ! -d /home/worker ]]; then - mkdir -p /home/worker - chown -R worker:worker /home/worker + if [ ! -d /var/run/sshd ]; then + mkdir /var/run/sshd + ssh-keygen -t rsa -f /etc/ssh/ssh_host_rsa_key -N '' fi - cat > /home/worker/setup-worker-ssh.sh < ~/.ssh/authorized_keys -chmod 0640 ~/.ssh/authorized_keys -cat >> ~/.ssh/config < /etc/munge/munge.key" - chown munge: /etc/munge/munge.key - chmod 400 /etc/munge/munge.key sudo -u munge /sbin/munged munge -n munge -n | unmunge remunge } -# copy secrets to /.secret directory for other nodes -_copy_secrets() { - cp /home/worker/worker-secret.tar.gz /.secret/worker-secret.tar.gz - cp thome/worker/setup-worker-ssh.sh /.secret/setup-worker-ssh.sh - cp /etc/munge/munge.key /.secret/munge.key - rm -f /home/worker/worker-secret.tar.gz - rm -f /home/worker/setup-worker-ssh.sh +_enable_slurmrestd() { + + cat >/usr/lib/systemd/system/slurmrestd.service < +FROM clustercockpit/slurm.base:24.05.3 +LABEL org.opencontainers.image.authors="jan.eitzinger@fau.de" # clean up RUN rm -f /root/rpmbuild/RPMS/slurm-*.rpm \ @@ -8,4 +8,5 @@ RUN rm -f /root/rpmbuild/RPMS/slurm-*.rpm \ WORKDIR /home/worker COPY docker-entrypoint.sh /docker-entrypoint.sh +CMD ["/usr/sbin/init"] ENTRYPOINT ["/docker-entrypoint.sh"] diff --git a/data/slurm/home/config/cgroup.conf b/slurm/worker/cgroup.conf similarity index 79% rename from data/slurm/home/config/cgroup.conf rename to slurm/worker/cgroup.conf index 728b80b..1f930c7 100644 --- a/data/slurm/home/config/cgroup.conf +++ b/slurm/worker/cgroup.conf @@ -1,3 +1,4 @@ +CgroupPlugin=disabled ConstrainCores=yes ConstrainDevices=no ConstrainRAMSpace=yes diff --git a/slurm/worker/docker-entrypoint.sh b/slurm/worker/docker-entrypoint.sh index 12ecf3e..e254c0a 100755 --- a/slurm/worker/docker-entrypoint.sh +++ b/slurm/worker/docker-entrypoint.sh @@ -1,6 +1,10 @@ #!/usr/bin/env bash set -e +# Determine the system architecture dynamically +ARCH=$(uname -m) +SLURM_VERSION="24.05.3" + # start sshd server _sshd_host() { if [ ! -d /var/run/sshd ]; then @@ -12,6 +16,10 @@ _sshd_host() { # start munge using existing key _munge_start_using_key() { + sudo yum install -y nc + sudo yum install -y procps + sudo yum install -y iputils + echo -n "cheking for munge.key" while [ ! -f /.secret/munge.key ]; do echo -n "." @@ -32,49 +40,67 @@ _munge_start_using_key() { # wait for worker user in shared /home volume _wait_for_worker() { + echo "checking for id_rsa.pub" if [ ! -f /home/worker/.ssh/id_rsa.pub ]; then - echo -n "checking for id_rsa.pub" + echo "checking for id_rsa.pub" while [ ! -f /home/worker/.ssh/id_rsa.pub ]; do echo -n "." sleep 1 done echo "" fi + echo "done checking for id_rsa.pub" + } _start_dbus() { - dbus-uuidgen > /var/lib/dbus/machine-id - mkdir -p /var/run/dbus - dbus-daemon --config-file=/usr/share/dbus-1/system.conf --print-address + dbus-uuidgen >/var/lib/dbus/machine-id + mkdir -p /var/run/dbus + dbus-daemon --config-file=/usr/share/dbus-1/system.conf --print-address } # run slurmd _slurmd() { - cd /root/rpmbuild/RPMS/aarch64 - yum -y --nogpgcheck localinstall slurm-22.05.6-1.el8.aarch64.rpm \ - slurm-perlapi-22.05.6-1.el8.aarch64.rpm \ - slurm-slurmd-22.05.6-1.el8.aarch64.rpm \ - slurm-torque-22.05.6-1.el8.aarch64.rpm - if [ ! -f /.secret/slurm.conf ]; then - echo -n "checking for slurm.conf" - while [ ! -f /.secret/slurm.conf ]; do - echo -n "." - sleep 1 - done - echo "" - fi - mkdir -p /var/spool/slurm/d /etc/slurm - chown slurm: /var/spool/slurm/d - cp /home/config/cgroup.conf /etc/slurm/cgroup.conf - chown slurm: /etc/slurm/cgroup.conf - chmod 600 /etc/slurm/cgroup.conf - cp /home/config/slurm.conf /etc/slurm/slurm.conf - chown slurm: /etc/slurm/slurm.conf - chmod 600 /etc/slurm/slurm.conf - touch /var/log/slurmd.log - chown slurm: /var/log/slurmd.log - echo -n "Starting slurmd" - /usr/sbin/slurmd + cd /root/rpmbuild/RPMS/$ARCH + yum -y --nogpgcheck localinstall slurm-$SLURM_VERSION*.$ARCH.rpm \ + slurm-perlapi-$SLURM_VERSION*.$ARCH.rpm \ + slurm-slurmd-$SLURM_VERSION*.$ARCH.rpm \ + slurm-torque-$SLURM_VERSION*.$ARCH.rpm + + echo "checking for slurm.conf" + if [ ! -f /.secret/slurm.conf ]; then + echo "checking for slurm.conf" + while [ ! -f /.secret/slurm.conf ]; do + echo -n "." + sleep 1 + done + echo "" + fi + echo "found slurm.conf" + + # sudo yum install -y nc + # sudo yum install -y procps + # sudo yum install -y iputils + + mkdir -p /var/spool/slurm/d /etc/slurm /var/run/slurm/d /var/log/slurm + chown slurm: /var/spool/slurm/d /var/run/slurm/d /var/log/slurm + cp /home/config/cgroup.conf /etc/slurm/cgroup.conf + chown slurm: /etc/slurm/cgroup.conf + chmod 600 /etc/slurm/cgroup.conf + cp /home/config/slurm.conf /etc/slurm/slurm.conf + chown slurm: /etc/slurm/slurm.conf + chmod 600 /etc/slurm/slurm.conf + touch /var/log/slurm/slurmd.log + chown slurm: /var/log/slurm/slurmd.log + + touch /var/run/slurm/d/slurmd.pid + chmod 600 /var/run/slurm/d/slurmd.pid + chown slurm: /var/run/slurm/d/slurmd.pid + + echo "Starting slurmd" + /usr/sbin/slurmstepd infinity & + /usr/sbin/slurmd -Dvv + echo "Started slurmd" } ### main ###