mirror of
https://github.com/ClusterCockpit/cc-backend
synced 2025-01-26 03:19:06 +01:00
read .env automatically, support systemd
This commit is contained in:
parent
ff24d946fd
commit
f185d12078
56
README.md
56
README.md
@ -28,15 +28,24 @@ ln -s <your-existing-job-archive> ./var/job-archive
|
||||
# Create empty job.db (Will be initialized as SQLite3 database)
|
||||
touch ./var/job.db
|
||||
|
||||
# EDIT THE .env FILE BEFORE YOU DEPLOY (Change the secrets)!
|
||||
# If authentication is disabled, it can be empty.
|
||||
source .env
|
||||
|
||||
# This will first initialize the job.db database by traversing all
|
||||
# `meta.json` files in the job-archive. After that, a HTTP server on
|
||||
# the port 8080 will be running. The `--init-db` is only needed the first time.
|
||||
./cc-jobarchive --init-db --add-user <your-username>:admin:<your-password>
|
||||
# `meta.json` files in the job-archive and add a new user. `--no-server` will cause the
|
||||
# executable to stop once it has done that instead of starting a server.
|
||||
./cc-jobarchive --init-db --add-user <your-username>:admin:<your-password> --no-server
|
||||
|
||||
# Start a HTTP server (HTTPS can be enabled, the default port is 8080):
|
||||
./cc-jobarchive
|
||||
|
||||
# Show other options:
|
||||
./cc-jobarchive --help
|
||||
```
|
||||
|
||||
In order to run this program as a deamon, look at [utils/systemd/README.md](./utils/systemd/README.md) where a systemd unit file and more explanation is provided.
|
||||
|
||||
### Configuration
|
||||
|
||||
A config file in the JSON format can be provided using `--config` to override the defaults. Look at the beginning of `server.go` for the defaults and consequently the format of the configuration file.
|
||||
@ -45,9 +54,42 @@ A config file in the JSON format can be provided using `--config` to override th
|
||||
|
||||
This project uses [gqlgen](https://github.com/99designs/gqlgen) for the GraphQL API. The schema can be found in `./graph/schema.graphqls`. After changing it, you need to run `go run github.com/99designs/gqlgen` which will update `graph/model`. In case new resolvers are needed, they will be inserted into `graph/schema.resolvers.go`, where you will need to implement them.
|
||||
|
||||
### Project Structure
|
||||
|
||||
- `api/` contains the REST API. The routes defined there should be called whenever a job starts/stops. The API is documented in the OpenAPI 3.0 format in [./api/openapi.yaml](./api/openapi.yaml).
|
||||
- `auth/` is where the (optional) authentication middleware can be found, which adds the currently authenticated user to the request context. The `user` table is created and managed here as well.
|
||||
- `auth/ldap.go` contains everything to do with automatically syncing and authenticating users form an LDAP server.
|
||||
- `config` handles the `cluster.json` files and the user-specific configurations (changeable via GraphQL) for the Web-UI such as the selected metrics etc.
|
||||
- `frontend` is a submodule, this is where the Svelte based frontend resides.
|
||||
- `graph/generated` should *not* be touched.
|
||||
- `graph/model` contains all types defined in the GraphQL schema not manually defined in `schema/`. Manually defined types have to be listed in `gqlgen.yml`.
|
||||
- `graph/schema.graphqls` contains the GraphQL schema. Whenever you change it, you should call `go run github.com/99designs/gqlgen`.
|
||||
- `graph/` contains the resolvers and handlers for the GraphQL API. Function signatures in `graph/schema.resolvers.go` are automatically generated.
|
||||
- `metricdata/` handles getting and archiving the metrics associated with a job.
|
||||
- `metricdata/metricdata.go` defines the interface `MetricDataRepository` and provides functions to the GraphQL and REST API for accessing a jobs metrics which automatically take care of selecting the source for the metrics (the archive or one of the metric data repositories).
|
||||
- `metricdata/archive.go` provides functions for fetching metrics from the job-archive and archiving a job to the job-archive.
|
||||
- `metricdata/cc-metric-store.go` contains an implementation of the `MetricDataRepository` interface which can fetch data from an [cc-metric-store](https://github.com/ClusterCockpit/cc-metric-store)
|
||||
- `metricdata/influxdb-v2` contains an implementation of the `MetricDataRepository` interface which can fetch data from an InfluxDBv2 database. It is currently disabled and out of date and can not be used as of writing.
|
||||
- `schema/` contains type definitions used all over this project extracted in this package as Go disallows cyclic dependencies between packages.
|
||||
- `schema/float.go` contains a custom `float64` type which overwrites JSON and GraphQL Marshaling/Unmarshalling. This is needed because a regular optional `Float` in GraphQL will map to `*float64` types in Go. Wrapping every single metric value in an allocation would be a lot of overhead.
|
||||
- `schema/job.go` provides the types representing a job and its resources. Those can be used as type for a `meta.json` file and/or a row in the `job` table.
|
||||
- `templates/` is mostly full of HTML templates and a small helper go module.
|
||||
- `utils/systemd` describes how to deploy/install this as a systemd service
|
||||
- `utils/` is mostly outdated. Look at the [cc-util repo](https://github.com/ClusterCockpit/cc-util) for more up-to-date scripts.
|
||||
- `.env` *must* be changed before you deploy this. It contains a Base64 encoded [Ed25519](https://en.wikipedia.org/wiki/EdDSA) key-pair, the secret used for sessions and the password to the LDAP server if LDAP authentication is enabled.
|
||||
- `gqlgen.yml` configures the behaviour and generation of [gqlgen](https://github.com/99designs/gqlgen).
|
||||
- `init-db.go` initializes the `job` (and `tag` and `jobtag`) table if the `--init-db` flag is provided. Not only is the table created in the correct schema, but the job-archive is traversed as well.
|
||||
- `server.go` contains the main function and starts the actual http server.
|
||||
|
||||
### TODO
|
||||
|
||||
- [ ] Documentation
|
||||
- [ ] Write more TODOs
|
||||
- [ ] Caching
|
||||
- [ ] Generate JWTs based on the provided keys
|
||||
- [ ] fix frontend
|
||||
- [ ] write (unit) tests
|
||||
- [ ] make tokens and sessions (currently based on cookies) expire after some configurable time
|
||||
- [ ] when authenticating using a JWT, check if that user still exists
|
||||
- [ ] allow mysql as database and passing the database uri as environment variable
|
||||
- [ ] fix InfluxDB MetricDataRepository (new or old line-protocol format? Support node-level metrics only?)
|
||||
- [ ] support all metric scopes
|
||||
- [ ] documentation, comments in the code base
|
||||
- [ ] write more TODOs
|
||||
- [ ] caching
|
||||
|
13
api/rest.go
13
api/rest.go
@ -9,6 +9,7 @@ import (
|
||||
"net/http"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"sync"
|
||||
|
||||
"github.com/ClusterCockpit/cc-jobarchive/config"
|
||||
"github.com/ClusterCockpit/cc-jobarchive/graph"
|
||||
@ -20,10 +21,11 @@ import (
|
||||
)
|
||||
|
||||
type RestApi struct {
|
||||
DB *sqlx.DB
|
||||
Resolver *graph.Resolver
|
||||
AsyncArchiving bool
|
||||
MachineStateDir string
|
||||
DB *sqlx.DB
|
||||
Resolver *graph.Resolver
|
||||
AsyncArchiving bool
|
||||
MachineStateDir string
|
||||
OngoingArchivings sync.WaitGroup
|
||||
}
|
||||
|
||||
func (api *RestApi) MountRoutes(r *mux.Router) {
|
||||
@ -233,6 +235,9 @@ func (api *RestApi) stopJob(rw http.ResponseWriter, r *http.Request) {
|
||||
}
|
||||
|
||||
doArchiving := func(job *schema.Job, ctx context.Context) error {
|
||||
api.OngoingArchivings.Add(1)
|
||||
defer api.OngoingArchivings.Done()
|
||||
|
||||
job.Duration = int32(req.StopTime - job.StartTime.Unix())
|
||||
jobMeta, err := metricdata.ArchiveJob(job, ctx)
|
||||
if err != nil {
|
||||
|
201
server.go
201
server.go
@ -1,14 +1,26 @@
|
||||
package main
|
||||
|
||||
import (
|
||||
"bufio"
|
||||
"context"
|
||||
"crypto/tls"
|
||||
"encoding/json"
|
||||
"errors"
|
||||
"flag"
|
||||
"fmt"
|
||||
"log"
|
||||
"net"
|
||||
"net/http"
|
||||
"net/url"
|
||||
"os"
|
||||
"os/exec"
|
||||
"os/signal"
|
||||
"os/user"
|
||||
"strconv"
|
||||
"strings"
|
||||
"sync"
|
||||
"syscall"
|
||||
"time"
|
||||
|
||||
"github.com/99designs/gqlgen/graphql/handler"
|
||||
"github.com/99designs/gqlgen/graphql/playground"
|
||||
@ -33,6 +45,10 @@ type ProgramConfig struct {
|
||||
// Address where the http (or https) server will listen on (for example: 'localhost:80').
|
||||
Addr string `json:"addr"`
|
||||
|
||||
// Drop root permissions once .env was read and the port was taken.
|
||||
User string `json:"user"`
|
||||
Group string `json:"group"`
|
||||
|
||||
// Disable authentication (for everything: API, Web-UI, ...)
|
||||
DisableAuthentication bool `json:"disable-authentication"`
|
||||
|
||||
@ -68,7 +84,7 @@ type ProgramConfig struct {
|
||||
}
|
||||
|
||||
var programConfig ProgramConfig = ProgramConfig{
|
||||
Addr: "0.0.0.0:8080",
|
||||
Addr: ":8080",
|
||||
DisableAuthentication: false,
|
||||
StaticFiles: "./frontend/public",
|
||||
DB: "./var/job.db",
|
||||
@ -116,6 +132,10 @@ func main() {
|
||||
flag.StringVar(&flagGenJWT, "jwt", "", "Generate and print a JWT for the user specified by the username")
|
||||
flag.Parse()
|
||||
|
||||
if err := loadEnv("./.env"); err != nil && !os.IsNotExist(err) {
|
||||
log.Fatalf("parsing './.env' file failed: %s", err.Error())
|
||||
}
|
||||
|
||||
if flagConfigFile != "" {
|
||||
data, err := os.ReadFile(flagConfigFile)
|
||||
if err != nil {
|
||||
@ -280,15 +300,67 @@ func main() {
|
||||
handlers.AllowedMethods([]string{"GET", "POST", "HEAD", "OPTIONS"}),
|
||||
handlers.AllowedOrigins([]string{"*"}))(handlers.LoggingHandler(os.Stdout, handlers.CompressHandler(r)))
|
||||
|
||||
// Start http or https server
|
||||
if programConfig.HttpsCertFile != "" && programConfig.HttpsKeyFile != "" {
|
||||
log.Printf("HTTPS server running at %s...", programConfig.Addr)
|
||||
err = http.ListenAndServeTLS(programConfig.Addr, programConfig.HttpsCertFile, programConfig.HttpsKeyFile, handler)
|
||||
} else {
|
||||
log.Printf("HTTP server running at %s...", programConfig.Addr)
|
||||
err = http.ListenAndServe(programConfig.Addr, handler)
|
||||
var wg sync.WaitGroup
|
||||
server := http.Server{
|
||||
ReadTimeout: 10 * time.Second,
|
||||
WriteTimeout: 10 * time.Second,
|
||||
Handler: handler,
|
||||
Addr: programConfig.Addr,
|
||||
}
|
||||
log.Fatal(err)
|
||||
|
||||
// Start http or https server
|
||||
|
||||
listener, err := net.Listen("tcp", programConfig.Addr)
|
||||
if err != nil {
|
||||
log.Fatal(err)
|
||||
}
|
||||
|
||||
if programConfig.HttpsCertFile != "" && programConfig.HttpsKeyFile != "" {
|
||||
cert, err := tls.LoadX509KeyPair(programConfig.HttpsCertFile, programConfig.HttpsKeyFile)
|
||||
if err != nil {
|
||||
log.Fatal(err)
|
||||
}
|
||||
listener = tls.NewListener(listener, &tls.Config{
|
||||
Certificates: []tls.Certificate{cert},
|
||||
})
|
||||
log.Printf("HTTPS server listening at %s...", programConfig.Addr)
|
||||
} else {
|
||||
log.Printf("HTTP server listening at %s...", programConfig.Addr)
|
||||
}
|
||||
|
||||
// Because this program will want to bind to a privileged port (like 80), the listener must
|
||||
// be established first, then the user can be changed, and after that,
|
||||
// the actuall http server can be started.
|
||||
if err := dropPrivileges(); err != nil {
|
||||
log.Fatalf("error while changing user: %s", err.Error())
|
||||
}
|
||||
|
||||
wg.Add(1)
|
||||
go func() {
|
||||
defer wg.Done()
|
||||
if err := server.Serve(listener); err != nil && err != http.ErrServerClosed {
|
||||
log.Fatal(err)
|
||||
}
|
||||
}()
|
||||
|
||||
wg.Add(1)
|
||||
sigs := make(chan os.Signal, 1)
|
||||
signal.Notify(sigs, syscall.SIGINT, syscall.SIGTERM)
|
||||
go func() {
|
||||
defer wg.Done()
|
||||
<-sigs
|
||||
systemdNotifiy(false, "shutting down")
|
||||
|
||||
// First shut down the server gracefully (waiting for all ongoing requests)
|
||||
server.Shutdown(context.Background())
|
||||
|
||||
// Then, wait for any async archivings still pending...
|
||||
api.OngoingArchivings.Wait()
|
||||
}()
|
||||
|
||||
systemdNotifiy(true, "running")
|
||||
wg.Wait()
|
||||
log.Print("Gracefull shutdown completed!")
|
||||
}
|
||||
|
||||
func monitoringRoutes(router *mux.Router, resolver *graph.Resolver) {
|
||||
@ -448,3 +520,114 @@ func monitoringRoutes(router *mux.Router, resolver *graph.Resolver) {
|
||||
})
|
||||
})
|
||||
}
|
||||
|
||||
func loadEnv(file string) error {
|
||||
f, err := os.Open(file)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
defer f.Close()
|
||||
s := bufio.NewScanner(bufio.NewReader(f))
|
||||
for s.Scan() {
|
||||
line := s.Text()
|
||||
if strings.HasPrefix(line, "#") || len(line) == 0 {
|
||||
continue
|
||||
}
|
||||
|
||||
if strings.Contains(line, "#") {
|
||||
return errors.New("'#' are only supported at the start of a line")
|
||||
}
|
||||
|
||||
line = strings.TrimPrefix(line, "export ")
|
||||
parts := strings.SplitN(line, "=", 2)
|
||||
if len(parts) != 2 {
|
||||
return fmt.Errorf("unsupported line: %#v", line)
|
||||
}
|
||||
|
||||
key := strings.TrimSpace(parts[0])
|
||||
val := strings.TrimSpace(parts[1])
|
||||
if strings.HasPrefix(val, "\"") {
|
||||
if !strings.HasSuffix(val, "\"") {
|
||||
return fmt.Errorf("unsupported line: %#v", line)
|
||||
}
|
||||
|
||||
runes := []rune(val[1 : len(val)-1])
|
||||
sb := strings.Builder{}
|
||||
for i := 0; i < len(runes); i++ {
|
||||
if runes[i] == '\\' {
|
||||
i++
|
||||
switch runes[i] {
|
||||
case 'n':
|
||||
sb.WriteRune('\n')
|
||||
case 'r':
|
||||
sb.WriteRune('\r')
|
||||
case 't':
|
||||
sb.WriteRune('\t')
|
||||
case '"':
|
||||
sb.WriteRune('"')
|
||||
default:
|
||||
return fmt.Errorf("unsupprorted escape sequence in quoted string: backslash %#v", runes[i])
|
||||
}
|
||||
continue
|
||||
}
|
||||
sb.WriteRune(runes[i])
|
||||
}
|
||||
|
||||
val = sb.String()
|
||||
}
|
||||
|
||||
os.Setenv(key, val)
|
||||
}
|
||||
|
||||
return s.Err()
|
||||
}
|
||||
|
||||
func dropPrivileges() error {
|
||||
if programConfig.Group != "" {
|
||||
g, err := user.LookupGroup(programConfig.Group)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
gid, _ := strconv.Atoi(g.Gid)
|
||||
if err := syscall.Setgid(gid); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
|
||||
if programConfig.User != "" {
|
||||
u, err := user.Lookup(programConfig.User)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
uid, _ := strconv.Atoi(u.Uid)
|
||||
if err := syscall.Setuid(uid); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
// If started via systemd, inform systemd that we are running:
|
||||
// https://www.freedesktop.org/software/systemd/man/sd_notify.html
|
||||
func systemdNotifiy(ready bool, status string) {
|
||||
if os.Getenv("NOTIFY_SOCKET") == "" {
|
||||
// Not started using systemd
|
||||
return
|
||||
}
|
||||
|
||||
args := []string{fmt.Sprintf("--pid=%d", os.Getpid())}
|
||||
if ready {
|
||||
args = append(args, "--ready")
|
||||
}
|
||||
|
||||
if status != "" {
|
||||
args = append(args, fmt.Sprintf("--status=%s", status))
|
||||
}
|
||||
|
||||
cmd := exec.Command("systemd-notify", args...)
|
||||
cmd.Run() // errors ignored on purpose, there is not much to do anyways.
|
||||
}
|
||||
|
30
utils/systemd/README.md
Normal file
30
utils/systemd/README.md
Normal file
@ -0,0 +1,30 @@
|
||||
# How to run this as a systemd deamon
|
||||
|
||||
The files in this directory assume that you install the Golang version of ClusterCockpit to `/var/clustercockpit`. If you do not like that, you can choose any other location, but make sure to replace all paths that begin with `/var/clustercockpit` in the `clustercockpit.service` file!
|
||||
|
||||
If you have not installed [yarn](https://yarnpkg.com/getting-started/install) and [go](https://go.dev/doc/install) already, do that (Golang is available in most package managers).
|
||||
|
||||
The `config.json` can have the optional fields *user* and *group*. If provided, the application will call [setuid](https://man7.org/linux/man-pages/man2/setuid.2.html) and [setgid](https://man7.org/linux/man-pages/man2/setgid.2.html) after having read the config file and having bound to a TCP port (so that it can take a privileged port), but before it starts accepting any connections. This is good for security, but means that the directories `frontend/public`, `var/` and `templates/` must be readable by that user and `var/` writable as well (All paths relative to the repos root). The `.env` and `config.json` files might contain secrets and should not be readable by that user. If those files are changed, the server has to be restarted.
|
||||
|
||||
```sh
|
||||
# 1.: Clone this repository to /var/clustercockpit
|
||||
git clone git@github.com:ClusterCockpit/cc-specifications.git /var/clustercockpit
|
||||
|
||||
# 2.: Install all dependencies and build everything
|
||||
cd /var/clustercockpit
|
||||
go get && go build && (cd ./frontend && yarn install && yarn build)
|
||||
|
||||
# 3.: Modify the `./config.json` file from the directory which contains this README.md to your liking and put it in the repo root
|
||||
cp ./utils/systemd/config.json ./config.json
|
||||
vim ./config.json # do your thing...
|
||||
|
||||
# 4.: Add the systemd service unit file
|
||||
sudo ln -s /var/clustercockpit/utils/systemd/clustercockpit.service /etc/systemd/system/clustercockpit.service
|
||||
|
||||
# 5.: Enable and start the server
|
||||
sudo systemctl enable clustercockpit.service # optional (if done, (re-)starts automatically)
|
||||
sudo systemctl start clustercockpit.service
|
||||
|
||||
# Check whats going on:
|
||||
sudo journalctl -u clustercockpit.service
|
||||
```
|
16
utils/systemd/clustercockpit.service
Normal file
16
utils/systemd/clustercockpit.service
Normal file
@ -0,0 +1,16 @@
|
||||
[Unit]
|
||||
Description=ClusterCockpit Web Server (Go edition)
|
||||
Documentation=https://github.com/ClusterCockpit/cc-backend
|
||||
Wants=network-online.target
|
||||
After=network-online.target
|
||||
|
||||
[Service]
|
||||
WorkingDirectory=/var/clustercockpit
|
||||
Type=notify
|
||||
NotifyAccess=all
|
||||
Restart=on-failure
|
||||
TimeoutStopSec=100
|
||||
ExecStart=/var/clustercockpit/cc-jobarchive --config ./config.json
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
7
utils/systemd/config.json
Normal file
7
utils/systemd/config.json
Normal file
@ -0,0 +1,7 @@
|
||||
{
|
||||
"addr": "0.0.0.0:443",
|
||||
"https-cert-file": "/etc/letsencrypt/live/<...>/fullchain.pem",
|
||||
"https-key-file": "/etc/letsencrypt/live/<...>/privkey.pem",
|
||||
"user": "clustercockpit",
|
||||
"group": "clustercockpit"
|
||||
}
|
Loading…
Reference in New Issue
Block a user