cc-metric-collector/collectors
Thomas Gruber 195d0794b0
Merge develop branch into main (#106)
* Add cpu_used (all-cpu_idle) to CpustatCollector

* Update to line-protocol/v2

* Update runonce.yml with Golang 1.20

* Update fsnotify in LIKWID Collector

* Use not a pointer to line-protocol.Encoder

* Simplify Makefile

* Use only as many arguments as required

* Allow sum function to handle non float types

* Allow values to be a slice of type float64, float32, int, int64, int32, bool

* Use generic function to simplify code

* Add missing case for type []int32

* Use generic function to compute minimum

* Use generic function to compute maximum

* Use generic function to compute average

* Add error value to sumAnyType

* Use generic function to compute median

* For older versions of go slices is not part of the installation

* Remove old entries from go.sum

* Use simpler sort function

* Compute metrics ib_total and ib_total_pkts

* Add aggregated metrics.
Add missing units

* Update likwidMetric.go

Fixes a potential bug when `fsnotify.NewWatcher()` fails with an error

* Completly avoid memory allocations in infinibandMetric read()

* Fixed initialization: Initalization and measurements should run in the same thread

---------

Co-authored-by: Holger Obermaier <40787752+ho-ob@users.noreply.github.com>
2023-08-29 14:12:49 +02:00
..
beegfsmetaMetric.go Add latest development to main branch (#89) 2022-10-10 12:23:51 +02:00
beegfsmetaMetric.md Beegfs collector (#50) 2022-03-04 14:35:47 +01:00
beegfsstorageMetric.go Add latest development to main branch (#89) 2022-10-10 12:23:51 +02:00
beegfsstorageMetric.md Beegfs collector (#50) 2022-03-04 14:35:47 +01:00
collectorManager.go Merge develop branch into main (#96) 2022-12-14 17:02:39 +01:00
cpufreqCpuinfoMetric.go Merge develop branch into main (#96) 2022-12-14 17:02:39 +01:00
cpufreqCpuinfoMetric.md Rename cpu type to hwthread (#69) 2022-05-13 14:09:45 +02:00
cpufreqMetric.go Merge develop branch into main (#96) 2022-12-14 17:02:39 +01:00
cpufreqMetric.md Merge develop branch into main (#96) 2022-12-14 17:02:39 +01:00
cpustatMetric.go Merge develop branch into main (#106) 2023-08-29 14:12:49 +02:00
cpustatMetric.md Merge develop branch into main (#106) 2023-08-29 14:12:49 +02:00
customCmdMetric.go Use customcmd commands if they did not error. (#101) 2023-02-28 12:02:01 +01:00
customCmdMetric.md Modularize the whole thing (#16) 2022-01-25 15:37:43 +01:00
diskstatMetric.go Add latest development to main branch (#89) 2022-10-10 12:23:51 +02:00
diskstatMetric.md Split diskstat Collector (#38) 2022-02-21 12:44:26 +01:00
gpfsMetric.go Merge develop branch into main (#106) 2023-08-29 14:12:49 +02:00
gpfsMetric.md Merge develop branch into main (#106) 2023-08-29 14:12:49 +02:00
infinibandMetric.go Merge develop branch into main (#106) 2023-08-29 14:12:49 +02:00
infinibandMetric.md Merge develop branch into main (#106) 2023-08-29 14:12:49 +02:00
iostatMetric.go Add latest development to main branch (#89) 2022-10-10 12:23:51 +02:00
iostatMetric.md Split diskstat Collector (#38) 2022-02-21 12:44:26 +01:00
ipmiMetric.go Merge develop branch into main (#96) 2022-12-14 17:02:39 +01:00
ipmiMetric.md Merge develop branch into main (#96) 2022-12-14 17:02:39 +01:00
likwidMetric.go Merge develop branch into main (#106) 2023-08-29 14:12:49 +02:00
likwidMetric.md Fix for LIKWID collector with separate measurement thread and inotify watcher on the LIKWID lock (#97) 2022-12-20 12:59:33 +01:00
loadavgMetric.go Add latest development to main branch (#89) 2022-10-10 12:23:51 +02:00
loadavgMetric.md Modularize the whole thing (#16) 2022-01-25 15:37:43 +01:00
lustreMetric.go Add latest development to main branch (#89) 2022-10-10 12:23:51 +02:00
lustreMetric.md Derived metrics (#65) 2022-03-15 16:09:47 +01:00
Makefile Merge develop branch into main (#106) 2023-08-29 14:12:49 +02:00
memstatMetric.go Add latest development to main branch (#89) 2022-10-10 12:23:51 +02:00
memstatMetric.md Modularize the whole thing (#16) 2022-01-25 15:37:43 +01:00
metricCollector.go Add latest development to main branch (#89) 2022-10-10 12:23:51 +02:00
netstatMetric.go Add latest development to main branch (#89) 2022-10-10 12:23:51 +02:00
netstatMetric.md Derived metrics (#57) 2022-03-11 13:48:18 +01:00
nfs3Metric.md Split NfsCollector in Nfs3Collector and Nfs4Collector (#28) 2022-02-07 15:43:01 +01:00
nfs4Metric.md Split NfsCollector in Nfs3Collector and Nfs4Collector (#28) 2022-02-07 15:43:01 +01:00
nfsiostatMetric.go Merge develop branch into main (#96) 2022-12-14 17:02:39 +01:00
nfsiostatMetric.md Merge develop branch into main (#96) 2022-12-14 17:02:39 +01:00
nfsMetric.go Add latest development to main branch (#89) 2022-10-10 12:23:51 +02:00
numastatsMetric.go Merge develop branch into main (#96) 2022-12-14 17:02:39 +01:00
numastatsMetric.md Merge develop branch into main (#96) 2022-12-14 17:02:39 +01:00
nvidiaMetric.go Add latest development to main branch (#89) 2022-10-10 12:23:51 +02:00
nvidiaMetric.md Option to use MIG slice name as subtype-id in NvidiaCollector 2022-05-13 15:26:47 +02:00
raplMetric.go Merge develop branch into main (#96) 2022-12-14 17:02:39 +01:00
raplMetric.md Merge develop branch into main (#96) 2022-12-14 17:02:39 +01:00
README.md Merge develop branch into main (#96) 2022-12-14 17:02:39 +01:00
rocmsmiMetric.go Add latest development to main branch (#89) 2022-10-10 12:23:51 +02:00
rocmsmiMetric.md AMD ROCm SMI collector (#77) 2022-05-25 15:55:43 +02:00
sampleMetric.go Merge develop branch into main (#96) 2022-12-14 17:02:39 +01:00
sampleTimerMetric.go Merge develop branch into main (#96) 2022-12-14 17:02:39 +01:00
schedstatMetric.go Add latest development to main branch (#89) 2022-10-10 12:23:51 +02:00
schedstatMetric.md cpustatMetric.go: Use derived values instead of absolute values (#83) 2022-09-07 14:13:06 +02:00
selfMetric.go Add latest development to main branch (#89) 2022-10-10 12:23:51 +02:00
selfMetric.md Add latest development to main branch (#89) 2022-10-10 12:23:51 +02:00
tempMetric.go Add latest development to main branch (#89) 2022-10-10 12:23:51 +02:00
tempMetric.md Modularize the whole thing (#16) 2022-01-25 15:37:43 +01:00
topprocsMetric.go Add latest development to main branch (#89) 2022-10-10 12:23:51 +02:00
topprocsMetric.md Modularize the whole thing (#16) 2022-01-25 15:37:43 +01:00

CCMetric collectors

This folder contains the collectors for the cc-metric-collector.

Configuration

{
    "collector_type" : {
        <collector specific configuration>
    }
}

In contrast to the configuration files for sinks and receivers, the collectors configuration is not a list but a set of dicts. This is required because we didn't manage to partially read the type before loading the remaining configuration. We are eager to change this to the same format.

Available collectors

Todos

  • Aggreate metrics to higher topology entity (sum hwthread metrics to socket metric, ...). Needs to be configurable

Contributing own collectors

A collector reads data from any source, parses it to metrics and submits these metrics to the metric-collector. A collector provides three function:

  • Name() string: Return the name of the collector
  • Init(config json.RawMessage) error: Initializes the collector using the given collector-specific config in JSON. Check if needed files/commands exists, ...
  • Initialized() bool: Check if a collector is successfully initialized
  • Read(duration time.Duration, output chan ccMetric.CCMetric): Read, parse and submit data to the output channel as CCMetric. If the collector has to measure anything for some duration, use the provided function argument duration.
  • Close(): Closes down the collector.

It is recommanded to call setup() in the Init() function.

Finally, the collector needs to be registered in the collectorManager.go. There is a list of collectors called AvailableCollectors which is a map (collector_type_string -> pointer to MetricCollector interface). Add a new entry with a descriptive name and the new collector.

Sample collector

package collectors

import (
    "encoding/json"
    "time"

    lp "github.com/ClusterCockpit/cc-metric-collector/internal/ccMetric"
)

// Struct for the collector-specific JSON config
type SampleCollectorConfig struct {
    ExcludeMetrics []string `json:"exclude_metrics"`
}

type SampleCollector struct {
    metricCollector
    config SampleCollectorConfig
}

func (m *SampleCollector) Init(config json.RawMessage) error {
    // Check if already initialized
    if m.init {
        return nil
    }

    m.name = "SampleCollector"
    m.setup()
    if len(config) > 0 {
        err := json.Unmarshal(config, &m.config)
        if err != nil {
            return err
        }
    }
    m.meta = map[string]string{"source": m.name, "group": "Sample"}

    m.init = true
    return nil
}

func (m *SampleCollector) Read(interval time.Duration, output chan lp.CCMetric) {
    if !m.init {
        return
    }
    // tags for the metric, if type != node use proper type and type-id
    tags := map[string]string{"type" : "node"}

    x, err := GetMetric()
    if err != nil {
        cclog.ComponentError(m.name, fmt.Sprintf("Read(): %v", err))
    }

    // Each metric has exactly one field: value !
    value := map[string]interface{}{"value": int64(x)}
    if y, err := lp.New("sample_metric", tags, m.meta, value, time.Now()); err == nil {
        output <- y
    }
}

func (m *SampleCollector) Close() {
    m.init = false
    return
}