cc-metric-collector/collectors
Thomas Gruber f683f2e6da
Dynamically load liblikwid (#40)
* Check whether LIKWID library is present

* Generalize nan_to_zero option to invalid_to_zero including +Inf,+Inf and NaN

* Remove double error printing and return if measurements do not work
2022-02-21 13:29:33 +01:00
..
collectorManager.go Split diskstat Collector (#38) 2022-02-21 12:44:26 +01:00
cpufreqCpuinfoMetric.go Remove tags for num cores and packages 2022-02-18 16:59:59 +01:00
cpufreqCpuinfoMetric.md Add collector documentation 2022-02-08 13:46:44 +01:00
cpufreqMetric.go Remove tags for num cores and packages 2022-02-18 16:59:59 +01:00
cpufreqMetric.md Add collector documentation 2022-02-08 13:46:44 +01:00
cpustatMetric.go Update CpustatCollector (#36) 2022-02-17 15:46:06 +01:00
cpustatMetric.md Fix for documentation 2022-01-26 18:37:59 +01:00
customCmdMetric.go Use FromInfluxMetric() to convert influx to cc metric 2022-02-08 10:58:53 +01:00
customCmdMetric.md Modularize the whole thing (#16) 2022-01-25 15:37:43 +01:00
diskstatMetric.go Split diskstat Collector (#38) 2022-02-21 12:44:26 +01:00
diskstatMetric.md Split diskstat Collector (#38) 2022-02-21 12:44:26 +01:00
gpfsMetric.go Add skip_filesystem configuration 2022-02-07 13:30:42 +01:00
gpfsMetric.md Add documentation for GPFS metric 2022-02-07 11:37:34 +01:00
infinibandMetric.go Refactoring: Use array of pointers 2022-02-15 15:37:25 +01:00
infinibandMetric.md Add markdown documentation for metric collector ibstat_perfquery 2022-02-07 09:46:19 +01:00
infinibandPerfQueryMetric.go Add missing error check in InfiniBandPerfQueryMetric 2022-02-08 13:46:19 +01:00
infinibandPerfQueryMetric.md Add markdown documentation for metric collector ibstat_perfquery 2022-02-07 09:46:19 +01:00
iostatMetric.go Split diskstat Collector (#38) 2022-02-21 12:44:26 +01:00
iostatMetric.md Split diskstat Collector (#38) 2022-02-21 12:44:26 +01:00
ipmiMetric.go Use LookPath in IpmiCollector 2022-02-07 15:44:29 +01:00
ipmiMetric.md Modularize the whole thing (#16) 2022-01-25 15:37:43 +01:00
likwidMetric.go Dynamically load liblikwid (#40) 2022-02-21 13:29:33 +01:00
likwidMetric.md Dynamically load liblikwid (#40) 2022-02-21 13:29:33 +01:00
loadavgMetric.go Moved check which metric to skip to Init() 2022-02-04 19:22:42 +01:00
loadavgMetric.md Modularize the whole thing (#16) 2022-01-25 15:37:43 +01:00
lustreMetric.go Fix for LustreCollector. Check for root user 2022-02-07 13:27:35 +01:00
lustreMetric.md Modularize the whole thing (#16) 2022-01-25 15:37:43 +01:00
Makefile Updated to latest stable version of likwid 2022-01-19 15:55:48 +01:00
memstatMetric.go Avoid labels in collector manager loop 2022-01-26 15:54:49 +01:00
memstatMetric.md Modularize the whole thing (#16) 2022-01-25 15:37:43 +01:00
metricCollector.go Use FromInfluxMetric() to convert influx to cc metric 2022-02-08 10:58:53 +01:00
netstatMetric.go Update NetstatCollector to derive bandwidths and use an include list 2022-02-18 02:25:23 +01:00
netstatMetric.md Update NetstatCollector to derive bandwidths and use an include list 2022-02-18 02:25:23 +01:00
nfs3Metric.md Split NfsCollector in Nfs3Collector and Nfs4Collector (#28) 2022-02-07 15:43:01 +01:00
nfs4Metric.md Split NfsCollector in Nfs3Collector and Nfs4Collector (#28) 2022-02-07 15:43:01 +01:00
nfsMetric.go Split NfsCollector in Nfs3Collector and Nfs4Collector (#28) 2022-02-07 15:43:01 +01:00
numastatsMetric.go Cleanup 2022-02-14 22:14:06 +01:00
numastatsMetric.md Add collector documentation 2022-02-08 13:46:44 +01:00
nvidiaMetric.go Add comments and units to all nvidia metrics 2022-02-15 10:57:32 +01:00
nvidiaMetric.md Prefix Nvidia metrics with 'nv_' 2022-01-26 18:45:23 +01:00
README.md Split diskstat Collector (#38) 2022-02-21 12:44:26 +01:00
tempMetric.go Report maximum and critical temperature 2022-02-18 16:56:41 +01:00
tempMetric.md Modularize the whole thing (#16) 2022-01-25 15:37:43 +01:00
topprocsMetric.go Modularize the whole thing (#16) 2022-01-25 15:37:43 +01:00
topprocsMetric.md Modularize the whole thing (#16) 2022-01-25 15:37:43 +01:00

CCMetric collectors

This folder contains the collectors for the cc-metric-collector.

Configuration

{
    "collector_type" : {
        <collector specific configuration>
    }
}

In contrast to the configuration files for sinks and receivers, the collectors configuration is not a list but a set of dicts. This is required because we didn't manage to partially read the type before loading the remaining configuration. We are eager to change this to the same format.

Available collectors

Todos

  • Aggreate metrics to higher topology entity (sum hwthread metrics to socket metric, ...). Needs to be configurable

Contributing own collectors

A collector reads data from any source, parses it to metrics and submits these metrics to the metric-collector. A collector provides three function:

  • Name() string: Return the name of the collector
  • Init(config json.RawMessage) error: Initializes the collector using the given collector-specific config in JSON. Check if needed files/commands exists, ...
  • Initialized() bool: Check if a collector is successfully initialized
  • Read(duration time.Duration, output chan ccMetric.CCMetric): Read, parse and submit data to the output channel as CCMetric. If the collector has to measure anything for some duration, use the provided function argument duration.
  • Close(): Closes down the collector.

It is recommanded to call setup() in the Init() function.

Finally, the collector needs to be registered in the collectorManager.go. There is a list of collectors called AvailableCollectors which is a map (collector_type_string -> pointer to MetricCollector interface). Add a new entry with a descriptive name and the new collector.

Sample collector

package collectors

import (
    "encoding/json"
    "time"

    lp "github.com/ClusterCockpit/cc-metric-collector/internal/ccMetric"
)

// Struct for the collector-specific JSON config
type SampleCollectorConfig struct {
    ExcludeMetrics []string `json:"exclude_metrics"`
}

type SampleCollector struct {
    metricCollector
    config SampleCollectorConfig
}

func (m *SampleCollector) Init(config json.RawMessage) error {
    // Check if already initialized
    if m.init {
        return nil
    }

    m.name = "SampleCollector"
    m.setup()
    if len(config) > 0 {
        err := json.Unmarshal(config, &m.config)
        if err != nil {
            return err
        }
    }
    m.meta = map[string]string{"source": m.name, "group": "Sample"}

    m.init = true
    return nil
}

func (m *SampleCollector) Read(interval time.Duration, output chan lp.CCMetric) {
    if !m.init {
        return
    }
    // tags for the metric, if type != node use proper type and type-id
    tags := map[string]string{"type" : "node"}

    x, err := GetMetric()
    if err != nil {
        cclog.ComponentError(m.name, fmt.Sprintf("Read(): %v", err))
    }

    // Each metric has exactly one field: value !
    value := map[string]interface{}{"value": int64(x)}
    if y, err := lp.New("sample_metric", tags, m.meta, value, time.Now()); err == nil {
        output <- y
    }
}

func (m *SampleCollector) Close() {
    m.init = false
    return
}