Merge branch 'develop' into main

2025-10-18 03:55:06 +02:00 · 2022-02-21 14:32:24 +01:00
parent d5ff5b83ce f683f2e6da
commit 24a2c9992f
101 changed files with 8324 additions and 2641 deletions
--- a/collectors/README.md
+++ b/collectors/README.md
@@ -1,304 +1,59 @@
+# CCMetric collectors
+
 This folder contains the collectors for the cc-metric-collector.

-# `metricCollector.go`
-The base class/configuration is located in `metricCollector.go`.
-
-# Collectors
-
-* `memstatMetric.go`: Reads `/proc/meminfo` to calculate **node** metrics. It also combines values to the metric `mem_used`
-* `loadavgMetric.go`: Reads `/proc/loadavg` and submits **node** metrics:
-* `netstatMetric.go`: Reads `/proc/net/dev` and submits for all network devices as the **node** metrics.
-* `lustreMetric.go`: Reads Lustre's stats files and submits **node** metrics:
-* `infinibandMetric.go`: Reads InfiniBand metrics. It uses the `perfquery` command to read the **node** metrics but can fallback to sysfs counters in case `perfquery` does not work.
-* `likwidMetric.go`: Reads hardware performance events using LIKWID. It submits **socket** and **cpu** metrics
-* `cpustatMetric.go`: Read CPU specific values from `/proc/stat`
-* `topprocsMetric.go`: Reads the TopX processes by their CPU usage. X is configurable
-* `nvidiaMetric.go`: Read data about Nvidia GPUs using the NVML library
-* `tempMetric.go`: Read temperature data from `/sys/class/hwmon/hwmon*`
-* `ipmiMetric.go`: Collect data from `ipmitool` or as fallback `ipmi-sensors`
-* `customCmdMetric.go`: Run commands or read files and submit the output (output has to be in InfluxDB line protocol!)
-
-If any of the collectors cannot be initialized, it is excluded from all further reads. Like if the Lustre stat file is not a valid path, no Lustre specific metrics will be recorded.
-
-# Collector configuration
+# Configuration

 ```json
-  "collectors": [
-    "tempstat"
-  ],
-  "collect_config": {
-    "tempstat": {
-      "tag_override": {
-        "hwmon0" : {
-            "type" : "socket",
-            "type-id" : "0"
-        },
-        "hwmon1" : {
-            "type" : "socket",
-            "type-id" : "1"
-        }
-      }
+{
+    "collector_type" : {
+        <collector specific configuration>
    }
-  }
+}
 ```

-The configuration of the collectors in the main config files consists of two parts: active collectors (`collectors`) and collector configuration (`collect_config`). At startup, all collectors in the `collectors` list is initialized and, if successfully initialized, added to the active collectors for metric retrieval. At initialization the collector-specific configuration from the `collect_config` section is handed over. Each collector has own configuration options, check at the collector-specific section.
+In contrast to the configuration files for sinks and receivers, the collectors configuration is not a list but a set of dicts. This is required because we didn't manage to partially read the type before loading the remaining configuration. We are eager to change this to the same format.

-## `memstat`
+# Available collectors

-```json
-  "memstat": {
-    "exclude_metrics": [
-      "mem_used"
-    ]
-  }
-```
-
-The `memstat` collector reads data from `/proc/meminfo` and outputs a handful **node** metrics. If a metric is not required, it can be excluded from forwarding it to the sink.
-
-
-Metrics:
-* `mem_total`
-* `mem_sreclaimable`
-* `mem_slab`
-* `mem_free`
-* `mem_buffers`
-* `mem_cached`
-* `mem_available`
-* `mem_shared`
-* `swap_total`
-* `swap_free`
-* `mem_used` = `mem_total` - (`mem_free` + `mem_buffers` + `mem_cached`)
-
-## `loadavg`
-```json
-  "loadavg": {
-    "exclude_metrics": [
-      "proc_run"
-    ]
-  }
-```
-
-The `loadavg` collector reads data from `/proc/loadavg` and outputs a handful **node** metrics. If a metric is not required, it can be excluded from forwarding it to the sink.
-
-Metrics:
-* `load_one`
-* `load_five`
-* `load_fifteen`
-* `proc_run`
-* `proc_total`
-
-## `netstat`
-```json
-  "netstat": {
-    "exclude_devices": [
-      "lo"
-    ]
-  }
-```
-
-The `netstat` collector reads data from `/proc/net/dev` and outputs a handful **node** metrics. If a device is not required, it can be excluded from forwarding it to the sink. Commonly the `lo` device should be excluded.
-
-Metrics:
-* `bytes_in`
-* `bytes_out`
-* `pkts_in`
-* `pkts_out`
-
-The device name is added as tag `device`.
-
-
-## `diskstat`
-
-```json
-  "diskstat": {
-    "exclude_metrics": [
-      "read_ms"
-    ],
-  }
-```
-
-The `netstat` collector reads data from `/proc/net/dev` and outputs a handful **node** metrics. If a metric is not required, it can be excluded from forwarding it to the sink.
-
-Metrics:
-* `reads`
-* `reads_merged`
-* `read_sectors`
-* `read_ms`
-* `writes`
-* `writes_merged`
-* `writes_sectors`
-* `writes_ms`
-* `ioops`
-* `ioops_ms`
-* `ioops_weighted_ms`
-* `discards`
-* `discards_merged`
-* `discards_sectors`
-* `discards_ms`
-* `flushes`
-* `flushes_ms`
-
-
-The device name is added as tag `device`.
-
-## `cpustat`
-```json
-  "netstat": {
-    "exclude_metrics": [
-      "cpu_idle"
-    ]
-  }
-```
-
-The `cpustat` collector reads data from `/proc/stats` and outputs a handful **node** and **hwthread** metrics. If a metric is not required, it can be excluded from forwarding it to the sink.
-
-Metrics:
-* `cpu_user`
-* `cpu_nice`
-* `cpu_system`
-* `cpu_idle`
-* `cpu_iowait`
-* `cpu_irq`
-* `cpu_softirq`
-* `cpu_steal`
-* `cpu_guest`
-* `cpu_guest_nice`
-
-## `likwid`
-```json
-  "likwid": {
-    "eventsets": [
-      {
-        "events": {
-          "FIXC1": "ACTUAL_CPU_CLOCK",
-          "FIXC2": "MAX_CPU_CLOCK",
-          "PMC0": "RETIRED_INSTRUCTIONS",
-          "PMC1": "CPU_CLOCKS_UNHALTED",
-          "PMC2": "RETIRED_SSE_AVX_FLOPS_ALL",
-          "PMC3": "MERGE",
-          "DFC0": "DRAM_CHANNEL_0",
-          "DFC1": "DRAM_CHANNEL_1",
-          "DFC2": "DRAM_CHANNEL_2",
-          "DFC3": "DRAM_CHANNEL_3"
-        },
-        "metrics": [
-          {
-            "name": "ipc",
-            "calc": "PMC0/PMC1",
-            "socket_scope": false,
-            "publish": true
-          },
-          {
-            "name": "flops_any",
-            "calc": "0.000001*PMC2/time",
-            "socket_scope": false,
-            "publish": true
-          },
-          {
-            "name": "clock_mhz",
-            "calc": "0.000001*(FIXC1/FIXC2)/inverseClock",
-            "socket_scope": false,
-            "publish": true
-          },
-          {
-            "name": "mem1",
-            "calc": "0.000001*(DFC0+DFC1+DFC2+DFC3)*64.0/time",
-            "socket_scope": true,
-            "publish": false
-          }
-        ]
-      },
-      {
-        "events": {
-          "DFC0": "DRAM_CHANNEL_4",
-          "DFC1": "DRAM_CHANNEL_5",
-          "DFC2": "DRAM_CHANNEL_6",
-          "DFC3": "DRAM_CHANNEL_7",
-          "PWR0": "RAPL_CORE_ENERGY",
-          "PWR1": "RAPL_PKG_ENERGY"
-        },
-        "metrics": [
-          {
-            "name": "pwr_core",
-            "calc": "PWR0/time",
-            "socket_scope": false,
-            "publish": true
-          },
-          {
-            "name": "pwr_pkg",
-            "calc": "PWR1/time",
-            "socket_scope": true,
-            "publish": true
-          },
-          {
-            "name": "mem2",
-            "calc": "0.000001*(DFC0+DFC1+DFC2+DFC3)*64.0/time",
-            "socket_scope": true,
-            "publish": false
-          }
-        ]
-      }
-    ],
-    "globalmetrics": [
-      {
-        "name": "mem_bw",
-        "calc": "mem1+mem2",
-        "socket_scope": true,
-        "publish": true
-      }
-    ]
-  }
-```
-
-_Example config suitable for AMD Zen3_
-
-The `likwid` collector reads hardware performance counters at a **hwthread** and **socket** level. The configuration looks quite complicated but it is basically copy&paste from [LIKWID's performance groups](https://github.com/RRZE-HPC/likwid/tree/master/groups). The collector made multiple iterations and tried to use the performance groups but it lacked flexibility. The current way of configuration provides most flexibility.
-
-The logic is as following: There are multiple eventsets, each consisting of a list of counters+events and a list of metrics. If you compare a common performance group with the example setting above, there is not much difference:
-```
-EVENTSET                         ->   "events": {
-FIXC1 ACTUAL_CPU_CLOCK           ->     "FIXC1": "ACTUAL_CPU_CLOCK",
-FIXC2 MAX_CPU_CLOCK              ->     "FIXC2": "MAX_CPU_CLOCK",
-PMC0  RETIRED_INSTRUCTIONS       ->     "PMC0" : "RETIRED_INSTRUCTIONS",
-PMC1  CPU_CLOCKS_UNHALTED        ->     "PMC1" : "CPU_CLOCKS_UNHALTED",
-PMC2  RETIRED_SSE_AVX_FLOPS_ALL  ->     "PMC2": "RETIRED_SSE_AVX_FLOPS_ALL",
-PMC3  MERGE                      ->     "PMC3": "MERGE",
-                                 ->   }
-```
-
-The metrics are following the same procedure:
-
-```
-METRICS                          ->   "metrics": [
-IPC   PMC0/PMC1                  ->     {
-                                 ->       "name" : "IPC",
-                                 ->       "calc" : "PMC0/PMC1",
-                                 ->       "socket_scope": false,
-                                 ->       "publish": true
-                                 ->     }
-                                 ->   ]
-```
-
-The `socket_scope` option tells whether it is submitted per socket or per hwthread. If a metric is only used for internal calculations, you can set `publish = false`.
-
-Since some metrics can only be gathered in multiple measurements (like the memory bandwidth on AMD Zen3 chips), configure multiple eventsets like in the example config and use the `globalmetrics` section to combine them. **Be aware** that the combination might be misleading because the "behavior" of a metric changes over time and the multiple measurements might count different computing phases.
+* [`cpustat`](./cpustatMetric.md)
+* [`memstat`](./memstatMetric.md)
+* [`iostat`](./iostatMetric.md)
+* [`diskstat`](./diskstatMetric.md)
+* [`loadavg`](./loadavgMetric.md)
+* [`netstat`](./netstatMetric.md)
+* [`ibstat`](./infinibandMetric.md)
+* [`ibstat_perfquery`](./infinibandPerfQueryMetric.md)
+* [`tempstat`](./tempMetric.md)
+* [`lustrestat`](./lustreMetric.md)
+* [`likwid`](./likwidMetric.md)
+* [`nvidia`](./nvidiaMetric.md)
+* [`customcmd`](./customCmdMetric.md)
+* [`ipmistat`](./ipmiMetric.md)
+* [`topprocs`](./topprocsMetric.md)
+* [`nfs3stat`](./nfs3Metric.md)
+* [`nfs4stat`](./nfs4Metric.md)
+* [`cpufreq`](./cpufreqMetric.md)
+* [`cpufreq_cpuinfo`](./cpufreqCpuinfoMetric.md)
+* [`numastat`](./numastatMetric.md)
+* [`gpfs`](./gpfsMetric.md)

 ## Todos

-* [ ] Exclude devices for `diskstat` collector
 * [ ] Aggreate metrics to higher topology entity (sum hwthread metrics to socket metric, ...). Needs to be configurable

 # Contributing own collectors
 A collector reads data from any source, parses it to metrics and submits these metrics to the `metric-collector`. A collector provides three function:

-* `Init(config []byte) error`: Initializes the collector using the given collector-specific config in JSON.
-* `Read(duration time.Duration, out *[]lp.MutableMetric) error`: Read, parse and submit data to the `out` list. If the collector has to measure anything for some duration, use the provided function argument `duration`. 
+* `Name() string`: Return the name of the collector
+* `Init(config json.RawMessage) error`: Initializes the collector using the given collector-specific config in JSON. Check if needed files/commands exists, ...
+* `Initialized() bool`: Check if a collector is successfully initialized
+* `Read(duration time.Duration, output chan ccMetric.CCMetric)`: Read, parse and submit data to the `output` channel as [`CCMetric`](../internal/ccMetric/README.md). If the collector has to measure anything for some duration, use the provided function argument `duration`. 
 * `Close()`: Closes down the collector.

 It is recommanded to call `setup()` in the `Init()` function.

-Finally, the collector needs to be registered in the `metric-collector.go`. There is a list of collectors called `Collectors` which is a map (string -> pointer to collector). Add a new entry with a descriptive name and the new collector.
+Finally, the collector needs to be registered in the `collectorManager.go`. There is a list of collectors called `AvailableCollectors` which is a map (`collector_type_string` -> `pointer to MetricCollector interface`). Add a new entry with a descriptive name and the new collector.

 ## Sample collector

@@ -307,8 +62,9 @@ package collectors

 import (
    "encoding/json"
-    lp "github.com/influxdata/line-protocol"
    "time"
+
+    lp "github.com/ClusterCockpit/cc-metric-collector/internal/ccMetric"
 )

 // Struct for the collector-specific JSON config
@@ -317,11 +73,11 @@ type SampleCollectorConfig struct {
 }

 type SampleCollector struct {
-    MetricCollector
+    metricCollector
    config SampleCollectorConfig
 }

-func (m *SampleCollector) Init(config []byte) error {
+func (m *SampleCollector) Init(config json.RawMessage) error {
    // Check if already initialized
    if m.init {
        return nil
@@ -335,21 +91,28 @@ func (m *SampleCollector) Init(config []byte) error {
            return err
        }
    }
+    m.meta = map[string]string{"source": m.name, "group": "Sample"}
+
    m.init = true
    return nil
 }

-func (m *SampleCollector) Read(interval time.Duration, out *[]lp.MutableMetric) {
+func (m *SampleCollector) Read(interval time.Duration, output chan lp.CCMetric) {
    if !m.init {
        return
    }
    // tags for the metric, if type != node use proper type and type-id
    tags := map[string]string{"type" : "node"}
+
+    x, err := GetMetric()
+    if err != nil {
+        cclog.ComponentError(m.name, fmt.Sprintf("Read(): %v", err))
+    }
+
    // Each metric has exactly one field: value !
-    value := map[string]interface{}{"value": int(x)}
-    y, err := lp.New("sample_metric", tags, value, time.Now())
-    if err == nil {
-        *out = append(*out, y)
+    value := map[string]interface{}{"value": int64(x)}
+    if y, err := lp.New("sample_metric", tags, m.meta, value, time.Now()); err == nil {
+        output <- y
    }
 }

--- a/collectors/collectorManager.go
+++ b/collectors/collectorManager.go
@@ -0,0 +1,173 @@
+package collectors
+
+import (
+	"encoding/json"
+	"os"
+	"sync"
+	"time"
+
+	cclog "github.com/ClusterCockpit/cc-metric-collector/internal/ccLogger"
+	lp "github.com/ClusterCockpit/cc-metric-collector/internal/ccMetric"
+	mct "github.com/ClusterCockpit/cc-metric-collector/internal/multiChanTicker"
+)
+
+// Map of all available metric collectors
+var AvailableCollectors = map[string]MetricCollector{
+
+	"likwid":           new(LikwidCollector),
+	"loadavg":          new(LoadavgCollector),
+	"memstat":          new(MemstatCollector),
+	"netstat":          new(NetstatCollector),
+	"ibstat":           new(InfinibandCollector),
+	"ibstat_perfquery": new(InfinibandPerfQueryCollector),
+	"lustrestat":       new(LustreCollector),
+	"cpustat":          new(CpustatCollector),
+	"topprocs":         new(TopProcsCollector),
+	"nvidia":           new(NvidiaCollector),
+	"customcmd":        new(CustomCmdCollector),
+	"iostat":           new(IOstatCollector),
+	"diskstat":         new(DiskstatCollector),
+	"tempstat":         new(TempCollector),
+	"ipmistat":         new(IpmiCollector),
+	"gpfs":             new(GpfsCollector),
+	"cpufreq":          new(CPUFreqCollector),
+	"cpufreq_cpuinfo":  new(CPUFreqCpuInfoCollector),
+	"nfs3stat":         new(Nfs3Collector),
+	"nfs4stat":         new(Nfs4Collector),
+	"numastats":        new(NUMAStatsCollector),
+}
+
+// Metric collector manager data structure
+type collectorManager struct {
+	collectors []MetricCollector          // List of metric collectors to use
+	output     chan lp.CCMetric           // Output channels
+	done       chan bool                  // channel to finish / stop metric collector manager
+	ticker     mct.MultiChanTicker        // periodically ticking once each interval
+	duration   time.Duration              // duration (for metrics that measure over a given duration)
+	wg         *sync.WaitGroup            // wait group for all goroutines in cc-metric-collector
+	config     map[string]json.RawMessage // json encoded config for collector manager
+}
+
+// Metric collector manager access functions
+type CollectorManager interface {
+	Init(ticker mct.MultiChanTicker, duration time.Duration, wg *sync.WaitGroup, collectConfigFile string) error
+	AddOutput(output chan lp.CCMetric)
+	Start()
+	Close()
+}
+
+// Init initializes a new metric collector manager by setting up:
+// * output channel
+// * done channel
+// * wait group synchronization for goroutines (from variable wg)
+// * ticker (from variable ticker)
+// * configuration (read from config file in variable collectConfigFile)
+// Initialization is done for all configured collectors
+func (cm *collectorManager) Init(ticker mct.MultiChanTicker, duration time.Duration, wg *sync.WaitGroup, collectConfigFile string) error {
+	cm.collectors = make([]MetricCollector, 0)
+	cm.output = nil
+	cm.done = make(chan bool)
+	cm.wg = wg
+	cm.ticker = ticker
+	cm.duration = duration
+
+	// Read collector config file
+	configFile, err := os.Open(collectConfigFile)
+	if err != nil {
+		cclog.Error(err.Error())
+		return err
+	}
+	defer configFile.Close()
+	jsonParser := json.NewDecoder(configFile)
+	err = jsonParser.Decode(&cm.config)
+	if err != nil {
+		cclog.Error(err.Error())
+		return err
+	}
+
+	// Initialize configured collectors
+	for collectorName, collectorCfg := range cm.config {
+		if _, found := AvailableCollectors[collectorName]; !found {
+			cclog.ComponentError("CollectorManager", "SKIP unknown collector", collectorName)
+			continue
+		}
+		collector := AvailableCollectors[collectorName]
+
+		err = collector.Init(collectorCfg)
+		if err != nil {
+			cclog.ComponentError("CollectorManager", "Collector", collectorName, "initialization failed:", err.Error())
+			continue
+		}
+		cclog.ComponentDebug("CollectorManager", "ADD COLLECTOR", collector.Name())
+		cm.collectors = append(cm.collectors, collector)
+	}
+	return nil
+}
+
+// Start starts the metric collector manager
+func (cm *collectorManager) Start() {
+	tick := make(chan time.Time)
+	cm.ticker.AddChannel(tick)
+
+	cm.wg.Add(1)
+	go func() {
+		defer cm.wg.Done()
+		// Collector manager is done
+		done := func() {
+			// close all metric collectors
+			for _, c := range cm.collectors {
+				c.Close()
+			}
+			close(cm.done)
+			cclog.ComponentDebug("CollectorManager", "DONE")
+		}
+
+		// Wait for done signal or timer event
+		for {
+			select {
+			case <-cm.done:
+				done()
+				return
+			case t := <-tick:
+				for _, c := range cm.collectors {
+					// Wait for done signal or execute the collector
+					select {
+					case <-cm.done:
+						done()
+						return
+					default:
+						// Read metrics from collector c
+						cclog.ComponentDebug("CollectorManager", c.Name(), t)
+						c.Read(cm.duration, cm.output)
+					}
+				}
+			}
+		}
+	}()
+
+	// Collector manager is started
+	cclog.ComponentDebug("CollectorManager", "STARTED")
+}
+
+// AddOutput adds the output channel to the metric collector manager
+func (cm *collectorManager) AddOutput(output chan lp.CCMetric) {
+	cm.output = output
+}
+
+// Close finishes / stops the metric collector manager
+func (cm *collectorManager) Close() {
+	cclog.ComponentDebug("CollectorManager", "CLOSE")
+	cm.done <- true
+	// wait for close of channel cm.done
+	<-cm.done
+}
+
+// New creates a new initialized metric collector manager
+func New(ticker mct.MultiChanTicker, duration time.Duration, wg *sync.WaitGroup, collectConfigFile string) (CollectorManager, error) {
+	cm := new(collectorManager)
+	err := cm.Init(ticker, duration, wg, collectConfigFile)
+	if err != nil {
+		return nil, err
+	}
+	return cm, err
+}
--- a/collectors/cpufreqCpuinfoMetric.go
+++ b/collectors/cpufreqCpuinfoMetric.go
@@ -2,14 +2,16 @@ package collectors

 import (
 	"bufio"
+	"encoding/json"
+
 	"fmt"
-	"log"
 	"os"
 	"strconv"
 	"strings"
 	"time"

-	lp "github.com/influxdata/line-protocol"
+	cclog "github.com/ClusterCockpit/cc-metric-collector/internal/ccLogger"
+	lp "github.com/ClusterCockpit/cc-metric-collector/internal/ccMetric"
 )

 //
@@ -21,41 +23,55 @@ import (
 type CPUFreqCpuInfoCollectorTopology struct {
 	processor               string // logical processor number (continuous, starting at 0)
 	coreID                  string // socket local core ID
-	coreID_int              int
+	coreID_int              int64
 	physicalPackageID       string // socket / package ID
-	physicalPackageID_int   int
+	physicalPackageID_int   int64
 	numPhysicalPackages     string // number of  sockets / packages
-	numPhysicalPackages_int int
+	numPhysicalPackages_int int64
 	isHT                    bool
 	numNonHT                string // number of non hyperthreading processors
-	numNonHT_int            int
+	numNonHT_int            int64
 	tagSet                  map[string]string
 }

 type CPUFreqCpuInfoCollector struct {
-	MetricCollector
-	topology []CPUFreqCpuInfoCollectorTopology
+	metricCollector
+	topology []*CPUFreqCpuInfoCollectorTopology
 }

-func (m *CPUFreqCpuInfoCollector) Init(config []byte) error {
+func (m *CPUFreqCpuInfoCollector) Init(config json.RawMessage) error {
+	// Check if already initialized
+	if m.init {
+		return nil
+	}
+
+	m.setup()
+
 	m.name = "CPUFreqCpuInfoCollector"
+	m.meta = map[string]string{
+		"source": m.name,
+		"group":  "CPU",
+		"unit":   "MHz",
+	}

 	const cpuInfoFile = "/proc/cpuinfo"
 	file, err := os.Open(cpuInfoFile)
 	if err != nil {
-		return fmt.Errorf("Failed to open '%s': %v", cpuInfoFile, err)
+		return fmt.Errorf("Failed to open file '%s': %v", cpuInfoFile, err)
 	}
 	defer file.Close()

 	// Collect topology information from file cpuinfo
 	foundFreq := false
 	processor := ""
-	numNonHT_int := 0
+	var numNonHT_int int64 = 0
 	coreID := ""
 	physicalPackageID := ""
-	maxPhysicalPackageID := 0
-	m.topology = make([]CPUFreqCpuInfoCollectorTopology, 0)
+	var maxPhysicalPackageID int64 = 0
+	m.topology = make([]*CPUFreqCpuInfoCollectorTopology, 0)
 	coreSeenBefore := make(map[string]bool)
+
+	// Read cpuinfo file, line by line
 	scanner := bufio.NewScanner(file)
 	for scanner.Scan() {
 		lineSplit := strings.Split(scanner.Text(), ":")
@@ -81,39 +97,41 @@ func (m *CPUFreqCpuInfoCollector) Init(config []byte) error {
 			len(coreID) > 0 &&
 			len(physicalPackageID) > 0 {

-			coreID_int, err := strconv.Atoi(coreID)
+			topology := new(CPUFreqCpuInfoCollectorTopology)
+
+			// Processor
+			topology.processor = processor
+
+			// Core ID
+			topology.coreID = coreID
+			topology.coreID_int, err = strconv.ParseInt(coreID, 10, 64)
 			if err != nil {
-				return fmt.Errorf("Unable to convert coreID to int: %v", err)
+				return fmt.Errorf("Unable to convert coreID '%s' to int64: %v", coreID, err)
 			}
-			physicalPackageID_int, err := strconv.Atoi(physicalPackageID)
+
+			// Physical package ID
+			topology.physicalPackageID = physicalPackageID
+			topology.physicalPackageID_int, err = strconv.ParseInt(physicalPackageID, 10, 64)
 			if err != nil {
-				return fmt.Errorf("Unable to convert physicalPackageID to int: %v", err)
+				return fmt.Errorf("Unable to convert physicalPackageID '%s' to int64: %v", physicalPackageID, err)
 			}

 			// increase maximun socket / package ID, when required
-			if physicalPackageID_int > maxPhysicalPackageID {
-				maxPhysicalPackageID = physicalPackageID_int
+			if topology.physicalPackageID_int > maxPhysicalPackageID {
+				maxPhysicalPackageID = topology.physicalPackageID_int
 			}

+			// is hyperthread?
 			globalID := physicalPackageID + ":" + coreID
-			isHT := coreSeenBefore[globalID]
+			topology.isHT = coreSeenBefore[globalID]
 			coreSeenBefore[globalID] = true
-			if !isHT {
+			if !topology.isHT {
 				// increase number on non hyper thread cores
 				numNonHT_int++
 			}

 			// store collected topology information
-			m.topology = append(
-				m.topology,
-				CPUFreqCpuInfoCollectorTopology{
-					processor:             processor,
-					coreID:                coreID,
-					coreID_int:            coreID_int,
-					physicalPackageID:     physicalPackageID,
-					physicalPackageID_int: physicalPackageID_int,
-					isHT:                  isHT,
-				})
+			m.topology = append(m.topology, topology)

 			// reset topology information
 			foundFreq = false
@@ -126,18 +144,15 @@ func (m *CPUFreqCpuInfoCollector) Init(config []byte) error {
 	numPhysicalPackageID_int := maxPhysicalPackageID + 1
 	numPhysicalPackageID := fmt.Sprint(numPhysicalPackageID_int)
 	numNonHT := fmt.Sprint(numNonHT_int)
-	for i := range m.topology {
-		t := &m.topology[i]
+	for _, t := range m.topology {
 		t.numPhysicalPackages = numPhysicalPackageID
 		t.numPhysicalPackages_int = numPhysicalPackageID_int
 		t.numNonHT = numNonHT
 		t.numNonHT_int = numNonHT_int
 		t.tagSet = map[string]string{
-			"type":        "cpu",
-			"type-id":     t.processor,
-			"num_core":    t.numNonHT,
-			"package_id":  t.physicalPackageID,
-			"num_package": t.numPhysicalPackages,
+			"type":       "cpu",
+			"type-id":    t.processor,
+			"package_id": t.physicalPackageID,
 		}
 	}

@@ -145,14 +160,18 @@ func (m *CPUFreqCpuInfoCollector) Init(config []byte) error {
 	return nil
 }

-func (m *CPUFreqCpuInfoCollector) Read(interval time.Duration, out *[]lp.MutableMetric) {
+func (m *CPUFreqCpuInfoCollector) Read(interval time.Duration, output chan lp.CCMetric) {
+	// Check if already initialized
 	if !m.init {
 		return
 	}
+
 	const cpuInfoFile = "/proc/cpuinfo"
 	file, err := os.Open(cpuInfoFile)
 	if err != nil {
-		log.Printf("Failed to open '%s': %v", cpuInfoFile, err)
+		cclog.ComponentError(
+			m.name,
+			fmt.Sprintf("Read(): Failed to open file '%s': %v", cpuInfoFile, err))
 		return
 	}
 	defer file.Close()
@@ -167,16 +186,17 @@ func (m *CPUFreqCpuInfoCollector) Read(interval time.Duration, out *[]lp.Mutable

 			// frequency
 			if key == "cpu MHz" {
-				t := &m.topology[processorCounter]
+				t := m.topology[processorCounter]
 				if !t.isHT {
 					value, err := strconv.ParseFloat(strings.TrimSpace(lineSplit[1]), 64)
 					if err != nil {
-						log.Printf("Failed to convert cpu MHz to float: %v", err)
+						cclog.ComponentError(
+							m.name,
+							fmt.Sprintf("Read(): Failed to convert cpu MHz '%s' to float64: %v", lineSplit[1], err))
 						return
 					}
-					y, err := lp.New("cpufreq", t.tagSet, map[string]interface{}{"value": value}, now)
-					if err == nil {
-						*out = append(*out, y)
+					if y, err := lp.New("cpufreq", t.tagSet, m.meta, map[string]interface{}{"value": value}, now); err == nil {
+						output <- y
 					}
 				}
 				processorCounter++
--- a/collectors/cpufreqCpuinfoMetric.md
+++ b/collectors/cpufreqCpuinfoMetric.md
@@ -0,0 +1,10 @@
+
+## `cpufreq_cpuinfo` collector
+```json
+  "cpufreq_cpuinfo": {}
+```
+
+The `cpufreq_cpuinfo` collector reads the clock frequency from `/proc/cpuinfo` and outputs a handful **cpu** metrics.
+
+Metrics:
+* `cpufreq`
--- a/collectors/cpufreqMetric.go
+++ b/collectors/cpufreqMetric.go
@@ -1,48 +1,30 @@
 package collectors

 import (
-	"bufio"
 	"encoding/json"
 	"fmt"
-	"log"
-	"os"
+	"io/ioutil"
 	"path/filepath"
 	"strconv"
 	"strings"
 	"time"

-	lp "github.com/influxdata/line-protocol"
+	cclog "github.com/ClusterCockpit/cc-metric-collector/internal/ccLogger"
+	lp "github.com/ClusterCockpit/cc-metric-collector/internal/ccMetric"
 	"golang.org/x/sys/unix"
 )

-//
-// readOneLine reads one line from a file.
-// It returns ok when file was successfully read.
-// In this case text contains the first line of the files contents.
-//
-func readOneLine(filename string) (text string, ok bool) {
-	file, err := os.Open(filename)
-	if err != nil {
-		return
-	}
-	defer file.Close()
-	scanner := bufio.NewScanner(file)
-	ok = scanner.Scan()
-	text = scanner.Text()
-	return
-}
-
 type CPUFreqCollectorTopology struct {
 	processor               string // logical processor number (continuous, starting at 0)
 	coreID                  string // socket local core ID
-	coreID_int              int
+	coreID_int              int64
 	physicalPackageID       string // socket / package ID
-	physicalPackageID_int   int
+	physicalPackageID_int   int64
 	numPhysicalPackages     string // number of  sockets / packages
-	numPhysicalPackages_int int
+	numPhysicalPackages_int int64
 	isHT                    bool
 	numNonHT                string // number of non hyperthreading processors
-	numNonHT_int            int
+	numNonHT_int            int64
 	scalingCurFreqFile      string
 	tagSet                  map[string]string
 }
@@ -56,14 +38,19 @@ type CPUFreqCollectorTopology struct {
 // See: https://www.kernel.org/doc/html/latest/admin-guide/pm/cpufreq.html
 //
 type CPUFreqCollector struct {
-	MetricCollector
+	metricCollector
 	topology []CPUFreqCollectorTopology
 	config   struct {
 		ExcludeMetrics []string `json:"exclude_metrics,omitempty"`
 	}
 }

-func (m *CPUFreqCollector) Init(config []byte) error {
+func (m *CPUFreqCollector) Init(config json.RawMessage) error {
+	// Check if already initialized
+	if m.init {
+		return nil
+	}
+
 	m.name = "CPUFreqCollector"
 	m.setup()
 	if len(config) > 0 {
@@ -72,54 +59,61 @@ func (m *CPUFreqCollector) Init(config []byte) error {
 			return err
 		}
 	}
+	m.meta = map[string]string{
+		"source": m.name,
+		"group":  "CPU",
+		"unit":   "MHz",
+	}

 	// Loop for all CPU directories
 	baseDir := "/sys/devices/system/cpu"
 	globPattern := filepath.Join(baseDir, "cpu[0-9]*")
 	cpuDirs, err := filepath.Glob(globPattern)
 	if err != nil {
-		return fmt.Errorf("CPUFreqCollector.Init() unable to glob files with pattern %s: %v", globPattern, err)
+		return fmt.Errorf("Unable to glob files with pattern '%s': %v", globPattern, err)
 	}
 	if cpuDirs == nil {
-		return fmt.Errorf("CPUFreqCollector.Init() unable to find any files with pattern %s", globPattern)
+		return fmt.Errorf("Unable to find any files with pattern '%s'", globPattern)
 	}

 	// Initialize CPU topology
 	m.topology = make([]CPUFreqCollectorTopology, len(cpuDirs))
 	for _, cpuDir := range cpuDirs {
 		processor := strings.TrimPrefix(cpuDir, "/sys/devices/system/cpu/cpu")
-		processor_int, err := strconv.Atoi(processor)
+		processor_int, err := strconv.ParseInt(processor, 10, 64)
 		if err != nil {
-			return fmt.Errorf("CPUFreqCollector.Init() unable to convert cpuID to int: %v", err)
+			return fmt.Errorf("Unable to convert cpuID '%s' to int64: %v", processor, err)
 		}

 		// Read package ID
 		physicalPackageIDFile := filepath.Join(cpuDir, "topology", "physical_package_id")
-		physicalPackageID, ok := readOneLine(physicalPackageIDFile)
-		if !ok {
-			return fmt.Errorf("CPUFreqCollector.Init() unable to read physical package ID from %s", physicalPackageIDFile)
-		}
-		physicalPackageID_int, err := strconv.Atoi(physicalPackageID)
+		line, err := ioutil.ReadFile(physicalPackageIDFile)
 		if err != nil {
-			return fmt.Errorf("CPUFreqCollector.Init() unable to convert packageID to int: %v", err)
+			return fmt.Errorf("Unable to read physical package ID from file '%s': %v", physicalPackageIDFile, err)
+		}
+		physicalPackageID := strings.TrimSpace(string(line))
+		physicalPackageID_int, err := strconv.ParseInt(physicalPackageID, 10, 64)
+		if err != nil {
+			return fmt.Errorf("Unable to convert packageID '%s' to int64: %v", physicalPackageID, err)
 		}

 		// Read core ID
 		coreIDFile := filepath.Join(cpuDir, "topology", "core_id")
-		coreID, ok := readOneLine(coreIDFile)
-		if !ok {
-			return fmt.Errorf("CPUFreqCollector.Init() unable to read core ID from %s", coreIDFile)
-		}
-		coreID_int, err := strconv.Atoi(coreID)
+		line, err = ioutil.ReadFile(coreIDFile)
 		if err != nil {
-			return fmt.Errorf("CPUFreqCollector.Init() unable to convert coreID to int: %v", err)
+			return fmt.Errorf("Unable to read core ID from file '%s': %v", coreIDFile, err)
+		}
+		coreID := strings.TrimSpace(string(line))
+		coreID_int, err := strconv.ParseInt(coreID, 10, 64)
+		if err != nil {
+			return fmt.Errorf("Unable to convert coreID '%s' to int64: %v", coreID, err)
 		}

 		// Check access to current frequency file
 		scalingCurFreqFile := filepath.Join(cpuDir, "cpufreq", "scaling_cur_freq")
 		err = unix.Access(scalingCurFreqFile, unix.R_OK)
 		if err != nil {
-			return fmt.Errorf("CPUFreqCollector.Init() unable to access %s: %v", scalingCurFreqFile, err)
+			return fmt.Errorf("Unable to access file '%s': %v", scalingCurFreqFile, err)
 		}

 		t := &m.topology[processor_int]
@@ -142,8 +136,8 @@ func (m *CPUFreqCollector) Init(config []byte) error {
 	}

 	// number of non hyper thread cores and packages / sockets
-	numNonHT_int := 0
-	maxPhysicalPackageID := 0
+	var numNonHT_int int64 = 0
+	var maxPhysicalPackageID int64 = 0
 	for i := range m.topology {
 		t := &m.topology[i]

@@ -167,11 +161,9 @@ func (m *CPUFreqCollector) Init(config []byte) error {
 		t.numNonHT = numNonHT
 		t.numNonHT_int = numNonHT_int
 		t.tagSet = map[string]string{
-			"type":        "cpu",
-			"type-id":     t.processor,
-			"num_core":    t.numNonHT,
-			"package_id":  t.physicalPackageID,
-			"num_package": t.numPhysicalPackages,
+			"type":       "cpu",
+			"type-id":    t.processor,
+			"package_id": t.physicalPackageID,
 		}
 	}

@@ -179,7 +171,8 @@ func (m *CPUFreqCollector) Init(config []byte) error {
 	return nil
 }

-func (m *CPUFreqCollector) Read(interval time.Duration, out *[]lp.MutableMetric) {
+func (m *CPUFreqCollector) Read(interval time.Duration, output chan lp.CCMetric) {
+	// Check if already initialized
 	if !m.init {
 		return
 	}
@@ -194,20 +187,23 @@ func (m *CPUFreqCollector) Read(interval time.Duration, out *[]lp.MutableMetric)
 		}

 		// Read current frequency
-		line, ok := readOneLine(t.scalingCurFreqFile)
-		if !ok {
-			log.Printf("CPUFreqCollector.Read(): Failed to read one line from file '%s'", t.scalingCurFreqFile)
+		line, err := ioutil.ReadFile(t.scalingCurFreqFile)
+		if err != nil {
+			cclog.ComponentError(
+				m.name,
+				fmt.Sprintf("Read(): Failed to read file '%s': %v", t.scalingCurFreqFile, err))
 			continue
 		}
-		cpuFreq, err := strconv.Atoi(line)
+		cpuFreq, err := strconv.ParseInt(strings.TrimSpace(string(line)), 10, 64)
 		if err != nil {
-			log.Printf("CPUFreqCollector.Read(): Failed to convert CPU frequency '%s': %v", line, err)
+			cclog.ComponentError(
+				m.name,
+				fmt.Sprintf("Read(): Failed to convert CPU frequency '%s' to int64: %v", line, err))
 			continue
 		}

-		y, err := lp.New("cpufreq", t.tagSet, map[string]interface{}{"value": cpuFreq}, now)
-		if err == nil {
-			*out = append(*out, y)
+		if y, err := lp.New("cpufreq", t.tagSet, m.meta, map[string]interface{}{"value": cpuFreq}, now); err == nil {
+			output <- y
 		}
 	}
 }
--- a/collectors/cpufreqMetric.md
+++ b/collectors/cpufreqMetric.md
@@ -0,0 +1,11 @@
+## `cpufreq_cpuinfo` collector
+```json
+  "cpufreq": {
+    "exclude_metrics": []
+  }
+```
+
+The `cpufreq` collector reads the clock frequency from `/sys/devices/system/cpu/cpu*/cpufreq` and outputs a handful **cpu** metrics.
+
+Metrics:
+* `cpufreq`
--- a/collectors/cpustatMetric.go
+++ b/collectors/cpustatMetric.go
@@ -1,14 +1,16 @@
 package collectors

 import (
+	"bufio"
 	"encoding/json"
 	"fmt"
-	"io/ioutil"
+	"os"
 	"strconv"
 	"strings"
 	"time"

-	lp "github.com/influxdata/line-protocol"
+	cclog "github.com/ClusterCockpit/cc-metric-collector/internal/ccLogger"
+	lp "github.com/ClusterCockpit/cc-metric-collector/internal/ccMetric"
 )

 const CPUSTATFILE = `/proc/stat`
@@ -18,72 +20,129 @@ type CpustatCollectorConfig struct {
 }

 type CpustatCollector struct {
-	MetricCollector
-	config CpustatCollectorConfig
+	metricCollector
+	config          CpustatCollectorConfig
+	matches         map[string]int
+	cputags         map[string]map[string]string
+	nodetags        map[string]string
+	num_cpus_metric lp.CCMetric
 }

-func (m *CpustatCollector) Init(config []byte) error {
+func (m *CpustatCollector) Init(config json.RawMessage) error {
 	m.name = "CpustatCollector"
 	m.setup()
+	m.meta = map[string]string{"source": m.name, "group": "CPU", "unit": "Percent"}
+	m.nodetags = map[string]string{"type": "node"}
 	if len(config) > 0 {
 		err := json.Unmarshal(config, &m.config)
 		if err != nil {
 			return err
 		}
 	}
+	matches := map[string]int{
+		"cpu_user":       1,
+		"cpu_nice":       2,
+		"cpu_system":     3,
+		"cpu_idle":       4,
+		"cpu_iowait":     5,
+		"cpu_irq":        6,
+		"cpu_softirq":    7,
+		"cpu_steal":      8,
+		"cpu_guest":      9,
+		"cpu_guest_nice": 10,
+	}
+
+	m.matches = make(map[string]int)
+	for match, index := range matches {
+		doExclude := false
+		for _, exclude := range m.config.ExcludeMetrics {
+			if match == exclude {
+				doExclude = true
+				break
+			}
+		}
+		if !doExclude {
+			m.matches[match] = index
+		}
+	}
+
+	// Check input file
+	file, err := os.Open(string(CPUSTATFILE))
+	if err != nil {
+		cclog.ComponentError(m.name, err.Error())
+	}
+	defer file.Close()
+
+	// Pre-generate tags for all CPUs
+	num_cpus := 0
+	m.cputags = make(map[string]map[string]string)
+	scanner := bufio.NewScanner(file)
+	for scanner.Scan() {
+		line := scanner.Text()
+		linefields := strings.Fields(line)
+		if strings.HasPrefix(linefields[0], "cpu") && strings.Compare(linefields[0], "cpu") != 0 {
+			cpustr := strings.TrimLeft(linefields[0], "cpu")
+			cpu, _ := strconv.Atoi(cpustr)
+			m.cputags[linefields[0]] = map[string]string{"type": "cpu", "type-id": fmt.Sprintf("%d", cpu)}
+			num_cpus++
+		}
+	}
 	m.init = true
 	return nil
 }

-func ParseStatLine(line string, cpu int, exclude []string, out *[]lp.MutableMetric) {
-	ls := strings.Fields(line)
-	matches := []string{"", "cpu_user", "cpu_nice", "cpu_system", "cpu_idle", "cpu_iowait", "cpu_irq", "cpu_softirq", "cpu_steal", "cpu_guest", "cpu_guest_nice"}
-	for _, ex := range exclude {
-		matches, _ = RemoveFromStringList(matches, ex)
-	}
-
-	var tags map[string]string
-	if cpu < 0 {
-		tags = map[string]string{"type": "node"}
-	} else {
-		tags = map[string]string{"type": "cpu", "type-id": fmt.Sprintf("%d", cpu)}
-	}
-	for i, m := range matches {
-		if len(m) > 0 {
-			x, err := strconv.ParseInt(ls[i], 0, 64)
+func (m *CpustatCollector) parseStatLine(linefields []string, tags map[string]string, output chan lp.CCMetric) {
+	values := make(map[string]float64)
+	total := 0.0
+	for match, index := range m.matches {
+		if len(match) > 0 {
+			x, err := strconv.ParseInt(linefields[index], 0, 64)
 			if err == nil {
-				y, err := lp.New(m, tags, map[string]interface{}{"value": int(x)}, time.Now())
-				if err == nil {
-					*out = append(*out, y)
-				}
+				values[match] = float64(x)
+				total += values[match]
 			}
 		}
 	}
+	t := time.Now()
+	for name, value := range values {
+		y, err := lp.New(name, tags, m.meta, map[string]interface{}{"value": (value * 100.0) / total}, t)
+		if err == nil {
+			output <- y
+		}
+	}
 }

-func (m *CpustatCollector) Read(interval time.Duration, out *[]lp.MutableMetric) {
+func (m *CpustatCollector) Read(interval time.Duration, output chan lp.CCMetric) {
 	if !m.init {
 		return
 	}
-	buffer, err := ioutil.ReadFile(string(CPUSTATFILE))
-
+	num_cpus := 0
+	file, err := os.Open(string(CPUSTATFILE))
 	if err != nil {
-		return
+		cclog.ComponentError(m.name, err.Error())
+	}
+	defer file.Close()
+
+	scanner := bufio.NewScanner(file)
+	for scanner.Scan() {
+		line := scanner.Text()
+		linefields := strings.Fields(line)
+		if strings.Compare(linefields[0], "cpu") == 0 {
+			m.parseStatLine(linefields, m.nodetags, output)
+		} else if strings.HasPrefix(linefields[0], "cpu") {
+			m.parseStatLine(linefields, m.cputags[linefields[0]], output)
+			num_cpus++
+		}
 	}

-	ll := strings.Split(string(buffer), "\n")
-	for _, line := range ll {
-		if len(line) == 0 {
-			continue
-		}
-		ls := strings.Fields(line)
-		if strings.Compare(ls[0], "cpu") == 0 {
-			ParseStatLine(line, -1, m.config.ExcludeMetrics, out)
-		} else if strings.HasPrefix(ls[0], "cpu") {
-			cpustr := strings.TrimLeft(ls[0], "cpu")
-			cpu, _ := strconv.Atoi(cpustr)
-			ParseStatLine(line, cpu, m.config.ExcludeMetrics, out)
-		}
+	num_cpus_metric, err := lp.New("num_cpus",
+		m.nodetags,
+		m.meta,
+		map[string]interface{}{"value": int(num_cpus)},
+		time.Now(),
+	)
+	if err == nil {
+		output <- num_cpus_metric
 	}
 }

--- a/collectors/cpustatMetric.md
+++ b/collectors/cpustatMetric.md
@@ -0,0 +1,23 @@
+
+## `cpustat` collector
+```json
+  "cpustat": {
+    "exclude_metrics": [
+      "cpu_idle"
+    ]
+  }
+```
+
+The `cpustat` collector reads data from `/proc/stats` and outputs a handful **node** and **hwthread** metrics. If a metric is not required, it can be excluded from forwarding it to the sink.
+
+Metrics:
+* `cpu_user`
+* `cpu_nice`
+* `cpu_system`
+* `cpu_idle`
+* `cpu_iowait`
+* `cpu_irq`
+* `cpu_softirq`
+* `cpu_steal`
+* `cpu_guest`
+* `cpu_guest_nice`
--- a/collectors/customCmdMetric.go
+++ b/collectors/customCmdMetric.go
@@ -9,7 +9,13 @@ import (
 	"strings"
 	"time"

+<<<<<<< HEAD
 	lp "github.com/influxdata/line-protocol"
+=======
+	ccmetric "github.com/ClusterCockpit/cc-metric-collector/internal/ccMetric"
+	lp "github.com/ClusterCockpit/cc-metric-collector/internal/ccMetric"
+	influx "github.com/influxdata/line-protocol"
+>>>>>>> develop
 )

 const CUSTOMCMDPATH = `/home/unrz139/Work/cc-metric-collector/collectors/custom`
@@ -21,17 +27,18 @@ type CustomCmdCollectorConfig struct {
 }

 type CustomCmdCollector struct {
-	MetricCollector
-	handler  *lp.MetricHandler
-	parser   *lp.Parser
+	metricCollector
+	handler  *influx.MetricHandler
+	parser   *influx.Parser
 	config   CustomCmdCollectorConfig
 	commands []string
 	files    []string
 }

-func (m *CustomCmdCollector) Init(config []byte) error {
+func (m *CustomCmdCollector) Init(config json.RawMessage) error {
 	var err error
 	m.name = "CustomCmdCollector"
+	m.meta = map[string]string{"source": m.name, "group": "Custom"}
 	if len(config) > 0 {
 		err = json.Unmarshal(config, &m.config)
 		if err != nil {
@@ -61,8 +68,8 @@ func (m *CustomCmdCollector) Init(config []byte) error {
 	if len(m.files) == 0 && len(m.commands) == 0 {
 		return errors.New("No metrics to collect")
 	}
-	m.handler = lp.NewMetricHandler()
-	m.parser = lp.NewParser(m.handler)
+	m.handler = influx.NewMetricHandler()
+	m.parser = influx.NewParser(m.handler)
 	m.parser.SetTimeFunc(DefaultTime)
 	m.init = true
 	return nil
@@ -72,7 +79,7 @@ var DefaultTime = func() time.Time {
 	return time.Unix(42, 0)
 }

-func (m *CustomCmdCollector) Read(interval time.Duration, out *[]lp.MutableMetric) {
+func (m *CustomCmdCollector) Read(interval time.Duration, output chan lp.CCMetric) {
 	if !m.init {
 		return
 	}
@@ -95,9 +102,10 @@ func (m *CustomCmdCollector) Read(interval time.Duration, out *[]lp.MutableMetri
 			if skip {
 				continue
 			}
-			y, err := lp.New(c.Name(), Tags2Map(c), Fields2Map(c), c.Time())
+
+			y := ccmetric.FromInfluxMetric(c)
 			if err == nil {
-				*out = append(*out, y)
+				output <- y
 			}
 		}
 	}
@@ -117,9 +125,9 @@ func (m *CustomCmdCollector) Read(interval time.Duration, out *[]lp.MutableMetri
 			if skip {
 				continue
 			}
-			y, err := lp.New(f.Name(), Tags2Map(f), Fields2Map(f), f.Time())
+			y := ccmetric.FromInfluxMetric(f)
 			if err == nil {
-				*out = append(*out, y)
+				output <- y
 			}
 		}
 	}
--- a/collectors/customCmdMetric.md
+++ b/collectors/customCmdMetric.md
@@ -0,0 +1,20 @@
+
+## `customcmd` collector
+
+```json
+  "customcmd": {
+    "exclude_metrics": [
+      "mymetric"
+    ],
+    "files" : [
+      "/var/run/myapp.metrics"
+    ],
+    "commands" : [
+      "/usr/local/bin/getmetrics.pl"
+    ]
+  }
+```
+
+The `customcmd` collector reads data from files and the output of executed commands. The files and commands can output multiple metrics (separated by newline) but the have to be in the [InfluxDB line protocol](https://docs.influxdata.com/influxdb/cloud/reference/syntax/line-protocol/). If a metric is not parsable, it is skipped. If a metric is not required, it can be excluded from forwarding it to the sink.
+
+
--- a/collectors/diskstatMetric.go
+++ b/collectors/diskstatMetric.go
@@ -1,113 +1,111 @@
 package collectors

 import (
-	"io/ioutil"
-
-	lp "github.com/influxdata/line-protocol"
-
-	//	"log"
+	"bufio"
 	"encoding/json"
-	"errors"
-	"strconv"
+	"fmt"
+	"os"
 	"strings"
+	"syscall"
 	"time"
+
+	cclog "github.com/ClusterCockpit/cc-metric-collector/internal/ccLogger"
+	lp "github.com/ClusterCockpit/cc-metric-collector/internal/ccMetric"
 )

-const DISKSTATFILE = `/proc/diskstats`
-const DISKSTAT_SYSFSPATH = `/sys/block`
+//	"log"
+
+const MOUNTFILE = `/proc/self/mounts`

 type DiskstatCollectorConfig struct {
 	ExcludeMetrics []string `json:"exclude_metrics,omitempty"`
 }

 type DiskstatCollector struct {
-	MetricCollector
-	matches map[int]string
-	config  DiskstatCollectorConfig
+	metricCollector
+	//matches map[string]int
+	config IOstatCollectorConfig
+	//devices map[string]IOstatCollectorEntry
 }

-func (m *DiskstatCollector) Init(config []byte) error {
-	var err error
+func (m *DiskstatCollector) Init(config json.RawMessage) error {
 	m.name = "DiskstatCollector"
+	m.meta = map[string]string{"source": m.name, "group": "Disk"}
 	m.setup()
 	if len(config) > 0 {
-		err = json.Unmarshal(config, &m.config)
+		err := json.Unmarshal(config, &m.config)
 		if err != nil {
 			return err
 		}
 	}
-	// https://www.kernel.org/doc/html/latest/admin-guide/iostats.html
-	matches := map[int]string{
-		3:  "reads",
-		4:  "reads_merged",
-		5:  "read_sectors",
-		6:  "read_ms",
-		7:  "writes",
-		8:  "writes_merged",
-		9:  "writes_sectors",
-		10: "writes_ms",
-		11: "ioops",
-		12: "ioops_ms",
-		13: "ioops_weighted_ms",
-		14: "discards",
-		15: "discards_merged",
-		16: "discards_sectors",
-		17: "discards_ms",
-		18: "flushes",
-		19: "flushes_ms",
+	file, err := os.Open(string(MOUNTFILE))
+	if err != nil {
+		cclog.ComponentError(m.name, err.Error())
+		return err
 	}
-	m.matches = make(map[int]string)
-	for k, v := range matches {
-		_, skip := stringArrayContains(m.config.ExcludeMetrics, v)
-		if !skip {
-			m.matches[k] = v
-		}
-	}
-	if len(m.matches) == 0 {
-		return errors.New("No metrics to collect")
-	}
-	_, err = ioutil.ReadFile(string(DISKSTATFILE))
-	if err == nil {
-		m.init = true
-	}
-	return err
+	defer file.Close()
+	m.init = true
+	return nil
 }

-func (m *DiskstatCollector) Read(interval time.Duration, out *[]lp.MutableMetric) {
-	var lines []string
+func (m *DiskstatCollector) Read(interval time.Duration, output chan lp.CCMetric) {
 	if !m.init {
 		return
 	}

-	buffer, err := ioutil.ReadFile(string(DISKSTATFILE))
+	file, err := os.Open(string(MOUNTFILE))
 	if err != nil {
+		cclog.ComponentError(m.name, err.Error())
 		return
 	}
-	lines = strings.Split(string(buffer), "\n")
+	defer file.Close()

-	for _, line := range lines {
+	part_max_used := uint64(0)
+	scanner := bufio.NewScanner(file)
+	for scanner.Scan() {
+		line := scanner.Text()
 		if len(line) == 0 {
 			continue
 		}
-		f := strings.Fields(line)
-		if strings.Contains(f[2], "loop") {
+		if !strings.HasPrefix(line, "/dev") {
 			continue
 		}
-		tags := map[string]string{
-			"device": f[2],
-			"type":   "node",
+		linefields := strings.Fields(line)
+		if strings.Contains(linefields[0], "loop") {
+			continue
 		}
-		for idx, name := range m.matches {
-			if idx < len(f) {
-				x, err := strconv.ParseInt(f[idx], 0, 64)
-				if err == nil {
-					y, err := lp.New(name, tags, map[string]interface{}{"value": int(x)}, time.Now())
-					if err == nil {
-						*out = append(*out, y)
-					}
-				}
-			}
+		if strings.Contains(linefields[1], "boot") {
+			continue
 		}
+		path := strings.Replace(linefields[1], `\040`, " ", -1)
+		stat := syscall.Statfs_t{}
+		err := syscall.Statfs(path, &stat)
+		if err != nil {
+			fmt.Println(err.Error())
+			return
+		}
+		tags := map[string]string{"type": "node", "device": linefields[0]}
+		total := (stat.Blocks * uint64(stat.Bsize)) / uint64(1000000000)
+		y, err := lp.New("disk_total", tags, m.meta, map[string]interface{}{"value": total}, time.Now())
+		if err == nil {
+			y.AddMeta("unit", "GBytes")
+			output <- y
+		}
+		free := (stat.Bfree * uint64(stat.Bsize)) / uint64(1000000000)
+		y, err = lp.New("disk_free", tags, m.meta, map[string]interface{}{"value": free}, time.Now())
+		if err == nil {
+			y.AddMeta("unit", "GBytes")
+			output <- y
+		}
+		perc := (100 * (total - free)) / total
+		if perc > part_max_used {
+			part_max_used = perc
+		}
+	}
+	y, err := lp.New("part_max_used", map[string]string{"type": "node"}, m.meta, map[string]interface{}{"value": part_max_used}, time.Now())
+	if err == nil {
+		y.AddMeta("unit", "percent")
+		output <- y
 	}
 }

--- a/collectors/diskstatMetric.md
+++ b/collectors/diskstatMetric.md
@@ -0,0 +1,21 @@
+
+## `diskstat` collector
+
+```json
+  "diskstat": {
+    "exclude_metrics": [
+      "disk_total"
+    ],
+  }
+```
+
+The `diskstat` collector reads data from `/proc/self/mounts` and outputs a handful **node** metrics. If a metric is not required, it can be excluded from forwarding it to the sink.
+
+Metrics per device (with `device` tag):
+* `disk_total` (unit `GBytes`)
+* `disk_free` (unit `GBytes`)
+
+Global metrics:
+* `part_max_used` (unit `percent`)
+
+
--- a/collectors/gpfsMetric.go
+++ b/collectors/gpfsMetric.go
@@ -7,24 +7,32 @@ import (
 	"fmt"
 	"io/ioutil"
 	"log"
-	"os"
 	"os/exec"
 	"os/user"
 	"strconv"
 	"strings"
 	"time"

-	lp "github.com/influxdata/line-protocol"
+	cclog "github.com/ClusterCockpit/cc-metric-collector/internal/ccLogger"
+	lp "github.com/ClusterCockpit/cc-metric-collector/internal/ccMetric"
 )

 type GpfsCollector struct {
-	MetricCollector
+	metricCollector
+	tags   map[string]string
 	config struct {
-		Mmpmon string `json:"mmpmon"`
+		Mmpmon            string   `json:"mmpmon_path,omitempty"`
+		ExcludeFilesystem []string `json:"exclude_filesystem,omitempty"`
 	}
+	skipFS map[string]struct{}
 }

-func (m *GpfsCollector) Init(config []byte) error {
+func (m *GpfsCollector) Init(config json.RawMessage) error {
+	// Check if already initialized
+	if m.init {
+		return nil
+	}
+
 	var err error
 	m.name = "GpfsCollector"
 	m.setup()
@@ -40,27 +48,40 @@ func (m *GpfsCollector) Init(config []byte) error {
 			return err
 		}
 	}
+	m.meta = map[string]string{
+		"source": m.name,
+		"group":  "GPFS",
+	}
+	m.tags = map[string]string{
+		"type":       "node",
+		"filesystem": "",
+	}
+	m.skipFS = make(map[string]struct{})
+	for _, fs := range m.config.ExcludeFilesystem {
+		m.skipFS[fs] = struct{}{}
+	}

 	// GPFS / IBM Spectrum Scale file system statistics can only be queried by user root
 	user, err := user.Current()
 	if err != nil {
-		return fmt.Errorf("GpfsCollector.Init(): Failed to get current user: %v", err)
+		return fmt.Errorf("Failed to get current user: %v", err)
 	}
 	if user.Uid != "0" {
-		return fmt.Errorf("GpfsCollector.Init(): GPFS file system statistics can only be queried by user root")
+		return fmt.Errorf("GPFS file system statistics can only be queried by user root")
 	}

 	// Check if mmpmon is in executable search path
 	_, err = exec.LookPath(m.config.Mmpmon)
 	if err != nil {
-		return fmt.Errorf("GpfsCollector.Init(): Failed to find mmpmon binary '%s': %v", m.config.Mmpmon, err)
+		return fmt.Errorf("Failed to find mmpmon binary '%s': %v", m.config.Mmpmon, err)
 	}

 	m.init = true
 	return nil
 }

-func (m *GpfsCollector) Read(interval time.Duration, out *[]lp.MutableMetric) {
+func (m *GpfsCollector) Read(interval time.Duration, output chan lp.CCMetric) {
+	// Check if already initialized
 	if !m.init {
 		return
 	}
@@ -77,12 +98,15 @@ func (m *GpfsCollector) Read(interval time.Duration, out *[]lp.MutableMetric) {
 	cmd.Stderr = cmdStderr
 	err := cmd.Run()
 	if err != nil {
-		fmt.Fprintf(os.Stderr, "GpfsCollector.Read(): Failed to execute command \"%s\": %s\n", cmd.String(), err.Error())
-		fmt.Fprintf(os.Stderr, "GpfsCollector.Read(): command exit code: \"%d\"\n", cmd.ProcessState.ExitCode())
-		data, _ := ioutil.ReadAll(cmdStderr)
-		fmt.Fprintf(os.Stderr, "GpfsCollector.Read(): command stderr: \"%s\"\n", string(data))
-		data, _ = ioutil.ReadAll(cmdStdout)
-		fmt.Fprintf(os.Stderr, "GpfsCollector.Read(): command stdout: \"%s\"\n", string(data))
+		dataStdErr, _ := ioutil.ReadAll(cmdStderr)
+		dataStdOut, _ := ioutil.ReadAll(cmdStdout)
+		cclog.ComponentError(
+			m.name,
+			fmt.Sprintf("Read(): Failed to execute command \"%s\": %v\n", cmd.String(), err),
+			fmt.Sprintf("Read(): command exit code: \"%d\"\n", cmd.ProcessState.ExitCode()),
+			fmt.Sprintf("Read(): command stderr: \"%s\"\n", string(dataStdErr)),
+			fmt.Sprintf("Read(): command stdout: \"%s\"\n", string(dataStdOut)),
+		)
 		return
 	}

@@ -90,194 +114,163 @@ func (m *GpfsCollector) Read(interval time.Duration, out *[]lp.MutableMetric) {
 	scanner := bufio.NewScanner(cmdStdout)
 	for scanner.Scan() {
 		lineSplit := strings.Fields(scanner.Text())
-		if lineSplit[0] == "_fs_io_s_" {
-			key_value := make(map[string]string)
-			for i := 1; i < len(lineSplit); i += 2 {
-				key_value[lineSplit[i]] = lineSplit[i+1]
-			}

-			// Ignore keys:
-			// _n_:  node IP address,
-			// _nn_: node name,
-			// _cl_: cluster name,
-			// _d_:  number of disks
+		// Only process lines starting with _fs_io_s_
+		if lineSplit[0] != "_fs_io_s_" {
+			continue
+		}

-			filesystem, ok := key_value["_fs_"]
-			if !ok {
-				fmt.Fprintf(os.Stderr, "GpfsCollector.Read(): Failed to get filesystem name.\n")
-				continue
-			}
+		key_value := make(map[string]string)
+		for i := 1; i < len(lineSplit); i += 2 {
+			key_value[lineSplit[i]] = lineSplit[i+1]
+		}

-			tagList := map[string]string{
-				"type":       "node",
-				"filesystem": filesystem,
-			}
+		// Ignore keys:
+		// _n_:  node IP address,
+		// _nn_: node name,
+		// _cl_: cluster name,
+		// _d_:  number of disks

-			// return code
-			rc, err := strconv.Atoi(key_value["_rc_"])
-			if err != nil {
-				fmt.Fprintf(os.Stderr, "GpfsCollector.Read(): Failed to convert return code: %s\n", err.Error())
-				continue
-			}
-			if rc != 0 {
-				fmt.Fprintf(os.Stderr, "GpfsCollector.Read(): Filesystem %s not ok.", filesystem)
-				continue
-			}
+		filesystem, ok := key_value["_fs_"]
+		if !ok {
+			cclog.ComponentError(
+				m.name,
+				"Read(): Failed to get filesystem name.")
+			continue
+		}

-			/* requires go 1.17
-			// unix epoch in microseconds
-			timestampInt, err := strconv.ParseInt(key_value["_t_"]+key_value["_tu_"], 10, 64)
-			timestamp := time.UnixMicro(timestampInt)
-			if err != nil {
-				fmt.Fprintf(os.Stderr,
-					"GpfsCollector.Read(): Failed to convert time stamp '%s': %s\n",
-					key_value["_t_"]+key_value["_tu_"], err.Error())
-				continue
-			}
-			*/
-			timestamp := time.Now()
+		// Skip excluded filesystems
+		if _, skip := m.skipFS[filesystem]; skip {
+			continue
+		}

-			// bytes read
-			bytesRead, err := strconv.ParseInt(key_value["_br_"], 10, 64)
-			if err != nil {
-				fmt.Fprintf(os.Stderr,
-					"GpfsCollector.Read(): Failed to convert bytes read '%s': %s\n",
-					key_value["_br_"], err.Error())
-				continue
-			}
-			y, err := lp.New(
-				"gpfs_bytes_read",
-				tagList,
-				map[string]interface{}{
-					"value": bytesRead,
-				},
-				timestamp)
-			if err == nil {
-				*out = append(*out, y)
-			}
+		m.tags["filesystem"] = filesystem

-			// bytes written
-			bytesWritten, err := strconv.ParseInt(key_value["_bw_"], 10, 64)
-			if err != nil {
-				fmt.Fprintf(os.Stderr,
-					"GpfsCollector.Read(): Failed to convert bytes written '%s': %s\n",
-					key_value["_bw_"], err.Error())
-				continue
-			}
-			y, err = lp.New(
-				"gpfs_bytes_written",
-				tagList,
-				map[string]interface{}{
-					"value": bytesWritten,
-				},
-				timestamp)
-			if err == nil {
-				*out = append(*out, y)
-			}
+		// return code
+		rc, err := strconv.Atoi(key_value["_rc_"])
+		if err != nil {
+			cclog.ComponentError(
+				m.name,
+				fmt.Sprintf("Read(): Failed to convert return code '%s' to int: %v", key_value["_rc_"], err))
+			continue
+		}
+		if rc != 0 {
+			cclog.ComponentError(
+				m.name,
+				fmt.Sprintf("Read(): Filesystem '%s' is not ok.", filesystem))
+			continue
+		}

-			// number of opens
-			numOpens, err := strconv.ParseInt(key_value["_oc_"], 10, 64)
-			if err != nil {
-				fmt.Fprintf(os.Stderr,
-					"GpfsCollector.Read(): Failed to convert number of opens '%s': %s\n",
-					key_value["_oc_"], err.Error())
-				continue
-			}
-			y, err = lp.New(
-				"gpfs_num_opens",
-				tagList,
-				map[string]interface{}{
-					"value": numOpens,
-				},
-				timestamp)
-			if err == nil {
-				*out = append(*out, y)
-			}
+		sec, err := strconv.ParseInt(key_value["_t_"], 10, 64)
+		if err != nil {
+			cclog.ComponentError(
+				m.name,
+				fmt.Sprintf("Read(): Failed to convert seconds '%s' to int64: %v", key_value["_t_"], err))
+			continue
+		}
+		msec, err := strconv.ParseInt(key_value["_tu_"], 10, 64)
+		if err != nil {
+			cclog.ComponentError(
+				m.name,
+				fmt.Sprintf("Read(): Failed to convert micro seconds '%s' to int64: %v", key_value["_tu_"], err))
+			continue
+		}
+		timestamp := time.Unix(sec, msec*1000)

-			// number of closes
-			numCloses, err := strconv.ParseInt(key_value["_cc_"], 10, 64)
-			if err != nil {
-				fmt.Fprintf(os.Stderr, "GpfsCollector.Read(): Failed to convert number of closes: %s\n", err.Error())
-				continue
-			}
-			y, err = lp.New(
-				"gpfs_num_closes",
-				tagList,
-				map[string]interface{}{
-					"value": numCloses,
-				},
-				timestamp)
-			if err == nil {
-				*out = append(*out, y)
-			}
+		// bytes read
+		bytesRead, err := strconv.ParseInt(key_value["_br_"], 10, 64)
+		if err != nil {
+			cclog.ComponentError(
+				m.name,
+				fmt.Sprintf("Read(): Failed to convert bytes read '%s' to int64: %v", key_value["_br_"], err))
+			continue
+		}
+		if y, err := lp.New("gpfs_bytes_read", m.tags, m.meta, map[string]interface{}{"value": bytesRead}, timestamp); err == nil {
+			output <- y
+		}

-			// number of reads
-			numReads, err := strconv.ParseInt(key_value["_rdc_"], 10, 64)
-			if err != nil {
-				fmt.Fprintf(os.Stderr, "GpfsCollector.Read(): Failed to convert number of reads: %s\n", err.Error())
-				continue
-			}
-			y, err = lp.New(
-				"gpfs_num_reads",
-				tagList,
-				map[string]interface{}{
-					"value": numReads,
-				},
-				timestamp)
-			if err == nil {
-				*out = append(*out, y)
-			}
+		// bytes written
+		bytesWritten, err := strconv.ParseInt(key_value["_bw_"], 10, 64)
+		if err != nil {
+			cclog.ComponentError(
+				m.name,
+				fmt.Sprintf("Read(): Failed to convert bytes written '%s' to int64: %v", key_value["_bw_"], err))
+			continue
+		}
+		if y, err := lp.New("gpfs_bytes_written", m.tags, m.meta, map[string]interface{}{"value": bytesWritten}, timestamp); err == nil {
+			output <- y
+		}

-			// number of writes
-			numWrites, err := strconv.ParseInt(key_value["_wc_"], 10, 64)
-			if err != nil {
-				fmt.Fprintf(os.Stderr, "GpfsCollector.Read(): Failed to convert number of writes: %s\n", err.Error())
-				continue
-			}
-			y, err = lp.New(
-				"gpfs_num_writes",
-				tagList,
-				map[string]interface{}{
-					"value": numWrites,
-				},
-				timestamp)
-			if err == nil {
-				*out = append(*out, y)
-			}
+		// number of opens
+		numOpens, err := strconv.ParseInt(key_value["_oc_"], 10, 64)
+		if err != nil {
+			cclog.ComponentError(
+				m.name,
+				fmt.Sprintf("Read(): Failed to convert number of opens '%s' to int64: %v", key_value["_oc_"], err))
+			continue
+		}
+		if y, err := lp.New("gpfs_num_opens", m.tags, m.meta, map[string]interface{}{"value": numOpens}, timestamp); err == nil {
+			output <- y
+		}

-			// number of read directories
-			numReaddirs, err := strconv.ParseInt(key_value["_dir_"], 10, 64)
-			if err != nil {
-				fmt.Fprintf(os.Stderr, "GpfsCollector.Read(): Failed to convert number of read directories: %s\n", err.Error())
-				continue
-			}
-			y, err = lp.New(
-				"gpfs_num_readdirs",
-				tagList,
-				map[string]interface{}{
-					"value": numReaddirs,
-				},
-				timestamp)
-			if err == nil {
-				*out = append(*out, y)
-			}
+		// number of closes
+		numCloses, err := strconv.ParseInt(key_value["_cc_"], 10, 64)
+		if err != nil {
+			cclog.ComponentError(
+				m.name,
+				fmt.Sprintf("Read(): Failed to convert number of closes: '%s' to int64: %v", key_value["_cc_"], err))
+			continue
+		}
+		if y, err := lp.New("gpfs_num_closes", m.tags, m.meta, map[string]interface{}{"value": numCloses}, timestamp); err == nil {
+			output <- y
+		}

-			// Number of inode updates
-			numInodeUpdates, err := strconv.ParseInt(key_value["_iu_"], 10, 64)
-			if err != nil {
-				fmt.Fprintf(os.Stderr, "GpfsCollector.Read(): Failed to convert Number of inode updates: %s\n", err.Error())
-				continue
-			}
-			y, err = lp.New(
-				"gpfs_num_inode_updates",
-				tagList,
-				map[string]interface{}{
-					"value": numInodeUpdates,
-				},
-				timestamp)
-			if err == nil {
-				*out = append(*out, y)
-			}
+		// number of reads
+		numReads, err := strconv.ParseInt(key_value["_rdc_"], 10, 64)
+		if err != nil {
+			cclog.ComponentError(
+				m.name,
+				fmt.Sprintf("Read(): Failed to convert number of reads: '%s' to int64: %v", key_value["_rdc_"], err))
+			continue
+		}
+		if y, err := lp.New("gpfs_num_reads", m.tags, m.meta, map[string]interface{}{"value": numReads}, timestamp); err == nil {
+			output <- y
+		}
+
+		// number of writes
+		numWrites, err := strconv.ParseInt(key_value["_wc_"], 10, 64)
+		if err != nil {
+			cclog.ComponentError(
+				m.name,
+				fmt.Sprintf("Read(): Failed to convert number of writes: '%s' to int64: %v", key_value["_wc_"], err))
+			continue
+		}
+		if y, err := lp.New("gpfs_num_writes", m.tags, m.meta, map[string]interface{}{"value": numWrites}, timestamp); err == nil {
+			output <- y
+		}
+
+		// number of read directories
+		numReaddirs, err := strconv.ParseInt(key_value["_dir_"], 10, 64)
+		if err != nil {
+			cclog.ComponentError(
+				m.name,
+				fmt.Sprintf("Read(): Failed to convert number of read directories: '%s' to int64: %v", key_value["_dir_"], err))
+			continue
+		}
+		if y, err := lp.New("gpfs_num_readdirs", m.tags, m.meta, map[string]interface{}{"value": numReaddirs}, timestamp); err == nil {
+			output <- y
+		}
+
+		// Number of inode updates
+		numInodeUpdates, err := strconv.ParseInt(key_value["_iu_"], 10, 64)
+		if err != nil {
+			cclog.ComponentError(
+				m.name,
+				fmt.Sprintf("Read(): Failed to convert number of inode updates: '%s' to int: %v", key_value["_iu_"], err))
+			continue
+		}
+		if y, err := lp.New("gpfs_num_inode_updates", m.tags, m.meta, map[string]interface{}{"value": numInodeUpdates}, timestamp); err == nil {
+			output <- y
 		}
 	}
 }
--- a/collectors/gpfsMetric.md
+++ b/collectors/gpfsMetric.md
@@ -0,0 +1,30 @@
+## `gpfs` collector
+
+```json
+  "ibstat": {
+    "mmpmon_path": "/path/to/mmpmon",
+    "exclude_filesystem": [
+      "fs1"
+    ]
+  }
+```
+
+The `gpfs` collector uses the `mmpmon` command to read performance metrics for
+GPFS / IBM Spectrum Scale filesystems.
+
+The reported filesystems can be filtered with the `exclude_filesystem` option
+in the configuration.
+
+The path to the `mmpmon` command can be configured with the `mmpmon_path` option
+in the configuration.
+
+Metrics:
+* `bytes_read`
+* `gpfs_bytes_written`
+* `gpfs_num_opens`
+* `gpfs_num_closes`
+* `gpfs_num_reads`
+* `gpfs_num_readdirs`
+* `gpfs_num_inode_updates`
+
+The collector adds a `filesystem` tag to all metrics
--- a/collectors/infinibandMetric.go
+++ b/collectors/infinibandMetric.go
@@ -3,282 +3,168 @@ package collectors
 import (
 	"fmt"
 	"io/ioutil"
-	"log"
-	"os/exec"
+	"os"

-	lp "github.com/influxdata/line-protocol"
+	cclog "github.com/ClusterCockpit/cc-metric-collector/internal/ccLogger"
+	lp "github.com/ClusterCockpit/cc-metric-collector/internal/ccMetric"
+	"golang.org/x/sys/unix"

-	//	"os"
 	"encoding/json"
-	"errors"
 	"path/filepath"
 	"strconv"
 	"strings"
 	"time"
 )

-const (
-	IBBASEPATH = `/sys/class/infiniband/`
-	PERFQUERY  = `/usr/sbin/perfquery`
-)
+const IB_BASEPATH = `/sys/class/infiniband/`

-type InfinibandCollectorConfig struct {
-	ExcludeDevices []string `json:"exclude_devices,omitempty"`
-	PerfQueryPath  string   `json:"perfquery_path"`
+type InfinibandCollectorInfo struct {
+	LID              string            // IB local Identifier (LID)
+	device           string            // IB device
+	port             string            // IB device port
+	portCounterFiles map[string]string // mapping counter name -> sysfs file
+	tagSet           map[string]string // corresponding tag list
 }

 type InfinibandCollector struct {
-	MetricCollector
-	tags          map[string]string
-	lids          map[string]map[string]string
-	config        InfinibandCollectorConfig
-	use_perfquery bool
+	metricCollector
+	config struct {
+		ExcludeDevices []string `json:"exclude_devices,omitempty"` // IB device to exclude e.g. mlx5_0
+	}
+	info []*InfinibandCollectorInfo
 }

-func (m *InfinibandCollector) Help() {
-	fmt.Println("This collector includes all devices that can be found below ", IBBASEPATH)
-	fmt.Println("and where any of the ports provides a 'lid' file (glob ", IBBASEPATH, "/<dev>/ports/<port>/lid).")
-	fmt.Println("The devices can be filtered with the 'exclude_devices' option in the configuration.")
-	fmt.Println("For each found LIDs the collector calls the 'perfquery' command")
-	fmt.Println("The path to the 'perfquery' command can be configured with the 'perfquery_path' option")
-	fmt.Println("in the configuration")
-	fmt.Println("")
-	fmt.Println("Full configuration object:")
-	fmt.Println("\"ibstat\" : {")
-	fmt.Println("  \"perfquery_path\" : \"path/to/perfquery\"  # if omitted, it searches in $PATH")
-	fmt.Println("  \"exclude_devices\" : [\"dev1\"]")
-	fmt.Println("}")
-	fmt.Println("")
-	fmt.Println("Metrics:")
-	fmt.Println("- ib_recv")
-	fmt.Println("- ib_xmit")
-	fmt.Println("- ib_recv_pkts")
-	fmt.Println("- ib_xmit_pkts")
-}
+// Init initializes the Infiniband collector by walking through files below IB_BASEPATH
+func (m *InfinibandCollector) Init(config json.RawMessage) error {
+
+	// Check if already initialized
+	if m.init {
+		return nil
+	}

-func (m *InfinibandCollector) Init(config []byte) error {
 	var err error
 	m.name = "InfinibandCollector"
-	m.use_perfquery = false
 	m.setup()
-	m.tags = map[string]string{"type": "node"}
+	m.meta = map[string]string{
+		"source": m.name,
+		"group":  "Network",
+	}
 	if len(config) > 0 {
 		err = json.Unmarshal(config, &m.config)
 		if err != nil {
 			return err
 		}
 	}
-	if len(m.config.PerfQueryPath) == 0 {
-		path, err := exec.LookPath("perfquery")
-		if err == nil {
-			m.config.PerfQueryPath = path
-		}
-	}
-	m.lids = make(map[string]map[string]string)
-	p := fmt.Sprintf("%s/*/ports/*/lid", string(IBBASEPATH))
-	files, err := filepath.Glob(p)
-	for _, f := range files {
-		lid, err := ioutil.ReadFile(f)
-		if err == nil {
-			plist := strings.Split(strings.Replace(f, string(IBBASEPATH), "", -1), "/")
-			skip := false
-			for _, d := range m.config.ExcludeDevices {
-				if d == plist[0] {
-					skip = true
-				}
-			}
-			if !skip {
-				m.lids[plist[0]] = make(map[string]string)
-				m.lids[plist[0]][plist[2]] = string(lid)
-			}
-		}
-	}

-	for _, ports := range m.lids {
-		for port, lid := range ports {
-			args := fmt.Sprintf("-r %s %s 0xf000", lid, port)
-			command := exec.Command(m.config.PerfQueryPath, args)
-			command.Wait()
-			_, err := command.Output()
-			if err == nil {
-				m.use_perfquery = true
-			}
-			break
-		}
-		break
-	}
-
-	if len(m.lids) > 0 {
-		m.init = true
-	} else {
-		err = errors.New("No usable devices")
-	}
-
-	return err
-}
-
-func DoPerfQuery(cmd string, dev string, lid string, port string, tags map[string]string, out *[]lp.MutableMetric) error {
-
-	args := fmt.Sprintf("-r %s %s 0xf000", lid, port)
-	command := exec.Command(cmd, args)
-	command.Wait()
-	stdout, err := command.Output()
+	// Loop for all InfiniBand directories
+	globPattern := filepath.Join(IB_BASEPATH, "*", "ports", "*")
+	ibDirs, err := filepath.Glob(globPattern)
 	if err != nil {
-		log.Print(err)
-		return err
+		return fmt.Errorf("Unable to glob files with pattern %s: %v", globPattern, err)
+	}
+	if ibDirs == nil {
+		return fmt.Errorf("Unable to find any directories with pattern %s", globPattern)
 	}
-	ll := strings.Split(string(stdout), "\n")

-	for _, line := range ll {
-		if strings.HasPrefix(line, "PortRcvData") || strings.HasPrefix(line, "RcvData") {
-			lv := strings.Fields(line)
-			v, err := strconv.ParseFloat(lv[1], 64)
-			if err == nil {
-				y, err := lp.New("ib_recv", tags, map[string]interface{}{"value": float64(v)}, time.Now())
-				if err == nil {
-					*out = append(*out, y)
-				}
+	for _, path := range ibDirs {
+
+		// Skip, when no LID is assigned
+		line, err := ioutil.ReadFile(filepath.Join(path, "lid"))
+		if err != nil {
+			continue
+		}
+		LID := strings.TrimSpace(string(line))
+		if LID == "0x0" {
+			continue
+		}
+
+		// Get device and port component
+		pathSplit := strings.Split(path, string(os.PathSeparator))
+		device := pathSplit[4]
+		port := pathSplit[6]
+
+		// Skip excluded devices
+		skip := false
+		for _, excludedDevice := range m.config.ExcludeDevices {
+			if excludedDevice == device {
+				skip = true
+				break
 			}
 		}
-		if strings.HasPrefix(line, "PortXmitData") || strings.HasPrefix(line, "XmtData") {
-			lv := strings.Fields(line)
-			v, err := strconv.ParseFloat(lv[1], 64)
-			if err == nil {
-				y, err := lp.New("ib_xmit", tags, map[string]interface{}{"value": float64(v)}, time.Now())
-				if err == nil {
-					*out = append(*out, y)
-				}
-			}
-		}
-		if strings.HasPrefix(line, "PortRcvPkts") || strings.HasPrefix(line, "RcvPkts") {
-			lv := strings.Fields(line)
-			v, err := strconv.ParseFloat(lv[1], 64)
-			if err == nil {
-				y, err := lp.New("ib_recv_pkts", tags, map[string]interface{}{"value": float64(v)}, time.Now())
-				if err == nil {
-					*out = append(*out, y)
-				}
-			}
-		}
-		if strings.HasPrefix(line, "PortXmitPkts") || strings.HasPrefix(line, "XmtPkts") {
-			lv := strings.Fields(line)
-			v, err := strconv.ParseFloat(lv[1], 64)
-			if err == nil {
-				y, err := lp.New("ib_xmit_pkts", tags, map[string]interface{}{"value": float64(v)}, time.Now())
-				if err == nil {
-					*out = append(*out, y)
-				}
+		if skip {
+			continue
+		}
+
+		// Check access to counter files
+		countersDir := filepath.Join(path, "counters")
+		portCounterFiles := map[string]string{
+			"ib_recv":      filepath.Join(countersDir, "port_rcv_data"),
+			"ib_xmit":      filepath.Join(countersDir, "port_xmit_data"),
+			"ib_recv_pkts": filepath.Join(countersDir, "port_rcv_packets"),
+			"ib_xmit_pkts": filepath.Join(countersDir, "port_xmit_packets"),
+		}
+		for _, counterFile := range portCounterFiles {
+			err := unix.Access(counterFile, unix.R_OK)
+			if err != nil {
+				return fmt.Errorf("Unable to access %s: %v", counterFile, err)
 			}
 		}
+
+		m.info = append(m.info,
+			&InfinibandCollectorInfo{
+				LID:              LID,
+				device:           device,
+				port:             port,
+				portCounterFiles: portCounterFiles,
+				tagSet: map[string]string{
+					"type":   "node",
+					"device": device,
+					"port":   port,
+					"lid":    LID,
+				},
+			})
 	}
+
+	if len(m.info) == 0 {
+		return fmt.Errorf("Found no IB devices")
+	}
+
+	m.init = true
 	return nil
 }

-func DoSysfsRead(dev string, lid string, port string, tags map[string]string, out *[]lp.MutableMetric) error {
-	path := fmt.Sprintf("%s/%s/ports/%s/counters/", string(IBBASEPATH), dev, port)
-	buffer, err := ioutil.ReadFile(fmt.Sprintf("%s/port_rcv_data", path))
-	if err == nil {
-		data := strings.Replace(string(buffer), "\n", "", -1)
-		v, err := strconv.ParseFloat(data, 64)
-		if err == nil {
-			y, err := lp.New("ib_recv", tags, map[string]interface{}{"value": float64(v)}, time.Now())
-			if err == nil {
-				*out = append(*out, y)
-			}
-		}
-	}
-	buffer, err = ioutil.ReadFile(fmt.Sprintf("%s/port_xmit_data", path))
-	if err == nil {
-		data := strings.Replace(string(buffer), "\n", "", -1)
-		v, err := strconv.ParseFloat(data, 64)
-		if err == nil {
-			y, err := lp.New("ib_xmit", tags, map[string]interface{}{"value": float64(v)}, time.Now())
-			if err == nil {
-				*out = append(*out, y)
-			}
-		}
-	}
-	buffer, err = ioutil.ReadFile(fmt.Sprintf("%s/port_rcv_packets", path))
-	if err == nil {
-		data := strings.Replace(string(buffer), "\n", "", -1)
-		v, err := strconv.ParseFloat(data, 64)
-		if err == nil {
-			y, err := lp.New("ib_recv_pkts", tags, map[string]interface{}{"value": float64(v)}, time.Now())
-			if err == nil {
-				*out = append(*out, y)
-			}
-		}
-	}
-	buffer, err = ioutil.ReadFile(fmt.Sprintf("%s/port_xmit_packets", path))
-	if err == nil {
-		data := strings.Replace(string(buffer), "\n", "", -1)
-		v, err := strconv.ParseFloat(data, 64)
-		if err == nil {
-			y, err := lp.New("ib_xmit_pkts", tags, map[string]interface{}{"value": float64(v)}, time.Now())
-			if err == nil {
-				*out = append(*out, y)
-			}
-		}
-	}
-	return nil
-}
+// Read reads Infiniband counter files below IB_BASEPATH
+func (m *InfinibandCollector) Read(interval time.Duration, output chan lp.CCMetric) {

-func (m *InfinibandCollector) Read(interval time.Duration, out *[]lp.MutableMetric) {
-
-	if m.init {
-		for dev, ports := range m.lids {
-			for port, lid := range ports {
-				tags := map[string]string{"type": "node", "device": dev, "port": port}
-				if m.use_perfquery {
-					DoPerfQuery(m.config.PerfQueryPath, dev, lid, port, tags, out)
-				} else {
-					DoSysfsRead(dev, lid, port, tags, out)
-				}
-			}
-		}
+	// Check if already initialized
+	if !m.init {
+		return
 	}

-	//	buffer, err := ioutil.ReadFile(string(LIDFILE))
+	now := time.Now()
+	for _, info := range m.info {
+		for counterName, counterFile := range info.portCounterFiles {
+			line, err := ioutil.ReadFile(counterFile)
+			if err != nil {
+				cclog.ComponentError(
+					m.name,
+					fmt.Sprintf("Read(): Failed to read from file '%s': %v", counterFile, err))
+				continue
+			}
+			data := strings.TrimSpace(string(line))
+			v, err := strconv.ParseInt(data, 10, 64)
+			if err != nil {
+				cclog.ComponentError(
+					m.name,
+					fmt.Sprintf("Read(): Failed to convert Infininiband metrice %s='%s' to int64: %v", counterName, data, err))
+				continue
+			}
+			if y, err := lp.New(counterName, info.tagSet, m.meta, map[string]interface{}{"value": v}, now); err == nil {
+				output <- y
+			}
+		}

-	//	if err != nil {
-	//		log.Print(err)
-	//		return
-	//	}
-
-	//	args := fmt.Sprintf("-r %s 1 0xf000", string(buffer))
-
-	//	command := exec.Command(PERFQUERY, args)
-	//	command.Wait()
-	//	stdout, err := command.Output()
-	//	if err != nil {
-	//		log.Print(err)
-	//		return
-	//	}
-
-	//	ll := strings.Split(string(stdout), "\n")
-
-	//	for _, line := range ll {
-	//		if strings.HasPrefix(line, "PortRcvData") || strings.HasPrefix(line, "RcvData") {
-	//			lv := strings.Fields(line)
-	//			v, err := strconv.ParseFloat(lv[1], 64)
-	//			if err == nil {
-	//				y, err := lp.New("ib_recv", m.tags, map[string]interface{}{"value": float64(v)}, time.Now())
-	//				if err == nil {
-	//					*out = append(*out, y)
-	//				}
-	//			}
-	//		}
-	//		if strings.HasPrefix(line, "PortXmitData") || strings.HasPrefix(line, "XmtData") {
-	//			lv := strings.Fields(line)
-	//			v, err := strconv.ParseFloat(lv[1], 64)
-	//			if err == nil {
-	//				y, err := lp.New("ib_xmit", m.tags, map[string]interface{}{"value": float64(v)}, time.Now())
-	//				if err == nil {
-	//					*out = append(*out, y)
-	//				}
-	//			}
-	//		}
-	//	}
+	}
 }

 func (m *InfinibandCollector) Close() {
--- a/collectors/infinibandMetric.md
+++ b/collectors/infinibandMetric.md
@@ -0,0 +1,26 @@
+
+## `ibstat` collector
+
+```json
+  "ibstat": {
+    "exclude_devices": [
+      "mlx4"
+    ]
+  }
+```
+
+The `ibstat` collector includes all Infiniband devices that can be
+found below `/sys/class/infiniband/` and where any of the ports provides a
+LID file (`/sys/class/infiniband/<dev>/ports/<port>/lid`)
+
+The devices can be filtered with the `exclude_devices` option in the configuration.
+
+For each found LID the collector reads data through the sysfs files below `/sys/class/infiniband/<device>`.
+
+Metrics:
+* `ib_recv`
+* `ib_xmit`
+* `ib_recv_pkts`
+* `ib_xmit_pkts`
+
+The collector adds a `device` tag to all metrics
--- a/collectors/infinibandPerfQueryMetric.go
+++ b/collectors/infinibandPerfQueryMetric.go
@@ -0,0 +1,232 @@
+package collectors
+
+import (
+	"fmt"
+	"io/ioutil"
+	"log"
+	"os/exec"
+
+	lp "github.com/ClusterCockpit/cc-metric-collector/internal/ccMetric"
+
+	//	"os"
+	"encoding/json"
+	"errors"
+	"path/filepath"
+	"strconv"
+	"strings"
+	"time"
+)
+
+const PERFQUERY = `/usr/sbin/perfquery`
+
+type InfinibandPerfQueryCollector struct {
+	metricCollector
+	tags   map[string]string
+	lids   map[string]map[string]string
+	config struct {
+		ExcludeDevices []string `json:"exclude_devices,omitempty"`
+		PerfQueryPath  string   `json:"perfquery_path"`
+	}
+}
+
+func (m *InfinibandPerfQueryCollector) Init(config json.RawMessage) error {
+	var err error
+	m.name = "InfinibandCollectorPerfQuery"
+	m.setup()
+	m.meta = map[string]string{"source": m.name, "group": "Network"}
+	m.tags = map[string]string{"type": "node"}
+	if len(config) > 0 {
+		err = json.Unmarshal(config, &m.config)
+		if err != nil {
+			return err
+		}
+	}
+	if len(m.config.PerfQueryPath) == 0 {
+		path, err := exec.LookPath("perfquery")
+		if err == nil {
+			m.config.PerfQueryPath = path
+		}
+	}
+	m.lids = make(map[string]map[string]string)
+	p := fmt.Sprintf("%s/*/ports/*/lid", string(IB_BASEPATH))
+	files, err := filepath.Glob(p)
+	if err != nil {
+		return err
+	}
+	for _, f := range files {
+		lid, err := ioutil.ReadFile(f)
+		if err == nil {
+			plist := strings.Split(strings.Replace(f, string(IB_BASEPATH), "", -1), "/")
+			skip := false
+			for _, d := range m.config.ExcludeDevices {
+				if d == plist[0] {
+					skip = true
+				}
+			}
+			if !skip {
+				m.lids[plist[0]] = make(map[string]string)
+				m.lids[plist[0]][plist[2]] = string(lid)
+			}
+		}
+	}
+
+	for _, ports := range m.lids {
+		for port, lid := range ports {
+			args := fmt.Sprintf("-r %s %s 0xf000", lid, port)
+			command := exec.Command(m.config.PerfQueryPath, args)
+			command.Wait()
+			_, err := command.Output()
+			if err != nil {
+				return fmt.Errorf("Failed to execute %s: %v", m.config.PerfQueryPath, err)
+			}
+		}
+	}
+
+	if len(m.lids) == 0 {
+		return errors.New("No usable IB devices")
+	}
+
+	m.init = true
+	return nil
+}
+
+func (m *InfinibandPerfQueryCollector) doPerfQuery(cmd string, dev string, lid string, port string, tags map[string]string, output chan lp.CCMetric) error {
+
+	args := fmt.Sprintf("-r %s %s 0xf000", lid, port)
+	command := exec.Command(cmd, args)
+	command.Wait()
+	stdout, err := command.Output()
+	if err != nil {
+		log.Print(err)
+		return err
+	}
+	ll := strings.Split(string(stdout), "\n")
+
+	for _, line := range ll {
+		if strings.HasPrefix(line, "PortRcvData") || strings.HasPrefix(line, "RcvData") {
+			lv := strings.Fields(line)
+			v, err := strconv.ParseFloat(lv[1], 64)
+			if err == nil {
+				y, err := lp.New("ib_recv", tags, m.meta, map[string]interface{}{"value": float64(v)}, time.Now())
+				if err == nil {
+					output <- y
+				}
+			}
+		}
+		if strings.HasPrefix(line, "PortXmitData") || strings.HasPrefix(line, "XmtData") {
+			lv := strings.Fields(line)
+			v, err := strconv.ParseFloat(lv[1], 64)
+			if err == nil {
+				y, err := lp.New("ib_xmit", tags, m.meta, map[string]interface{}{"value": float64(v)}, time.Now())
+				if err == nil {
+					output <- y
+				}
+			}
+		}
+		if strings.HasPrefix(line, "PortRcvPkts") || strings.HasPrefix(line, "RcvPkts") {
+			lv := strings.Fields(line)
+			v, err := strconv.ParseFloat(lv[1], 64)
+			if err == nil {
+				y, err := lp.New("ib_recv_pkts", tags, m.meta, map[string]interface{}{"value": float64(v)}, time.Now())
+				if err == nil {
+					output <- y
+				}
+			}
+		}
+		if strings.HasPrefix(line, "PortXmitPkts") || strings.HasPrefix(line, "XmtPkts") {
+			lv := strings.Fields(line)
+			v, err := strconv.ParseFloat(lv[1], 64)
+			if err == nil {
+				y, err := lp.New("ib_xmit_pkts", tags, m.meta, map[string]interface{}{"value": float64(v)}, time.Now())
+				if err == nil {
+					output <- y
+				}
+			}
+		}
+		if strings.HasPrefix(line, "PortRcvPkts") || strings.HasPrefix(line, "RcvPkts") {
+			lv := strings.Fields(line)
+			v, err := strconv.ParseFloat(lv[1], 64)
+			if err == nil {
+				y, err := lp.New("ib_recv_pkts", tags, m.meta, map[string]interface{}{"value": float64(v)}, time.Now())
+				if err == nil {
+					output <- y
+				}
+			}
+		}
+		if strings.HasPrefix(line, "PortXmitPkts") || strings.HasPrefix(line, "XmtPkts") {
+			lv := strings.Fields(line)
+			v, err := strconv.ParseFloat(lv[1], 64)
+			if err == nil {
+				y, err := lp.New("ib_xmit_pkts", tags, m.meta, map[string]interface{}{"value": float64(v)}, time.Now())
+				if err == nil {
+					output <- y
+				}
+			}
+		}
+	}
+	return nil
+}
+
+func (m *InfinibandPerfQueryCollector) Read(interval time.Duration, output chan lp.CCMetric) {
+
+	if m.init {
+		for dev, ports := range m.lids {
+			for port, lid := range ports {
+				tags := map[string]string{
+					"type":   "node",
+					"device": dev,
+					"port":   port,
+					"lid":    lid}
+				path := fmt.Sprintf("%s/%s/ports/%s/counters/", string(IB_BASEPATH), dev, port)
+				buffer, err := ioutil.ReadFile(fmt.Sprintf("%s/port_rcv_data", path))
+				if err == nil {
+					data := strings.Replace(string(buffer), "\n", "", -1)
+					v, err := strconv.ParseFloat(data, 64)
+					if err == nil {
+						y, err := lp.New("ib_recv", tags, m.meta, map[string]interface{}{"value": float64(v)}, time.Now())
+						if err == nil {
+							output <- y
+						}
+					}
+				}
+				buffer, err = ioutil.ReadFile(fmt.Sprintf("%s/port_xmit_data", path))
+				if err == nil {
+					data := strings.Replace(string(buffer), "\n", "", -1)
+					v, err := strconv.ParseFloat(data, 64)
+					if err == nil {
+						y, err := lp.New("ib_xmit", tags, m.meta, map[string]interface{}{"value": float64(v)}, time.Now())
+						if err == nil {
+							output <- y
+						}
+					}
+				}
+				buffer, err = ioutil.ReadFile(fmt.Sprintf("%s/port_rcv_packets", path))
+				if err == nil {
+					data := strings.Replace(string(buffer), "\n", "", -1)
+					v, err := strconv.ParseFloat(data, 64)
+					if err == nil {
+						y, err := lp.New("ib_recv_pkts", tags, m.meta, map[string]interface{}{"value": float64(v)}, time.Now())
+						if err == nil {
+							output <- y
+						}
+					}
+				}
+				buffer, err = ioutil.ReadFile(fmt.Sprintf("%s/port_xmit_packets", path))
+				if err == nil {
+					data := strings.Replace(string(buffer), "\n", "", -1)
+					v, err := strconv.ParseFloat(data, 64)
+					if err == nil {
+						y, err := lp.New("ib_xmit_pkts", tags, m.meta, map[string]interface{}{"value": float64(v)}, time.Now())
+						if err == nil {
+							output <- y
+						}
+					}
+				}
+			}
+		}
+	}
+}
+
+func (m *InfinibandPerfQueryCollector) Close() {
+	m.init = false
+}
--- a/collectors/infinibandPerfQueryMetric.md
+++ b/collectors/infinibandPerfQueryMetric.md
@@ -0,0 +1,28 @@
+
+## `ibstat_perfquery` collector
+
+```json
+  "ibstat_perfquery": {
+    "perfquery_path": "/path/to/perfquery",
+    "exclude_devices": [
+      "mlx4"
+    ]
+  }
+```
+
+The `ibstat_perfquery` collector includes all Infiniband devices that can be
+found below `/sys/class/infiniband/` and where any of the ports provides a
+LID file (`/sys/class/infiniband/<dev>/ports/<port>/lid`)
+
+The devices can be filtered with the `exclude_devices` option in the configuration.
+
+For each found LID the collector calls the `perfquery` command. The path to the
+`perfquery` command can be configured with the `perfquery_path` option in the configuration
+
+Metrics:
+* `ib_recv`
+* `ib_xmit`
+* `ib_recv_pkts`
+* `ib_xmit_pkts`
+
+The collector adds a `device` tag to all metrics
--- a/collectors/iostatMetric.go
+++ b/collectors/iostatMetric.go
@@ -0,0 +1,155 @@
+package collectors
+
+import (
+	"bufio"
+	"os"
+
+	cclog "github.com/ClusterCockpit/cc-metric-collector/internal/ccLogger"
+	lp "github.com/ClusterCockpit/cc-metric-collector/internal/ccMetric"
+
+	//	"log"
+	"encoding/json"
+	"errors"
+	"strconv"
+	"strings"
+	"time"
+)
+
+const IOSTATFILE = `/proc/diskstats`
+const IOSTAT_SYSFSPATH = `/sys/block`
+
+type IOstatCollectorConfig struct {
+	ExcludeMetrics []string `json:"exclude_metrics,omitempty"`
+}
+
+type IOstatCollectorEntry struct {
+	lastValues map[string]int64
+	tags       map[string]string
+}
+
+type IOstatCollector struct {
+	metricCollector
+	matches map[string]int
+	config  IOstatCollectorConfig
+	devices map[string]IOstatCollectorEntry
+}
+
+func (m *IOstatCollector) Init(config json.RawMessage) error {
+	var err error
+	m.name = "IOstatCollector"
+	m.meta = map[string]string{"source": m.name, "group": "Disk"}
+	m.setup()
+	if len(config) > 0 {
+		err = json.Unmarshal(config, &m.config)
+		if err != nil {
+			return err
+		}
+	}
+	// https://www.kernel.org/doc/html/latest/admin-guide/iostats.html
+	matches := map[string]int{
+		"io_reads":             3,
+		"io_reads_merged":      4,
+		"io_read_sectors":      5,
+		"io_read_ms":           6,
+		"io_writes":            7,
+		"io_writes_merged":     8,
+		"io_writes_sectors":    9,
+		"io_writes_ms":         10,
+		"io_ioops":             11,
+		"io_ioops_ms":          12,
+		"io_ioops_weighted_ms": 13,
+		"io_discards":          14,
+		"io_discards_merged":   15,
+		"io_discards_sectors":  16,
+		"io_discards_ms":       17,
+		"io_flushes":           18,
+		"io_flushes_ms":        19,
+	}
+	m.devices = make(map[string]IOstatCollectorEntry)
+	m.matches = make(map[string]int)
+	for k, v := range matches {
+		if _, skip := stringArrayContains(m.config.ExcludeMetrics, k); !skip {
+			m.matches[k] = v
+		}
+	}
+	if len(m.matches) == 0 {
+		return errors.New("no metrics to collect")
+	}
+	file, err := os.Open(string(IOSTATFILE))
+	if err != nil {
+		cclog.ComponentError(m.name, err.Error())
+		return err
+	}
+	defer file.Close()
+
+	scanner := bufio.NewScanner(file)
+	for scanner.Scan() {
+		line := scanner.Text()
+		linefields := strings.Fields(line)
+		device := linefields[2]
+		if strings.Contains(device, "loop") {
+			continue
+		}
+		values := make(map[string]int64)
+		for m := range m.matches {
+			values[m] = 0
+		}
+		m.devices[device] = IOstatCollectorEntry{
+			tags: map[string]string{
+				"device": linefields[2],
+				"type":   "node",
+			},
+			lastValues: values,
+		}
+	}
+	m.init = true
+	return err
+}
+
+func (m *IOstatCollector) Read(interval time.Duration, output chan lp.CCMetric) {
+	if !m.init {
+		return
+	}
+
+	file, err := os.Open(string(IOSTATFILE))
+	if err != nil {
+		cclog.ComponentError(m.name, err.Error())
+		return
+	}
+	defer file.Close()
+
+	scanner := bufio.NewScanner(file)
+	for scanner.Scan() {
+		line := scanner.Text()
+		if len(line) == 0 {
+			continue
+		}
+		linefields := strings.Fields(line)
+		device := linefields[2]
+		if strings.Contains(device, "loop") {
+			continue
+		}
+		if _, ok := m.devices[device]; !ok {
+			continue
+		}
+		entry := m.devices[device]
+		for name, idx := range m.matches {
+			if idx < len(linefields) {
+				x, err := strconv.ParseInt(linefields[idx], 0, 64)
+				if err == nil {
+					diff := x - entry.lastValues[name]
+					y, err := lp.New(name, entry.tags, m.meta, map[string]interface{}{"value": int(diff)}, time.Now())
+					if err == nil {
+						output <- y
+					}
+				}
+				entry.lastValues[name] = x
+			}
+		}
+		m.devices[device] = entry
+	}
+}
+
+func (m *IOstatCollector) Close() {
+	m.init = false
+}
--- a/collectors/iostatMetric.md
+++ b/collectors/iostatMetric.md
@@ -0,0 +1,34 @@
+
+## `iostat` collector
+
+```json
+  "iostat": {
+    "exclude_metrics": [
+      "read_ms"
+    ],
+  }
+```
+
+The `iostat` collector reads data from `/proc/diskstats` and outputs a handful **node** metrics. If a metric is not required, it can be excluded from forwarding it to the sink.
+
+Metrics:
+* `io_reads`
+* `io_reads_merged`
+* `io_read_sectors`
+* `io_read_ms`
+* `io_writes`
+* `io_writes_merged`
+* `io_writes_sectors`
+* `io_writes_ms`
+* `io_ioops`
+* `io_ioops_ms`
+* `io_ioops_weighted_ms`
+* `io_discards`
+* `io_discards_merged`
+* `io_discards_sectors`
+* `io_discards_ms`
+* `io_flushes`
+* `io_flushes_ms`
+
+The device name is added as tag `device`. For more details, see https://www.kernel.org/doc/html/latest/admin-guide/iostats.html
+
--- a/collectors/ipmiMetric.go
+++ b/collectors/ipmiMetric.go
@@ -10,11 +10,11 @@ import (
 	"strings"
 	"time"

-	lp "github.com/influxdata/line-protocol"
+	lp "github.com/ClusterCockpit/cc-metric-collector/internal/ccMetric"
 )

-const IPMITOOL_PATH = `/usr/bin/ipmitool`
-const IPMISENSORS_PATH = `/usr/sbin/ipmi-sensors`
+const IPMITOOL_PATH = `ipmitool`
+const IPMISENSORS_PATH = `ipmi-sensors`

 type IpmiCollectorConfig struct {
 	ExcludeDevices  []string `json:"exclude_devices"`
@@ -23,37 +23,44 @@ type IpmiCollectorConfig struct {
 }

 type IpmiCollector struct {
-	MetricCollector
-	tags    map[string]string
-	matches map[string]string
-	config  IpmiCollectorConfig
+	metricCollector
+	//tags        map[string]string
+	//matches     map[string]string
+	config      IpmiCollectorConfig
+	ipmitool    string
+	ipmisensors string
 }

-func (m *IpmiCollector) Init(config []byte) error {
+func (m *IpmiCollector) Init(config json.RawMessage) error {
 	m.name = "IpmiCollector"
 	m.setup()
+	m.meta = map[string]string{"source": m.name, "group": "IPMI"}
+	m.config.IpmitoolPath = string(IPMITOOL_PATH)
+	m.config.IpmisensorsPath = string(IPMISENSORS_PATH)
+	m.ipmitool = ""
+	m.ipmisensors = ""
 	if len(config) > 0 {
 		err := json.Unmarshal(config, &m.config)
 		if err != nil {
 			return err
 		}
 	}
-	_, err1 := os.Stat(m.config.IpmitoolPath)
-	_, err2 := os.Stat(m.config.IpmisensorsPath)
-	if err1 != nil {
-		m.config.IpmitoolPath = ""
+	p, err := exec.LookPath(m.config.IpmitoolPath)
+	if err == nil {
+		m.ipmitool = p
 	}
-	if err2 != nil {
-		m.config.IpmisensorsPath = ""
+	p, err = exec.LookPath(m.config.IpmisensorsPath)
+	if err == nil {
+		m.ipmisensors = p
 	}
-	if err1 != nil && err2 != nil {
+	if len(m.ipmitool) == 0 && len(m.ipmisensors) == 0 {
 		return errors.New("No IPMI reader found")
 	}
 	m.init = true
 	return nil
 }

-func ReadIpmiTool(cmd string, out *[]lp.MutableMetric) {
+func (m *IpmiCollector) readIpmiTool(cmd string, output chan lp.CCMetric) {
 	command := exec.Command(cmd, "sensor")
 	command.Wait()
 	stdout, err := command.Output()
@@ -74,24 +81,25 @@ func ReadIpmiTool(cmd string, out *[]lp.MutableMetric) {
 			name := strings.ToLower(strings.Replace(strings.Trim(lv[0], " "), " ", "_", -1))
 			unit := strings.Trim(lv[2], " ")
 			if unit == "Volts" {
-				unit = "V"
+				unit = "Volts"
 			} else if unit == "degrees C" {
-				unit = "C"
+				unit = "degC"
 			} else if unit == "degrees F" {
-				unit = "F"
+				unit = "degF"
 			} else if unit == "Watts" {
-				unit = "W"
+				unit = "Watts"
 			}

-			y, err := lp.New(name, map[string]string{"unit": unit, "type": "node"}, map[string]interface{}{"value": v}, time.Now())
+			y, err := lp.New(name, map[string]string{"type": "node"}, m.meta, map[string]interface{}{"value": v}, time.Now())
 			if err == nil {
-				*out = append(*out, y)
+				y.AddMeta("unit", unit)
+				output <- y
 			}
 		}
 	}
 }

-func ReadIpmiSensors(cmd string, out *[]lp.MutableMetric) {
+func (m *IpmiCollector) readIpmiSensors(cmd string, output chan lp.CCMetric) {

 	command := exec.Command(cmd, "--comma-separated-output", "--sdr-cache-recreate")
 	command.Wait()
@@ -109,25 +117,28 @@ func ReadIpmiSensors(cmd string, out *[]lp.MutableMetric) {
 			v, err := strconv.ParseFloat(lv[3], 64)
 			if err == nil {
 				name := strings.ToLower(strings.Replace(lv[1], " ", "_", -1))
-				y, err := lp.New(name, map[string]string{"unit": lv[4], "type": "node"}, map[string]interface{}{"value": v}, time.Now())
+				y, err := lp.New(name, map[string]string{"type": "node"}, m.meta, map[string]interface{}{"value": v}, time.Now())
 				if err == nil {
-					*out = append(*out, y)
+					if len(lv) > 4 {
+						y.AddMeta("unit", lv[4])
+					}
+					output <- y
 				}
 			}
 		}
 	}
 }

-func (m *IpmiCollector) Read(interval time.Duration, out *[]lp.MutableMetric) {
+func (m *IpmiCollector) Read(interval time.Duration, output chan lp.CCMetric) {
 	if len(m.config.IpmitoolPath) > 0 {
 		_, err := os.Stat(m.config.IpmitoolPath)
 		if err == nil {
-			ReadIpmiTool(m.config.IpmitoolPath, out)
+			m.readIpmiTool(m.config.IpmitoolPath, output)
 		}
 	} else if len(m.config.IpmisensorsPath) > 0 {
 		_, err := os.Stat(m.config.IpmisensorsPath)
 		if err == nil {
-			ReadIpmiSensors(m.config.IpmisensorsPath, out)
+			m.readIpmiSensors(m.config.IpmisensorsPath, output)
 		}
 	}
 }
--- a/collectors/ipmiMetric.md
+++ b/collectors/ipmiMetric.md
@@ -0,0 +1,16 @@
+
+## `ipmistat` collector
+
+```json
+  "ipmistat": {
+    "ipmitool_path": "/path/to/ipmitool",
+    "ipmisensors_path": "/path/to/ipmi-sensors",
+  }
+```
+
+The `ipmistat` collector reads data from `ipmitool` (`ipmitool sensor`) or `ipmi-sensors` (`ipmi-sensors --sdr-cache-recreate --comma-separated-output`). 
+
+The metrics depend on the output of the underlying tools but contain temperature, power and energy metrics.
+
+
+
--- a/collectors/likwidMetric.go
+++ b/collectors/likwidMetric.go
@@ -2,7 +2,7 @@ package collectors

 /*
 #cgo CFLAGS: -I./likwid
-#cgo LDFLAGS: -L./likwid -llikwid -llikwid-hwloc -lm
+#cgo LDFLAGS: -L./likwid -llikwid -llikwid-hwloc -lm -Wl,--unresolved-symbols=ignore-in-object-files
 #include <stdlib.h>
 #include <likwid.h>
 */
@@ -13,55 +13,111 @@ import (
 	"errors"
 	"fmt"
 	"io/ioutil"
-	"log"
 	"math"
 	"os"
+	"regexp"
 	"strconv"
 	"strings"
 	"time"
 	"unsafe"

-	lp "github.com/influxdata/line-protocol"
-	"gopkg.in/Knetic/govaluate.v2"
+	cclog "github.com/ClusterCockpit/cc-metric-collector/internal/ccLogger"
+	lp "github.com/ClusterCockpit/cc-metric-collector/internal/ccMetric"
+	topo "github.com/ClusterCockpit/cc-metric-collector/internal/ccTopology"
+	agg "github.com/ClusterCockpit/cc-metric-collector/internal/metricAggregator"
+	"github.com/NVIDIA/go-nvml/pkg/dl"
+)
+
+type MetricScope string
+
+const (
+	METRIC_SCOPE_HWTHREAD = iota
+	METRIC_SCOPE_CORE
+	METRIC_SCOPE_LLC
+	METRIC_SCOPE_NUMA
+	METRIC_SCOPE_DIE
+	METRIC_SCOPE_SOCKET
+	METRIC_SCOPE_NODE
+)
+
+func (ms MetricScope) String() string {
+	return string(ms)
+}
+
+func (ms MetricScope) Likwid() string {
+	LikwidDomains := map[string]string{
+		"cpu":        "",
+		"core":       "",
+		"llc":        "C",
+		"numadomain": "M",
+		"die":        "D",
+		"socket":     "S",
+		"node":       "N",
+	}
+	return LikwidDomains[string(ms)]
+}
+
+func (ms MetricScope) Granularity() int {
+	for i, g := range GetAllMetricScopes() {
+		if ms == g {
+			return i
+		}
+	}
+	return -1
+}
+
+func GetAllMetricScopes() []MetricScope {
+	return []MetricScope{"cpu" /*, "core", "llc", "numadomain", "die",*/, "socket", "node"}
+}
+
+const (
+	LIKWID_LIB_NAME     = "liblikwid.so"
+	LIKWID_LIB_DL_FLAGS = dl.RTLD_LAZY | dl.RTLD_GLOBAL
 )

 type LikwidCollectorMetricConfig struct {
-	Name         string `json:"name"`
-	Calc         string `json:"calc"`
-	Socket_scope bool   `json:"socket_scope"`
-	Publish      bool   `json:"publish"`
+	Name string `json:"name"` // Name of the metric
+	Calc string `json:"calc"` // Calculation for the metric using
+	//Aggr        string      `json:"aggregation"` // if scope unequal to LIKWID metric scope, the values are combined (sum, min, max, mean or avg, median)
+	Scope       MetricScope `json:"scope"` // scope for calculation. subscopes are aggregated using the 'aggregation' function
+	Publish     bool        `json:"publish"`
+	granulatity MetricScope
 }

 type LikwidCollectorEventsetConfig struct {
-	Events  map[string]string             `json:"events"`
-	Metrics []LikwidCollectorMetricConfig `json:"metrics"`
+	Events      map[string]string `json:"events"`
+	granulatity map[string]MetricScope
+	Metrics     []LikwidCollectorMetricConfig `json:"metrics"`
 }

 type LikwidCollectorConfig struct {
 	Eventsets      []LikwidCollectorEventsetConfig `json:"eventsets"`
-	Metrics        []LikwidCollectorMetricConfig   `json:"globalmetrics"`
-	ExcludeMetrics []string                        `json:"exclude_metrics"`
-	ForceOverwrite bool                            `json:"force_overwrite"`
+	Metrics        []LikwidCollectorMetricConfig   `json:"globalmetrics,omitempty"`
+	ForceOverwrite bool                            `json:"force_overwrite,omitempty"`
+	InvalidToZero  bool                            `json:"invalid_to_zero,omitempty"`
 }

 type LikwidCollector struct {
-	MetricCollector
-	cpulist   []C.int
-	sock2tid  map[int]int
-	metrics   map[C.int]map[string]int
-	groups    []C.int
-	config    LikwidCollectorConfig
-	results   map[int]map[int]map[string]interface{}
-	mresults  map[int]map[int]map[string]float64
-	gmresults map[int]map[string]float64
-	basefreq  float64
+	metricCollector
+	cpulist       []C.int
+	cpu2tid       map[int]int
+	sock2tid      map[int]int
+	scopeRespTids map[MetricScope]map[int]int
+	metrics       map[C.int]map[string]int
+	groups        []C.int
+	config        LikwidCollectorConfig
+	results       map[int]map[int]map[string]interface{}
+	mresults      map[int]map[int]map[string]float64
+	gmresults     map[int]map[string]float64
+	basefreq      float64
+	running       bool
 }

 type LikwidMetric struct {
-	name         string
-	search       string
-	socket_scope bool
-	group_idx    int
+	name      string
+	search    string
+	scope     MetricScope
+	group_idx int
 }

 func eventsToEventStr(events map[string]string) string {
@@ -72,12 +128,27 @@ func eventsToEventStr(events map[string]string) string {
 	return strings.Join(elist, ",")
 }

+func getGranularity(counter, event string) MetricScope {
+	if strings.HasPrefix(counter, "PMC") || strings.HasPrefix(counter, "FIXC") {
+		return "cpu"
+	} else if strings.Contains(counter, "BOX") || strings.Contains(counter, "DEV") {
+		return "socket"
+	} else if strings.HasPrefix(counter, "PWR") {
+		if event == "RAPL_CORE_ENERGY" {
+			return "cpu"
+		} else {
+			return "socket"
+		}
+	}
+	return "unknown"
+}
+
 func getBaseFreq() float64 {
 	var freq float64 = math.NaN()
 	C.power_init(0)
 	info := C.get_powerInfo()
 	if float64(info.baseFrequency) != 0 {
-		freq = float64(info.baseFrequency)
+		freq = float64(info.baseFrequency) * 1e3
 	} else {
 		buffer, err := ioutil.ReadFile("/sys/devices/system/cpu/cpu0/cpufreq/bios_limit")
 		if err == nil {
@@ -91,21 +162,102 @@ func getBaseFreq() float64 {
 	return freq
 }

-func getSocketCpus() map[C.int]int {
-	slist := SocketList()
-	var cpu C.int
-	outmap := make(map[C.int]int)
-	for _, s := range slist {
-		t := C.CString(fmt.Sprintf("S%d", s))
-		clen := C.cpustr_to_cpulist(t, &cpu, 1)
-		if int(clen) == 1 {
-			outmap[cpu] = s
+func (m *LikwidCollector) initGranularity() {
+	splitRegex := regexp.MustCompile("[+-/*()]")
+	for _, evset := range m.config.Eventsets {
+		evset.granulatity = make(map[string]MetricScope)
+		for counter, event := range evset.Events {
+			gran := getGranularity(counter, event)
+			if gran.Granularity() >= 0 {
+				evset.granulatity[counter] = gran
+			}
+		}
+		for i, metric := range evset.Metrics {
+			s := splitRegex.Split(metric.Calc, -1)
+			gran := MetricScope("cpu")
+			evset.Metrics[i].granulatity = gran
+			for _, x := range s {
+				if _, ok := evset.Events[x]; ok {
+					if evset.granulatity[x].Granularity() > gran.Granularity() {
+						gran = evset.granulatity[x]
+					}
+				}
+			}
+			evset.Metrics[i].granulatity = gran
 		}
 	}
-	return outmap
+	for i, metric := range m.config.Metrics {
+		s := splitRegex.Split(metric.Calc, -1)
+		gran := MetricScope("cpu")
+		m.config.Metrics[i].granulatity = gran
+		for _, x := range s {
+			for _, evset := range m.config.Eventsets {
+				for _, m := range evset.Metrics {
+					if m.Name == x && m.granulatity.Granularity() > gran.Granularity() {
+						gran = m.granulatity
+					}
+				}
+			}
+		}
+		m.config.Metrics[i].granulatity = gran
+	}
 }

-func (m *LikwidCollector) Init(config []byte) error {
+type TopoResolveFunc func(cpuid int) int
+
+func (m *LikwidCollector) getResponsiblities() map[MetricScope]map[int]int {
+	get_cpus := func(scope MetricScope) map[int]int {
+		var slist []int
+		var cpu C.int
+		var input func(index int) string
+		switch scope {
+		case "node":
+			slist = []int{0}
+			input = func(index int) string { return "N:0" }
+		case "socket":
+			input = func(index int) string { return fmt.Sprintf("%s%d:0", scope.Likwid(), index) }
+			slist = topo.SocketList()
+		// case "numadomain":
+		// 	input = func(index int) string { return fmt.Sprintf("%s%d:0", scope.Likwid(), index) }
+		// 	slist = topo.NumaNodeList()
+		// 	cclog.Debug(scope, " ", input(0), " ", slist)
+		// case "die":
+		// 	input = func(index int) string { return fmt.Sprintf("%s%d:0", scope.Likwid(), index) }
+		// 	slist = topo.DieList()
+		// case "llc":
+		// 	input = fmt.Sprintf("%s%d:0", scope.Likwid(), s)
+		// 	slist = topo.LLCacheList()
+		case "cpu":
+			input = func(index int) string { return fmt.Sprintf("%d", index) }
+			slist = topo.CpuList()
+		case "hwthread":
+			input = func(index int) string { return fmt.Sprintf("%d", index) }
+			slist = topo.CpuList()
+		}
+		outmap := make(map[int]int)
+		for _, s := range slist {
+			t := C.CString(input(s))
+			clen := C.cpustr_to_cpulist(t, &cpu, 1)
+			if int(clen) == 1 {
+				outmap[s] = m.cpu2tid[int(cpu)]
+			} else {
+				cclog.Error(fmt.Sprintf("Cannot determine responsible CPU for %s", input(s)))
+				outmap[s] = -1
+			}
+			C.free(unsafe.Pointer(t))
+		}
+		return outmap
+	}
+
+	scopes := GetAllMetricScopes()
+	complete := make(map[MetricScope]map[int]int)
+	for _, s := range scopes {
+		complete[s] = get_cpus(s)
+	}
+	return complete
+}
+
+func (m *LikwidCollector) Init(config json.RawMessage) error {
 	var ret C.int
 	m.name = "LikwidCollector"
 	if len(config) > 0 {
@@ -114,36 +266,78 @@ func (m *LikwidCollector) Init(config []byte) error {
 			return err
 		}
 	}
+	lib := dl.New(LIKWID_LIB_NAME, LIKWID_LIB_DL_FLAGS)
+	if lib == nil {
+		return fmt.Errorf("error instantiating DynamicLibrary for %s", LIKWID_LIB_NAME)
+	}
+	if m.config.ForceOverwrite {
+		cclog.ComponentDebug(m.name, "Set LIKWID_FORCE=1")
+		os.Setenv("LIKWID_FORCE", "1")
+	}
 	m.setup()
-	cpulist := CpuList()
-	m.cpulist = make([]C.int, len(cpulist))
-	slist := getSocketCpus()

-	m.sock2tid = make(map[int]int)
+	m.meta = map[string]string{"source": m.name, "group": "PerfCounter"}
+	cclog.ComponentDebug(m.name, "Get cpulist and init maps and lists")
+	cpulist := topo.CpuList()
+	m.cpulist = make([]C.int, len(cpulist))
+	m.cpu2tid = make(map[int]int)
 	for i, c := range cpulist {
 		m.cpulist[i] = C.int(c)
-		if sid, found := slist[m.cpulist[i]]; found {
-			m.sock2tid[sid] = i
-		}
+		m.cpu2tid[c] = i
+
 	}
 	m.results = make(map[int]map[int]map[string]interface{})
 	m.mresults = make(map[int]map[int]map[string]float64)
 	m.gmresults = make(map[int]map[string]float64)
+	cclog.ComponentDebug(m.name, "initialize LIKWID topology")
 	ret = C.topology_init()
 	if ret != 0 {
-		return errors.New("Failed to initialize LIKWID topology")
-	}
-	if m.config.ForceOverwrite {
-		os.Setenv("LIKWID_FORCE", "1")
+		err := errors.New("failed to initialize LIKWID topology")
+		cclog.ComponentError(m.name, err.Error())
+		return err
 	}
+
+	// Determine which counter works at which level. PMC*: cpu, *BOX*: socket, ...
+	m.initGranularity()
+	// Generate map for MetricScope -> scope_id (like socket id) -> responsible id (offset in cpulist)
+	m.scopeRespTids = m.getResponsiblities()
+
+	cclog.ComponentDebug(m.name, "initialize LIKWID perfmon module")
 	ret = C.perfmon_init(C.int(len(m.cpulist)), &m.cpulist[0])
 	if ret != 0 {
 		C.topology_finalize()
-		return errors.New("Failed to initialize LIKWID topology")
+		err := errors.New("failed to initialize LIKWID topology")
+		cclog.ComponentError(m.name, err.Error())
+		return err
 	}

+	// This is for the global metrics computation test
+	globalParams := make(map[string]interface{})
+	globalParams["time"] = float64(1.0)
+	globalParams["inverseClock"] = float64(1.0)
+	// While adding the events, we test the metrics whether they can be computed at all
 	for i, evset := range m.config.Eventsets {
 		estr := eventsToEventStr(evset.Events)
+		// Generate parameter list for the metric computing test
+		params := make(map[string]interface{})
+		params["time"] = float64(1.0)
+		params["inverseClock"] = float64(1.0)
+		for counter := range evset.Events {
+			params[counter] = float64(1.0)
+		}
+		for _, metric := range evset.Metrics {
+			// Try to evaluate the metric
+			_, err := agg.EvalFloat64Condition(metric.Calc, params)
+			if err != nil {
+				cclog.ComponentError(m.name, "Calculation for metric", metric.Name, "failed:", err.Error())
+				continue
+			}
+			// If the metric is not in the parameter list for the global metrics, add it
+			if _, ok := globalParams[metric.Name]; !ok {
+				globalParams[metric.Name] = float64(1.0)
+			}
+		}
+		// Now we add the list of events to likwid
 		cstr := C.CString(estr)
 		gid := C.perfmon_addEventSet(cstr)
 		if gid >= 0 {
@@ -155,153 +349,208 @@ func (m *LikwidCollector) Init(config []byte) error {
 		for tid := range m.cpulist {
 			m.results[i][tid] = make(map[string]interface{})
 			m.mresults[i][tid] = make(map[string]float64)
-			m.gmresults[tid] = make(map[string]float64)
+			if i == 0 {
+				m.gmresults[tid] = make(map[string]float64)
+			}
+		}
+	}
+	for _, metric := range m.config.Metrics {
+		// Try to evaluate the global metric
+		_, err := agg.EvalFloat64Condition(metric.Calc, globalParams)
+		if err != nil {
+			cclog.ComponentError(m.name, "Calculation for metric", metric.Name, "failed:", err.Error())
+			continue
 		}
 	}

+	// If no event set could be added, shut down LikwidCollector
 	if len(m.groups) == 0 {
 		C.perfmon_finalize()
 		C.topology_finalize()
-		return errors.New("No LIKWID performance group initialized")
+		err := errors.New("no LIKWID performance group initialized")
+		cclog.ComponentError(m.name, err.Error())
+		return err
 	}
 	m.basefreq = getBaseFreq()
+	cclog.ComponentDebug(m.name, "BaseFreq", m.basefreq)
 	m.init = true
 	return nil
 }

-func (m *LikwidCollector) Read(interval time.Duration, out *[]lp.MutableMetric) {
+// take a measurement for 'interval' seconds of event set index 'group'
+func (m *LikwidCollector) takeMeasurement(group int, interval time.Duration) error {
+	var ret C.int
+	gid := m.groups[group]
+	ret = C.perfmon_setupCounters(gid)
+	if ret != 0 {
+		gctr := C.GoString(C.perfmon_getGroupName(gid))
+		err := fmt.Errorf("failed to setup performance group %d (%s)", gid, gctr)
+		return err
+	}
+	ret = C.perfmon_startCounters()
+	if ret != 0 {
+		gctr := C.GoString(C.perfmon_getGroupName(gid))
+		err := fmt.Errorf("failed to start performance group %d (%s)", gid, gctr)
+		return err
+	}
+	m.running = true
+	time.Sleep(interval)
+	m.running = false
+	ret = C.perfmon_stopCounters()
+	if ret != 0 {
+		gctr := C.GoString(C.perfmon_getGroupName(gid))
+		err := fmt.Errorf("failed to stop performance group %d (%s)", gid, gctr)
+		return err
+	}
+	return nil
+}
+
+// Get all measurement results for an event set, derive the metric values out of the measurement results and send it
+func (m *LikwidCollector) calcEventsetMetrics(group int, interval time.Duration, output chan lp.CCMetric) error {
+	var eidx C.int
+	evset := m.config.Eventsets[group]
+	gid := m.groups[group]
+	invClock := float64(1.0 / m.basefreq)
+
+	// Go over events and get the results
+	for eidx = 0; int(eidx) < len(evset.Events); eidx++ {
+		ctr := C.perfmon_getCounterName(gid, eidx)
+		ev := C.perfmon_getEventName(gid, eidx)
+		gctr := C.GoString(ctr)
+		gev := C.GoString(ev)
+		// MetricScope for the counter (and if needed the event)
+		scope := getGranularity(gctr, gev)
+		// Get the map scope-id -> tids
+		// This way we read less counters like only the responsible hardware thread for a socket
+		scopemap := m.scopeRespTids[scope]
+		for _, tid := range scopemap {
+			if tid >= 0 {
+				m.results[group][tid]["time"] = interval.Seconds()
+				m.results[group][tid]["inverseClock"] = invClock
+				res := C.perfmon_getLastResult(gid, eidx, C.int(tid))
+				m.results[group][tid][gctr] = float64(res)
+			}
+		}
+	}
+
+	// Go over the event set metrics, derive the value out of the event:counter values and send it
+	for _, metric := range evset.Metrics {
+		// The metric scope is determined in the Init() function
+		// Get the map scope-id -> tids
+		scopemap := m.scopeRespTids[metric.Scope]
+		for domain, tid := range scopemap {
+			if tid >= 0 {
+				value, err := agg.EvalFloat64Condition(metric.Calc, m.results[group][tid])
+				if err != nil {
+					cclog.ComponentError(m.name, "Calculation for metric", metric.Name, "failed:", err.Error())
+					continue
+				}
+				m.mresults[group][tid][metric.Name] = value
+				if m.config.InvalidToZero && math.IsNaN(value) {
+					value = 0.0
+				}
+				if m.config.InvalidToZero && math.IsInf(value, 0) {
+					value = 0.0
+				}
+				// Now we have the result, send it with the proper tags
+				if !math.IsNaN(value) {
+					if metric.Publish {
+						tags := map[string]string{"type": metric.Scope.String()}
+						if metric.Scope != "node" {
+							tags["type-id"] = fmt.Sprintf("%d", domain)
+						}
+						fields := map[string]interface{}{"value": value}
+						y, err := lp.New(metric.Name, tags, m.meta, fields, time.Now())
+						if err == nil {
+							output <- y
+						}
+					}
+				}
+			}
+		}
+	}
+
+	return nil
+}
+
+// Go over the global metrics, derive the value out of the event sets' metric values and send it
+func (m *LikwidCollector) calcGlobalMetrics(interval time.Duration, output chan lp.CCMetric) error {
+	for _, metric := range m.config.Metrics {
+		scopemap := m.scopeRespTids[metric.Scope]
+		for domain, tid := range scopemap {
+			if tid >= 0 {
+				// Here we generate parameter list
+				params := make(map[string]interface{})
+				for j := range m.groups {
+					for mname, mres := range m.mresults[j][tid] {
+						params[mname] = mres
+					}
+				}
+				// Evaluate the metric
+				value, err := agg.EvalFloat64Condition(metric.Calc, params)
+				if err != nil {
+					cclog.ComponentError(m.name, "Calculation for metric", metric.Name, "failed:", err.Error())
+					continue
+				}
+				m.gmresults[tid][metric.Name] = value
+				if m.config.InvalidToZero && math.IsNaN(value) {
+					value = 0.0
+				}
+				if m.config.InvalidToZero && math.IsInf(value, 0) {
+					value = 0.0
+				}
+				// Now we have the result, send it with the proper tags
+				if !math.IsNaN(value) {
+					if metric.Publish {
+						tags := map[string]string{"type": metric.Scope.String()}
+						if metric.Scope != "node" {
+							tags["type-id"] = fmt.Sprintf("%d", domain)
+						}
+						fields := map[string]interface{}{"value": value}
+						y, err := lp.New(metric.Name, tags, m.meta, fields, time.Now())
+						if err == nil {
+							output <- y
+						}
+					}
+				}
+			}
+		}
+	}
+	return nil
+}
+
+// main read function taking multiple measurement rounds, each 'interval' seconds long
+func (m *LikwidCollector) Read(interval time.Duration, output chan lp.CCMetric) {
 	if !m.init {
 		return
 	}
-	var ret C.int

-	for i, gid := range m.groups {
-		evset := m.config.Eventsets[i]
-		ret = C.perfmon_setupCounters(gid)
-		if ret != 0 {
-			log.Print("Failed to setup performance group ", C.perfmon_getGroupName(gid))
-			continue
-		}
-		ret = C.perfmon_startCounters()
-		if ret != 0 {
-			log.Print("Failed to start performance group ", C.perfmon_getGroupName(gid))
-			continue
-		}
-		time.Sleep(interval)
-		ret = C.perfmon_stopCounters()
-		if ret != 0 {
-			log.Print("Failed to stop performance group ", C.perfmon_getGroupName(gid))
-			continue
-		}
-		var eidx C.int
-		for tid := range m.cpulist {
-			for eidx = 0; int(eidx) < len(evset.Events); eidx++ {
-				ctr := C.perfmon_getCounterName(gid, eidx)
-				gctr := C.GoString(ctr)
-				res := C.perfmon_getLastResult(gid, eidx, C.int(tid))
-				m.results[i][tid][gctr] = float64(res)
-			}
-			m.results[i][tid]["time"] = interval.Seconds()
-			m.results[i][tid]["inverseClock"] = float64(1.0 / m.basefreq)
-			for _, metric := range evset.Metrics {
-				expression, err := govaluate.NewEvaluableExpression(metric.Calc)
-				if err != nil {
-					log.Print(err.Error())
-					continue
-				}
-				result, err := expression.Evaluate(m.results[i][tid])
-				if err != nil {
-					log.Print(err.Error())
-					continue
-				}
-				m.mresults[i][tid][metric.Name] = float64(result.(float64))
-			}
-		}
-	}
-
-	for _, metric := range m.config.Metrics {
-		for tid := range m.cpulist {
-			var params map[string]interface{}
-			expression, err := govaluate.NewEvaluableExpression(metric.Calc)
-			if err != nil {
-				log.Print(err.Error())
-				continue
-			}
-			params = make(map[string]interface{})
-			for j := range m.groups {
-				for mname, mres := range m.mresults[j][tid] {
-					params[mname] = mres
-				}
-			}
-			result, err := expression.Evaluate(params)
-			if err != nil {
-				log.Print(err.Error())
-				continue
-			}
-			m.gmresults[tid][metric.Name] = float64(result.(float64))
-		}
-	}
 	for i := range m.groups {
-		evset := m.config.Eventsets[i]
-		for _, metric := range evset.Metrics {
-			_, skip := stringArrayContains(m.config.ExcludeMetrics, metric.Name)
-			if metric.Publish && !skip {
-				if metric.Socket_scope {
-					for sid, tid := range m.sock2tid {
-						y, err := lp.New(metric.Name,
-							map[string]string{"type": "socket", "type-id": fmt.Sprintf("%d", int(sid))},
-							map[string]interface{}{"value": m.mresults[i][tid][metric.Name]},
-							time.Now())
-						if err == nil {
-							*out = append(*out, y)
-						}
-					}
-				} else {
-					for tid, cpu := range m.cpulist {
-						y, err := lp.New(metric.Name,
-							map[string]string{"type": "cpu", "type-id": fmt.Sprintf("%d", int(cpu))},
-							map[string]interface{}{"value": m.mresults[i][tid][metric.Name]},
-							time.Now())
-						if err == nil {
-							*out = append(*out, y)
-						}
-					}
-				}
-			}
-		}
-	}
-	for _, metric := range m.config.Metrics {
-		_, skip := stringArrayContains(m.config.ExcludeMetrics, metric.Name)
-		if metric.Publish && !skip {
-			if metric.Socket_scope {
-				for sid, tid := range m.sock2tid {
-					y, err := lp.New(metric.Name,
-						map[string]string{"type": "socket", "type-id": fmt.Sprintf("%d", int(sid))},
-						map[string]interface{}{"value": m.gmresults[tid][metric.Name]},
-						time.Now())
-					if err == nil {
-						*out = append(*out, y)
-					}
-				}
-			} else {
-				for tid, cpu := range m.cpulist {
-					y, err := lp.New(metric.Name,
-						map[string]string{"type": "cpu", "type-id": fmt.Sprintf("%d", int(cpu))},
-						map[string]interface{}{"value": m.gmresults[tid][metric.Name]},
-						time.Now())
-					if err == nil {
-						*out = append(*out, y)
-					}
-				}
-			}
+		// measure event set 'i' for 'interval' seconds
+		err := m.takeMeasurement(i, interval)
+		if err != nil {
+			cclog.ComponentError(m.name, err.Error())
+			return
 		}
+		// read measurements and derive event set metrics
+		m.calcEventsetMetrics(i, interval, output)
 	}
+	// use the event set metrics to derive the global metrics
+	m.calcGlobalMetrics(interval, output)
 }

 func (m *LikwidCollector) Close() {
 	if m.init {
+		cclog.ComponentDebug(m.name, "Closing ...")
 		m.init = false
+		if m.running {
+			cclog.ComponentDebug(m.name, "Stopping counters")
+			C.perfmon_stopCounters()
+		}
+		cclog.ComponentDebug(m.name, "Finalize LIKWID perfmon module")
 		C.perfmon_finalize()
+		cclog.ComponentDebug(m.name, "Finalize LIKWID topology module")
 		C.topology_finalize()
+		cclog.ComponentDebug(m.name, "Closing done")
 	}
 }
--- a/collectors/likwidMetric.md
+++ b/collectors/likwidMetric.md
@@ -0,0 +1,148 @@
+
+## `likwid` collector
+
+The `likwid` collector is probably the most complicated collector. The LIKWID library is included as static library with *direct* access mode. The *direct* access mode is suitable if the daemon is executed by a root user. The static library does not contain the performance groups, so all information needs to be provided in the configuration.
+
+The `likwid` configuration consists of two parts, the "eventsets" and "globalmetrics":
+- An event set list itself has two parts, the "events" and a set of derivable "metrics". Each of the "events" is a counter:event pair in LIKWID's syntax. The "metrics" are a list of formulas to derive the metric value from the measurements of the "events". Each metric has a name, the formula, a scope and a publish flag. A counter names can be used like variables in the formulas, so `PMC0+PMC1` sums the measurements for the both events configured in the counters `PMC0` and `PMC1`. The scope tells the Collector whether it is a metric for each hardware thread (`cpu`) or each CPU socket (`socket`). The last one is the publishing flag. It tells the collector whether a metric should be sent to the router.
+- The global metrics are metrics which require data from all event set measurements to be derived. The inputs are the metrics in the event sets. Similar to the metrics in the event sets, the global metrics are defined by a name, a formula, a scope and a publish flag. See event set metrics for details. The only difference is that there is no access to the raw event measurements anymore but only to the metrics. So, the idea is to derive a metric in the "eventsets" section and reuse it in the "globalmetrics" part. If you need a metric only for deriving the global metrics, disable forwarding of the event set metrics. **Be aware** that the combination might be misleading because the "behavior" of a metric changes over time and the multiple measurements might count different computing phases.
+
+Additional options:
+- `force_overwrite`: Same as setting `LIKWID_FORCE=1`. In case counters are already in-use, LIKWID overwrites their configuration to do its measurements
+- `invalid_to_zero`: In some cases, the calculations result in `NaN` or `Inf`. With this option, all `NaN` and `Inf` values are replaces with `0.0`.
+
+### Available metric scopes
+
+Hardware performance counters are scattered all over the system nowadays. A counter coveres a specific part of the system. While there are hardware thread specific counter for CPU cycles, instructions and so on, some others are specific for a whole CPU socket/package. To address that, the collector provides the specification of a 'scope' for each metric.
+
+- `cpu` : One metric per CPU hardware thread with the tags `"type" : "cpu"` and `"type-id" : "$cpu_id"`
+- `socket` : One metric per CPU socket/package with the tags `"type" : "socket"` and `"type-id" : "$socket_id"`
+
+**Note:** You cannot specify `socket` scope for a metric that is measured at `cpu` scope, so some kind of expert knowledge or lookup work in the [Likwid Wiki](https://github.com/RRZE-HPC/likwid/wiki) is required. Get the scope of each counter from the *Architecture* pages and as soon as one counter in a metric is socket-specific, the whole metric is socket-specific.
+
+As a guideline:
+- All counters `FIXCx`, `PMCy` and `TMAz` have the scope `cpu`
+- All counters names containing `BOX` have the scope `socket`
+- All `PWRx` counters have scope `socket`, except `"PWR1" : "RAPL_CORE_ENERGY"` has `cpu` scope
+- All `DFCx` counters have scope `socket`
+
+
+### Example configuration
+
+
+```json
+  "likwid": {
+    "force_overwrite" : false,
+    "nan_to_zero" : false,
+    "eventsets": [
+      {
+        "events": {
+          "FIXC1": "ACTUAL_CPU_CLOCK",
+          "FIXC2": "MAX_CPU_CLOCK",
+          "PMC0": "RETIRED_INSTRUCTIONS",
+          "PMC1": "CPU_CLOCKS_UNHALTED",
+          "PMC2": "RETIRED_SSE_AVX_FLOPS_ALL",
+          "PMC3": "MERGE",
+          "DFC0": "DRAM_CHANNEL_0",
+          "DFC1": "DRAM_CHANNEL_1",
+          "DFC2": "DRAM_CHANNEL_2",
+          "DFC3": "DRAM_CHANNEL_3"
+        },
+        "metrics": [
+          {
+            "name": "ipc",
+            "calc": "PMC0/PMC1",
+            "scope": "cpu",
+            "publish": true
+          },
+          {
+            "name": "flops_any",
+            "calc": "0.000001*PMC2/time",
+            "scope": "cpu",
+            "publish": true
+          },
+          {
+            "name": "clock_mhz",
+            "calc": "0.000001*(FIXC1/FIXC2)/inverseClock",
+            "scope": "cpu",
+            "publish": true
+          },
+          {
+            "name": "mem1",
+            "calc": "0.000001*(DFC0+DFC1+DFC2+DFC3)*64.0/time",
+            "scope": "socket",
+            "publish": false
+          }
+        ]
+      },
+      {
+        "events": {
+          "DFC0": "DRAM_CHANNEL_4",
+          "DFC1": "DRAM_CHANNEL_5",
+          "DFC2": "DRAM_CHANNEL_6",
+          "DFC3": "DRAM_CHANNEL_7",
+          "PWR0": "RAPL_CORE_ENERGY",
+          "PWR1": "RAPL_PKG_ENERGY"
+        },
+        "metrics": [
+          {
+            "name": "pwr_core",
+            "calc": "PWR0/time",
+            "scope": "socket",
+            "publish": true
+          },
+          {
+            "name": "pwr_pkg",
+            "calc": "PWR1/time",
+            "scope": "socket",
+            "publish": true
+          },
+          {
+            "name": "mem2",
+            "calc": "0.000001*(DFC0+DFC1+DFC2+DFC3)*64.0/time",
+            "scope": "socket",
+            "publish": false
+          }
+        ]
+      }
+    ],
+    "globalmetrics": [
+      {
+        "name": "mem_bw",
+        "calc": "mem1+mem2",
+        "scope": "socket",
+        "publish": true
+      }
+    ]
+  }
+```
+
+### How to get the eventsets and metrics from LIKWID
+
+The `likwid` collector reads hardware performance counters at a **cpu** and **socket** level. The configuration looks quite complicated but it is basically copy&paste from [LIKWID's performance groups](https://github.com/RRZE-HPC/likwid/tree/master/groups). The collector made multiple iterations and tried to use the performance groups but it lacked flexibility. The current way of configuration provides most flexibility.
+
+The logic is as following: There are multiple eventsets, each consisting of a list of counters+events and a list of metrics. If you compare a common performance group with the example setting above, there is not much difference:
+```
+EVENTSET                         ->   "events": {
+FIXC1 ACTUAL_CPU_CLOCK           ->     "FIXC1": "ACTUAL_CPU_CLOCK",
+FIXC2 MAX_CPU_CLOCK              ->     "FIXC2": "MAX_CPU_CLOCK",
+PMC0  RETIRED_INSTRUCTIONS       ->     "PMC0" : "RETIRED_INSTRUCTIONS",
+PMC1  CPU_CLOCKS_UNHALTED        ->     "PMC1" : "CPU_CLOCKS_UNHALTED",
+PMC2  RETIRED_SSE_AVX_FLOPS_ALL  ->     "PMC2": "RETIRED_SSE_AVX_FLOPS_ALL",
+PMC3  MERGE                      ->     "PMC3": "MERGE",
+                                 ->   }
+```
+
+The metrics are following the same procedure:
+
+```
+METRICS                          ->   "metrics": [
+IPC   PMC0/PMC1                  ->     {
+                                 ->       "name" : "IPC",
+                                 ->       "calc" : "PMC0/PMC1",
+                                 ->       "scope": "cpu",
+                                 ->       "publish": true
+                                 ->     }
+                                 ->   ]
+```
+
--- a/collectors/loadavgMetric.go
+++ b/collectors/loadavgMetric.go
@@ -2,29 +2,39 @@ package collectors

 import (
 	"encoding/json"
+	"fmt"
 	"io/ioutil"
 	"strconv"
 	"strings"
 	"time"

-	lp "github.com/influxdata/line-protocol"
+	cclog "github.com/ClusterCockpit/cc-metric-collector/internal/ccLogger"
+	lp "github.com/ClusterCockpit/cc-metric-collector/internal/ccMetric"
 )

-const LOADAVGFILE = `/proc/loadavg`
-
-type LoadavgCollectorConfig struct {
-	ExcludeMetrics []string `json:"exclude_metrics,omitempty"`
-}
+//
+// LoadavgCollector collects:
+// * load average of last 1, 5 & 15 minutes
+// * number of processes currently runnable
+// * total number of processes in system
+//
+// See: https://www.kernel.org/doc/html/latest/filesystems/proc.html
+//
+const LOADAVGFILE = "/proc/loadavg"

 type LoadavgCollector struct {
-	MetricCollector
+	metricCollector
 	tags         map[string]string
 	load_matches []string
+	load_skips   []bool
 	proc_matches []string
-	config       LoadavgCollectorConfig
+	proc_skips   []bool
+	config       struct {
+		ExcludeMetrics []string `json:"exclude_metrics,omitempty"`
+	}
 }

-func (m *LoadavgCollector) Init(config []byte) error {
+func (m *LoadavgCollector) Init(config json.RawMessage) error {
 	m.name = "LoadavgCollector"
 	m.setup()
 	if len(config) > 0 {
@@ -33,45 +43,82 @@ func (m *LoadavgCollector) Init(config []byte) error {
 			return err
 		}
 	}
+	m.meta = map[string]string{
+		"source": m.name,
+		"group":  "LOAD"}
 	m.tags = map[string]string{"type": "node"}
-	m.load_matches = []string{"load_one", "load_five", "load_fifteen"}
-	m.proc_matches = []string{"proc_run", "proc_total"}
+	m.load_matches = []string{
+		"load_one",
+		"load_five",
+		"load_fifteen"}
+	m.load_skips = make([]bool, len(m.load_matches))
+	m.proc_matches = []string{
+		"proc_run",
+		"proc_total"}
+	m.proc_skips = make([]bool, len(m.proc_matches))
+
+	for i, name := range m.load_matches {
+		_, m.load_skips[i] = stringArrayContains(m.config.ExcludeMetrics, name)
+	}
+	for i, name := range m.proc_matches {
+		_, m.proc_skips[i] = stringArrayContains(m.config.ExcludeMetrics, name)
+	}
 	m.init = true
 	return nil
 }

-func (m *LoadavgCollector) Read(interval time.Duration, out *[]lp.MutableMetric) {
-	var skip bool
+func (m *LoadavgCollector) Read(interval time.Duration, output chan lp.CCMetric) {
 	if !m.init {
 		return
 	}
-	buffer, err := ioutil.ReadFile(string(LOADAVGFILE))
-
+	buffer, err := ioutil.ReadFile(LOADAVGFILE)
 	if err != nil {
+		if err != nil {
+			cclog.ComponentError(
+				m.name,
+				fmt.Sprintf("Read(): Failed to read file '%s': %v", LOADAVGFILE, err))
+		}
 		return
 	}
+	now := time.Now()

+	// Load metrics
 	ls := strings.Split(string(buffer), ` `)
 	for i, name := range m.load_matches {
 		x, err := strconv.ParseFloat(ls[i], 64)
+		if err != nil {
+			cclog.ComponentError(
+				m.name,
+				fmt.Sprintf("Read(): Failed to convert '%s' to float64: %v", ls[i], err))
+			continue
+		}
+		if m.load_skips[i] {
+			continue
+		}
+		y, err := lp.New(name, m.tags, m.meta, map[string]interface{}{"value": x}, now)
 		if err == nil {
-			_, skip = stringArrayContains(m.config.ExcludeMetrics, name)
-			y, err := lp.New(name, m.tags, map[string]interface{}{"value": float64(x)}, time.Now())
-			if err == nil && !skip {
-				*out = append(*out, y)
-			}
+			output <- y
 		}
 	}
+
+	// Process metrics
 	lv := strings.Split(ls[3], `/`)
 	for i, name := range m.proc_matches {
-		x, err := strconv.ParseFloat(lv[i], 64)
-		if err == nil {
-			_, skip = stringArrayContains(m.config.ExcludeMetrics, name)
-			y, err := lp.New(name, m.tags, map[string]interface{}{"value": float64(x)}, time.Now())
-			if err == nil && !skip {
-				*out = append(*out, y)
-			}
+		x, err := strconv.ParseInt(lv[i], 10, 64)
+		if err != nil {
+			cclog.ComponentError(
+				m.name,
+				fmt.Sprintf("Read(): Failed to convert '%s' to float64: %v", lv[i], err))
+			continue
 		}
+		if m.proc_skips[i] {
+			continue
+		}
+		y, err := lp.New(name, m.tags, m.meta, map[string]interface{}{"value": x}, now)
+		if err == nil {
+			output <- y
+		}
+
 	}
 }

--- a/collectors/loadavgMetric.md
+++ b/collectors/loadavgMetric.md
@@ -0,0 +1,19 @@
+
+## `loadavg` collector
+
+```json
+  "loadavg": {
+    "exclude_metrics": [
+      "proc_run"
+    ]
+  }
+```
+
+The `loadavg` collector reads data from `/proc/loadavg` and outputs a handful **node** metrics. If a metric is not required, it can be excluded from forwarding it to the sink.
+
+Metrics:
+* `load_one`
+* `load_five`
+* `load_fifteen`
+* `proc_run`
+* `proc_total`
--- a/collectors/lustreMetric.go
+++ b/collectors/lustreMetric.go
@@ -3,31 +3,84 @@ package collectors
 import (
 	"encoding/json"
 	"errors"
-	"io/ioutil"
-	"log"
+	"fmt"
+	"os/exec"
+	"os/user"
 	"strconv"
 	"strings"
 	"time"

-	lp "github.com/influxdata/line-protocol"
+	cclog "github.com/ClusterCockpit/cc-metric-collector/internal/ccLogger"
+	lp "github.com/ClusterCockpit/cc-metric-collector/internal/ccMetric"
 )

-const LUSTREFILE = `/proc/fs/lustre/llite/lnec-XXXXXX/stats`
+const LUSTRE_SYSFS = `/sys/fs/lustre`
+const LCTL_CMD = `lctl`
+const LCTL_OPTION = `get_param`

 type LustreCollectorConfig struct {
-	Procfiles      []string `json:"procfiles"`
+	LCtlCommand    string   `json:"lctl_command"`
 	ExcludeMetrics []string `json:"exclude_metrics"`
+	SendAllMetrics bool     `json:"send_all_metrics"`
 }

 type LustreCollector struct {
-	MetricCollector
+	metricCollector
 	tags    map[string]string
 	matches map[string]map[string]int
-	devices []string
+	stats   map[string]map[string]int64
 	config  LustreCollectorConfig
+	lctl    string
 }

-func (m *LustreCollector) Init(config []byte) error {
+func (m *LustreCollector) getDeviceDataCommand(device string) []string {
+	statsfile := fmt.Sprintf("llite.%s.stats", device)
+	command := exec.Command(m.lctl, LCTL_OPTION, statsfile)
+	command.Wait()
+	stdout, _ := command.Output()
+	return strings.Split(string(stdout), "\n")
+}
+
+func (m *LustreCollector) getDevices() []string {
+	devices := make([]string, 0)
+
+	// //Version reading devices from sysfs
+	// globPattern := filepath.Join(LUSTRE_SYSFS, "llite/*/stats")
+	// files, err := filepath.Glob(globPattern)
+	// if err != nil {
+	// 	return devices
+	// }
+	// for _, f := range files {
+	// 	pathlist := strings.Split(f, "/")
+	// 	devices = append(devices, pathlist[4])
+	// }
+
+	data := m.getDeviceDataCommand("*")
+
+	for _, line := range data {
+		if strings.HasPrefix(line, "llite") {
+			linefields := strings.Split(line, ".")
+			if len(linefields) > 2 {
+				devices = append(devices, linefields[1])
+			}
+		}
+	}
+	return devices
+}
+
+// //Version reading the stats data of a device from sysfs
+// func (m *LustreCollector) getDeviceDataSysfs(device string) []string {
+// 	llitedir := filepath.Join(LUSTRE_SYSFS, "llite")
+// 	devdir := filepath.Join(llitedir, device)
+// 	statsfile := filepath.Join(devdir, "stats")
+// 	buffer, err := ioutil.ReadFile(statsfile)
+// 	if err != nil {
+// 		return make([]string, 0)
+// 	}
+// 	return strings.Split(string(buffer), "\n")
+// }
+
+func (m *LustreCollector) Init(config json.RawMessage) error {
 	var err error
 	m.name = "LustreCollector"
 	if len(config) > 0 {
@@ -38,66 +91,120 @@ func (m *LustreCollector) Init(config []byte) error {
 	}
 	m.setup()
 	m.tags = map[string]string{"type": "node"}
-	m.matches = map[string]map[string]int{"read_bytes": {"read_bytes": 6, "read_requests": 1},
-		"write_bytes":      {"write_bytes": 6, "write_requests": 1},
-		"open":             {"open": 1},
-		"close":            {"close": 1},
-		"setattr":          {"setattr": 1},
-		"getattr":          {"getattr": 1},
-		"statfs":           {"statfs": 1},
-		"inode_permission": {"inode_permission": 1}}
-	m.devices = make([]string, 0)
-	for _, p := range m.config.Procfiles {
-		_, err := ioutil.ReadFile(p)
-		if err == nil {
-			m.devices = append(m.devices, p)
-		} else {
-			log.Print(err.Error())
-			continue
-		}
+	m.meta = map[string]string{"source": m.name, "group": "Lustre"}
+	defmatches := map[string]map[string]int{
+		"read_bytes":       {"lustre_read_bytes": 6, "lustre_read_requests": 1},
+		"write_bytes":      {"lustre_write_bytes": 6, "lustre_write_requests": 1},
+		"open":             {"lustre_open": 1},
+		"close":            {"lustre_close": 1},
+		"setattr":          {"lustre_setattr": 1},
+		"getattr":          {"lustre_getattr": 1},
+		"statfs":           {"lustre_statfs": 1},
+		"inode_permission": {"lustre_inode_permission": 1}}
+
+	// Lustre file system statistics can only be queried by user root
+	user, err := user.Current()
+	if err != nil {
+		cclog.ComponentError(m.name, "Failed to get current user:", err.Error())
+		return err
+	}
+	if user.Uid != "0" {
+		cclog.ComponentError(m.name, "Lustre file system statistics can only be queried by user root:", err.Error())
+		return err
 	}

-	if len(m.devices) == 0 {
-		return errors.New("No metrics to collect")
+	m.matches = make(map[string]map[string]int)
+	for lineprefix, names := range defmatches {
+		for metricname, offset := range names {
+			_, skip := stringArrayContains(m.config.ExcludeMetrics, metricname)
+			if skip {
+				continue
+			}
+			if _, prefixExist := m.matches[lineprefix]; !prefixExist {
+				m.matches[lineprefix] = make(map[string]int)
+			}
+			if _, metricExist := m.matches[lineprefix][metricname]; !metricExist {
+				m.matches[lineprefix][metricname] = offset
+			}
+		}
+	}
+	p, err := exec.LookPath(m.config.LCtlCommand)
+	if err != nil {
+		p, err = exec.LookPath(LCTL_CMD)
+		if err != nil {
+			return err
+		}
+	}
+	m.lctl = p
+
+	devices := m.getDevices()
+	if len(devices) == 0 {
+		return errors.New("no metrics to collect")
+	}
+	m.stats = make(map[string]map[string]int64)
+	for _, d := range devices {
+		m.stats[d] = make(map[string]int64)
+		for _, names := range m.matches {
+			for metricname := range names {
+				m.stats[d][metricname] = 0
+			}
+		}
 	}
 	m.init = true
 	return nil
 }

-func (m *LustreCollector) Read(interval time.Duration, out *[]lp.MutableMetric) {
+func (m *LustreCollector) Read(interval time.Duration, output chan lp.CCMetric) {
 	if !m.init {
 		return
 	}
-	for _, p := range m.devices {
-		buffer, err := ioutil.ReadFile(p)
+	for device, devData := range m.stats {
+		stats := m.getDeviceDataCommand(device)
+		processed := []string{}

-		if err != nil {
-			log.Print(err)
-			return
-		}
-
-		for _, line := range strings.Split(string(buffer), "\n") {
+		for _, line := range stats {
 			lf := strings.Fields(line)
 			if len(lf) > 1 {
-				for match, fields := range m.matches {
-					if lf[0] == match {
-						for name, idx := range fields {
-							_, skip := stringArrayContains(m.config.ExcludeMetrics, name)
-							if skip {
-								continue
+				if fields, ok := m.matches[lf[0]]; ok {
+					for name, idx := range fields {
+						x, err := strconv.ParseInt(lf[idx], 0, 64)
+						if err != nil {
+							continue
+						}
+						value := x - devData[name]
+						devData[name] = x
+						if value < 0 {
+							value = 0
+						}
+						y, err := lp.New(name, m.tags, m.meta, map[string]interface{}{"value": value}, time.Now())
+						if err == nil {
+							y.AddTag("device", device)
+							if strings.Contains(name, "byte") {
+								y.AddMeta("unit", "Byte")
 							}
-							x, err := strconv.ParseInt(lf[idx], 0, 64)
-							if err == nil {
-								y, err := lp.New(name, m.tags, map[string]interface{}{"value": x}, time.Now())
-								if err == nil {
-									*out = append(*out, y)
-								}
+							output <- y
+							if m.config.SendAllMetrics {
+								processed = append(processed, name)
 							}
 						}
 					}
 				}
 			}
 		}
+		if m.config.SendAllMetrics {
+			for name := range devData {
+				if _, done := stringArrayContains(processed, name); !done {
+					y, err := lp.New(name, m.tags, m.meta, map[string]interface{}{"value": 0}, time.Now())
+					if err == nil {
+						y.AddTag("device", device)
+						if strings.Contains(name, "byte") {
+							y.AddMeta("unit", "Byte")
+						}
+						output <- y
+					}
+				}
+			}
+		}
 	}
 }

--- a/collectors/lustreMetric.md
+++ b/collectors/lustreMetric.md
@@ -0,0 +1,29 @@
+
+## `lustrestat` collector
+
+```json
+  "lustrestat": {
+    "procfiles" : [
+      "/proc/fs/lustre/llite/lnec-XXXXXX/stats"
+    ],
+    "exclude_metrics": [
+      "setattr",
+      "getattr"
+    ]
+  }
+```
+
+The `lustrestat` collector reads from the procfs stat files for Lustre like `/proc/fs/lustre/llite/lnec-XXXXXX/stats`.
+
+Metrics:
+* `read_bytes`
+* `read_requests`
+* `write_bytes`
+* `write_requests`
+* `open`
+* `close`
+* `getattr`
+* `setattr`
+* `statfs`
+* `inode_permission`
+
--- a/collectors/memstatMetric.go
+++ b/collectors/memstatMetric.go
@@ -10,7 +10,7 @@ import (
 	"strings"
 	"time"

-	lp "github.com/influxdata/line-protocol"
+	lp "github.com/ClusterCockpit/cc-metric-collector/internal/ccMetric"
 )

 const MEMSTATFILE = `/proc/meminfo`
@@ -20,14 +20,14 @@ type MemstatCollectorConfig struct {
 }

 type MemstatCollector struct {
-	MetricCollector
+	metricCollector
 	stats   map[string]int64
 	tags    map[string]string
 	matches map[string]string
 	config  MemstatCollectorConfig
 }

-func (m *MemstatCollector) Init(config []byte) error {
+func (m *MemstatCollector) Init(config json.RawMessage) error {
 	var err error
 	m.name = "MemstatCollector"
 	if len(config) > 0 {
@@ -36,6 +36,7 @@ func (m *MemstatCollector) Init(config []byte) error {
 			return err
 		}
 	}
+	m.meta = map[string]string{"source": m.name, "group": "Memory", "unit": "kByte"}
 	m.stats = make(map[string]int64)
 	m.matches = make(map[string]string)
 	m.tags = map[string]string{"type": "node"}
@@ -65,7 +66,7 @@ func (m *MemstatCollector) Init(config []byte) error {
 	return err
 }

-func (m *MemstatCollector) Read(interval time.Duration, out *[]lp.MutableMetric) {
+func (m *MemstatCollector) Read(interval time.Duration, output chan lp.CCMetric) {
 	if !m.init {
 		return
 	}
@@ -93,13 +94,13 @@ func (m *MemstatCollector) Read(interval time.Duration, out *[]lp.MutableMetric)

 	for match, name := range m.matches {
 		if _, exists := m.stats[match]; !exists {
-			err = errors.New(fmt.Sprintf("Parse error for %s : %s", match, name))
+			err = fmt.Errorf("Parse error for %s : %s", match, name)
 			log.Print(err)
 			continue
 		}
-		y, err := lp.New(name, m.tags, map[string]interface{}{"value": int(float64(m.stats[match]) * 1.0e-3)}, time.Now())
+		y, err := lp.New(name, m.tags, m.meta, map[string]interface{}{"value": int(float64(m.stats[match]) * 1.0e-3)}, time.Now())
 		if err == nil {
-			*out = append(*out, y)
+			output <- y
 		}
 	}

@@ -108,18 +109,18 @@ func (m *MemstatCollector) Read(interval time.Duration, out *[]lp.MutableMetric)
 			if _, cached := m.stats[`Cached`]; cached {
 				memUsed := m.stats[`MemTotal`] - (m.stats[`MemFree`] + m.stats[`Buffers`] + m.stats[`Cached`])
 				_, skip := stringArrayContains(m.config.ExcludeMetrics, "mem_used")
-				y, err := lp.New("mem_used", m.tags, map[string]interface{}{"value": int(float64(memUsed) * 1.0e-3)}, time.Now())
+				y, err := lp.New("mem_used", m.tags, m.meta, map[string]interface{}{"value": int(float64(memUsed) * 1.0e-3)}, time.Now())
 				if err == nil && !skip {
-					*out = append(*out, y)
+					output <- y
 				}
 			}
 		}
 	}
 	if _, found := m.stats[`MemShared`]; found {
 		_, skip := stringArrayContains(m.config.ExcludeMetrics, "mem_shared")
-		y, err := lp.New("mem_shared", m.tags, map[string]interface{}{"value": int(float64(m.stats[`MemShared`]) * 1.0e-3)}, time.Now())
+		y, err := lp.New("mem_shared", m.tags, m.meta, map[string]interface{}{"value": int(float64(m.stats[`MemShared`]) * 1.0e-3)}, time.Now())
 		if err == nil && !skip {
-			*out = append(*out, y)
+			output <- y
 		}
 	}
 }
--- a/collectors/memstatMetric.md
+++ b/collectors/memstatMetric.md
@@ -0,0 +1,27 @@
+
+## `memstat` collector
+
+```json
+  "memstat": {
+    "exclude_metrics": [
+      "mem_used"
+    ]
+  }
+```
+
+The `memstat` collector reads data from `/proc/meminfo` and outputs a handful **node** metrics. If a metric is not required, it can be excluded from forwarding it to the sink.
+
+
+Metrics:
+* `mem_total`
+* `mem_sreclaimable`
+* `mem_slab`
+* `mem_free`
+* `mem_buffers`
+* `mem_cached`
+* `mem_available`
+* `mem_shared`
+* `swap_total`
+* `swap_free`
+* `mem_used` = `mem_total` - (`mem_free` + `mem_buffers` + `mem_cached`)
+
--- a/collectors/metricCollector.go
+++ b/collectors/metricCollector.go
@@ -1,40 +1,48 @@
 package collectors

 import (
-	"errors"
-	lp "github.com/influxdata/line-protocol"
+	"encoding/json"
+	"fmt"
 	"io/ioutil"
 	"log"
 	"strconv"
 	"strings"
 	"time"
+
+	lp "github.com/ClusterCockpit/cc-metric-collector/internal/ccMetric"
 )

-type MetricGetter interface {
+type MetricCollector interface {
 	Name() string
-	Init(config []byte) error
+	Init(config json.RawMessage) error
 	Initialized() bool
-	Read(time.Duration, *[]lp.MutableMetric)
+	Read(duration time.Duration, output chan lp.CCMetric)
 	Close()
 }

-type MetricCollector struct {
+type metricCollector struct {
 	name string
 	init bool
+	meta map[string]string
 }

-func (c *MetricCollector) Name() string {
+// Name() returns the name of the metric collector
+func (c *metricCollector) Name() string {
 	return c.name
 }

-func (c *MetricCollector) setup() error {
+func (c *metricCollector) setup() error {
 	return nil
 }

-func (c *MetricCollector) Initialized() bool {
-	return c.init == true
+// Initialized() indicates whether the metric collector has been initialized.
+func (c *metricCollector) Initialized() bool {
+	return c.init
 }

+// intArrayContains scans an array of ints if the value str is present in the array
+// If the specified value is found, the corresponding array index is returned.
+// The bool value is used to signal success or failure
 func intArrayContains(array []int, str int) (int, bool) {
 	for i, a := range array {
 		if a == str {
@@ -44,6 +52,9 @@ func intArrayContains(array []int, str int) (int, bool) {
 	return -1, false
 }

+// stringArrayContains scans an array of strings if the value str is present in the array
+// If the specified value is found, the corresponding array index is returned.
+// The bool value is used to signal success or failure
 func stringArrayContains(array []string, str string) (int, bool) {
 	for i, a := range array {
 		if a == str {
@@ -103,27 +114,13 @@ func CpuList() []int {
 	return cpulist
 }

-func Tags2Map(metric lp.Metric) map[string]string {
-	tags := make(map[string]string)
-	for _, t := range metric.TagList() {
-		tags[t.Key] = t.Value
-	}
-	return tags
-}
-
-func Fields2Map(metric lp.Metric) map[string]interface{} {
-	fields := make(map[string]interface{})
-	for _, f := range metric.FieldList() {
-		fields[f.Key] = f.Value
-	}
-	return fields
-}
-
+// RemoveFromStringList removes the string r from the array of strings s
+// If r is not contained in the array an error is returned
 func RemoveFromStringList(s []string, r string) ([]string, error) {
 	for i, item := range s {
 		if r == item {
 			return append(s[:i], s[i+1:]...), nil
 		}
 	}
-	return s, errors.New("No such string in list")
+	return s, fmt.Errorf("No such string in list")
 }
--- a/collectors/netstatMetric.go
+++ b/collectors/netstatMetric.go
@@ -1,86 +1,138 @@
 package collectors

 import (
+	"bufio"
 	"encoding/json"
-	"io/ioutil"
-	"log"
+	"errors"
+	"os"
 	"strconv"
 	"strings"
 	"time"

-	lp "github.com/influxdata/line-protocol"
+	cclog "github.com/ClusterCockpit/cc-metric-collector/internal/ccLogger"
+	lp "github.com/ClusterCockpit/cc-metric-collector/internal/ccMetric"
 )

 const NETSTATFILE = `/proc/net/dev`

 type NetstatCollectorConfig struct {
-	ExcludeDevices []string `json:"exclude_devices"`
+	IncludeDevices []string `json:"include_devices"`
+}
+
+type NetstatCollectorMetric struct {
+	index     int
+	lastValue float64
 }

 type NetstatCollector struct {
-	MetricCollector
-	config  NetstatCollectorConfig
-	matches map[int]string
+	metricCollector
+	config        NetstatCollectorConfig
+	matches       map[string]map[string]NetstatCollectorMetric
+	devtags       map[string]map[string]string
+	lastTimestamp time.Time
 }

-func (m *NetstatCollector) Init(config []byte) error {
+func (m *NetstatCollector) Init(config json.RawMessage) error {
 	m.name = "NetstatCollector"
 	m.setup()
-	m.matches = map[int]string{
-		1:  "bytes_in",
-		9:  "bytes_out",
-		2:  "pkts_in",
-		10: "pkts_out",
+	m.lastTimestamp = time.Now()
+	m.meta = map[string]string{"source": m.name, "group": "Network"}
+	m.devtags = make(map[string]map[string]string)
+	nameIndexMap := map[string]int{
+		"net_bytes_in":  1,
+		"net_pkts_in":   2,
+		"net_bytes_out": 9,
+		"net_pkts_out":  10,
 	}
+	m.matches = make(map[string]map[string]NetstatCollectorMetric)
 	if len(config) > 0 {
 		err := json.Unmarshal(config, &m.config)
 		if err != nil {
-			log.Print(err.Error())
+			cclog.ComponentError(m.name, "Error reading config:", err.Error())
 			return err
 		}
 	}
-	_, err := ioutil.ReadFile(string(NETSTATFILE))
-	if err == nil {
-		m.init = true
-	}
-	return nil
-}
-
-func (m *NetstatCollector) Read(interval time.Duration, out *[]lp.MutableMetric) {
-	data, err := ioutil.ReadFile(string(NETSTATFILE))
+	file, err := os.Open(string(NETSTATFILE))
 	if err != nil {
-		log.Print(err.Error())
-		return
+		cclog.ComponentError(m.name, err.Error())
+		return err
 	}
+	defer file.Close()

-	lines := strings.Split(string(data), "\n")
-	for _, l := range lines {
+	scanner := bufio.NewScanner(file)
+	for scanner.Scan() {
+		l := scanner.Text()
 		if !strings.Contains(l, ":") {
 			continue
 		}
 		f := strings.Fields(l)
-		dev := f[0][0 : len(f[0])-1]
-		cont := false
-		for _, d := range m.config.ExcludeDevices {
-			if d == dev {
-				cont = true
+		dev := strings.Trim(f[0], ": ")
+		if _, ok := stringArrayContains(m.config.IncludeDevices, dev); ok {
+			m.matches[dev] = make(map[string]NetstatCollectorMetric)
+			for name, idx := range nameIndexMap {
+				m.matches[dev][name] = NetstatCollectorMetric{
+					index:     idx,
+					lastValue: 0,
+				}
 			}
+			m.devtags[dev] = map[string]string{"device": dev, "type": "node"}
 		}
-		if cont {
+	}
+	if len(m.devtags) == 0 {
+		return errors.New("no devices to collector metrics found")
+	}
+	m.init = true
+	return nil
+}
+
+func (m *NetstatCollector) Read(interval time.Duration, output chan lp.CCMetric) {
+	if !m.init {
+		return
+	}
+	now := time.Now()
+	file, err := os.Open(string(NETSTATFILE))
+	if err != nil {
+		cclog.ComponentError(m.name, err.Error())
+		return
+	}
+	defer file.Close()
+	tdiff := now.Sub(m.lastTimestamp)
+
+	scanner := bufio.NewScanner(file)
+	for scanner.Scan() {
+		l := scanner.Text()
+		if !strings.Contains(l, ":") {
 			continue
 		}
-		tags := map[string]string{"device": dev, "type": "node"}
-		for i, name := range m.matches {
-			v, err := strconv.ParseInt(f[i], 10, 0)
-			if err == nil {
-				y, err := lp.New(name, tags, map[string]interface{}{"value": int(float64(v) * 1.0e-3)}, time.Now())
+		f := strings.Fields(l)
+		dev := strings.Trim(f[0], ":")
+
+		if devmetrics, ok := m.matches[dev]; ok {
+			for name, data := range devmetrics {
+				v, err := strconv.ParseFloat(f[data.index], 64)
 				if err == nil {
-					*out = append(*out, y)
+					vdiff := v - data.lastValue
+					value := vdiff / tdiff.Seconds()
+					if data.lastValue == 0 {
+						value = 0
+					}
+					data.lastValue = v
+					y, err := lp.New(name, m.devtags[dev], m.meta, map[string]interface{}{"value": value}, now)
+					if err == nil {
+						switch {
+						case strings.Contains(name, "byte"):
+							y.AddMeta("unit", "bytes/sec")
+						case strings.Contains(name, "pkt"):
+							y.AddMeta("unit", "packets/sec")
+						}
+						output <- y
+					}
+					devmetrics[name] = data
 				}
 			}
 		}
 	}
-
+	m.lastTimestamp = time.Now()
 }

 func (m *NetstatCollector) Close() {
--- a/collectors/netstatMetric.md
+++ b/collectors/netstatMetric.md
@@ -0,0 +1,21 @@
+
+## `netstat` collector
+
+```json
+  "netstat": {
+    "include_devices": [
+      "eth0"
+    ]
+  }
+```
+
+The `netstat` collector reads data from `/proc/net/dev` and outputs a handful **node** metrics. With the `include_devices` list you can specify which network devices should be measured. **Note**: Most other collectors use an _exclude_ list instead of an include list.
+
+Metrics:
+* `net_bytes_in` (`unit=bytes/sec`)
+* `net_bytes_out` (`unit=bytes/sec`)
+* `net_pkts_in` (`unit=packets/sec`)
+* `net_pkts_out` (`unit=packets/sec`)
+
+The device name is added as tag `device`.
+
--- a/collectors/nfs3Metric.md
+++ b/collectors/nfs3Metric.md
@@ -0,0 +1,39 @@
+
+## `nfs3stat` collector
+
+```json
+  "nfs3stat": {
+    "nfsstat" : "/path/to/nfsstat",
+    "exclude_metrics": [
+      "nfs3_total"
+    ]
+  }
+```
+
+The `nfs3stat` collector reads data from `nfsstat` command and outputs a handful **node** metrics. If a metric is not required, it can be excluded from forwarding it to the sink. There is currently no possibility to get the metrics per mount point.
+
+
+Metrics:
+* `nfs3_total` 
+* `nfs3_null` 
+* `nfs3_getattr` 
+* `nfs3_setattr` 
+* `nfs3_lookup` 
+* `nfs3_access` 
+* `nfs3_readlink` 
+* `nfs3_read` 
+* `nfs3_write` 
+* `nfs3_create` 
+* `nfs3_mkdir` 
+* `nfs3_symlink` 
+* `nfs3_remove` 
+* `nfs3_rmdir` 
+* `nfs3_rename` 
+* `nfs3_link` 
+* `nfs3_readdir` 
+* `nfs3_readdirplus` 
+* `nfs3_fsstat` 
+* `nfs3_fsinfo` 
+* `nfs3_pathconf` 
+* `nfs3_commit` 
+
--- a/collectors/nfs4Metric.md
+++ b/collectors/nfs4Metric.md
@@ -0,0 +1,62 @@
+
+## `nfs4stat` collector
+
+```json
+  "nfs4stat": {
+    "nfsstat" : "/path/to/nfsstat",
+    "exclude_metrics": [
+      "nfs4_total"
+    ]
+  }
+```
+
+The `nfs4stat` collector reads data from `nfsstat` command and outputs a handful **node** metrics. If a metric is not required, it can be excluded from forwarding it to the sink. There is currently no possibility to get the metrics per mount point.
+
+
+Metrics:
+* `nfs4_total` 
+* `nfs4_null` 
+* `nfs4_read` 
+* `nfs4_write` 
+* `nfs4_commit` 
+* `nfs4_open` 
+* `nfs4_open_conf` 
+* `nfs4_open_noat` 
+* `nfs4_open_dgrd` 
+* `nfs4_close` 
+* `nfs4_setattr` 
+* `nfs4_fsinfo` 
+* `nfs4_renew` 
+* `nfs4_setclntid` 
+* `nfs4_confirm` 
+* `nfs4_lock` 
+* `nfs4_lockt` 
+* `nfs4_locku` 
+* `nfs4_access` 
+* `nfs4_getattr` 
+* `nfs4_lookup` 
+* `nfs4_lookup_root` 
+* `nfs4_remove` 
+* `nfs4_rename` 
+* `nfs4_link` 
+* `nfs4_symlink` 
+* `nfs4_create` 
+* `nfs4_pathconf` 
+* `nfs4_statfs` 
+* `nfs4_readlink` 
+* `nfs4_readdir` 
+* `nfs4_server_caps` 
+* `nfs4_delegreturn` 
+* `nfs4_getacl` 
+* `nfs4_setacl` 
+* `nfs4_rel_lkowner` 
+* `nfs4_exchange_id` 
+* `nfs4_create_session` 
+* `nfs4_destroy_session` 
+* `nfs4_sequence` 
+* `nfs4_get_lease_time` 
+* `nfs4_reclaim_comp` 
+* `nfs4_secinfo_no` 
+* `nfs4_bind_conn_to_ses` 
+
+
--- a/collectors/nfsMetric.go
+++ b/collectors/nfsMetric.go
@@ -0,0 +1,174 @@
+package collectors
+
+import (
+	"encoding/json"
+	"fmt"
+	"log"
+
+	//	"os"
+	"os/exec"
+	"strconv"
+	"strings"
+	"time"
+
+	lp "github.com/ClusterCockpit/cc-metric-collector/internal/ccMetric"
+)
+
+// First part contains the code for the general NfsCollector.
+// Later, the general NfsCollector is more limited to Nfs3- and Nfs4Collector.
+
+const NFSSTAT_EXEC = `nfsstat`
+
+type NfsCollectorData struct {
+	current int64
+	last    int64
+}
+
+type nfsCollector struct {
+	metricCollector
+	tags    map[string]string
+	version string
+	config  struct {
+		Nfsstats       string   `json:"nfsstat"`
+		ExcludeMetrics []string `json:"exclude_metrics,omitempty"`
+	}
+	data map[string]NfsCollectorData
+}
+
+func (m *nfsCollector) initStats() error {
+	cmd := exec.Command(m.config.Nfsstats, `-l`)
+	cmd.Wait()
+	buffer, err := cmd.Output()
+	if err == nil {
+		for _, line := range strings.Split(string(buffer), "\n") {
+			lf := strings.Fields(line)
+			if len(lf) != 5 {
+				continue
+			}
+			if lf[1] == m.version {
+				name := strings.Trim(lf[3], ":")
+				if _, exist := m.data[name]; !exist {
+					value, err := strconv.ParseInt(lf[4], 0, 64)
+					if err == nil {
+						x := m.data[name]
+						x.current = value
+						x.last = 0
+						m.data[name] = x
+					}
+				}
+			}
+		}
+	}
+	return err
+}
+
+func (m *nfsCollector) updateStats() error {
+	cmd := exec.Command(m.config.Nfsstats, `-l`)
+	cmd.Wait()
+	buffer, err := cmd.Output()
+	if err == nil {
+		for _, line := range strings.Split(string(buffer), "\n") {
+			lf := strings.Fields(line)
+			if len(lf) != 5 {
+				continue
+			}
+			if lf[1] == m.version {
+				name := strings.Trim(lf[3], ":")
+				if _, exist := m.data[name]; exist {
+					value, err := strconv.ParseInt(lf[4], 0, 64)
+					if err == nil {
+						x := m.data[name]
+						x.last = x.current
+						x.current = value
+						m.data[name] = x
+					}
+				}
+			}
+		}
+	}
+	return err
+}
+
+func (m *nfsCollector) MainInit(config json.RawMessage) error {
+	m.config.Nfsstats = string(NFSSTAT_EXEC)
+	// Read JSON configuration
+	if len(config) > 0 {
+		err := json.Unmarshal(config, &m.config)
+		if err != nil {
+			log.Print(err.Error())
+			return err
+		}
+	}
+	m.meta = map[string]string{
+		"source": m.name,
+		"group":  "NFS",
+	}
+	m.tags = map[string]string{
+		"type": "node",
+	}
+	// Check if nfsstat is in executable search path
+	_, err := exec.LookPath(m.config.Nfsstats)
+	if err != nil {
+		return fmt.Errorf("NfsCollector.Init(): Failed to find nfsstat binary '%s': %v", m.config.Nfsstats, err)
+	}
+	m.data = make(map[string]NfsCollectorData)
+	m.initStats()
+	m.init = true
+	return nil
+}
+
+func (m *nfsCollector) Read(interval time.Duration, output chan lp.CCMetric) {
+	if !m.init {
+		return
+	}
+	timestamp := time.Now()
+
+	m.updateStats()
+	prefix := ""
+	switch m.version {
+	case "v3":
+		prefix = "nfs3"
+	case "v4":
+		prefix = "nfs4"
+	default:
+		prefix = "nfs"
+	}
+
+	for name, data := range m.data {
+		if _, skip := stringArrayContains(m.config.ExcludeMetrics, name); skip {
+			continue
+		}
+		value := data.current - data.last
+		y, err := lp.New(fmt.Sprintf("%s_%s", prefix, name), m.tags, m.meta, map[string]interface{}{"value": value}, timestamp)
+		if err == nil {
+			y.AddMeta("version", m.version)
+			output <- y
+		}
+	}
+}
+
+func (m *nfsCollector) Close() {
+	m.init = false
+}
+
+type Nfs3Collector struct {
+	nfsCollector
+}
+
+type Nfs4Collector struct {
+	nfsCollector
+}
+
+func (m *Nfs3Collector) Init(config json.RawMessage) error {
+	m.name = "Nfs3Collector"
+	m.version = `v3`
+	m.setup()
+	return m.MainInit(config)
+}
+
+func (m *Nfs4Collector) Init(config json.RawMessage) error {
+	m.name = "Nfs4Collector"
+	m.version = `v4`
+	m.setup()
+	return m.MainInit(config)
+}
--- a/collectors/numastatsMetric.go
+++ b/collectors/numastatsMetric.go
@@ -2,15 +2,16 @@ package collectors

 import (
 	"bufio"
+	"encoding/json"
 	"fmt"
-	"log"
 	"os"
 	"path/filepath"
 	"strconv"
 	"strings"
 	"time"

-	lp "github.com/influxdata/line-protocol"
+	cclog "github.com/ClusterCockpit/cc-metric-collector/internal/ccLogger"
+	lp "github.com/ClusterCockpit/cc-metric-collector/internal/ccMetric"
 )

 //
@@ -42,11 +43,11 @@ type NUMAStatsCollectorTopolgy struct {
 }

 type NUMAStatsCollector struct {
-	MetricCollector
+	metricCollector
 	topology []NUMAStatsCollectorTopolgy
 }

-func (m *NUMAStatsCollector) Init(config []byte) error {
+func (m *NUMAStatsCollector) Init(config json.RawMessage) error {
 	// Check if already initialized
 	if m.init {
 		return nil
@@ -54,25 +55,29 @@ func (m *NUMAStatsCollector) Init(config []byte) error {

 	m.name = "NUMAStatsCollector"
 	m.setup()
+	m.meta = map[string]string{
+		"source": m.name,
+		"group":  "NUMA",
+	}

 	// Loop for all NUMA node directories
-	baseDir := "/sys/devices/system/node"
-	globPattern := filepath.Join(baseDir, "node[0-9]*")
+	base := "/sys/devices/system/node/node"
+	globPattern := base + "[0-9]*"
 	dirs, err := filepath.Glob(globPattern)
 	if err != nil {
-		return fmt.Errorf("unable to glob files with pattern %s", globPattern)
+		return fmt.Errorf("unable to glob files with pattern '%s'", globPattern)
 	}
 	if dirs == nil {
-		return fmt.Errorf("unable to find any files with pattern %s", globPattern)
+		return fmt.Errorf("unable to find any files with pattern '%s'", globPattern)
 	}
 	m.topology = make([]NUMAStatsCollectorTopolgy, 0, len(dirs))
 	for _, dir := range dirs {
-		node := strings.TrimPrefix(dir, "/sys/devices/system/node/node")
+		node := strings.TrimPrefix(dir, base)
 		file := filepath.Join(dir, "numastat")
 		m.topology = append(m.topology,
 			NUMAStatsCollectorTopolgy{
 				file:   file,
-				tagSet: map[string]string{"domain": node},
+				tagSet: map[string]string{"memoryDomain": node},
 			})
 	}

@@ -80,7 +85,7 @@ func (m *NUMAStatsCollector) Init(config []byte) error {
 	return nil
 }

-func (m *NUMAStatsCollector) Read(interval time.Duration, out *[]lp.MutableMetric) {
+func (m *NUMAStatsCollector) Read(interval time.Duration, output chan lp.CCMetric) {
 	if !m.init {
 		return
 	}
@@ -92,9 +97,14 @@ func (m *NUMAStatsCollector) Read(interval time.Duration, out *[]lp.MutableMetri
 		now := time.Now()
 		file, err := os.Open(t.file)
 		if err != nil {
+			cclog.ComponentError(
+				m.name,
+				fmt.Sprintf("Read(): Failed to open file '%s': %v", t.file, err))
 			return
 		}
 		scanner := bufio.NewScanner(file)
+
+		// Read line by line
 		for scanner.Scan() {
 			split := strings.Fields(scanner.Text())
 			if len(split) != 2 {
@@ -103,12 +113,20 @@ func (m *NUMAStatsCollector) Read(interval time.Duration, out *[]lp.MutableMetri
 			key := split[0]
 			value, err := strconv.ParseInt(split[1], 10, 64)
 			if err != nil {
-				log.Printf("failed to convert %s='%s' to int64: %v", key, split[1], err)
+				cclog.ComponentError(
+					m.name,
+					fmt.Sprintf("Read(): Failed to convert %s='%s' to int64: %v", key, split[1], err))
 				continue
 			}
-			y, err := lp.New("numastats_"+key, t.tagSet, map[string]interface{}{"value": value}, now)
+			y, err := lp.New(
+				"numastats_"+key,
+				t.tagSet,
+				m.meta,
+				map[string]interface{}{"value": value},
+				now,
+			)
 			if err == nil {
-				*out = append(*out, y)
+				output <- y
 			}
 		}

--- a/collectors/numastatsMetric.md
+++ b/collectors/numastatsMetric.md
@@ -0,0 +1,15 @@
+
+## `numastat` collector
+```json
+  "numastat": {}
+```
+
+The `numastat` collector reads data from `/sys/devices/system/node/node*/numastat` and outputs a handful **memoryDomain** metrics. See: https://www.kernel.org/doc/html/latest/admin-guide/numastat.html
+
+Metrics:
+* `numastats_numa_hit`: A process wanted to allocate memory from this node, and succeeded.
+* `numastats_numa_miss`: A process wanted to allocate memory from another node, but ended up with memory from this node.
+* `numastats_numa_foreign`: A process wanted to allocate on this node, but ended up with memory from another node.
+* `numastats_local_node`: A process ran on this node's CPU, and got memory from this node.
+* `numastats_other_node`: A process ran on a different node's CPU, and got memory from this node.
+* `numastats_interleave_hit`: Interleaving wanted to allocate from this node and succeeded.
--- a/collectors/nvidiaMetric.go
+++ b/collectors/nvidiaMetric.go
@@ -7,19 +7,28 @@ import (
 	"log"
 	"time"

+	cclog "github.com/ClusterCockpit/cc-metric-collector/internal/ccLogger"
+	lp "github.com/ClusterCockpit/cc-metric-collector/internal/ccMetric"
 	"github.com/NVIDIA/go-nvml/pkg/nvml"
-	lp "github.com/influxdata/line-protocol"
 )

 type NvidiaCollectorConfig struct {
 	ExcludeMetrics []string `json:"exclude_metrics,omitempty"`
 	ExcludeDevices []string `json:"exclude_devices,omitempty"`
+	AddPciInfoTag  bool     `json:"add_pci_info_tag,omitempty"`
+}
+
+type NvidiaCollectorDevice struct {
+	device         nvml.Device
+	excludeMetrics map[string]bool
+	tags           map[string]string
 }

 type NvidiaCollector struct {
-	MetricCollector
+	metricCollector
 	num_gpus int
 	config   NvidiaCollectorConfig
+	gpus     []NvidiaCollectorDevice
 }

 func (m *NvidiaCollector) CatchPanic() {
@@ -29,9 +38,10 @@ func (m *NvidiaCollector) CatchPanic() {
 	}
 }

-func (m *NvidiaCollector) Init(config []byte) error {
+func (m *NvidiaCollector) Init(config json.RawMessage) error {
 	var err error
 	m.name = "NvidiaCollector"
+	m.config.AddPciInfoTag = false
 	m.setup()
 	if len(config) > 0 {
 		err = json.Unmarshal(config, &m.config)
@@ -39,224 +49,415 @@ func (m *NvidiaCollector) Init(config []byte) error {
 			return err
 		}
 	}
+	m.meta = map[string]string{
+		"source": m.name,
+		"group":  "Nvidia",
+	}
+
 	m.num_gpus = 0
 	defer m.CatchPanic()
+
+	// Initialize NVIDIA Management Library (NVML)
 	ret := nvml.Init()
 	if ret != nvml.SUCCESS {
 		err = errors.New(nvml.ErrorString(ret))
+		cclog.ComponentError(m.name, "Unable to initialize NVML", err.Error())
 		return err
 	}
-	m.num_gpus, ret = nvml.DeviceGetCount()
+
+	// Number of NVIDIA GPUs
+	num_gpus, ret := nvml.DeviceGetCount()
 	if ret != nvml.SUCCESS {
 		err = errors.New(nvml.ErrorString(ret))
+		cclog.ComponentError(m.name, "Unable to get device count", err.Error())
 		return err
 	}
+
+	// For all GPUs
+	m.gpus = make([]NvidiaCollectorDevice, num_gpus)
+	for i := 0; i < num_gpus; i++ {
+		g := &m.gpus[i]
+
+		// Skip excluded devices
+		str_i := fmt.Sprintf("%d", i)
+		if _, skip := stringArrayContains(m.config.ExcludeDevices, str_i); skip {
+			continue
+		}
+
+		// Get device handle
+		device, ret := nvml.DeviceGetHandleByIndex(i)
+		if ret != nvml.SUCCESS {
+			err = errors.New(nvml.ErrorString(ret))
+			cclog.ComponentError(m.name, "Unable to get device at index", i, ":", err.Error())
+			return err
+		}
+		g.device = device
+
+		// Add tags
+		g.tags = map[string]string{
+			"type":    "accelerator",
+			"type-id": str_i,
+		}
+
+		// Add excluded metrics
+		g.excludeMetrics = map[string]bool{}
+		for _, e := range m.config.ExcludeMetrics {
+			g.excludeMetrics[e] = true
+		}
+
+		// Add PCI info as tag
+		if m.config.AddPciInfoTag {
+			pciInfo, ret := nvml.DeviceGetPciInfo(g.device)
+			if ret != nvml.SUCCESS {
+				err = errors.New(nvml.ErrorString(ret))
+				cclog.ComponentError(m.name, "Unable to get PCI info for device at index", i, ":", err.Error())
+				return err
+			}
+			g.tags["pci_identifier"] = fmt.Sprintf(
+				"%08X:%02X:%02X.0",
+				pciInfo.Domain,
+				pciInfo.Bus,
+				pciInfo.Device)
+		}
+	}
+
 	m.init = true
 	return nil
 }

-func (m *NvidiaCollector) Read(interval time.Duration, out *[]lp.MutableMetric) {
+func (m *NvidiaCollector) Read(interval time.Duration, output chan lp.CCMetric) {
 	if !m.init {
 		return
 	}
-	for i := 0; i < m.num_gpus; i++ {
-		device, ret := nvml.DeviceGetHandleByIndex(i)
-		if ret != nvml.SUCCESS {
-			log.Fatalf("Unable to get device at index %d: %v", i, nvml.ErrorString(ret))
-			return
-		}
-		_, skip := stringArrayContains(m.config.ExcludeDevices, fmt.Sprintf("%d", i))
-		if skip {
-			continue
-		}
-		tags := map[string]string{"type": "accelerator", "type-id": fmt.Sprintf("%d", i)}

-		util, ret := nvml.DeviceGetUtilizationRates(device)
-		if ret == nvml.SUCCESS {
-			_, skip = stringArrayContains(m.config.ExcludeMetrics, "util")
-			y, err := lp.New("util", tags, map[string]interface{}{"value": float64(util.Gpu)}, time.Now())
-			if err == nil && !skip {
-				*out = append(*out, y)
-			}
-			_, skip = stringArrayContains(m.config.ExcludeMetrics, "mem_util")
-			y, err = lp.New("mem_util", tags, map[string]interface{}{"value": float64(util.Memory)}, time.Now())
-			if err == nil && !skip {
-				*out = append(*out, y)
+	for i := range m.gpus {
+		device := &m.gpus[i]
+
+		if !device.excludeMetrics["nv_util"] || !device.excludeMetrics["nv_mem_util"] {
+			// Retrieves the current utilization rates for the device's major subsystems.
+			//
+			// Available utilization rates
+			// * Gpu: Percent of time over the past sample period during which one or more kernels was executing on the GPU.
+			// * Memory: Percent of time over the past sample period during which global (device) memory was being read or written
+			//
+			// Note:
+			// * During driver initialization when ECC is enabled one can see high GPU and Memory Utilization readings.
+			//   This is caused by ECC Memory Scrubbing mechanism that is performed during driver initialization.
+			// * On MIG-enabled GPUs, querying device utilization rates is not currently supported.
+			util, ret := nvml.DeviceGetUtilizationRates(device.device)
+			if ret == nvml.SUCCESS {
+				if !device.excludeMetrics["nv_util"] {
+					y, err := lp.New("nv_util", device.tags, m.meta, map[string]interface{}{"value": float64(util.Gpu)}, time.Now())
+					if err == nil {
+						y.AddMeta("unit", "%")
+						output <- y
+					}
+				}
+				if !device.excludeMetrics["nv_mem_util"] {
+					y, err := lp.New("nv_mem_util", device.tags, m.meta, map[string]interface{}{"value": float64(util.Memory)}, time.Now())
+					if err == nil {
+						y.AddMeta("unit", "%")
+						output <- y
+					}
+				}
 			}
 		}

-		meminfo, ret := nvml.DeviceGetMemoryInfo(device)
-		if ret == nvml.SUCCESS {
-			t := float64(meminfo.Total) / (1024 * 1024)
-			_, skip = stringArrayContains(m.config.ExcludeMetrics, "mem_total")
-			y, err := lp.New("mem_total", tags, map[string]interface{}{"value": t}, time.Now())
-			if err == nil && !skip {
-				*out = append(*out, y)
-			}
-			f := float64(meminfo.Used) / (1024 * 1024)
-			_, skip = stringArrayContains(m.config.ExcludeMetrics, "fb_memory")
-			y, err = lp.New("fb_memory", tags, map[string]interface{}{"value": f}, time.Now())
-			if err == nil && !skip {
-				*out = append(*out, y)
+		if !device.excludeMetrics["nv_mem_total"] || !device.excludeMetrics["nv_fb_memory"] {
+			// Retrieves the amount of used, free and total memory available on the device, in bytes.
+			//
+			// Enabling ECC reduces the amount of total available memory, due to the extra required parity bits.
+			//
+			// The reported amount of used memory is equal to the sum of memory allocated by all active channels on the device.
+			//
+			// Available memory info:
+			// * Free: Unallocated FB memory (in bytes).
+			// * Total: Total installed FB memory (in bytes).
+			// * Used: Allocated FB memory (in bytes). Note that the driver/GPU always sets aside a small amount of memory for bookkeeping.
+			//
+			// Note:
+			// In MIG mode, if device handle is provided, the API returns aggregate information, only if the caller has appropriate privileges.
+			// Per-instance information can be queried by using specific MIG device handles.
+			meminfo, ret := nvml.DeviceGetMemoryInfo(device.device)
+			if ret == nvml.SUCCESS {
+				if !device.excludeMetrics["nv_mem_total"] {
+					t := float64(meminfo.Total) / (1024 * 1024)
+					y, err := lp.New("nv_mem_total", device.tags, m.meta, map[string]interface{}{"value": t}, time.Now())
+					if err == nil {
+						y.AddMeta("unit", "MByte")
+						output <- y
+					}
+				}
+
+				if !device.excludeMetrics["nv_fb_memory"] {
+					f := float64(meminfo.Used) / (1024 * 1024)
+					y, err := lp.New("nv_fb_memory", device.tags, m.meta, map[string]interface{}{"value": f}, time.Now())
+					if err == nil {
+						y.AddMeta("unit", "MByte")
+						output <- y
+					}
+				}
 			}
 		}

-		temp, ret := nvml.DeviceGetTemperature(device, nvml.TEMPERATURE_GPU)
-		if ret == nvml.SUCCESS {
-			_, skip = stringArrayContains(m.config.ExcludeMetrics, "temp")
-			y, err := lp.New("temp", tags, map[string]interface{}{"value": float64(temp)}, time.Now())
-			if err == nil && !skip {
-				*out = append(*out, y)
+		if !device.excludeMetrics["nv_temp"] {
+			// Retrieves the current temperature readings for the device, in degrees C.
+			//
+			// Available temperature sensors:
+			// * TEMPERATURE_GPU: Temperature sensor for the GPU die.
+			// * NVML_TEMPERATURE_COUNT
+			temp, ret := nvml.DeviceGetTemperature(device.device, nvml.TEMPERATURE_GPU)
+			if ret == nvml.SUCCESS {
+				y, err := lp.New("nv_temp", device.tags, m.meta, map[string]interface{}{"value": float64(temp)}, time.Now())
+				if err == nil {
+					y.AddMeta("unit", "degC")
+					output <- y
+				}
 			}
 		}

-		fan, ret := nvml.DeviceGetFanSpeed(device)
-		if ret == nvml.SUCCESS {
-			_, skip = stringArrayContains(m.config.ExcludeMetrics, "fan")
-			y, err := lp.New("fan", tags, map[string]interface{}{"value": float64(fan)}, time.Now())
-			if err == nil && !skip {
-				*out = append(*out, y)
+		if !device.excludeMetrics["nv_fan"] {
+			// Retrieves the intended operating speed of the device's fan.
+			//
+			// Note: The reported speed is the intended fan speed.
+			// If the fan is physically blocked and unable to spin, the output will not match the actual fan speed.
+			//
+			// For all discrete products with dedicated fans.
+			//
+			// The fan speed is expressed as a percentage of the product's maximum noise tolerance fan speed.
+			// This value may exceed 100% in certain cases.
+			fan, ret := nvml.DeviceGetFanSpeed(device.device)
+			if ret == nvml.SUCCESS {
+				y, err := lp.New("nv_fan", device.tags, m.meta, map[string]interface{}{"value": float64(fan)}, time.Now())
+				if err == nil {
+					y.AddMeta("unit", "%")
+					output <- y
+				}
 			}
 		}

-		_, ecc_pend, ret := nvml.DeviceGetEccMode(device)
-		if ret == nvml.SUCCESS {
-			var y lp.MutableMetric
-			var err error
-			switch ecc_pend {
-			case nvml.FEATURE_DISABLED:
-				y, err = lp.New("ecc_mode", tags, map[string]interface{}{"value": string("OFF")}, time.Now())
-			case nvml.FEATURE_ENABLED:
-				y, err = lp.New("ecc_mode", tags, map[string]interface{}{"value": string("ON")}, time.Now())
-			default:
-				y, err = lp.New("ecc_mode", tags, map[string]interface{}{"value": string("UNKNOWN")}, time.Now())
-			}
-			_, skip = stringArrayContains(m.config.ExcludeMetrics, "ecc_mode")
-			if err == nil && !skip {
-				*out = append(*out, y)
-			}
-		} else if ret == nvml.ERROR_NOT_SUPPORTED {
-			_, skip = stringArrayContains(m.config.ExcludeMetrics, "ecc_mode")
-			y, err := lp.New("ecc_mode", tags, map[string]interface{}{"value": string("N/A")}, time.Now())
-			if err == nil && !skip {
-				*out = append(*out, y)
+		if !device.excludeMetrics["nv_ecc_mode"] {
+			// Retrieves the current and pending ECC modes for the device.
+			//
+			// For Fermi or newer fully supported devices. Only applicable to devices with ECC.
+			// Requires NVML_INFOROM_ECC version 1.0 or higher.
+			//
+			// Changing ECC modes requires a reboot.
+			// The "pending" ECC mode refers to the target mode following the next reboot.
+			_, ecc_pend, ret := nvml.DeviceGetEccMode(device.device)
+			if ret == nvml.SUCCESS {
+				var y lp.CCMetric
+				var err error
+				switch ecc_pend {
+				case nvml.FEATURE_DISABLED:
+					y, err = lp.New("nv_ecc_mode", device.tags, m.meta, map[string]interface{}{"value": "OFF"}, time.Now())
+				case nvml.FEATURE_ENABLED:
+					y, err = lp.New("nv_ecc_mode", device.tags, m.meta, map[string]interface{}{"value": "ON"}, time.Now())
+				default:
+					y, err = lp.New("nv_ecc_mode", device.tags, m.meta, map[string]interface{}{"value": "UNKNOWN"}, time.Now())
+				}
+				if err == nil {
+					output <- y
+				}
+			} else if ret == nvml.ERROR_NOT_SUPPORTED {
+				y, err := lp.New("nv_ecc_mode", device.tags, m.meta, map[string]interface{}{"value": "N/A"}, time.Now())
+				if err == nil {
+					output <- y
+				}
 			}
 		}

-		pstate, ret := nvml.DeviceGetPerformanceState(device)
-		if ret == nvml.SUCCESS {
-			_, skip = stringArrayContains(m.config.ExcludeMetrics, "perf_state")
-			y, err := lp.New("perf_state", tags, map[string]interface{}{"value": fmt.Sprintf("P%d", int(pstate))}, time.Now())
-			if err == nil && !skip {
-				*out = append(*out, y)
+		if !device.excludeMetrics["nv_perf_state"] {
+			// Retrieves the current performance state for the device.
+			//
+			// Allowed PStates:
+			//  0: Maximum Performance.
+			// ..
+			// 15: Minimum Performance.
+			// 32: Unknown performance state.
+			pState, ret := nvml.DeviceGetPerformanceState(device.device)
+			if ret == nvml.SUCCESS {
+				y, err := lp.New("nv_perf_state", device.tags, m.meta, map[string]interface{}{"value": fmt.Sprintf("P%d", int(pState))}, time.Now())
+				if err == nil {
+					output <- y
+				}
 			}
 		}

-		power, ret := nvml.DeviceGetPowerUsage(device)
-		if ret == nvml.SUCCESS {
-			_, skip = stringArrayContains(m.config.ExcludeMetrics, "power_usage_report")
-			y, err := lp.New("power_usage_report", tags, map[string]interface{}{"value": float64(power) / 1000}, time.Now())
-			if err == nil && !skip {
-				*out = append(*out, y)
+		if !device.excludeMetrics["nv_power_usage_report"] {
+			// Retrieves power usage for this GPU in milliwatts and its associated circuitry (e.g. memory)
+			//
+			// On Fermi and Kepler GPUs the reading is accurate to within +/- 5% of current power draw.
+			//
+			// It is only available if power management mode is supported
+			power, ret := nvml.DeviceGetPowerUsage(device.device)
+			if ret == nvml.SUCCESS {
+				y, err := lp.New("nv_power_usage_report", device.tags, m.meta, map[string]interface{}{"value": float64(power) / 1000}, time.Now())
+				if err == nil {
+					y.AddMeta("unit", "watts")
+					output <- y
+				}
 			}
 		}

-		gclk, ret := nvml.DeviceGetClockInfo(device, nvml.CLOCK_GRAPHICS)
-		if ret == nvml.SUCCESS {
-			_, skip = stringArrayContains(m.config.ExcludeMetrics, "graphics_clock_report")
-			y, err := lp.New("graphics_clock_report", tags, map[string]interface{}{"value": float64(gclk)}, time.Now())
-			if err == nil && !skip {
-				*out = append(*out, y)
+		// Retrieves the current clock speeds for the device.
+		//
+		// Available clock information:
+		// * CLOCK_GRAPHICS: Graphics clock domain.
+		// * CLOCK_SM: Streaming Multiprocessor clock domain.
+		// * CLOCK_MEM: Memory clock domain.
+		if !device.excludeMetrics["nv_graphics_clock_report"] {
+			graphicsClock, ret := nvml.DeviceGetClockInfo(device.device, nvml.CLOCK_GRAPHICS)
+			if ret == nvml.SUCCESS {
+				y, err := lp.New("nv_graphics_clock_report", device.tags, m.meta, map[string]interface{}{"value": float64(graphicsClock)}, time.Now())
+				if err == nil {
+					y.AddMeta("unit", "MHz")
+					output <- y
+				}
 			}
 		}

-		smclk, ret := nvml.DeviceGetClockInfo(device, nvml.CLOCK_SM)
-		if ret == nvml.SUCCESS {
-			_, skip = stringArrayContains(m.config.ExcludeMetrics, "sm_clock_report")
-			y, err := lp.New("sm_clock_report", tags, map[string]interface{}{"value": float64(smclk)}, time.Now())
-			if err == nil && !skip {
-				*out = append(*out, y)
+		if !device.excludeMetrics["nv_sm_clock_report"] {
+			smCock, ret := nvml.DeviceGetClockInfo(device.device, nvml.CLOCK_SM)
+			if ret == nvml.SUCCESS {
+				y, err := lp.New("nv_sm_clock_report", device.tags, m.meta, map[string]interface{}{"value": float64(smCock)}, time.Now())
+				if err == nil {
+					y.AddMeta("unit", "MHz")
+					output <- y
+				}
 			}
 		}

-		memclk, ret := nvml.DeviceGetClockInfo(device, nvml.CLOCK_MEM)
-		if ret == nvml.SUCCESS {
-			_, skip = stringArrayContains(m.config.ExcludeMetrics, "mem_clock_report")
-			y, err := lp.New("mem_clock_report", tags, map[string]interface{}{"value": float64(memclk)}, time.Now())
-			if err == nil && !skip {
-				*out = append(*out, y)
+		if !device.excludeMetrics["nv_mem_clock_report"] {
+			memClock, ret := nvml.DeviceGetClockInfo(device.device, nvml.CLOCK_MEM)
+			if ret == nvml.SUCCESS {
+				y, err := lp.New("nv_mem_clock_report", device.tags, m.meta, map[string]interface{}{"value": float64(memClock)}, time.Now())
+				if err == nil {
+					y.AddMeta("unit", "MHz")
+					output <- y
+				}
 			}
 		}

-		max_gclk, ret := nvml.DeviceGetMaxClockInfo(device, nvml.CLOCK_GRAPHICS)
-		if ret == nvml.SUCCESS {
-			_, skip = stringArrayContains(m.config.ExcludeMetrics, "max_graphics_clock")
-			y, err := lp.New("max_graphics_clock", tags, map[string]interface{}{"value": float64(max_gclk)}, time.Now())
-			if err == nil && !skip {
-				*out = append(*out, y)
+		// Retrieves the maximum clock speeds for the device.
+		//
+		// Available clock information:
+		// * CLOCK_GRAPHICS: Graphics clock domain.
+		// * CLOCK_SM:       Streaming multiprocessor clock domain.
+		// * CLOCK_MEM:      Memory clock domain.
+		// * CLOCK_VIDEO:    Video encoder/decoder clock domain.
+		// * CLOCK_COUNT:    Count of clock types.
+		//
+		// Note:
+		/// On GPUs from Fermi family current P0 clocks (reported by nvmlDeviceGetClockInfo) can differ from max clocks by few MHz.
+		if !device.excludeMetrics["nv_max_graphics_clock"] {
+			max_gclk, ret := nvml.DeviceGetMaxClockInfo(device.device, nvml.CLOCK_GRAPHICS)
+			if ret == nvml.SUCCESS {
+				y, err := lp.New("nv_max_graphics_clock", device.tags, m.meta, map[string]interface{}{"value": float64(max_gclk)}, time.Now())
+				if err == nil {
+					y.AddMeta("unit", "MHz")
+					output <- y
+				}
 			}
 		}

-		max_smclk, ret := nvml.DeviceGetClockInfo(device, nvml.CLOCK_SM)
-		if ret == nvml.SUCCESS {
-			_, skip = stringArrayContains(m.config.ExcludeMetrics, "max_sm_clock")
-			y, err := lp.New("max_sm_clock", tags, map[string]interface{}{"value": float64(max_smclk)}, time.Now())
-			if err == nil && !skip {
-				*out = append(*out, y)
+		if !device.excludeMetrics["nv_max_sm_clock"] {
+			maxSmClock, ret := nvml.DeviceGetClockInfo(device.device, nvml.CLOCK_SM)
+			if ret == nvml.SUCCESS {
+				y, err := lp.New("nv_max_sm_clock", device.tags, m.meta, map[string]interface{}{"value": float64(maxSmClock)}, time.Now())
+				if err == nil {
+					y.AddMeta("unit", "MHz")
+					output <- y
+				}
 			}
 		}

-		max_memclk, ret := nvml.DeviceGetClockInfo(device, nvml.CLOCK_MEM)
-		if ret == nvml.SUCCESS {
-			_, skip = stringArrayContains(m.config.ExcludeMetrics, "max_mem_clock")
-			y, err := lp.New("max_mem_clock", tags, map[string]interface{}{"value": float64(max_memclk)}, time.Now())
-			if err == nil && !skip {
-				*out = append(*out, y)
+		if !device.excludeMetrics["nv_max_mem_clock"] {
+			maxMemClock, ret := nvml.DeviceGetClockInfo(device.device, nvml.CLOCK_MEM)
+			if ret == nvml.SUCCESS {
+				y, err := lp.New("nv_max_mem_clock", device.tags, m.meta, map[string]interface{}{"value": float64(maxMemClock)}, time.Now())
+				if err == nil {
+					y.AddMeta("unit", "MHz")
+					output <- y
+				}
 			}
 		}

-		ecc_db, ret := nvml.DeviceGetTotalEccErrors(device, 1, 1)
-		if ret == nvml.SUCCESS {
-			_, skip = stringArrayContains(m.config.ExcludeMetrics, "ecc_db_error")
-			y, err := lp.New("ecc_db_error", tags, map[string]interface{}{"value": float64(ecc_db)}, time.Now())
-			if err == nil && !skip {
-				*out = append(*out, y)
+		if !device.excludeMetrics["nv_ecc_db_error"] {
+			// Retrieves the total ECC error counts for the device.
+			//
+			// For Fermi or newer fully supported devices.
+			// Only applicable to devices with ECC.
+			// Requires NVML_INFOROM_ECC version 1.0 or higher.
+			// Requires ECC Mode to be enabled.
+			//
+			// The total error count is the sum of errors across each of the separate memory systems,
+			// i.e. the total set of errors across the entire device.
+			ecc_db, ret := nvml.DeviceGetTotalEccErrors(device.device, nvml.MEMORY_ERROR_TYPE_UNCORRECTED, nvml.AGGREGATE_ECC)
+			if ret == nvml.SUCCESS {
+				y, err := lp.New("nv_ecc_db_error", device.tags, m.meta, map[string]interface{}{"value": float64(ecc_db)}, time.Now())
+				if err == nil {
+					output <- y
+				}
 			}
 		}

-		ecc_sb, ret := nvml.DeviceGetTotalEccErrors(device, 0, 1)
-		if ret == nvml.SUCCESS {
-			_, skip = stringArrayContains(m.config.ExcludeMetrics, "ecc_sb_error")
-			y, err := lp.New("ecc_sb_error", tags, map[string]interface{}{"value": float64(ecc_sb)}, time.Now())
-			if err == nil && !skip {
-				*out = append(*out, y)
+		if !device.excludeMetrics["nv_ecc_sb_error"] {
+			ecc_sb, ret := nvml.DeviceGetTotalEccErrors(device.device, nvml.MEMORY_ERROR_TYPE_CORRECTED, nvml.AGGREGATE_ECC)
+			if ret == nvml.SUCCESS {
+				y, err := lp.New("nv_ecc_sb_error", device.tags, m.meta, map[string]interface{}{"value": float64(ecc_sb)}, time.Now())
+				if err == nil {
+					output <- y
+				}
 			}
 		}

-		pwr_limit, ret := nvml.DeviceGetPowerManagementLimit(device)
-		if ret == nvml.SUCCESS {
-			_, skip = stringArrayContains(m.config.ExcludeMetrics, "power_man_limit")
-			y, err := lp.New("power_man_limit", tags, map[string]interface{}{"value": float64(pwr_limit)}, time.Now())
-			if err == nil && !skip {
-				*out = append(*out, y)
+		if !device.excludeMetrics["nv_power_man_limit"] {
+			// Retrieves the power management limit associated with this device.
+			//
+			// For Fermi or newer fully supported devices.
+			//
+			// The power limit defines the upper boundary for the card's power draw.
+			// If the card's total power draw reaches this limit the power management algorithm kicks in.
+			pwr_limit, ret := nvml.DeviceGetPowerManagementLimit(device.device)
+			if ret == nvml.SUCCESS {
+				y, err := lp.New("nv_power_man_limit", device.tags, m.meta, map[string]interface{}{"value": float64(pwr_limit) / 1000}, time.Now())
+				if err == nil {
+					y.AddMeta("unit", "watts")
+					output <- y
+				}
 			}
 		}

-		enc_util, _, ret := nvml.DeviceGetEncoderUtilization(device)
-		if ret == nvml.SUCCESS {
-			_, skip = stringArrayContains(m.config.ExcludeMetrics, "encoder_util")
-			y, err := lp.New("encoder_util", tags, map[string]interface{}{"value": float64(enc_util)}, time.Now())
-			if err == nil && !skip {
-				*out = append(*out, y)
+		if !device.excludeMetrics["nv_encoder_util"] {
+			// Retrieves the current utilization and sampling size in microseconds for the Encoder
+			//
+			// For Kepler or newer fully supported devices.
+			//
+			// Note: On MIG-enabled GPUs, querying encoder utilization is not currently supported.
+			enc_util, _, ret := nvml.DeviceGetEncoderUtilization(device.device)
+			if ret == nvml.SUCCESS {
+				y, err := lp.New("nv_encoder_util", device.tags, m.meta, map[string]interface{}{"value": float64(enc_util)}, time.Now())
+				if err == nil {
+					y.AddMeta("unit", "%")
+					output <- y
+				}
 			}
 		}

-		dec_util, _, ret := nvml.DeviceGetDecoderUtilization(device)
-		if ret == nvml.SUCCESS {
-			_, skip = stringArrayContains(m.config.ExcludeMetrics, "decoder_util")
-			y, err := lp.New("decoder_util", tags, map[string]interface{}{"value": float64(dec_util)}, time.Now())
-			if err == nil && !skip {
-				*out = append(*out, y)
+		if !device.excludeMetrics["nv_decoder_util"] {
+			// Retrieves the current utilization and sampling size in microseconds for the Decoder
+			//
+			// For Kepler or newer fully supported devices.
+			//
+			// Note: On MIG-enabled GPUs, querying decoder utilization is not currently supported.
+			dec_util, _, ret := nvml.DeviceGetDecoderUtilization(device.device)
+			if ret == nvml.SUCCESS {
+				y, err := lp.New("nv_decoder_util", device.tags, m.meta, map[string]interface{}{"value": float64(dec_util)}, time.Now())
+				if err == nil {
+					y.AddMeta("unit", "%")
+					output <- y
+				}
 			}
 		}
 	}
--- a/collectors/nvidiaMetric.md
+++ b/collectors/nvidiaMetric.md
@@ -0,0 +1,40 @@
+
+## `nvidia` collector
+
+```json
+  "nvidia": {
+    "exclude_devices" : [
+      "0","1"
+    ],
+    "exclude_metrics": [
+      "nv_fb_memory",
+      "nv_fan"
+    ]
+  }
+```
+
+Metrics:
+* `nv_util`
+* `nv_mem_util`
+* `nv_mem_total`
+* `nv_fb_memory`
+* `nv_temp`
+* `nv_fan`
+* `nv_ecc_mode`
+* `nv_perf_state`
+* `nv_power_usage_report`
+* `nv_graphics_clock_report`
+* `nv_sm_clock_report`
+* `nv_mem_clock_report`
+* `nv_max_graphics_clock`
+* `nv_max_sm_clock`
+* `nv_max_mem_clock`
+* `nv_ecc_db_error`
+* `nv_ecc_sb_error`
+* `nv_power_man_limit`
+* `nv_encoder_util`
+* `nv_decoder_util`
+
+It uses a separate `type` in the metrics. The output metric looks like this:
+`<name>,type=accelerator,type-id=<nvidia-gpu-id> value=<metric value> <timestamp>`
+
--- a/collectors/tempMetric.go
+++ b/collectors/tempMetric.go
@@ -4,104 +4,227 @@ import (
 	"encoding/json"
 	"fmt"
 	"io/ioutil"
-	"os"
 	"path/filepath"
 	"strconv"
 	"strings"
 	"time"

-	lp "github.com/influxdata/line-protocol"
+	cclog "github.com/ClusterCockpit/cc-metric-collector/internal/ccLogger"
+	lp "github.com/ClusterCockpit/cc-metric-collector/internal/ccMetric"
 )

-const HWMON_PATH = `/sys/class/hwmon`
+// See: https://www.kernel.org/doc/html/latest/hwmon/sysfs-interface.html
+// /sys/class/hwmon/hwmon*/name -> coretemp
+// /sys/class/hwmon/hwmon*/temp*_label -> Core 0
+// /sys/class/hwmon/hwmon*/temp*_input -> 27800 = 27.8°C
+// /sys/class/hwmon/hwmon*/temp*_max -> 86000 = 86.0°C
+// /sys/class/hwmon/hwmon*/temp*_crit -> 100000 = 100.0°C

-type TempCollectorConfig struct {
-	ExcludeMetrics []string                     `json:"exclude_metrics"`
-	TagOverride    map[string]map[string]string `json:"tag_override"`
+type TempCollectorSensor struct {
+	name         string
+	label        string
+	metricName   string // Default: name_label
+	file         string
+	maxTempName  string
+	maxTemp      int64
+	critTempName string
+	critTemp     int64
+	tags         map[string]string
 }

 type TempCollector struct {
-	MetricCollector
-	config TempCollectorConfig
+	metricCollector
+	config struct {
+		ExcludeMetrics     []string                     `json:"exclude_metrics"`
+		TagOverride        map[string]map[string]string `json:"tag_override"`
+		ReportMaxTemp      bool                         `json:"report_max_temperature"`
+		ReportCriticalTemp bool                         `json:"report_critical_temperature"`
+	}
+	sensors []*TempCollectorSensor
 }

-func (m *TempCollector) Init(config []byte) error {
+func (m *TempCollector) Init(config json.RawMessage) error {
+	// Check if already initialized
+	if m.init {
+		return nil
+	}
+
 	m.name = "TempCollector"
 	m.setup()
-	m.init = true
 	if len(config) > 0 {
 		err := json.Unmarshal(config, &m.config)
 		if err != nil {
 			return err
 		}
 	}
+
+	m.meta = map[string]string{
+		"source": m.name,
+		"group":  "IPMI",
+		"unit":   "degC",
+	}
+
+	m.sensors = make([]*TempCollectorSensor, 0)
+
+	// Find all temperature sensor files
+	globPattern := filepath.Join("/sys/class/hwmon", "*", "temp*_input")
+	inputFiles, err := filepath.Glob(globPattern)
+	if err != nil {
+		return fmt.Errorf("Unable to glob files with pattern '%s': %v", globPattern, err)
+	}
+	if inputFiles == nil {
+		return fmt.Errorf("Unable to find any files with pattern '%s'", globPattern)
+	}
+
+	// Get sensor name for each temperature sensor file
+	for _, file := range inputFiles {
+		sensor := new(TempCollectorSensor)
+
+		// sensor name
+		nameFile := filepath.Join(filepath.Dir(file), "name")
+		name, err := ioutil.ReadFile(nameFile)
+		if err == nil {
+			sensor.name = strings.TrimSpace(string(name))
+		}
+
+		// sensor label
+		labelFile := strings.TrimSuffix(file, "_input") + "_label"
+		label, err := ioutil.ReadFile(labelFile)
+		if err == nil {
+			sensor.label = strings.TrimSpace(string(label))
+		}
+
+		// sensor metric name
+		switch {
+		case len(sensor.name) == 0 && len(sensor.label) == 0:
+			continue
+		case sensor.name == "coretemp" && strings.HasPrefix(sensor.label, "Core ") ||
+			sensor.name == "coretemp" && strings.HasPrefix(sensor.label, "Package id "):
+			sensor.metricName = "temp_" + sensor.label
+		case len(sensor.name) != 0 && len(sensor.label) != 0:
+			sensor.metricName = sensor.name + "_" + sensor.label
+		case len(sensor.name) != 0:
+			sensor.metricName = sensor.name
+		case len(sensor.label) != 0:
+			sensor.metricName = sensor.label
+		}
+		sensor.metricName = strings.ToLower(sensor.metricName)
+		sensor.metricName = strings.Replace(sensor.metricName, " ", "_", -1)
+		// Add temperature prefix, if required
+		if !strings.Contains(sensor.metricName, "temp") {
+			sensor.metricName = "temp_" + sensor.metricName
+		}
+
+		// Sensor file
+		sensor.file = file
+
+		// Sensor tags
+		sensor.tags = map[string]string{
+			"type": "node",
+		}
+
+		// Apply tag override configuration
+		for key, newtags := range m.config.TagOverride {
+			if strings.Contains(sensor.file, key) {
+				sensor.tags = newtags
+				break
+			}
+		}
+
+		// max temperature
+		if m.config.ReportMaxTemp {
+			maxTempFile := strings.TrimSuffix(file, "_input") + "_max"
+			if buffer, err := ioutil.ReadFile(maxTempFile); err == nil {
+				if x, err := strconv.ParseInt(strings.TrimSpace(string(buffer)), 10, 64); err == nil {
+					sensor.maxTempName = strings.Replace(sensor.metricName, "temp", "max_temp", 1)
+					sensor.maxTemp = x / 1000
+				}
+			}
+		}
+
+		// critical temperature
+		if m.config.ReportCriticalTemp {
+			criticalTempFile := strings.TrimSuffix(file, "_input") + "_crit"
+			if buffer, err := ioutil.ReadFile(criticalTempFile); err == nil {
+				if x, err := strconv.ParseInt(strings.TrimSpace(string(buffer)), 10, 64); err == nil {
+					sensor.critTempName = strings.Replace(sensor.metricName, "temp", "crit_temp", 1)
+					sensor.critTemp = x / 1000
+				}
+			}
+		}
+
+		m.sensors = append(m.sensors, sensor)
+	}
+
+	// Empty sensors map
+	if len(m.sensors) == 0 {
+		return fmt.Errorf("No temperature sensors found")
+	}
+
+	// Finished initialization
+	m.init = true
 	return nil
 }

-func get_hwmon_sensors() (map[string]map[string]string, error) {
-	var folders []string
-	var sensors map[string]map[string]string
-	sensors = make(map[string]map[string]string)
-	err := filepath.Walk(HWMON_PATH, func(p string, info os.FileInfo, err error) error {
-		if info.IsDir() {
-			return nil
-		}
-		folders = append(folders, p)
-		return nil
-	})
-	if err != nil {
-		return sensors, err
-	}
+func (m *TempCollector) Read(interval time.Duration, output chan lp.CCMetric) {

-	for _, f := range folders {
-		sensors[f] = make(map[string]string)
-		myp := fmt.Sprintf("%s/", f)
-		err := filepath.Walk(myp, func(path string, info os.FileInfo, err error) error {
-			dir, fname := filepath.Split(path)
-			if strings.Contains(fname, "temp") && strings.Contains(fname, "_input") {
-				namefile := fmt.Sprintf("%s/%s", dir, strings.Replace(fname, "_input", "_label", -1))
-				name, ierr := ioutil.ReadFile(namefile)
-				if ierr == nil {
-					sensors[f][strings.Replace(string(name), "\n", "", -1)] = path
-				}
-			}
-			return nil
-		})
+	for _, sensor := range m.sensors {
+		// Read sensor file
+		buffer, err := ioutil.ReadFile(sensor.file)
 		if err != nil {
+			cclog.ComponentError(
+				m.name,
+				fmt.Sprintf("Read(): Failed to read file '%s': %v", sensor.file, err))
 			continue
 		}
-	}
-	return sensors, nil
-}
+		x, err := strconv.ParseInt(strings.TrimSpace(string(buffer)), 10, 64)
+		if err != nil {
+			cclog.ComponentError(
+				m.name,
+				fmt.Sprintf("Read(): Failed to convert temperature '%s' to int64: %v", buffer, err))
+			continue
+		}
+		x /= 1000
+		y, err := lp.New(
+			sensor.metricName,
+			sensor.tags,
+			m.meta,
+			map[string]interface{}{"value": x},
+			time.Now(),
+		)
+		if err == nil {
+			output <- y
+		}

-func (m *TempCollector) Read(interval time.Duration, out *[]lp.MutableMetric) {
-
-	sensors, err := get_hwmon_sensors()
-	if err != nil {
-		return
-	}
-	for _, files := range sensors {
-		for name, file := range files {
-			tags := map[string]string{"type": "node"}
-			for key, newtags := range m.config.TagOverride {
-				if strings.Contains(file, key) {
-					tags = newtags
-					break
-				}
-			}
-			buffer, err := ioutil.ReadFile(string(file))
-			if err != nil {
-				continue
-			}
-			x, err := strconv.ParseInt(strings.Replace(string(buffer), "\n", "", -1), 0, 64)
+		// max temperature
+		if m.config.ReportMaxTemp && sensor.maxTemp != 0 {
+			y, err := lp.New(
+				sensor.maxTempName,
+				sensor.tags,
+				m.meta,
+				map[string]interface{}{"value": sensor.maxTemp},
+				time.Now(),
+			)
 			if err == nil {
-				y, err := lp.New(strings.ToLower(name), tags, map[string]interface{}{"value": float64(x) / 1000}, time.Now())
-				if err == nil {
-					*out = append(*out, y)
-				}
+				output <- y
+			}
+		}
+
+		// critical temperature
+		if m.config.ReportCriticalTemp && sensor.critTemp != 0 {
+			y, err := lp.New(
+				sensor.critTempName,
+				sensor.tags,
+				m.meta,
+				map[string]interface{}{"value": sensor.critTemp},
+				time.Now(),
+			)
+			if err == nil {
+				output <- y
 			}
 		}
 	}
+
 }

 func (m *TempCollector) Close() {
--- a/collectors/tempMetric.md
+++ b/collectors/tempMetric.md
@@ -0,0 +1,22 @@
+
+## `tempstat` collector
+
+```json
+  "tempstat": {
+    "tag_override" : {
+        "<device like hwmon1>" : {
+            "type" : "socket",
+            "type-id" : "0"
+        }
+    },
+    "exclude_metrics": [
+      "metric1",
+      "metric2"
+    ]
+  }
+```
+
+The `tempstat` collector reads the data from `/sys/class/hwmon/<device>/tempX_{input,label}`
+
+Metrics:
+* `temp_*`: The metric name is taken from the `label` files.
--- a/collectors/topprocsMetric.go
+++ b/collectors/topprocsMetric.go
@@ -9,7 +9,7 @@ import (
 	"strings"
 	"time"

-	lp "github.com/influxdata/line-protocol"
+	lp "github.com/ClusterCockpit/cc-metric-collector/internal/ccMetric"
 )

 const MAX_NUM_PROCS = 10
@@ -20,15 +20,16 @@ type TopProcsCollectorConfig struct {
 }

 type TopProcsCollector struct {
-	MetricCollector
+	metricCollector
 	tags   map[string]string
 	config TopProcsCollectorConfig
 }

-func (m *TopProcsCollector) Init(config []byte) error {
+func (m *TopProcsCollector) Init(config json.RawMessage) error {
 	var err error
 	m.name = "TopProcsCollector"
 	m.tags = map[string]string{"type": "node"}
+	m.meta = map[string]string{"source": m.name, "group": "TopProcs"}
 	if len(config) > 0 {
 		err = json.Unmarshal(config, &m.config)
 		if err != nil {
@@ -51,7 +52,7 @@ func (m *TopProcsCollector) Init(config []byte) error {
 	return nil
 }

-func (m *TopProcsCollector) Read(interval time.Duration, out *[]lp.MutableMetric) {
+func (m *TopProcsCollector) Read(interval time.Duration, output chan lp.CCMetric) {
 	if !m.init {
 		return
 	}
@@ -66,9 +67,9 @@ func (m *TopProcsCollector) Read(interval time.Duration, out *[]lp.MutableMetric
 	lines := strings.Split(string(stdout), "\n")
 	for i := 1; i < m.config.Num_procs+1; i++ {
 		name := fmt.Sprintf("topproc%d", i)
-		y, err := lp.New(name, m.tags, map[string]interface{}{"value": string(lines[i])}, time.Now())
+		y, err := lp.New(name, m.tags, m.meta, map[string]interface{}{"value": string(lines[i])}, time.Now())
 		if err == nil {
-			*out = append(*out, y)
+			output <- y
 		}
 	}
 }
--- a/collectors/topprocsMetric.md
+++ b/collectors/topprocsMetric.md
@@ -0,0 +1,15 @@
+
+## `topprocs` collector
+
+```json
+  "topprocs": {
+    "num_procs": 5
+  }
+```
+
+The `topprocs` collector reads the TopX processes (sorted by CPU utilization, `ps -Ao comm --sort=-pcpu`). 
+
+In contrast to most other collectors, the metric value is a `string`.
+
+
+