Update for ccUnits and more README and tests

This commit is contained in:
Thomas Roehl 2022-03-11 12:26:31 +01:00
parent fc34a1d91d
commit 13156f84eb
4 changed files with 197 additions and 51 deletions

View File

@ -1,30 +1,43 @@
# ccUnits - A unit system for ClusterCockpit
When working with metrics, the problem comes up that they may use different unit name but have the same unit in fact. There are a lot of real world examples like 'kB' and 'Kbyte'. In CC Metric Collector, the Collectors read data from different sources which may use different units or the programmer specifies a unit for a metric by hand. In order to enable unit comparison and conversion, the ccUnits package provides some helpers:
When working with metrics, the problem comes up that they may use different unit name but have the same unit in fact. There are a lot of real world examples like 'kB' and 'Kbyte'. In CC Metric Collector, the Collectors read data from different sources which may use different units or the programmer specifies a unit for a metric by hand. The ccUnits system is not comparable with the SI unit system. If you are looking for a package for the SI units, see [here](https://pkg.go.dev/github.com/gurre/si).
In order to enable unit comparison and conversion, the ccUnits package provides some helpers:
There are basically two important functions:
```go
NewUnit(unit string) Unit
GetUnitPrefixFactor(in, out Unit) (float64, error) // Get the prefix difference for conversion
GetUnitPrefixFactor(in Unit, out Unit) (func(value float64) float64, error) // Get conversion function for the value
type Unit interface {
Valid() bool
String() string
Short() string
AddDivisorUnit(div Measure)
}
```
In order to get the "normalized" string unit back, you can use:
In order to get the "normalized" string unit back or test for validity, you can use:
```go
u := NewUnit("MB")
fmt.Printf("Long string %q", u.String())
fmt.Printf("Short string %q", u.Short())
fmt.Println(u.Valid()) // true
fmt.Printf("Long string %q", u.String()) // MegaBytes
fmt.Printf("Short string %q", u.Short()) // MBytes
v := NewUnit("foo")
fmt.Println(v.Valid()) // false
```
If you have two units and need the conversion factor:
If you have two units and need the conversion function:
```go
u1 := NewUnit("kB")
u2 := NewUnit("MBytes")
factor, err := GetUnitPrefixFactor(u1, u2) // Returns an error if the units have different measures
convFunc, err := GetUnitPrefixFactor(u1, u2) // Returns an error if the units have different measures
if err == nil {
v2 := v1 * factor
v2 := convFunc(v1)
}
```
(In the ClusterCockpit ecosystem the separation between values and units if useful since they are commonly not stored as a single entity but the value is a field in the CCMetric while unit is a tag or a meta information).
If you have a metric and want the derivation to a bandwidth or events per second, you can use the original unit:
```go
@ -34,10 +47,11 @@ if err == nil {
if ok {
out_unit = NewUnit(in_unit)
out_unit.AddDivisorUnit("seconds")
seconds := timeDiff.Seconds()
y, err := lp.New(metric.Name()+"_bw",
metric.Tags(),
metric.Meta(),
map[string]interface{"value": value/time},
map[string]interface{"value": value/seconds},
metric.Time())
if err == nil {
y.AddMeta("unit", out_unit.Short())
@ -46,6 +60,21 @@ if err == nil {
}
```
## Special unit detection
Some used measures like Bytes and Flops are non-dividable. Consequently there prefixes like Milli, Micro and Nano are not useful. This is quite handy since a unit `mb` for `MBytes` is not uncommon but would by default be parsed as "MilliBytes".
Special parsing rules for the following measures: iff `prefix==Milli`, use `prefix==Mega`
- `Bytes`
- `Flops`
- `Packets`
- `Events`
- `Cycles`
- `Requests`
This means the prefixes `Micro` (like `ubytes`) and `Nano` like (`nflops/sec`) are not allowed and return an invalid unit.
## Supported prefixes
```go
@ -90,4 +119,26 @@ const (
)
```
There a regular expression for each of the measures like `^([bB][yY]?[tT]?[eE]?[sS]?)` for the `Bytes` measure.
There a regular expression for each of the measures like `^([bB][yY]?[tT]?[eE]?[sS]?)` for the `Bytes` measure.
## New units
If the selected units are not suitable for your metric, feel free to send a PR.
### New prefix
For a new prefix, add it to the big `const` in `ccUnitPrefix.go` and adjust the prefix-unit-splitting regular expression. Afterwards, you have to add cases to the three functions `String()`, `Prefix()` and `NewPrefix()`. `NewPrefix()` contains the parser (`k` or `K` -> `Kilo`). The other one are used for output. `String()` outputs a longer version of the prefix (`Kilo`), while `Prefix()` returns only the short notation (`K`).
### New measure
Adding new prefixes is probably rare but adding a new measure is a more common task. At first, add it to the big `const` in `ccUnitMeasure.go`. Moreover, create a regular expression matching the measure (and pre-compile it like the others). Add the expression matching to `NewMeasure()`. The `String()` and `Short()` functions return descriptive strings for the measure in long form (like `Hertz`) and short form (like `Hz`).
If there are special conversation rules between measures and you want to convert one measure to another, like temperatures in Celsius to Fahrenheit, a special case in `GetUnitPrefixFactor()` is required.
### Special parsing rules
The two parsers for prefix and measure are called under the hood by `NewUnit()` and there might some special rules apply. Like in the above section about 'special unit detection', special rules for your new measure might be required. Currently there are two special cases:
- Measures that are non-dividable like Flops, Bytes, Events, ... cannot use `Milli`, `Micro` and `Nano`. The prefix `m` is forced to `M` for these measures
- If the prefix is `p`/`P` (`Peta`) or `e`/`E` (`Exa`) and the measure is not detectable, it retries detection with the prefix. So first round it tries for example (prefix `p`, measure `ackets`) which fails, to it retries with (measure `packets` and no prefix).

View File

@ -94,7 +94,7 @@ func (m *Measure) Short() string {
const bytesRegexStr = `^([bB][yY]?[tT]?[eE]?[sS]?)`
const flopsRegexStr = `^([fF][lL]?[oO]?[pP]?[sS]?)`
const percentRegexStr = `^(%%|[pP]ercent)`
const percentRegexStr = `^(%|[pP]ercent)`
const degreeCRegexStr = `^(deg[Cc]|°[cC])`
const degreeFRegexStr = `^(deg[fF]|°[fF])`
const rpmRegexStr = `^([rR][pP][mM])`
@ -105,6 +105,7 @@ const energyRegexStr = `^([jJ][oO]?[uU]?[lL]?[eE]?[sS]?)`
const cyclesRegexStr = `^([cC][yY][cC]?[lL]?[eE]?[sS]?)`
const requestsRegexStr = `^([rR][eE][qQ][uU]?[eE]?[sS]?[tT]?[sS]?)`
const packetsRegexStr = `^([pP][aA]?[cC]?[kK][eE]?[tT][sS]?)`
const eventsRegexStr = `^([eE][vV]?[eE]?[nN][tT][sS]?)`
var bytesRegex = regexp.MustCompile(bytesRegexStr)
var flopsRegex = regexp.MustCompile(flopsRegexStr)
@ -119,6 +120,7 @@ var energyRegex = regexp.MustCompile(energyRegexStr)
var cyclesRegex = regexp.MustCompile(cyclesRegexStr)
var requestsRegex = regexp.MustCompile(requestsRegexStr)
var packetsRegex = regexp.MustCompile(packetsRegexStr)
var eventsRegex = regexp.MustCompile(eventsRegexStr)
func NewMeasure(unit string) Measure {
var match []string
@ -174,5 +176,9 @@ func NewMeasure(unit string) Measure {
if match != nil {
return Packets
}
match = eventsRegex.FindStringSubmatch(unit)
if match != nil {
return Events
}
return None
}

View File

@ -5,71 +5,140 @@ import (
"strings"
)
type Unit struct {
scale Prefix
type unit struct {
prefix Prefix
measure Measure
divMeasure Measure
}
func (u *Unit) String() string {
type Unit interface {
Valid() bool
String() string
Short() string
AddDivisorUnit(div Measure)
getPrefix() Prefix
getMeasure() Measure
getDivMeasure() Measure
}
func (u *unit) Valid() bool {
return u.measure != None
}
func (u *unit) String() string {
if u.divMeasure != None {
return fmt.Sprintf("%s%s/%s", u.scale.String(), u.measure.String(), u.divMeasure.String())
return fmt.Sprintf("%s%s/%s", u.prefix.String(), u.measure.String(), u.divMeasure.String())
} else {
return fmt.Sprintf("%s%s", u.scale.String(), u.measure.String())
return fmt.Sprintf("%s%s", u.prefix.String(), u.measure.String())
}
}
func (u *Unit) Short() string {
func (u *unit) Short() string {
if u.divMeasure != None {
return fmt.Sprintf("%s%s/%s", u.scale.Prefix(), u.measure.Short(), u.divMeasure.Short())
return fmt.Sprintf("%s%s/%s", u.prefix.Prefix(), u.measure.Short(), u.divMeasure.Short())
} else {
return fmt.Sprintf("%s%s", u.scale.Prefix(), u.measure.Short())
return fmt.Sprintf("%s%s", u.prefix.Prefix(), u.measure.Short())
}
}
func (u *Unit) AddDivisorUnit(div Measure) {
func (u *unit) AddDivisorUnit(div Measure) {
u.divMeasure = div
}
func GetPrefixFactor(in Prefix, out Prefix) float64 {
func (u *unit) getPrefix() Prefix {
return u.prefix
}
func (u *unit) getMeasure() Measure {
return u.measure
}
func (u *unit) getDivMeasure() Measure {
return u.divMeasure
}
func GetPrefixFactor(in Prefix, out Prefix) func(value float64) float64 {
var factor = 1.0
var in_scale = 1.0
var out_scale = 1.0
var in_prefix = 1.0
var out_prefix = 1.0
if in != Base {
in_scale = float64(in)
in_prefix = float64(in)
}
if out != Base {
out_scale = float64(out)
out_prefix = float64(out)
}
factor = in_scale / out_scale
return factor
factor = in_prefix / out_prefix
return func(value float64) float64 { return factor }
}
func GetUnitPrefixFactor(in Unit, out Unit) (float64, error) {
if in.measure != out.measure || in.divMeasure != out.divMeasure {
return 1.0, fmt.Errorf("invalid measures in in and out Unit")
func GetUnitPrefixFactor(in Unit, out Unit) (func(value float64) float64, error) {
if in.getMeasure() == TemperatureC && out.getMeasure() == TemperatureF {
return func(value float64) float64 { return (value * 1.8) + 32 }, nil
} else if in.getMeasure() == TemperatureF && out.getMeasure() == TemperatureC {
return func(value float64) float64 { return (value - 32) / 1.8 }, nil
} else if in.getMeasure() != out.getMeasure() || in.getDivMeasure() != out.getDivMeasure() {
return func(value float64) float64 { return 1.0 }, fmt.Errorf("invalid measures in in and out Unit")
}
return GetPrefixFactor(in.scale, out.scale), nil
return GetPrefixFactor(in.getPrefix(), out.getPrefix()), nil
}
func NewUnit(unit string) Unit {
u := Unit{
scale: Base,
func NewUnit(unitStr string) Unit {
u := &unit{
prefix: Base,
measure: None,
divMeasure: None,
}
matches := prefixRegex.FindStringSubmatch(unit)
matches := prefixRegex.FindStringSubmatch(unitStr)
if len(matches) > 2 {
u.scale = NewPrefix(matches[1])
pre := NewPrefix(matches[1])
measures := strings.Split(matches[2], "/")
u.measure = NewMeasure(measures[0])
// Special case for 'm' as scale for Bytes as thers is nothing like MilliBytes
if u.measure == Bytes && u.scale == Milli {
u.scale = Mega
m := NewMeasure(measures[0])
// Special case for prefix 'p' or 'P' (Peta) and measures starting with 'p' or 'P'
// like 'packets' or 'percent'. Same for 'e' or 'E' (Exa) for measures starting with
// 'e' or 'E' like 'events'
if m == None && pre == Base {
if strings.ToLower(matches[1]) == "p" || strings.ToLower(matches[1]) == "e" {
t := NewMeasure(matches[1] + measures[0])
if t != None {
m = t
pre = Base
}
}
}
div := None
if len(measures) > 1 {
u.divMeasure = NewMeasure(measures[1])
div = NewMeasure(measures[1])
}
// Special case for 'm' as prefix for Bytes as thers is nothing like MilliBytes
switch m {
case Bytes:
if pre == Milli {
pre = Mega
}
case Flops:
if pre == Milli {
pre = Mega
}
case Packets:
if pre == Milli {
pre = Mega
}
case Events:
if pre == Milli {
pre = Mega
}
case Cycles:
if pre == Milli {
pre = Mega
}
case Requests:
if pre == Milli {
pre = Mega
}
}
u.prefix = pre
u.measure = m
u.divMeasure = div
}
return u
}

View File

@ -40,16 +40,36 @@ func TestUnitsExact(t *testing.T) {
{"degC", NewUnit("degC")},
{"degf", NewUnit("degF")},
{"°f", NewUnit("degF")},
{"events", NewUnit("events")},
{"event", NewUnit("events")},
{"EveNts", NewUnit("events")},
{"reqs", NewUnit("requests")},
{"requests", NewUnit("requests")},
{"Requests", NewUnit("requests")},
{"cyc", NewUnit("cycles")},
{"cy", NewUnit("cycles")},
{"Cycles", NewUnit("cycles")},
{"J", NewUnit("Joules")},
{"Joule", NewUnit("Joules")},
{"joule", NewUnit("Joules")},
{"W", NewUnit("Watt")},
{"Watts", NewUnit("Watt")},
{"watt", NewUnit("Watt")},
{"s", NewUnit("seconds")},
{"sec", NewUnit("seconds")},
{"secs", NewUnit("seconds")},
{"RPM", NewUnit("rpm")},
{"rPm", NewUnit("rpm")},
}
compareUnitExact := func(in, out Unit) bool {
if in.measure == out.measure && in.divMeasure == out.divMeasure && in.scale == out.scale {
if in.getMeasure() == out.getMeasure() && in.getDivMeasure() == out.getDivMeasure() && in.getPrefix() == out.getPrefix() {
return true
}
return false
}
for _, c := range testCases {
u := NewUnit(c.in)
if !compareUnitExact(u, c.want) {
if (!u.Valid()) || (!compareUnitExact(u, c.want)) {
t.Errorf("func NewUnit(%q) == %q, want %q", c.in, u.String(), c.want.String())
}
}
@ -57,9 +77,9 @@ func TestUnitsExact(t *testing.T) {
func TestUnitsDifferentPrefix(t *testing.T) {
testCases := []struct {
in string
want Unit
scaleFactor float64
in string
want Unit
prefixFactor float64
}{
{"kb", NewUnit("Bytes"), 1000},
{"Mb", NewUnit("Bytes"), 1000000},
@ -72,19 +92,19 @@ func TestUnitsDifferentPrefix(t *testing.T) {
{"mb", NewUnit("MBytes"), 1.0},
}
compareUnitWithPrefix := func(in, out Unit, factor float64) bool {
if in.measure == out.measure && in.divMeasure == out.divMeasure {
if f := GetPrefixFactor(in.scale, out.scale); f == factor {
if in.getMeasure() == out.getMeasure() && in.getDivMeasure() == out.getDivMeasure() {
if f := GetPrefixFactor(in.getPrefix(), out.getPrefix()); f(1.0) == factor {
return true
} else {
fmt.Println(f)
fmt.Println(f(1.0))
}
}
return false
}
for _, c := range testCases {
u := NewUnit(c.in)
if !compareUnitWithPrefix(u, c.want, c.scaleFactor) {
t.Errorf("func NewUnit(%q) == %q, want %q with factor %f", c.in, u.String(), c.want.String(), c.scaleFactor)
if (!u.Valid()) || (!compareUnitWithPrefix(u, c.want, c.prefixFactor)) {
t.Errorf("func NewUnit(%q) == %q, want %q with factor %f", c.in, u.String(), c.want.String(), c.prefixFactor)
}
}
}