1.9 KiB
		
	
	
	
	
	
	
	
			
		
		
	
	rocm_smi collector
  "rocm_smi": {
    "exclude_devices": [
      "0","1", "0000000:ff:01.0"
    ],
    "exclude_metrics": [
      "rocm_mm_util",
      "rocm_temp_vrsoc"
    ],
    "use_pci_info_as_type_id": true,
    "add_pci_info_tag": false,
    "add_serial_meta": false,
  }
The rocm_smi collector can be configured to leave out specific devices with the exclude_devices option. It takes logical IDs in the list of available devices or the PCI address similar to NVML format (%08X:%02X:%02X.0). Metrics (listed below) that should not be sent to the MetricRouter can be excluded with the exclude_metrics option.
The metrics sent by the rocm_smi collector use accelerator as type tag. For the type-id, it uses the device handle index by default. With the use_pci_info_as_type_id option, the PCI ID is used instead. If both values should be added as tags, activate the add_pci_info_tag option. It uses the device handle index as type-id and adds the PCI ID as separate pci_identifier tag.
Optionally, it is possible to add the serial to the meta informations. They are not sent to the sinks (if not configured otherwise).
Metrics:
rocm_gfx_utilrocm_umc_utilrocm_mm_utilrocm_avg_powerrocm_temp_memrocm_temp_hotspotrocm_temp_edgerocm_temp_vrgfxrocm_temp_vrsocrocm_temp_vrmemrocm_gfx_clockrocm_soc_clockrocm_u_clockrocm_v0_clockrocm_v1_clockrocm_d0_clockrocm_d1_clockrocm_temp_hbm
Some metrics add the additional sub type tag (stype) like the rocm_temp_hbm metrics set stype=device,stype-id=<HBM_slice_number>.