Device nodes are not guaranteed to be consistent over time #134

Scandiravian · 2023-07-19T19:00:52Z

Currently the smartctl exporter only attaches the device label to all metrics except smartctl_device. This was introduced in #83.

This can unfortunately lead to issues, since device nodes are not guaranteed to be consistent over time, so after a reboot /dev/sdc might for instance become /dev/sda.

This makes it difficult to create dashboards in Grafana that tracks for instance temperature over time, since a query will break after reboot.

If there is a goal to limit the number of labels sent, I think it would be better to switch to using the serial number as the identifying label sent with metrics. These are not guaranteed to be unique, though I think a conflict will be unlikely in most cases.

The text was updated successfully, but these errors were encountered:

k0ste · 2023-07-19T19:22:45Z

This can unfortunately lead to issues, since device nodes are not guaranteed to be consistent over time, so after a reboot /dev/sdc might for instance become /dev/sda.

This makes it difficult to create dashboards in Grafana that tracks for instance temperature over time, since a query will break after reboot.

What exactly query are break?

kfox1111 · 2023-07-19T20:00:32Z

I could see it breaking if you squashed an alert, and then the drive letters flip after reboot, and then the wrong drive is squashed...

disk-by-path might be another way to get a more stable identifier.

Scandiravian · 2023-07-20T08:21:00Z

What exactly query are break?

@k0ste Since PromQL doesn't support many-to-many joins and a new timeseries is created for each unique combination of labels, it's not possible to do a "join" between smartctl_device and any of the metrics that only has device to identify it by. In such a case I don't think it's possible to determine the actual piece of hardware that is for instance overheating by querying; it can only be done by manually looking it up.

Then there's also the issue that every time a device node is reassigned it makes any graph that tracks history wrong. If I'm tracking changes in disk space used, power cycles, etc and have alerts based on a percentage increase, those might trigger on a reboot when the device node is changed to a drive with different metrics.

Finally, with three servers, each with four drives, that'll eventually create 48 different timeseries in smartctl_device (each machine has 16 different ways to combine device node+serial number). This doesn't affect queries directly, but it does make it difficult to determine what is the current "right" one.

Scandiravian · 2023-07-20T08:45:27Z

disk-by-path might be another way to get a more stable identifier.

@kfox1111 I did consider suggesting this as well, though I decided against as that information is either available through smartctl or is not consistent (UUID for instance changing when a disk is formatted)

As I understand it, the use-case for this exporter is to track individual pieces of hardware over time, for instance if a drive is about to fail. I think the way that makes it the easiest to setup good tracking is to identify each harddrive by an id that doesn't change over time. I think the serial number is the only piece of data available that works for that, though I'm by no means a hardware expert, so there might be (and there probably are) a smarter solution than I can think of 😅

k0ste · 2023-07-20T09:45:55Z

@Scandiravian if you operate by linux device name - this is totally wrong, you should operate only by device serial_number. Linux device names are not persistent, for example:

you one of 36 disks, for example /dev/sdaa was powered off (Backplane/PSU problem)
when power was returned your disk will be /dev/sdab
what you will to do? Reboot kernel?

All your record rules / alerts should look like this

smartctl_device{form_factor="3.5 inches"} * on (instance, device)
  group_left () smartctl_device_temperature > 30

In this case, in one moment in time, the device label will be the same in all metrics. This how meta labels was designed

kfox1111 · 2023-07-20T16:10:03Z

I think there is a use case for querying both by disk-by-path (so you can identify slot 3 in node B in queries) as well as drive serial_numbers so you can track a drive no matter where it shows up.

kennethso168 · 2024-02-21T10:53:02Z

I think there is a use case for querying both by disk-by-path (so you can identify slot 3 in node B in queries) as well as drive serial_numbers so you can track a drive no matter where it shows up.

Agree. I believe to support both use cases, it may be appropriate to revert #83. And user should configure to drop the relevant label(s) in prometheus scrape config.

I have forked and reverted #83 for my own use. If deemed appropriate I can make a PR as well.

k0ste · 2024-02-21T11:14:59Z

I think there is a use case for querying both by disk-by-path (so you can identify slot 3 in node B in queries) as well as drive serial_numbers so you can track a drive no matter where it shows up.

Agree. I believe to support both use cases, it may be appropriate to revert #83. And user should configure to drop the relevant label(s) in prometheus scrape config.

I have forked and reverted #83 for my own use. If deemed appropriate I can make a PR as well.

This impossible to "resolve" on Prometheus side, because before drop something Prometheus should download something

What exactly issue do you have with current design?

kennethso168 · 2024-02-21T14:01:41Z

Oh my Google-fu should be really bad yesterday.

I would like to track the lifetime (e.g. Total Bytes Written) of a disk consistently. Yesterday I tested by intentionally causing a flip in device node. And the stats were, as expected, flipped.

You have provided an alert rule example, which inspired me to do something like this

That was still two series for the same drive before and after the device node flip. I would really like to join them as one. Then I was stuck.

In fact, I was using VictoriaMetrics instead of Prometheus. I tried using MetricsQL label_del function to drop the device label. I got duplicate output timeseries. I was stuck and thought that it was impossible to solve without changing the labels exported in the exporter. Thus I forked the project and added back exporting of labels including serial of the hard drive etc. and configured my scrape config to drop the device label

And after your reply I Googled again and came up with metricsQL: add function for merging time series values based on label value

That inspired me to come up with the following query and my problem is solved!

max without (device) (
    smartctl_device{form_factor="3.5 inches"} 
        * on (instance, device) group_left() 
    smartctl_device_attribute{instance=~'fileserver',attribute_name="Total_LBAs_Written",attribute_value_type="raw"}
) * 512

And for calculating the rate of increase:

Still open to discussion to whether adding serial number label is necessary.

k0ste · 2024-02-24T15:02:22Z

That inspired me to come up with the following query and my problem is solved!

Good to hear that!

Also you can make this query like this

sum by (attribute_name, model_name, serial_number) # <- result labels, that you actually need
  (smartctl_device_attribute{attribute_name="Total_LBAs_Written",
  attribute_value_type="raw"}
    * on (instance, device) group_right(attribute_name)
  smartctl_device{form_factor="3.5 inches"}
) * 512

When more you practice and more resolve your production cases, then more you get experience to create dashboards what works on you. For me is a priority to "do not go to ssh the host to find out something"
On our host dashboard we have a panel where supply or engineer teams can get answers:

how much disks installed on some host
what disk models, serial numbers
how long disk works
disk health (your SMART attributes)

Something like this:

We also have our thresholds (value for disk replacement alert), you can sort Grafana tables by attribute value, for example

Good luck!

lazywebm · 2024-05-02T15:35:39Z

In the process of setting up this exporter for the first time, running on a system with 25 SATA disks attached. #83 is not a good change in my opinion. I would always want to have the device serial number in the metrics output, for reasons described above (unstable identification via /dev/sdX).

Until then, I'll use my own forked version as well.

Informatic · 2024-07-14T17:43:10Z

IMO somewhat correct solution for this is replacing /dev/sd* with proper /dev/disk/by-id/... symlinks. This in fact should be solvable by passing a (seemingly) undocumented --device by-id option to smartctl --scan:

# smartctl --scan --device by-id 
/dev/disk/by-id/ata-CT500MX500SSD1_XXX -d scsi # /dev/disk/by-id/ata-CT500MX500SSD1_XXX, SCSI device
/dev/disk/by-id/ata-Hitachi_HUS724030ALE641_XXX -d scsi # /dev/disk/by-id/ata-Hitachi_HUS724030ALE641_XXX, SCSI device
/dev/disk/by-id/ata-ST31000528AS_XXX -d scsi # /dev/disk/by-id/ata-ST31000528AS_XXX, SCSI device
/dev/disk/by-id/ata-TOSHIBA_HDWD130_XXX -d scsi # /dev/disk/by-id/ata-TOSHIBA_HDWD130_XXX, SCSI device
/dev/disk/by-id/ata-TOSHIBA_HDWD130_YYY -d scsi # /dev/disk/by-id/ata-TOSHIBA_HDWD130_YYY, SCSI device
/dev/disk/by-id/ata-TOSHIBA_HDWD130_ZZZ -d scsi # /dev/disk/by-id/ata-TOSHIBA_HDWD130_ZZZ, SCSI device
/dev/nvme0 -d nvme # /dev/nvme0, NVMe device

As seen above, it's not perfect since it only applies to SATA/SAS devices, but this should be easily solvable regardless in smartmontools.

I'll (try to) prepare a PR adding a flag enabling this behaviour. (since it's a breaking change over what was there before)

Informatic mentioned this issue Jul 14, 2024

Add support for device types and predictable device paths #235

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Device nodes are not guaranteed to be consistent over time #134

Device nodes are not guaranteed to be consistent over time #134

Scandiravian commented Jul 19, 2023

k0ste commented Jul 19, 2023

kfox1111 commented Jul 19, 2023

Scandiravian commented Jul 20, 2023 •

edited

Loading

Scandiravian commented Jul 20, 2023 •

edited

Loading

k0ste commented Jul 20, 2023

kfox1111 commented Jul 20, 2023

kennethso168 commented Feb 21, 2024 •

edited

Loading

k0ste commented Feb 21, 2024

kennethso168 commented Feb 21, 2024

k0ste commented Feb 24, 2024

lazywebm commented May 2, 2024

Informatic commented Jul 14, 2024

Device nodes are not guaranteed to be consistent over time #134

Device nodes are not guaranteed to be consistent over time #134

Comments

Scandiravian commented Jul 19, 2023

k0ste commented Jul 19, 2023

kfox1111 commented Jul 19, 2023

Scandiravian commented Jul 20, 2023 • edited Loading

Scandiravian commented Jul 20, 2023 • edited Loading

k0ste commented Jul 20, 2023

kfox1111 commented Jul 20, 2023

kennethso168 commented Feb 21, 2024 • edited Loading

k0ste commented Feb 21, 2024

kennethso168 commented Feb 21, 2024

k0ste commented Feb 24, 2024

lazywebm commented May 2, 2024

Informatic commented Jul 14, 2024

Scandiravian commented Jul 20, 2023 •

edited

Loading

Scandiravian commented Jul 20, 2023 •

edited

Loading

kennethso168 commented Feb 21, 2024 •

edited

Loading