Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(input.intel_pmt): Handle telem devices without numa_node attribute #13977

Merged
merged 1 commit into from
Sep 26, 2023

Conversation

bensallen
Copy link
Contributor

When the intel_pmt input plugin is enabled currently, it will cause telegraf to fail on startup like so:

telegraf[47878]: 2023-09-21T22:27:09Z E! [telegraf] Error running agent: could not initialize input inputs.intel_pmt: error while exploring pmt sysfs: error while evaluating symlink "/sys/class/intel_pmt/telem0/numa_node": lstat /sys/devices/pci0000:00/0000:00:0a.0/intel_vsec.telemetry.0/intel_pmt/telem0/numa_node: no such file or directory

While intel_pmt telem devices do not have a numa_node attribute, their parent intel_vsec devices do. For example the current behavior:

$ ls -l /sys/class/intel_pmt/telem0/device/numa_node
ls: cannot access '/sys/class/intel_pmt/telem0/device/numa_node': No such file or directory

Versus traversing up to intel_vsec device:

$ ls -l /sys/class/intel_pmt/telem0/device/../numa_node
-rw-r--r--. 1 root root 4096 Sep 18 14:13 /sys/class/intel_pmt/telem0/device/../numa_node

Thus update explorePmtInSysfs() to traverse up to the intel_vsec device to find numa_node.

Note, filepath.Join() will interpret the .. ahead of the filepath.EvalSymlinks, thus we evalSymlinks /sys/class/intel_pmt/telem0/device and then use filepath.Join() to traverse up.

Tested on Fedora 38's 6.4.14-200.fc38.x86_64 kernel and OpenSUSE's 15.4's 5.14.21-150400.24.55-default.

  • [N/A] Updated associated README.md.
  • [N/A] Wrote appropriate unit tests.
  • Pull request title or commits are in conventional commit format

resolves #13976

When the intel_pmt input plugin is enabled currently, it will cause
telegraf to fail on startup like so:

```
telegraf[47878]: 2023-09-21T22:27:09Z E! [telegraf] Error running agent: could not initialize input inputs.intel_pmt: error while exploring pmt sysfs: error while evaluating symlink "/sys/class/intel_pmt/telem0/numa_node": lstat /sys/devices/pci0000:00/0000:00:0a.0/intel_vsec.telemetry.0/intel_pmt/telem0/numa_node: no such file or directory
```

While intel_pmt telem devices do not have a numa_node attribute, their
parent intel_vsec devices do. For example the current behavior:

```
$ ls -l /sys/class/intel_pmt/telem0/device/numa_node
ls: cannot access '/sys/class/intel_pmt/telem0/device/numa_node': No such file or directory
```

Versus traversing up to intel_vsec device:

```
$ ls -l /sys/class/intel_pmt/telem0/device/../numa_node
-rw-r--r--. 1 root root 4096 Sep 18 14:13 /sys/class/intel_pmt/telem0/device/../numa_node
```

Thus update explorePmtInSysfs() to traverse up to the intel_vsec device
to find numa_node.

Note, filepath.Join() will interpret the `..` ahead of the filepath.EvalSymlinks,
thus we evalSymlinks `/sys/class/intel_pmt/telem0/device` and then use filepath.Join() to traverse up.

Tested on Fedora 38's 6.4.14-200.fc38.x86_64 kernel and OpenSUSE's 15.4's 5.14.21-150400.24.55-default.
@telegraf-tiger
Copy link
Contributor

Thanks so much for the pull request!
🤝 ✒️ Just a reminder that the CLA has not yet been signed, and we'll need it before merging. Please sign the CLA when you get a chance, then post a comment here saying !signed-cla

@telegraf-tiger telegraf-tiger bot added fix pr to fix corresponding bug plugin/input 1. Request for new input plugins 2. Issues/PRs that are related to input plugins labels Sep 22, 2023
@bensallen
Copy link
Contributor Author

!signed-cla

@bensallen
Copy link
Contributor Author

@jakubsikorski it would be good to have you review this if possible.

@powersj powersj requested a review from p-zak September 22, 2023 16:56
Copy link
Contributor

@jakubsikorski jakubsikorski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Ben,

It seems different versions of the kernel behave differently here. This depends on the pmt_telemetry/intel_vsec device driver.

If the driver is bus/auxiliary: there won't be a numa_node file and it will fail, just as you described.
If the driver is bus/platform: numa_node file will be present and it will work as expected.

Traversing up the driver will become bus/pci. It could be intel_vsec as you described in newer versions of the kernel or intel_pmt, but it will be a pci driver, which should always have a numa_node file, even if the node is unknown (value is -1 then).

So the changes you propose will indeed fix that.
Thanks for finding this.

Copy link
Contributor

@powersj powersj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@p-zak @jakubsikorski thank you for reviewing!

@powersj powersj added the ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review. label Sep 25, 2023
Copy link
Member

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this!

@srebhan srebhan changed the title fix(input.intel_pmt): telem devices do not have a numa_node attribute fix(input.intel_pmt): Handle telem devices without numa_node attribute Sep 26, 2023
@srebhan srebhan merged commit 9a95ef1 into influxdata:master Sep 26, 2023
5 checks passed
@github-actions github-actions bot added this to the v1.28.2 milestone Sep 26, 2023
powersj pushed a commit that referenced this pull request Oct 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix pr to fix corresponding bug plugin/input 1. Request for new input plugins 2. Issues/PRs that are related to input plugins ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

intel_pmt input plugin fails initialization looking for numa_node in sysfs
5 participants