Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto enrollment of nodes #23

Open
chrisdoherty4 opened this issue Jan 18, 2023 · 11 comments
Open

Auto enrollment of nodes #23

chrisdoherty4 opened this issue Jan 18, 2023 · 11 comments
Labels
status/discussion The scope and kind of work is still in discussion theme/feature New functionality in Tinkerbell

Comments

@chrisdoherty4
Copy link
Member

Overview

There have been various requests to auto enroll devices with some sort of MAC filtering. Auto enrollment could mean bringing a device online ready to process workflows, or it could mean defining a default workflow to be run on all devices that auto enroll.

It may be useful to think of running a default workflow as an independently configurable feature from auto enrolling a device. This would help define auto enrollment as simply bringing a Tink Worker online on said device and subsequently allow operators to manually define workflows as well as define an automated approach.

@chrisdoherty4 chrisdoherty4 added theme/feature New functionality in Tinkerbell status/discussion The scope and kind of work is still in discussion labels Jan 18, 2023
@chrisdoherty4 chrisdoherty4 moved this to Discussion in Tinkerbell Roadmap Jan 18, 2023
@jacobweinstock
Copy link
Member

Linked issue: tinkerbell/smee#178

@jacobweinstock
Copy link
Member

Linked discussion: tinkerbell/tink#522

@jacobweinstock
Copy link
Member

Leaving this in discussion until it is broken down a bit more. (for example: auto network booted vs auto provisioned)

@chrisdoherty4
Copy link
Member Author

In the last discussion we concluded it would be useful to break this feature into 2: (1) Auto enrollment of hardware in the Tinkerbell stack and (2) Running default workflows. We'll tweak the summary of this roadmap item for (1) and have a separate roadmap item for (2).

@pedroalvesbatista
Copy link

Very nice, I would like to participate in those discussions and even do some PR's. This is definitively something great to have, as working in a bigger scale of machines, auto-enrollment and workers being able to become "discoverable" (maybe adding this to some config files, like discoverable:true).

@pedroalvesbatista
Copy link

Based on today's meeting, I suggest we go by two routes and start to "design" something more touchable.

  1. Node auto-enrollment and node "sniffing" - challenges and opportunities
    1.1 - Identify which services can play a part in the band :
  • Rufio could use some entries from BMC's after Hook boots up and fetchs a bunch of node HW information
  • Hegel could retain the previous HW information
  • Boots can read the entries from Hegel and provide them all to Provisioner
  • Perhaps design and implement a new service (and suggest a name) to collect HW metadata, generate a hardware.yml for default workflow execution and even as a skeleton for customized deployments based on some informations of specific nodes (based on MAC addresses, GPU/CPU profiling, memory sets like size or access type like NUMA etc)
  1. Design a Request for Enhancement (RFE) proposal and map the impacts on actual code-base and project as a whole

This would be the last part and after the previous one, based on that, we need to do :
2.1 - Design the features and implement small pieces along with quick experiments
2.2 - Collect data and thoughts on how components are interacting and side-effects along with trade-offs
2.3 - Decide to go for a alpha and beta version of the whole stack with everything in place
2.4 - Ask for feedback from the community and look to use-cases to demonstrate how Tinkerbell is behaving in real-world scenarios deployments

Anything else to be thrown here ?

@mddeff
Copy link

mddeff commented Jul 28, 2023

Unsure if this is the forum for providing community feedback/use-cases, or if that should saved for the RFE discussion, but we have a few distinct use cases where hardware auto-discovery/having a default workflow would be super helpful.

One of the things that might be difficult to reconcile is whether you've already "discovered" a piece of hardware before. I.e. Does one need 100% match between existing hardware profile and 'new' hardware profile for them to be 'the same' (and another hardware profile/object is not created)? What happens if its a match except for one PCI-E card being removed (update the old hardware object or create a new one)? Just some things to think about.

@chrisdoherty4
Copy link
Member Author

Hi @mddeff. This is definitely the right place to provide feedback, so thank you!

@jacobweinstock
Copy link
Member

linked PR: tinkerbell/smee#460

@jacobweinstock
Copy link
Member

jacobweinstock commented Jul 16, 2024

Design doc: #38

jacobweinstock added a commit to tinkerbell/smee that referenced this issue Aug 2, 2024
Auto netboot capability:

## Description

<!--- Please describe what this PR is going to change -->
This adds the ability to network boot machines that do not have an existing Hardware object. This is part of the implementation of this roadmap item: tinkerbell/roadmap#23

## Why is this needed

<!--- Link to issue you have raised -->

Fixes: #

## How Has This Been Tested?
<!--- Please describe in detail how you tested your changes. -->
<!--- Include details of your testing environment, and the tests you ran to -->
<!--- see how your change affects other areas of the code, etc. -->


## How are existing users impacted? What migration steps/scripts do we need?

<!--- Fixes a bug, unblocks installation, removes a component of the stack etc -->
<!--- Requires a DB migration script, etc. -->


## Checklist:

I have:

- [ ] updated the documentation and/or roadmap (if required)
- [ ] added unit or e2e tests
- [ ] provided instructions on how to upgrade
@jacobweinstock
Copy link
Member

#40 is related as it would enable a very flexible way to specify a Template for unknown Workers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status/discussion The scope and kind of work is still in discussion theme/feature New functionality in Tinkerbell
Projects
Status: No status
Development

No branches or pull requests

4 participants