Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the agent behave more like an agent #165

Closed
wants to merge 1 commit into from

Conversation

Itxaka
Copy link
Member

@Itxaka Itxaka commented Oct 17, 2023

This syncs with what we expect of the agent start as a service, something that runs in there and checks from time to time

This syncs with what we expect of the agent start as a service,
something that runs in there and checks from time to time

Signed-off-by: Itxaka <[email protected]>
@Itxaka Itxaka requested a review from mudler October 17, 2023 18:54
@Itxaka
Copy link
Member Author

Itxaka commented Oct 17, 2023

Is this ok @mudler ? Before this, on alpine the service kept on restarting time and time again as it was kind of set as a service.

This makes it behave more like a service, but Im not sure of the implications of really calling the bootstrap event over and over?

Is this correct? Do we expect this behaviour to be like this?

@Itxaka
Copy link
Member Author

Itxaka commented Oct 17, 2023

The thing with this is:

kairos-agent is managed as a one off service on alpine
this means that the service is supposed to start and go away
currently this makes the logging from the provider to catch the whole tty1 and ttyS0 and doesnt allow loging or anything
I was thinking of moving the agent to be more like a normal service
but on its current implementation it runs, and then exits which marks the service as failed and makes openrc restart it again and again
so I thougth about making it more service-like

currently exploring other ways of dealing with it.

@Itxaka
Copy link
Member Author

Itxaka commented Oct 17, 2023

I think the real problem with this is:

kairos agent behaves differently

  • if no provider, it just runs and exists
  • if provider, then it waits for provider to end before exiting

@Itxaka
Copy link
Member Author

Itxaka commented Oct 17, 2023

default inittab:

# /etc/inittab

::sysinit:/sbin/openrc sysinit
::sysinit:/sbin/openrc boot
::wait:/sbin/openrc default

# Set up a couple of getty's
tty1::respawn:/sbin/getty 38400 tty1
tty2::respawn:/sbin/getty 38400 tty2
tty3::respawn:/sbin/getty 38400 tty3
tty4::respawn:/sbin/getty 38400 tty4
tty5::respawn:/sbin/getty 38400 tty5
tty6::respawn:/sbin/getty 38400 tty6

# Put a getty on the serial port
#

# Stuff to do for the 3-finger salute
::ctrlaltdel:/sbin/reboot

# Stuff to do before rebooting
::shutdown:/sbin/openrc shutdown

ttyS0::respawn:/sbin/getty -L ttyS0 115200 vt100

Notice how the default runlevel has a wait on it, so it wont start the ttys until it has started everything

but the status shows that kairos-agent under a provider is marked as starting forever:

Runlevel: default
 cos-setup-network                                                                         [  started  ]
 edgevpn                                                                       [  started 00:11:39 (0) ]
 cos-setup-boot                                                                            [  started  ]
 kairos-agent                                                                              [ starting  ]
 cos-setup-reconcile                                                           [  started 00:11:40 (0) ]
Dynamic Runlevel: hotplugged
Dynamic Runlevel: needed/wanted
 hostname                                                                                  [  started  ]
 sysfs                                                                                     [  started  ]
 devfs                                                                                     [  started  ]
 modules                                                                                   [  started  ]
 hwclock                                                                                   [  started  ]
 fsck                                                                                      [  started  ]
 root                                                                                      [  started  ]
 localmount                                                                                [  started  ]
 dbus                                                                                      [  started  ]

@Itxaka
Copy link
Member Author

Itxaka commented Oct 17, 2023

well, adding a & in the service file also fixes this LMAO

// capture ctrl+c and exit cleanly
channel := make(chan os.Signal, 1)
signal.Notify(channel, os.Interrupt)
go func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will spawn up a goroutine and grow indefinetly on each time Run is called again

Copy link
Member

@mudler mudler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a small nit but going in the good direction

@Itxaka Itxaka added the blocked label Oct 18, 2023
@Itxaka Itxaka closed this May 16, 2024
@Itxaka Itxaka deleted the agent_behave_like_agent branch May 16, 2024 09:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants