Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent deadlocking event-handler on missing recordings #44659

Merged
merged 1 commit into from
Jul 29, 2024

Conversation

rosstimothy
Copy link
Contributor

@rosstimothy rosstimothy commented Jul 25, 2024

If the number of session.upload events without a corresponding session recording is higher than the configured concurrency, then the event-handler could deadlock while attempting to find said recording. To prevent this from occurring processing of session recordings has been modified such that:

  • if a recording does not exist and the session.upload event is from more than 48 hours in the past it is assumed to be lost and no more attempts to process the recording will happen
  • all processing of recordings that were not found is now done in a separate background routine, instead of inline with event processing

The storage directory will now contain information about sessions with a missing recording. The background routine will process them periodically, capping the number of attempts per recording at 3.

Changelog: reduce the probability that the event-handler deadlocks when encountering errors processing session recordings

@rosstimothy rosstimothy changed the title Prevent deadlocking on missing recordings Prevent event-handler deadlocking on missing recordings Jul 25, 2024
@rosstimothy rosstimothy changed the title Prevent event-handler deadlocking on missing recordings Prevent deadlocking event-handler on missing recordings Jul 25, 2024
@tigrato tigrato self-requested a review July 25, 2024 21:06
integrations/event-handler/app.go Outdated Show resolved Hide resolved
integrations/event-handler/session_events_job.go Outdated Show resolved Hide resolved
integrations/event-handler/session_events_job.go Outdated Show resolved Hide resolved
integrations/event-handler/session_events_job.go Outdated Show resolved Hide resolved
@rosstimothy rosstimothy force-pushed the tross/event_handler_deadlock branch from 7f1e152 to e9c931e Compare July 26, 2024 16:32
@rosstimothy rosstimothy marked this pull request as ready for review July 26, 2024 16:34
@gravitational gravitational deleted a comment from github-actions bot Jul 26, 2024
integrations/lib/signals.go Outdated Show resolved Hide resolved
integrations/event-handler/app.go Outdated Show resolved Hide resolved
integrations/event-handler/state.go Show resolved Hide resolved
integrations/event-handler/state.go Show resolved Hide resolved
integrations/event-handler/state.go Outdated Show resolved Hide resolved
@rosstimothy rosstimothy force-pushed the tross/event_handler_deadlock branch 2 times, most recently from dede267 to ff60700 Compare July 29, 2024 13:26
@public-teleport-github-review-bot public-teleport-github-review-bot bot removed the request for review from hugoShaka July 29, 2024 14:02
If the number of session.upload events without a corresponding
session recording is higher than the configured concurrency, then
the event-handler could deadlock while attempting to find said
recording. To prevent this from occurring processing of session
recordings has been modified such that:

- if a recording does not exist and the session.upload event is
  from more than 48 hours in the past it is assumed to be lost
  and no more attempts to process the recording will happen
- all processing of recordings that were not found is now done in
  a separate background routine, instead of inline with event
  processing

The storage directory will now contain information about sessions
with a missing recording. The background routine will process them
periodically, capping the number of attempts per recording at 3.
@rosstimothy rosstimothy force-pushed the tross/event_handler_deadlock branch from ff60700 to 04dcbe8 Compare July 29, 2024 17:56
@rosstimothy rosstimothy enabled auto-merge July 29, 2024 18:03
@rosstimothy rosstimothy added this pull request to the merge queue Jul 29, 2024
Merged via the queue into master with commit d38e2c3 Jul 29, 2024
40 checks passed
@rosstimothy rosstimothy deleted the tross/event_handler_deadlock branch July 29, 2024 18:41
@public-teleport-github-review-bot

@rosstimothy See the table below for backport results.

Branch Result
branch/v14 Failed
branch/v15 Failed
branch/v16 Failed

rosstimothy added a commit that referenced this pull request Jul 29, 2024
If the number of session.upload events without a corresponding
session recording is higher than the configured concurrency, then
the event-handler could deadlock while attempting to find said
recording. To prevent this from occurring processing of session
recordings has been modified such that:

- if a recording does not exist and the session.upload event is
  from more than 48 hours in the past it is assumed to be lost
  and no more attempts to process the recording will happen
- all processing of recordings that were not found is now done in
  a separate background routine, instead of inline with event
  processing

The storage directory will now contain information about sessions
with a missing recording. The background routine will process them
periodically, capping the number of attempts per recording at 3.
rosstimothy added a commit that referenced this pull request Jul 29, 2024
If the number of session.upload events without a corresponding
session recording is higher than the configured concurrency, then
the event-handler could deadlock while attempting to find said
recording. To prevent this from occurring processing of session
recordings has been modified such that:

- if a recording does not exist and the session.upload event is
  from more than 48 hours in the past it is assumed to be lost
  and no more attempts to process the recording will happen
- all processing of recordings that were not found is now done in
  a separate background routine, instead of inline with event
  processing

The storage directory will now contain information about sessions
with a missing recording. The background routine will process them
periodically, capping the number of attempts per recording at 3.
github-merge-queue bot pushed a commit that referenced this pull request Jul 29, 2024
)

If the number of session.upload events without a corresponding
session recording is higher than the configured concurrency, then
the event-handler could deadlock while attempting to find said
recording. To prevent this from occurring processing of session
recordings has been modified such that:

- if a recording does not exist and the session.upload event is
  from more than 48 hours in the past it is assumed to be lost
  and no more attempts to process the recording will happen
- all processing of recordings that were not found is now done in
  a separate background routine, instead of inline with event
  processing

The storage directory will now contain information about sessions
with a missing recording. The background routine will process them
periodically, capping the number of attempts per recording at 3.
github-merge-queue bot pushed a commit that referenced this pull request Jul 29, 2024
)

If the number of session.upload events without a corresponding
session recording is higher than the configured concurrency, then
the event-handler could deadlock while attempting to find said
recording. To prevent this from occurring processing of session
recordings has been modified such that:

- if a recording does not exist and the session.upload event is
  from more than 48 hours in the past it is assumed to be lost
  and no more attempts to process the recording will happen
- all processing of recordings that were not found is now done in
  a separate background routine, instead of inline with event
  processing

The storage directory will now contain information about sessions
with a missing recording. The background routine will process them
periodically, capping the number of attempts per recording at 3.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants