Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[backfill daemon run retries 3/n] retries of runs in completed backfills should not be considered part of the backfill #25900

Conversation

jamiedemaria
Copy link
Contributor

@jamiedemaria jamiedemaria commented Nov 13, 2024

Summary & Motivation

If a run is retried after a backfill is complete, that run is given the backfill tag, but has no affect on the backfill itself. This can cause confusion. Imagine the scenario where a single asset-partition failed in a backfill. The backfill is complete and a user retries the failed asset and the retry succeeds. That retried run will show up in the list of runs for the backfill, but the status in the overview tab for partition will still be failed since the status is locked when the backfill completes.

We should be more strict about when run retries are considered part of the backfill. We decided in https://github.com/dagster-io/internal/discussions/12460 that retries that are launched while the backfill is in progress will be part of the backfill, but that retries that are launched after the backfill is complete should not be considered part of the backfill.

To make this change we need to remove the backfill tag from retried runs if the backfill is not in progress.

How I Tested These Changes

new unit tests

manually launched a retry of a run that was launched by a backfill after the backfill was complete. no backfill tags were added
Screenshot 2024-12-02 at 1 55 09 PM

Changelog

Manual retries of runs launched by backfills are no longer considered part of the backfill if the backfill is complete when the retry is launched.

@jamiedemaria jamiedemaria force-pushed the jamie/backfill-daemon-accounts-for-retries branch from 389f553 to 46e48b3 Compare November 13, 2024 16:03
@jamiedemaria jamiedemaria force-pushed the jamie/remove-backfill-tags-for-run-retries-after-backfill-complete branch from aabdbaa to 039026a Compare November 13, 2024 16:04
@jamiedemaria jamiedemaria changed the title retries of runs in completed backfills should not be considered part of the backfill [backfill daemon run retries 3/n] retries of runs in completed backfills should not be considered part of the backfill Nov 13, 2024
@jamiedemaria jamiedemaria force-pushed the jamie/backfill-daemon-accounts-for-retries branch from 46e48b3 to eade146 Compare November 13, 2024 16:40
@jamiedemaria jamiedemaria force-pushed the jamie/remove-backfill-tags-for-run-retries-after-backfill-complete branch from 039026a to 9f7eaa6 Compare November 13, 2024 16:40
@jamiedemaria jamiedemaria force-pushed the jamie/backfill-daemon-accounts-for-retries branch from eade146 to 7408d71 Compare November 25, 2024 18:52
@jamiedemaria jamiedemaria force-pushed the jamie/remove-backfill-tags-for-run-retries-after-backfill-complete branch from 9b3b9ca to bc18437 Compare November 25, 2024 18:52
@jamiedemaria jamiedemaria force-pushed the jamie/backfill-daemon-accounts-for-retries branch from 7408d71 to bcf74ed Compare November 25, 2024 19:53
@jamiedemaria jamiedemaria force-pushed the jamie/remove-backfill-tags-for-run-retries-after-backfill-complete branch from bc18437 to a9538d8 Compare November 25, 2024 19:53
@jamiedemaria jamiedemaria force-pushed the jamie/backfill-daemon-accounts-for-retries branch from bcf74ed to 9e9139b Compare November 25, 2024 21:44
@jamiedemaria jamiedemaria force-pushed the jamie/remove-backfill-tags-for-run-retries-after-backfill-complete branch from a9538d8 to 0c88250 Compare November 25, 2024 21:44
@jamiedemaria jamiedemaria force-pushed the jamie/backfill-daemon-accounts-for-retries branch from 9e9139b to cb3ce07 Compare November 25, 2024 21:55
@jamiedemaria jamiedemaria force-pushed the jamie/remove-backfill-tags-for-run-retries-after-backfill-complete branch from 0c88250 to c06ac7f Compare November 25, 2024 21:56
@jamiedemaria jamiedemaria force-pushed the jamie/backfill-daemon-accounts-for-retries branch from cb3ce07 to 72f301f Compare November 25, 2024 22:19
@jamiedemaria jamiedemaria force-pushed the jamie/remove-backfill-tags-for-run-retries-after-backfill-complete branch from c06ac7f to 55249f2 Compare November 25, 2024 22:19
@jamiedemaria jamiedemaria force-pushed the jamie/backfill-daemon-accounts-for-retries branch from 72f301f to b42f503 Compare November 26, 2024 14:19
@jamiedemaria jamiedemaria force-pushed the jamie/remove-backfill-tags-for-run-retries-after-backfill-complete branch from 55249f2 to 11ed14f Compare November 26, 2024 14:20
@jamiedemaria jamiedemaria force-pushed the jamie/backfill-daemon-accounts-for-retries branch from b42f503 to eb2dd11 Compare November 27, 2024 16:03
@jamiedemaria jamiedemaria force-pushed the jamie/remove-backfill-tags-for-run-retries-after-backfill-complete branch from 11ed14f to 111b62b Compare November 27, 2024 16:03
@jamiedemaria jamiedemaria force-pushed the jamie/backfill-daemon-accounts-for-retries branch from eb2dd11 to 2fa09c5 Compare November 27, 2024 19:07
@jamiedemaria jamiedemaria force-pushed the jamie/remove-backfill-tags-for-run-retries-after-backfill-complete branch from 5062420 to 4944d77 Compare November 27, 2024 19:07
@jamiedemaria jamiedemaria force-pushed the jamie/backfill-daemon-accounts-for-retries branch from 2fa09c5 to ad1c316 Compare November 27, 2024 19:53
@jamiedemaria jamiedemaria force-pushed the jamie/remove-backfill-tags-for-run-retries-after-backfill-complete branch from 4944d77 to b5db7ae Compare November 27, 2024 19:54
@jamiedemaria jamiedemaria force-pushed the jamie/backfill-daemon-accounts-for-retries branch from ad1c316 to de67c1d Compare November 27, 2024 20:19
@jamiedemaria jamiedemaria force-pushed the jamie/remove-backfill-tags-for-run-retries-after-backfill-complete branch from b5db7ae to d7ac069 Compare November 27, 2024 20:19
@jamiedemaria jamiedemaria force-pushed the jamie/backfill-daemon-accounts-for-retries branch from de67c1d to 6cb78c1 Compare November 27, 2024 20:58
@jamiedemaria jamiedemaria force-pushed the jamie/backfill-daemon-accounts-for-retries branch from a89be73 to f6c77ca Compare December 2, 2024 14:49
@jamiedemaria jamiedemaria force-pushed the jamie/remove-backfill-tags-for-run-retries-after-backfill-complete branch 2 times, most recently from b135d13 to fb46cc2 Compare December 2, 2024 16:39
@jamiedemaria jamiedemaria marked this pull request as ready for review December 2, 2024 18:55
)
parent_run_tags = {}
if use_parent_run_tags:
parent_run_tags = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this applies to the old code too but maybe this should be parent_run_tags_to_include?

parent_run_tags = {
key: val
for key, val in parent_run.tags.items()
if key not in TAGS_TO_OMIT_ON_RETRY and key not in TAGS_TO_MAYBE_OMIT_ON_RETRY
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given that this is the sole place that TAGS_TO_OMIT_ON_RETRY and TAGS_TO_MAYBE_OMIT_ON_RETRY are used I think we could keep it as one list of tags vs. splitting them out into two lists? should it all just be TAGS_TO_MAYBE_OMIT_ON_RETRY?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and then just not have a condition to add back the tags that are in the TAGS_TO_OMIT_ON_RETRY list? that sounds good to me

@jamiedemaria jamiedemaria force-pushed the jamie/backfill-daemon-accounts-for-retries branch from 58ed06d to eea3924 Compare December 3, 2024 17:34
@jamiedemaria jamiedemaria force-pushed the jamie/remove-backfill-tags-for-run-retries-after-backfill-complete branch from fb46cc2 to 22ec8e7 Compare December 3, 2024 17:34
@jamiedemaria jamiedemaria force-pushed the jamie/backfill-daemon-accounts-for-retries branch from eea3924 to 5126053 Compare December 4, 2024 17:10
@jamiedemaria jamiedemaria force-pushed the jamie/remove-backfill-tags-for-run-retries-after-backfill-complete branch from 22ec8e7 to 9873da8 Compare December 4, 2024 17:10
@jamiedemaria jamiedemaria force-pushed the jamie/backfill-daemon-accounts-for-retries branch from 5126053 to 4ce8b5a Compare December 5, 2024 17:15
@jamiedemaria jamiedemaria force-pushed the jamie/remove-backfill-tags-for-run-retries-after-backfill-complete branch from 9873da8 to 56d380e Compare December 5, 2024 17:15
@jamiedemaria jamiedemaria force-pushed the jamie/backfill-daemon-accounts-for-retries branch from 4ce8b5a to d6f6797 Compare December 5, 2024 18:07
@jamiedemaria jamiedemaria force-pushed the jamie/remove-backfill-tags-for-run-retries-after-backfill-complete branch from 56d380e to 2b9202d Compare December 5, 2024 18:07
Base automatically changed from jamie/backfill-daemon-accounts-for-retries to master December 5, 2024 18:36
@jamiedemaria jamiedemaria force-pushed the jamie/remove-backfill-tags-for-run-retries-after-backfill-complete branch from 2b9202d to ef43e6b Compare December 5, 2024 18:37
@jamiedemaria jamiedemaria merged commit 8891639 into master Dec 5, 2024
1 check failed
@jamiedemaria jamiedemaria deleted the jamie/remove-backfill-tags-for-run-retries-after-backfill-complete branch December 5, 2024 19:12
cmpadden pushed a commit that referenced this pull request Dec 5, 2024
…lls should not be considered part of the backfill (#25900)

## Summary & Motivation
If a run is retried after a backfill is complete, that run is given the
backfill tag, but has no affect on the backfill itself. This can cause
confusion. Imagine the scenario where a single asset-partition failed in
a backfill. The backfill is complete and a user retries the failed asset
and the retry succeeds. That retried run will show up in the list of
runs for the backfill, but the status in the overview tab for partition
will still be failed since the status is locked when the backfill
completes.

We should be more strict about when run retries are considered part of
the backfill. We decided in
dagster-io/internal#12460 that retries
that are launched while the backfill is in progress will be part of the
backfill, but that retries that are launched after the backfill is
complete should not be considered part of the backfill.

To make this change we need to remove the backfill tag from retried runs
if the backfill is not in progress.

## How I Tested These Changes
new unit tests 

manually launched a retry of a run that was launched by a backfill after
the backfill was complete. no backfill tags were added
<img width="1037" alt="Screenshot 2024-12-02 at 1 55 09 PM"
src="https://github.com/user-attachments/assets/5bb8ae12-4c61-4fd4-8255-1d245ae43318">


## Changelog

Manual retries of runs launched by backfills are no longer considered
part of the backfill if the backfill is complete when the retry is
launched.
pskinnerthyme pushed a commit to pskinnerthyme/dagster that referenced this pull request Dec 16, 2024
…lls should not be considered part of the backfill (dagster-io#25900)

## Summary & Motivation
If a run is retried after a backfill is complete, that run is given the
backfill tag, but has no affect on the backfill itself. This can cause
confusion. Imagine the scenario where a single asset-partition failed in
a backfill. The backfill is complete and a user retries the failed asset
and the retry succeeds. That retried run will show up in the list of
runs for the backfill, but the status in the overview tab for partition
will still be failed since the status is locked when the backfill
completes.

We should be more strict about when run retries are considered part of
the backfill. We decided in
https://github.com/dagster-io/internal/discussions/12460 that retries
that are launched while the backfill is in progress will be part of the
backfill, but that retries that are launched after the backfill is
complete should not be considered part of the backfill.

To make this change we need to remove the backfill tag from retried runs
if the backfill is not in progress.

## How I Tested These Changes
new unit tests 

manually launched a retry of a run that was launched by a backfill after
the backfill was complete. no backfill tags were added
<img width="1037" alt="Screenshot 2024-12-02 at 1 55 09 PM"
src="https://github.com/user-attachments/assets/5bb8ae12-4c61-4fd4-8255-1d245ae43318">


## Changelog

Manual retries of runs launched by backfills are no longer considered
part of the backfill if the backfill is complete when the retry is
launched.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants