Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework "update freshness" to not error with library -> ingest #1339

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

fvankrieken
Copy link
Contributor

@fvankrieken fvankrieken commented Dec 20, 2024

We are going to have a failing run this weekend without these changes.

#1097 added specific logic on how to handle an ingest run when we've already archived that version of a dataset. However, it didn't account for when the previous archived "vintage" of that version had come from library. Given that we're in the middle of this migration, and there are some slight data changes occuring (that would cause the validation from #1097 to error), I decided to clean up the code a little and simply "pass" - no overwrite, no updating timestamps - when the existing version came from library

Commits are quite atomic. I moved around some testing code hence the several commits, it should make commit-by-commit review easier. I thought pulling out the logic of "validating" (@sf-dcp 's favorite word) the new dataset versus existing versions to a function which makes no changes, and then based on the enum returned run decides what to do.

Integration Tests

Running in my dev bucket, first run a job via library by using a branch of mine without latest main, so opendata job still uses library. This should get us latest versions of datasets, archived by library. I cancelled it as to save runtime. There are also the DOT ones which went private and failed.

Then, run a job on main with my dev bucket. See failures (except for the ones that weren't actually archived in the previous step). There are also some internal server errors happening? 500 errors. It seems like we might be getting rate limited for real. Anyways, see this job here to see "successful" failure

Then, run a job on this branch with my dev bucket. Getting one bizarre s3 error that I'm not going to try to troubleshoot this second

@fvankrieken fvankrieken force-pushed the fvk-ingest-library-switchover branch 2 times, most recently from 7e74279 to 982983b Compare December 20, 2024 14:58
@fvankrieken fvankrieken marked this pull request as ready for review December 20, 2024 15:20
@fvankrieken fvankrieken force-pushed the fvk-ingest-library-switchover branch 6 times, most recently from 2e67f53 to ea47c5e Compare December 20, 2024 19:14
@fvankrieken fvankrieken force-pushed the fvk-ingest-library-switchover branch from ea47c5e to c9ddc2e Compare December 20, 2024 19:22
Copy link

codecov bot commented Dec 20, 2024

Codecov Report

Attention: Patch coverage is 90.24390% with 4 lines in your changes missing coverage. Please review.

Project coverage is 70.07%. Comparing base (a041b06) to head (c9ddc2e).

Files with missing lines Patch % Lines
dcpy/lifecycle/ingest/run.py 66.66% 2 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1339      +/-   ##
==========================================
+ Coverage   70.00%   70.07%   +0.06%     
==========================================
  Files         114      115       +1     
  Lines        6121     6145      +24     
  Branches      702      706       +4     
==========================================
+ Hits         4285     4306      +21     
- Misses       1690     1692       +2     
- Partials      146      147       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: New
Development

Successfully merging this pull request may close these issues.

4 participants