Rework "update freshness" to not error with library -> ingest #1339

fvankrieken · 2024-12-20T14:42:16Z

We are going to have a failing run this weekend without these changes.

#1097 added specific logic on how to handle an ingest run when we've already archived that version of a dataset. However, it didn't account for when the previous archived "vintage" of that version had come from library. Given that we're in the middle of this migration, and there are some slight data changes occuring (that would cause the validation from #1097 to error), I decided to clean up the code a little and simply "pass" - no overwrite, no updating timestamps - when the existing version came from library

Commits are quite atomic. I moved around some testing code hence the several commits, it should make commit-by-commit review easier. I thought pulling out the logic of "validating" (@sf-dcp 's favorite word) the new dataset versus existing versions to a function which makes no changes, and then based on the enum returned run decides what to do.

Integration Tests

Running in my dev bucket, first run a job via library by using a branch of mine without latest main, so opendata job still uses library. This should get us latest versions of datasets, archived by library. I cancelled it as to save runtime. There are also the DOT ones which went private and failed.

Then, run a job on main with my dev bucket. See failures (except for the ones that weren't actually archived in the previous step). There are also some internal server errors happening? 500 errors. It seems like we might be getting rate limited for real. Anyways, see this job here to see "successful" failure

Then, run a job on this branch with my dev bucket. Getting one bizarre s3 error that I'm not going to try to troubleshoot this second

codecov · 2024-12-20T19:28:56Z

Codecov Report

Attention: Patch coverage is 90.24390% with 4 lines in your changes missing coverage. Please review.

Project coverage is 70.07%. Comparing base (a041b06) to head (cfa3c17).
Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
dcpy/lifecycle/ingest/run.py	66.66%	2 Missing and 2 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1339      +/-   ##
==========================================
+ Coverage   70.00%   70.07%   +0.06%     
==========================================
  Files         114      115       +1     
  Lines        6121     6145      +24     
  Branches      702      706       +4     
==========================================
+ Hits         4285     4306      +21     
- Misses       1690     1692       +2     
- Partials      146      147       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

fvankrieken force-pushed the fvk-ingest-library-switchover branch 2 times, most recently from 7e74279 to 982983b Compare December 20, 2024 14:58

fvankrieken marked this pull request as ready for review December 20, 2024 15:20

fvankrieken force-pushed the fvk-ingest-library-switchover branch 6 times, most recently from 2e67f53 to ea47c5e Compare December 20, 2024 19:14

fvankrieken added 6 commits December 20, 2024 14:22

small tweak to copy command

0f9cabc

pass on update freshness of library dataset

6cdef86

move shared constants out of __init__.py

ed97cdf

parametrize validation of templates

dd9ec72

move template validation

252486c

chk test validate

c9ddc2e

fvankrieken force-pushed the fvk-ingest-library-switchover branch from ea47c5e to c9ddc2e Compare December 20, 2024 19:22

fvankrieken assigned damonmcc, alexrichey and sf-dcp Dec 20, 2024

fvankrieken mentioned this pull request Dec 23, 2024

Scheduled Action Failure - Ingest - 📁 Open Data Routine Updates #1343

Closed

update existing issues for weekly action failures

cfa3c17

sf-dcp self-requested a review December 23, 2024 15:40

damonmcc approved these changes Dec 23, 2024

View reviewed changes

fvankrieken merged commit d74a90a into main Dec 23, 2024
21 checks passed

fvankrieken deleted the fvk-ingest-library-switchover branch December 23, 2024 15:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rework "update freshness" to not error with library -> ingest #1339

Rework "update freshness" to not error with library -> ingest #1339

fvankrieken commented Dec 20, 2024 •

edited

Loading

codecov bot commented Dec 20, 2024 •

edited

Loading

Rework "update freshness" to not error with library -> ingest #1339

Rework "update freshness" to not error with library -> ingest #1339

Conversation

fvankrieken commented Dec 20, 2024 • edited Loading

Integration Tests

codecov bot commented Dec 20, 2024 • edited Loading

Codecov Report

fvankrieken commented Dec 20, 2024 •

edited

Loading

codecov bot commented Dec 20, 2024 •

edited

Loading