You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a ticket to capture some of our discussions and longer term plans on how to approach continuous file deletions from the source MAXAR delivery bucket on MCP, post ingest and post validation.
The first Meeting is set for PI Planning week but this ticket will likely make it into Sprint 1.
Scope and Plan approach to Continuous delivery MCP file Deletes after ingestion
Topics to discuss (Draft)
Log Sync (Sending Manifests from CBA PROD (NGAP) over to MCP?
A Dag or other process to process the manifests or confirm Granules
Scope of this process, i.e. are we processing the entire huge manifest every time the process runs? Is there a more efficient way to do this?
CMR queries to verify publication (including validating the path to CBA PROD bucket in the CMR Records
(There is some existing code that does some of this already -- See OLD NGAP Deletes Task)
Other Topics?
Note: Reference to Starter ticket on the Cumulus side of this work: (Take a look at this ticket during the meeting) #328
A possible starting point to this discussion could be, what are we doing with the manifests that come in from CBA PROD (NGAP)? Are we processing all of the manifest data, each time this process runs?
Don't forget, there is a slight delay after ingestion before the items show up in ORCA,
It is notably easier if we have a list of expected granules generated from 'somewhere' (maybe parsing cloud watch data before going to the manifests or direct S3 queries?)
Maybe we do a slow running process that attempts to sync the entire manifest once per week?
The text was updated successfully, but these errors were encountered:
krisstanton
changed the title
Scope and Plan approach to Continuous delivery MCP file Deletes after ingestion
Scope and Plan approach to continuous deletions of MCP MAXAR delivery bucket files after ingestion
Jun 26, 2024
We had a meeting to discuss how to approach the continuous deletes.
At the end of the meeting, the current approach (from the Cumulus perspective) is there is a DAG which runs on MCP that will check for these 4 verifications
Files for a Granule exist in CBA PROD
Files for a Granule exist in CBA PROD (ORCA)
Granules are published in EarthData CMR
There is verification of external metrics data
Once the 4 verifications happen, The Dag will take 2 actions
Delete the files for the granule from the MCP Maxar Delivery Bucket
Remove only corresponding Granule entry from the correct DynamoDB table where we normally insert file lists and checksums info.
There will also be some involvement of assisting with the logic in the DAG that does the check points.
In the process of updating some of the tickets to reflect this work.
This is a ticket to capture some of our discussions and longer term plans on how to approach continuous file deletions from the source MAXAR delivery bucket on MCP, post ingest and post validation.
The first Meeting is set for PI Planning week but this ticket will likely make it into Sprint 1.
Scope and Plan approach to Continuous delivery MCP file Deletes after ingestion
Topics to discuss (Draft)
Note: Reference to Starter ticket on the Cumulus side of this work: (Take a look at this ticket during the meeting) #328
The text was updated successfully, but these errors were encountered: