Update move_current_la to only move dataset specific data #127

patrick-troy · 2024-11-05T14:56:15Z

Currently the move_current_la_sensor will trigger once for each dataset (cin/ssda903) however it does move all files both times which is not the most efficient. It might be worth adjusting this to only move (and remove) one datasets worth of data each time.

Example of how it's inefficient:
Assume this is a fresh platform
cin clean success -> move_current_la -> no workspace current files to delete -> add new cin files to workspace current
ssdsa903 clean success -> move_current_la -> delete cin workspace current files -> add old cin and new ssda903 files to workspace current
ssdsa903 clean success -> move_current_la -> delete cin and ssda903 workspace current files -> add old cin and new ssda903 files to workspace current

This could be adjusted to:
cin clean success -> move_current_la -> no workspace current files to delete -> add new cin files to workspace current
ssdsa903 clean success -> move_current_la -> no ssda903 workspace current files to delete -> add new ssda903 files to workspace current (retain old cin files)
ssdsa903 clean success -> move_current_la -> delete just ssda903 workspace current files -> add new ssda903 files to workspace current (retain old cin files)

We do still want to maintain runs for each data as the concatenate_sensor will lead on from this. The concatenate_sensor works based on dataset (so it runs for both cin and ssda903) and Dagster will only trigger one run for each run_id. So if we convert move_current_la_sensor to just create one run_id for both datasets then the concatenate_sensor will only trigger for one dataset when we need it to trigger for both

patrick-troy self-assigned this Nov 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update move_current_la to only move dataset specific data #127

Update move_current_la to only move dataset specific data #127

patrick-troy commented Nov 5, 2024

Update move_current_la to only move dataset specific data #127

Update move_current_la to only move dataset specific data #127

Comments

patrick-troy commented Nov 5, 2024