You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
imars-etl powered dags can fail if trying to load a product that already exists in the database. Is this really what we want though? It might make more sense if the operator got marked as "skipped".
This might be possible using AirflowSkipException, but I am not sure. This also raises the question of "do we want to actually overwrite the file" when this happens. This is a complexity perhaps best handled by added options to the imars-etl tool. Some possible usage scenarios:
overwrite the file; I don't care about the old one.
overwrite the file, but maybe keep the old version somewhere too.
don't overwrite the file; I don't know what I am doing.
Advanced checking could involve comparing hashes on the files to raise error only if the files differ. That functionality would probably solve most of the reasons I am seeing this right now, actually.
A rough plan to start:
implement (2)
add new status for overwritten versions... or maybe even a new table? oof.
mv overwritten files to holding tank with a cronjob running periodic cleanups
add hash-checking
add hash column to table (this may come in handy later anyway)
check hash on duplicate, err if differ, only warn if same.
The text was updated successfully, but these errors were encountered:
imars-etl powered dags can fail if trying to load a product that already exists in the database. Is this really what we want though? It might make more sense if the operator got marked as "skipped".
This might be possible using
AirflowSkipException
, but I am not sure. This also raises the question of "do we want to actually overwrite the file" when this happens. This is a complexity perhaps best handled by added options to theimars-etl
tool. Some possible usage scenarios:Advanced checking could involve comparing hashes on the files to raise error only if the files differ. That functionality would probably solve most of the reasons I am seeing this right now, actually.
A rough plan to start:
status
foroverwritten versions
... or maybe even a new table? oof.The text was updated successfully, but these errors were encountered: