We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refresh
delta
Two flaws exist with pipeline refresh for delta table format on filesystem destination:
filesystem
drop_sources
drop_data
Repro (1):
import dlt from dlt.destinations import filesystem from tests.pipeline.utils import airtable_emojis source = airtable_emojis().with_resources("📆 Schedule", "🦚Peacock") for resource in source.selected_resources.values(): resource.apply_hints(table_format="delta") pipe = dlt.pipeline( pipeline_name="refresh_repro", pipelines_dir="_storage", destination=filesystem("_storage") ) pipe.run(source) pipe.run(source.with_resources("🦚Peacock"), refresh="drop_sources") # actual: empty folder `/_schedule/_delta_log` remains # expected: `/_schedule/_delta_log` no longer exists
Repro (2):
import dlt from dlt.destinations import filesystem from tests.pipeline.utils import airtable_emojis source = airtable_emojis().with_resources("📆 Schedule", "🦚Peacock") for resource in source.selected_resources.values(): resource.apply_hints(table_format="delta") pipe = dlt.pipeline( pipeline_name="refresh_repro", pipelines_dir="_storage", destination=filesystem("_storage") ) pipe.run(source) pipe.run(source.with_resources("📆 Schedule"), refresh="drop_data") # actual: _schedule table has single commit (/_schedule/_delta_log/00000000000000000000.json) (in SQL terms: table got DROPped) # expected: _schedule table has two commits (in SQL terms: table got TRUNCATEd)
Yes, I'm already a dlt user.
No response
Custom implementations for drop_tables and truncate_tables for delta. Currently generic filesystem implementations are applied.
drop_tables
truncate_tables
#1742 (comment)
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Feature description
Two flaws exist with pipeline
refresh
fordelta
table format onfilesystem
destination:drop_sources
.drop_data
.Repro (1):
Repro (2):
Are you a dlt user?
Yes, I'm already a dlt user.
Use case
No response
Proposed solution
Custom implementations for
drop_tables
andtruncate_tables
fordelta
. Currently genericfilesystem
implementations are applied.Related issues
#1742 (comment)
The text was updated successfully, but these errors were encountered: