master merge for 0.4.7 release #1126

rudolfix · 2024-03-21T19:12:48Z

Description

Merges devel to master for a 0.4.7 release

* rewrites incremental: computation of hashes vastly reduced, fixed wrong criteria when to deduplicate, unique index in arrow frames rarely created * initial tests for ordered, random and overlapping incremental ranges * clarifies what deduplication in incremental means * handles no deduplication case explicitly, more tests

* Add note about google secret name normalization * Change space -> whitespace

* feat(airflow): allow re-using sources in airflow wrapper * lint fix

* feat(core): drop default value for write disposition * don't use default value in apply_hints * applies default write disposition in empty apply hints --------- Co-authored-by: Marcin Rudolf <[email protected]>

* start sink * parquet sink prototype * some more sink implementations * finish first batch of helpers * add missing tests and fix linting * make configuratio more versatile * implement sink function progress state * move to iterator * persist sink load state in pipeline state * fix unrelated typo * move sink state storage to loadpackage state * additional pr fixes * disable creating empty state file on loadpackage init * add sink docs page * small changes * make loadstorage state versioned and separate out common base functions * restrict access of destinations to load package state in accessor functions * fix tests * add tests for state and new injectable context * fix linter * fix linter error * some pr fixes * more pr fixes * small readme changes * add load id to loadpackage info in current * add support for directly passing through the naming convention to the sink * add support for batch size zero (filepath passthrouh) * use patched version of flak8 encoding * fix tests * add support for secrets and config in sink * update sink docs * revert encodings branch * fix small linting problem * add support for config specs * add possibility to create a resolved partial * add lock for resolving config add test for nested configs * change resolved partial method to dedicated function * change signatures in decorator lock injection context for wrapped functions small pr fixes * fixes bug in inject wrapper refactor * mark destination decorator as experimental in the docs * change injection context locking strategy forward generic destiation call params into config small fixes * make tests independent from gcp imports * move generic destination tests into common tests section destinations * fix global instantiation test after file move * add tests for locking injection context * make inject test a bit better make simple test for loading load package without state * skip generic destination in init test

* first version of embedded snippets check * add missing code block types where needed * small change to snippets script * fix all parser problems in code blocks * add better error messages and add check to ci * add linting of embedded snippets * small improvement for snippets linting * remove one ignored error code * add ruff dep * add mypy (comment out for now) * fix bug in script * ignore lint setup for embedded snippets * fix linting and small mypy adjustments * switches from shell to sh as shell block type * make snippet checker code nicer * small script changes and readme * add lint and type check count

* docs(kafka): describe the possible sync issues --------- Co-authored-by: rudolfix <[email protected]>

* Cancel previous CI runs for PRs when new commits pushed * Small readme example pipeline code formatting * Include matrix.os in group * Use max-parallel: 3 for lint and common tests * Skip matrix.os for linting * Skip matrix.os for destination workflows * Revert common tests workflow * Parallelize linting * Use only ubuntu to run linting * Run qdrant tests only on linux * Remove redundant qdrant workflow checks for windows os

* rename tests file * add setting to skip dlt internal tables and columns in custom destination * add nesting level setting to custom destination update readme * use correct internal dlt schema item marker propagate the max_nesting_level the correct way from the destination caps * add example for custom destination bigquery * fix embedded snippet checker output * add custom destination example to docs * update custom destination example * pin flake8-encodings to fork * fix snippet marker * ignore google imports * Docs: fix custom destination (#1113) * removed sink mentions, fixed code snippets * rename title * trigger tests * trigger tests 2 * revert changes * small edits * pin databind.json python package * pin databind core * add bigquery extra for snippets tests * updates to the readme * rename function for nesting level test --------- Co-authored-by: Alena Astrakhantseva <[email protected]>

* add a script to check wether our poetry lockfile is in order * small script changes * convert script to python move tools to tools folder * add encoding information

* splits pandas and arrow imports * fixes arrow without pandas deps test

* always register SPEC for f when injecting, fixes base tests * always synthesizes a spec even if fields are not added, keeps the base class fields if sig not annotated * fixes how base and spec are used in sink factory * don't fail on destination instantiation if no callable arg provided * add docstring for decorator rename config spec format generic destination example * allow destination decorator to be used without args remove some unneded things from the destination tests * add tests for base spec of custom destination fix tests for source decorator * improves sink spec test --------- Co-authored-by: Dave <[email protected]>

* Streamlit improvements * Refactor menu and create custom widgets * Add tag component * Add destination name to sidebar * Adjust headers * Set color scheme * Fix linting issues * Move import dlt to the top * Use defaul tag type * Use lower contrast shade of white * Use smaller heading element * Handle dummy destination errors and implement custom light and dark modes * Cleanup streamlit app code * Fix linting issues * Fix linting issues * Extract components from streamlit explorer code * Cleanup redundant data display * Add tag bold text option * Cast label to string * Rework document sections and display resource incremental if given * Do not display primary and merge keys if they are not specified * Integrate streamlit app tests * Fix mypy issue * Add general rendering test checks for streamlit * Set default color mode as light * Display resource state in expandable area * More rendering checks * Cleanup tests * Add test case for streamlit app with dummy dlt destination * Ignore B015 error * Fix linting errors * Add hot reload support for streamlit via DLT_STREAMLIT_HOT_RELOAD variable * Move non-essential info from sidebar to load info page * Expand pipeline summary block by default * Sort table by column names * Fix mypy errors * Pass pipelines dir and use bool argument for streamlit hot reload * Keep resource state expanded if it is missing the state * Use DLT_PIPELINES_DIR in load info as well * Remove unused import * Do not sort table column * Extract pipeline attaching logic * Use pipeline name from cli arguments for load_info page * Move pipeline_state_info into blocks * Remove comment * Move menu into blocks * Extract pipeline loading logic * Show simple unordered list with pipeline summary * Cleanup redundant code * Adjust tests * Remove unused code * Move dashboard into pages * Refactor querying logic and stop using deprecated experimental caching for streamlit * Fix mypy errors * Use get_dlt_data_dir to resolve pipelines_dir * Add more tests and checks * Pass DLT_PIPELINES_DIR instead of modifying DLT_DATA_DIR * Restore os environment after streamlit app exits * Remove max-parallel for linting * Allow linting to fail fast and fix linting errors * Fix mypy errors * Format code * Show message when pipelines dir is passed * Show message if pipelines_dir is passed * Copy streamlit package check to streamlit_app module __init__ * Adjust mypy ignore * Fix mypy issues * moves load info around * Pass pipeline params after terminator * Remove info message when pipelines dir is passed * Restor system args after test * Fix linting error * Adjust streamlit command arguments passing * Add comments * Manually enforce column order --------- Co-authored-by: Marcin Rudolf <[email protected]>

* allows to get data tables that seen data * always runs empty jobs through normalizer, marks them as seen data * defines and tracks special empty lists items in extractor * adds dlt mark and tests for materialize table items * does not run through empty normalize when table seen data

* Add example link to the custom destination page * Fix grammar * Update fix grammar instructions * docs sink cross references * fixes grammar --------- Co-authored-by: Marcin Rudolf <[email protected]>

netlify · 2024-03-21T19:13:03Z

✅ Deploy Preview for dlt-hub-docs ready!

Name	Link
🔨 Latest commit	`f1ec901`
🔍 Latest deploy log	https://app.netlify.com/sites/dlt-hub-docs/deploys/65fc8cdaeca5ef0008a8fda4
😎 Deploy Preview	https://deploy-preview-1126--dlt-hub-docs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

rudolfix and others added 26 commits March 8, 2024 20:28

Fix grammar in /docs/dlt-ecosystem/destinations/ (#1070)

ab213fa

Fix grammar in docs: batch 2 (#1075)

6604289

Docs: Add note about google secret name normalization (#1056)

5275449

* Add note about google secret name normalization * Change space -> whitespace

validates class instances in typed dict (#1082)

a0e83b7

bumps for pre-release 0.4.7a0

8d7ca4c

feat(airflow): allow re-using sources in airflow wrapper (#1080)

ee5db59

* feat(airflow): allow re-using sources in airflow wrapper * lint fix

feat(core): drop default value for write disposition (#1057)

e622300

* feat(core): drop default value for write disposition * don't use default value in apply_hints * applies default write disposition in empty apply hints --------- Co-authored-by: Marcin Rudolf <[email protected]>

docs(airflow): add description of new decompose methods (#1072)

3a007a4

Clarify process for enhancements and bug fixes (#1096)

5e0b8b4

docs(kafka): describe the possible sync issues (#1100)

50af01b

* docs(kafka): describe the possible sync issues --------- Co-authored-by: rudolfix <[email protected]>

improves state upgrade path error message (#1108)

7e30318

add grammar fixing script to docs tools (#1117)

8b28226

add a script to check wether our poetry lockfile is in order (#1103)

36cf442

* add a script to check wether our poetry lockfile is in order * small script changes * convert script to python move tools to tools folder * add encoding information

splits pandas and arrow imports (#1112)

f52e2e4

* splits pandas and arrow imports * fixes arrow without pandas deps test

improve no schema upgrade path exception (#1125)

3990973

Remove old streamlit app (#1124)

262d3ba

Add example link to the custom destination page (#1120)

04410ab

* Add example link to the custom destination page * Fix grammar * Update fix grammar instructions * docs sink cross references * fixes grammar --------- Co-authored-by: Marcin Rudolf <[email protected]>

bumps to 0.4.7

9c1d808

rudolfix added 2 commits March 21, 2024 20:29

Merge branch 'master' into devel

2eae039

fixes snippet, unlinks lint_me at the end

f1ec901

rudolfix merged commit be12a1c into master Mar 22, 2024
43 of 54 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

master merge for 0.4.7 release #1126

master merge for 0.4.7 release #1126

rudolfix commented Mar 21, 2024

netlify bot commented Mar 21, 2024 •

edited

Loading

master merge for 0.4.7 release #1126

master merge for 0.4.7 release #1126

Conversation

rudolfix commented Mar 21, 2024

Description

netlify bot commented Mar 21, 2024 • edited Loading

✅ Deploy Preview for dlt-hub-docs ready!

netlify bot commented Mar 21, 2024 •

edited

Loading