Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

master merge for 0.4.7 release #1126

Merged
merged 28 commits into from
Mar 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
641d7ba
simplifies and fixes incremental / fixes #971 (#1062)
rudolfix Mar 8, 2024
ab213fa
Fix grammar in /docs/dlt-ecosystem/destinations/ (#1070)
burnash Mar 11, 2024
6604289
Fix grammar in docs: batch 2 (#1075)
burnash Mar 11, 2024
5275449
Docs: Add note about google secret name normalization (#1056)
sultaniman Mar 12, 2024
a0e83b7
validates class instances in typed dict (#1082)
rudolfix Mar 12, 2024
8d7ca4c
bumps for pre-release 0.4.7a0
rudolfix Mar 12, 2024
ee5db59
feat(airflow): allow re-using sources in airflow wrapper (#1080)
IlyaFaer Mar 12, 2024
e622300
feat(core): drop default value for write disposition (#1057)
IlyaFaer Mar 12, 2024
7f43e76
Generic destination / sink decorator (#1065)
sh-rp Mar 14, 2024
3a007a4
docs(airflow): add description of new decompose methods (#1072)
IlyaFaer Mar 14, 2024
5e0b8b4
Clarify process for enhancements and bug fixes (#1096)
burnash Mar 15, 2024
ce701b5
check embedded code blocks (#1093)
sh-rp Mar 18, 2024
50af01b
docs(kafka): describe the possible sync issues (#1100)
IlyaFaer Mar 18, 2024
7e30318
improves state upgrade path error message (#1108)
sh-rp Mar 18, 2024
82ea4fd
Cancel previous CI runs for PRs when new commits pushed (#1109)
sultaniman Mar 19, 2024
713aa31
Extend custom destination (#1107)
sh-rp Mar 20, 2024
8b28226
add grammar fixing script to docs tools (#1117)
sh-rp Mar 20, 2024
36cf442
add a script to check wether our poetry lockfile is in order (#1103)
sh-rp Mar 20, 2024
f52e2e4
splits pandas and arrow imports (#1112)
rudolfix Mar 20, 2024
1f2b4ce
custom destination fixes (#1119)
rudolfix Mar 21, 2024
3a815bc
Streamlit improvements (#1060)
sultaniman Mar 21, 2024
3990973
improve no schema upgrade path exception (#1125)
sh-rp Mar 21, 2024
92bf3a0
materializes table schemas for empty tables (#1122)
rudolfix Mar 21, 2024
262d3ba
Remove old streamlit app (#1124)
sultaniman Mar 21, 2024
04410ab
Add example link to the custom destination page (#1120)
VioletM Mar 21, 2024
9c1d808
bumps to 0.4.7
rudolfix Mar 21, 2024
2eae039
Merge branch 'master' into devel
rudolfix Mar 21, 2024
f1ec901
fixes snippet, unlinks lint_me at the end
rudolfix Mar 21, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 8 additions & 3 deletions .github/workflows/lint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,10 @@ on:
- devel
workflow_dispatch:

concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true

jobs:
get_docs_changes:
uses: ./.github/workflows/get_docs_changes.yml
Expand All @@ -17,9 +21,10 @@ jobs:
needs: get_docs_changes
if: needs.get_docs_changes.outputs.changes_outside_docs == 'true'
strategy:
fail-fast: false
fail-fast: true
matrix:
os: ["ubuntu-latest", "macos-latest", "windows-latest"]
os:
- ubuntu-latest
python-version: ["3.8.x", "3.9.x", "3.10.x", "3.11.x"]

defaults:
Expand Down Expand Up @@ -75,4 +80,4 @@ jobs:
- name: Check matrix job results
if: contains(needs.*.result, 'failure') || contains(needs.*.result, 'cancelled')
run: |
echo "One or more matrix job tests failed or were cancelled. You may need to re-run them." && exit 1
echo "One or more matrix job tests failed or were cancelled. You may need to re-run them." && exit 1
4 changes: 4 additions & 0 deletions .github/workflows/test_airflow.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,10 @@ on:
- devel
workflow_dispatch:

concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true

jobs:
get_docs_changes:
uses: ./.github/workflows/get_docs_changes.yml
Expand Down
4 changes: 4 additions & 0 deletions .github/workflows/test_build_images.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,10 @@ on:
- devel
workflow_dispatch:

concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true

jobs:
get_docs_changes:
uses: ./.github/workflows/get_docs_changes.yml
Expand Down
18 changes: 17 additions & 1 deletion .github/workflows/test_common.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@

name: test common

on:
Expand All @@ -8,6 +7,10 @@ on:
- devel
workflow_dispatch:

concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true

env:
RUNTIME__LOG_LEVEL: ERROR

Expand Down Expand Up @@ -92,6 +95,19 @@ jobs:
name: Run smoke tests with minimum deps Windows
shell: cmd

- name: Install pyarrow
run: poetry install --no-interaction -E duckdb -E cli -E parquet --with sentry-sdk

- run: |
poetry run pytest tests/pipeline/test_pipeline_extra.py -k arrow
if: runner.os != 'Windows'
name: Run pipeline tests with pyarrow but no pandas installed
- run: |
poetry run pytest tests/pipeline/test_pipeline_extra.py -k arrow
if: runner.os == 'Windows'
name: Run pipeline tests with pyarrow but no pandas installed Windows
shell: cmd

- name: Install pipeline dependencies
run: poetry install --no-interaction -E duckdb -E cli -E parquet --with sentry-sdk --with pipeline

Expand Down
4 changes: 4 additions & 0 deletions .github/workflows/test_dbt_cloud.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,10 @@ on:
- devel
workflow_dispatch:

concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true

env:
# all credentials must be present to be passed to dbt cloud
DBT_CLOUD__ACCOUNT_ID: ${{ secrets.DBT_CLOUD__ACCOUNT_ID }}
Expand Down
4 changes: 4 additions & 0 deletions .github/workflows/test_dbt_runner.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,10 @@ on:
- devel
workflow_dispatch:

concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true

env:

DLT_SECRETS_TOML: ${{ secrets.DLT_SECRETS_TOML }}
Expand Down
4 changes: 4 additions & 0 deletions .github/workflows/test_destination_athena.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,10 @@ on:
- devel
workflow_dispatch:

concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true

env:
DLT_SECRETS_TOML: ${{ secrets.DLT_SECRETS_TOML }}

Expand Down
4 changes: 4 additions & 0 deletions .github/workflows/test_destination_athena_iceberg.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,10 @@ on:
- devel
workflow_dispatch:

concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true

env:
DLT_SECRETS_TOML: ${{ secrets.DLT_SECRETS_TOML }}

Expand Down
4 changes: 4 additions & 0 deletions .github/workflows/test_destination_bigquery.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,10 @@ on:
- devel
workflow_dispatch:

concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true

env:
DLT_SECRETS_TOML: ${{ secrets.DLT_SECRETS_TOML }}

Expand Down
4 changes: 4 additions & 0 deletions .github/workflows/test_destination_databricks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,10 @@ on:
- devel
workflow_dispatch:

concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true

env:
DLT_SECRETS_TOML: ${{ secrets.DLT_SECRETS_TOML }}

Expand Down
4 changes: 4 additions & 0 deletions .github/workflows/test_destination_mssql.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,10 @@ on:
- devel
workflow_dispatch:

concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true

env:

DLT_SECRETS_TOML: ${{ secrets.DLT_SECRETS_TOML }}
Expand Down
15 changes: 7 additions & 8 deletions .github/workflows/test_destination_qdrant.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,10 @@ on:
- devel
workflow_dispatch:

concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true

env:
DLT_SECRETS_TOML: ${{ secrets.DLT_SECRETS_TOML }}

Expand All @@ -28,7 +32,8 @@ jobs:
strategy:
fail-fast: false
matrix:
os: ["ubuntu-latest", "macos-latest", "windows-latest"]
os:
- ubuntu-latest
defaults:
run:
shell: bash
Expand Down Expand Up @@ -64,13 +69,7 @@ jobs:
run: poetry install --no-interaction -E qdrant -E parquet --with sentry-sdk --with pipeline
- run: |
poetry run pytest tests/load/
if: runner.os != 'Windows'
name: Run tests Linux/MAC
- run: |
poetry run pytest tests/load/
if: runner.os == 'Windows'
name: Run tests Windows
shell: cmd
name: Run tests Linux

matrix_job_required_check:
name: Qdrant loader tests
Expand Down
4 changes: 4 additions & 0 deletions .github/workflows/test_destination_snowflake.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,10 @@ on:
- devel
workflow_dispatch:

concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true

env:
DLT_SECRETS_TOML: ${{ secrets.DLT_SECRETS_TOML }}

Expand Down
6 changes: 5 additions & 1 deletion .github/workflows/test_destination_synapse.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,10 @@ on:
- devel
workflow_dispatch:

concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true

env:
DLT_SECRETS_TOML: ${{ secrets.DLT_SECRETS_TOML }}

Expand All @@ -24,7 +28,7 @@ jobs:
run_loader:
name: Tests Synapse loader
needs: get_docs_changes
if: needs.get_docs_changes.outputs.changes_outside_docs == 'true'
if: needs.get_docs_changes.outputs.changes_outside_docs == 'true'
strategy:
fail-fast: false
matrix:
Expand Down
4 changes: 4 additions & 0 deletions .github/workflows/test_destinations.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,10 @@ on:
- devel
workflow_dispatch:

concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true

env:

DLT_SECRETS_TOML: ${{ secrets.DLT_SECRETS_TOML }}
Expand Down
6 changes: 5 additions & 1 deletion .github/workflows/test_doc_snippets.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,10 @@ on:
- devel
workflow_dispatch:

concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true

env:
DLT_SECRETS_TOML: ${{ secrets.DLT_SECRETS_TOML }}

Expand Down Expand Up @@ -54,7 +58,7 @@ jobs:

- name: Install dependencies
# if: steps.cached-poetry-dependencies.outputs.cache-hit != 'true'
run: poetry install --no-interaction -E duckdb -E weaviate -E parquet -E qdrant --with docs,sentry-sdk --without airflow
run: poetry install --no-interaction -E duckdb -E weaviate -E parquet -E qdrant -E bigquery --with docs,sentry-sdk --without airflow

- name: create secrets.toml
run: pwd && echo "$DLT_SECRETS_TOML" > docs/website/docs/.dlt/secrets.toml
Expand Down
4 changes: 4 additions & 0 deletions .github/workflows/test_local_destinations.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,10 @@ on:
- devel
workflow_dispatch:

concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true

env:
DLT_SECRETS_TOML: ${{ secrets.DLT_SECRETS_TOML }}

Expand Down
9 changes: 9 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,15 @@ Thank you for considering contributing to **dlt**! We appreciate your help in ma
6. [Publishing (Maintainers Only)](#publishing-maintainers-only)
7. [Resources](#resources)

## Before You Begin

- **Proposing significant changes or enhancements**: If you're thinking about making significant changes, make sure to [submit an issue](https://github.com/dlt-hub/dlt/issues/new/choose) first. This ensures your efforts align with the project's direction and that you don't invest time on a feature that may not be merged.

- **Fixing bugs**:
- **Check existing issues**: search [open issues](https://github.com/dlt-hub/dlt/issues) to see if the bug you've found is already reported.
- If **not reported**, [create a new issue](https://github.com/dlt-hub/dlt/issues/new/choose). You're more than welcome to fix it and submit a pull request with your solution. Thank you!
- If the bug is **already reported**, please leave a comment on that issue stating you're working on fixing it. This helps keep everyone updated and avoids duplicate efforts.

## Getting Started

To get started, follow these steps:
Expand Down
8 changes: 5 additions & 3 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,8 @@ dev: has-poetry
poetry install --all-extras --with airflow --with docs --with providers --with pipeline --with sentry-sdk

lint:
./check-package.sh
./tools/check-package.sh
poetry run python ./tools/check-lockfile.py
poetry run mypy --config-file mypy.ini dlt tests
poetry run flake8 --max-line-length=200 dlt
poetry run flake8 --max-line-length=200 tests --exclude tests/reflection/module_cases
Expand All @@ -60,8 +61,9 @@ format:
# poetry run isort ./

test-and-lint-snippets:
poetry run mypy --config-file mypy.ini docs/website docs/examples
poetry run flake8 --max-line-length=200 docs/website docs/examples
cd docs/tools && poetry run python check_embedded_snippets.py full
poetry run mypy --config-file mypy.ini docs/website docs/examples docs/tools --exclude docs/tools/lint_setup
poetry run flake8 --max-line-length=200 docs/website docs/examples docs/tools
cd docs/website/docs && poetry run pytest --ignore=node_modules

lint-security:
Expand Down
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,19 +41,22 @@ Load chess game data from chess.com API and save it in DuckDB:
```python
import dlt
from dlt.sources.helpers import requests

# Create a dlt pipeline that will load
# chess player data to the DuckDB destination
pipeline = dlt.pipeline(
pipeline_name='chess_pipeline',
destination='duckdb',
dataset_name='player_data'
)

# Grab some player data from Chess.com API
data = []
for player in ['magnuscarlsen', 'rpragchess']:
response = requests.get(f'https://api.chess.com/pub/player/{player}')
response.raise_for_status()
data.append(response.json())

# Extract, normalize, and load the data
pipeline.run(data, table_name='player')
```
Expand Down
3 changes: 3 additions & 0 deletions dlt/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,8 @@

from dlt import sources
from dlt.extract.decorators import source, resource, transformer, defer
from dlt.destinations.decorators import destination

from dlt.pipeline import (
pipeline as _pipeline,
run,
Expand Down Expand Up @@ -62,6 +64,7 @@
"resource",
"transformer",
"defer",
"destination",
"pipeline",
"run",
"attach",
Expand Down
6 changes: 6 additions & 0 deletions dlt/cli/_dlt.py
Original file line number Diff line number Diff line change
Expand Up @@ -443,6 +443,12 @@ def main() -> int:
pipe_cmd.add_argument(
"--list-pipelines", "-l", default=False, action="store_true", help="List local pipelines"
)
pipe_cmd.add_argument(
"--hot-reload",
default=False,
action="store_true",
help="Reload streamlit app (for core development)",
)
pipe_cmd.add_argument("pipeline_name", nargs="?", help="Pipeline name")
pipe_cmd.add_argument("--pipelines-dir", help="Pipelines working directory", default=None)
pipe_cmd.add_argument(
Expand Down
Loading
Loading