From 0ca4f6bd5fa789e4930afda3765594cc95df215c Mon Sep 17 00:00:00 2001 From: dat-a-man <98139823+dat-a-man@users.noreply.github.com> Date: Wed, 6 Mar 2024 12:04:11 +0000 Subject: [PATCH] Added a note on incremental behaviour --- .../website/docs/general-usage/incremental-loading.md | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/docs/website/docs/general-usage/incremental-loading.md b/docs/website/docs/general-usage/incremental-loading.md index dd52c9c750..3aaa1678b9 100644 --- a/docs/website/docs/general-usage/incremental-loading.md +++ b/docs/website/docs/general-usage/incremental-loading.md @@ -296,6 +296,17 @@ We just yield all the events and `dlt` does the filtering (using `id` column dec Github returns events ordered from newest to oldest so we declare the `rows_order` as **descending** to [stop requesting more pages once the incremental value is out of range](#declare-row-order-to-not-request-unnecessary-data). We stop requesting more data from the API after finding first event with `created_at` earlier than `initial_value`. +:::note +**Note on Incremental Cursor Behavior:** +When using incremental cursors for loading data, it's important to understand how dlt handles records in relation to the cursor's +last value. dlt will load only those records for which the incremental cursor value is higher than the last known value of the cursor. +This means that any records with a cursor value lower than or equal to the last recorded value will be ignored during the loading process. +This behavior ensures efficiency by avoiding the reprocessing of records that have already been loaded, but it can lead to confusion if +there are expectations of loading older records that fall below the current cursor threshold. If your use case requires the inclusion of +such records, consider adjusting your data extraction logic or using a full refresh strategy where appropriate. +::: + + ### max, min or custom `last_value_func` `dlt.sources.incremental` allows to choose a function that orders (compares) cursor values to current `last_value`.