PyAirbyte BigQuery Source Issue #46938
Unanswered
jroman-sh
asked this question in
Source Python CDK
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Airbyte Versions
airbyte==0.18.1
airbyte-api==0.49.4
airbyte-cdk==5.13.0
airbyte-protocol-models==0.5.1
airbyte_protocol_models_dataclasses==0.13.0
airbyte_protocol_models_pdv2==0.13.0
Problem Statement
I'm trying to sync data between BigQuery and a PostgreSQL instance, but I’m running into an issue that I believe is related to an Airbyte’s internal implementation problem.
Below is some mock code I’ve prepared to demonstrate my case:
This simple code syncs a table called
twinco-staging.playground.test
to PostgreSQL.The table, which is a mock table, looks like the following:
When I run this simple sync job I get the following error.
Looking at the log:
The error is because the air bytes tries to get the cursor of the table to get its size but the cursor is null. This happens at IncrementalUtils.kt:20
However, as I'm doing a full refresh and my source does not have any cursor field aribyte should not be looking at the cursor.
When debugging my code to get into the issue I looked at the configured_catalogues of my source:
Here you can see that the
cursor_field=None
.You may think why
sync_mode=<SyncMode.incremental: 'incremental'>
. This is configured by default by PyAirbyte's Source class and it's not possible to configure it otherwise.Code Issue
Looking back at the first lines of the log we can see the following:
The method
validateCursorFieldForIncrementalTables
callsgetCursorFieldOptional
which callsgetCursorField
where theNullPointerException
error happens.Looking at this method implementation we can see the following condition which I believe is wrong:
AbstractDBSource.kt:207
For each stream we get calculate
hasSourceDefinedCursor
which is a boolean variable that is true when the source has a cursor. As we saw before my source does not have a cursor.Then we skip the validation if at least one of the following conditions is true:
!tableNameToTable.containsKey(fullyQualifiedTableName)
: If the table nameairbyteStream.syncMode != SyncMode.INCREMENTAL
: If it's not incremental skip cursor validationhasSourceDefinedCursor
getCursorFieldOptional
. This does not make logical sense.I believe that condition must be
! hasSourceDefinedCursor
What do you think guys? If this is actually an issue can anyone help me write it?
Thanks
Beta Was this translation helpful? Give feedback.
All reactions