-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] Using BigQuery adapter, validation_errors - Invalid cast from STRING to JSON #42
Comments
Hi @MatthewDaniels thank you for opening this issue. I am aware of this JSON BigQuery rollout; however, I wasn't aware that this would impact the Xero package as we do not perform any JSON parsing or JSON operations in the package. It seems that this issue is a being caused by some of your Xero connectors that you are unioning have the However, when looking into this further I noticed we don't even bring this field into the downstream staging model (other any other downstream models). As a result, I believe the best fix here is to simply remove the field from the macro so it is not attempted to be unioned along with all the other fields being used in the package. Is this field crucial to any other analytics you are performing? If not, I would suggest we remove the field. Here is a branch you may test out that removes the field. Let me know if this solves your immediate issue. If you do not have any immediate concerns with removing this field then we may take the next steps to roll this out in an official release.
packages:
- git: https://github.com/fivetran/dbt_xero.git
revision: patch/json-union
warn-unpinned: false Let me know how this branch goes! |
Thankyou so much for getting back to me so quickly on this - I really appreciate it and all the work that has gone into these packages! Unfortunately I am facing the same issue after updating my package to use the patch branch (although have had partial success due to your explanation @fivetran-joemarkiewicz 👍 ) - perhaps I am missing something as I am fairly new to DBT (but not to coding / scripting / data engineering) - this also makes it difficult to step through the processes to debug myself (so any pointers would be massively welcomed!). The steps I have followed are:
I got the same error and the compiled SQL in the targets directory has the validation_errors still present in it for some reason. I also then attempted to explicitly execute I also searched the This is the contents of my packages.yml packages:
# - package: fivetran/xero
# version: [">=0.6.0", "<0.7.0"]
- git: https://github.com/fivetran/dbt_xero.git
revision: patch/json-union
warn-unpinned: false This is the contents of my package-lock.yml: packages:
- git: https://github.com/fivetran/dbt_xero.git
revision: patch/json-union
- git: https://github.com/fivetran/dbt_xero_source.git
revision: patch/json-union
- package: fivetran/fivetran_utils
version: 0.4.8
- package: dbt-labs/dbt_utils
version: 1.1.1
sha1_hash: 69caaeb46e02a33fe81eefb9f7da1f5c2eac34e0
This is the last few lines of the CLI output:
💯 Partial successI tested with a SINGLE SCHEMA (ie: without the unioning, which as noted may include some of the errors) and I did not get the problem (likely because the single schema had no validation errors). I am super interested to understand why this patch may not have excluded the validation errors column altogether. |
Hi @MatthewDaniels I am glad you are seeing partial success! 🎉 To get full success, can you try and also run If that doesn't work, I will take a deeper look to see if there is something else I need to adjust. |
Hi @fivetran-joemarkiewicz, sorry I wasn't clear (or rather added likely too much info 😄 ) in my response (and thanks again for the super quick response!), but I did attempt to explicitly execute Both with no change to the compiled SQL, and unsure how to follow up from here, sorry! |
Updated Understanding of the IssueHi @MatthewDaniels thanks for adding the additional context. I was confused why removing the field did not seem to work. However, upon further investigation I found that the macro I changed in the patch is not actually the cause of the issue. Instead we can see that within the In your case, it seems that some of your Xero connectors have received the JSON update while others have not, this results in the union_relations macro attempting to union the same field, but with differing datatypes. Thus resulting in the error you see. Possible Interim SolutionI was able to reproduce this error locally and made an adjustment to the underlying union_data macro that then calls the dbt_utils.union_relations macro to include and Please note you will need to fully delete your Ultimately, if this does solve the issue you are more than welcome to continue using this branch. However, I would encourage opening a support ticket and raising this datatype inconsistency to our support team to be addressed at the connector level. Thanks! |
This is awesome insight, thanks heaps @fivetran-joemarkiewicz . I can confirm that this worked with all of the schemas included! 🎉 With regards to the updating of the field to JSON in the Fivetran connector - I have found three of the 9 that have not been updated - one of which has not even synced in the last 6 months (so will need to fix that!). I think two of the sources have very little updates (if any) so perhaps Fivetran may not have completed the updates due to that? At any rate - my DBT issue is resolved and I thank you so much for the help! |
Thanks for sharing @MatthewDaniels! I am happy to hear that this patch resolved your immediate issue and that you will be able to work more closely with support to understand any updates that need to be applied to the connectors. In the meantime, I will actually keep this issue open (but will mark as won't fix) for the time being in case any others run into the same issue you experienced. Thanks again! |
Is there an existing issue for this?
Describe the issue
First, thanks for the great work on these models!
When using BigQuery as the data warehouse for my data and running the XERO transform DBT models, there is one temp step that fails, causing the invoice line items model to be skipped.
The step is stg_xero__invoice_line_item_tmp and the error is in the following line:
cast('validation_errors' as JSON) as 'validation_errors'
(this ends up on line 26 of the rendered SQL in BQ, but for my use case, I am using multiple schemas to create a single set of output models)
The error being returned by BigQuery is the following:
Invalid cast from STRING to JSON at [111:26]
(again, the line numbers are likely specific to my project as I have multiple input schemas)
I noticed recently for the XERO connectors (and a few others) in Fivetran whilst using BigQuery as the destination that there was the following notification and wondered if the two were related?
I am unsure how to troubleshoot further and support where I can -
Relevant error log or model output
Using
threads: 4
in the profile.No other project specific vars or configurations are used.
Package versions
What database are you using dbt with?
bigquery
dbt Version
After attempting an update, I get the same issues with the following output from
dbt --version
Additional Context
No response
Are you willing to open a PR to help address this issue?
The text was updated successfully, but these errors were encountered: