Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically read JSON types #1203

Open
Guilherme-B opened this issue Mar 25, 2024 · 0 comments
Open

Automatically read JSON types #1203

Guilherme-B opened this issue Mar 25, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@Guilherme-B
Copy link

Guilherme-B commented Mar 25, 2024

As noted on the repository, Spark does not support JSON types for this reason, the BQ connector converts the JSON record into a String. In Spark 3.4.0 a new method to was introduced, which allows us to cast the DataFrame into a target schema. However, the casting fails and forces us to use the fromJson function, which would essentially mean having to store each StructType separately for each column that needs to be parsed, in other words, hardcoding.

I was wondering if there is any other way to do this? Can we somehow determine if a column is of the JSON type in BigQuery (metadata does not seem to be available on the read DataFrame) and if so, retrieve only the relevant schema portion for that specific column and convert it into a StructType?

@davidrabinowitz davidrabinowitz added the enhancement New feature or request label Mar 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants