Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a method to add schema information to a BigQueryRelation #1232

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

DJSagarAhire
Copy link
Contributor

This PR adds a way to add schema information to a relation. This can be called by a plugin that wishes to request validation.

The current intention is that this will be called by Wrangler in the transform() method to supply schema information (which Wrangler currently already has in the oSchema variable).

@DJSagarAhire DJSagarAhire added build Trigger unit test build bq-pushdown labels Apr 19, 2023
@DJSagarAhire DJSagarAhire requested review from tivv and fernst April 19, 2023 10:32
@DJSagarAhire DJSagarAhire self-assigned this Apr 19, 2023
* @param schema The schema.
* @return A new relation with the schema added.
*/
public Relation addSchema(@Nullable Schema schema) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please validate there is no existing schema unless you expect schema overrides

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should we ideally be doing in case a schema already exists? Since we're returning a new Relation I think overriding the schema should be fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how do you plan to use this method. Generally it's better to introduce methods when they are needed (even with mock implementation) than visa versa.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've mentioned that already in the PR description above. I plan to use it in Wrangler in transform() just like any other Relation operation. I don't see any other way to add a schema to the Relation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whangler should have nothing to do with schema management. How would Wrangler know the schema of a SQL expression?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also it's unneded burden on the plugin. Given known schema of the original relation and known schema of all expressions, resulting schema should be automatically contructed by the platform

@DJSagarAhire DJSagarAhire changed the title Add a method to add schema information to a relation Add a method to add schema information to a BigQueryRelation Apr 19, 2023
@fernst fernst requested a review from tivv April 19, 2023 20:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bq-pushdown build Trigger unit test build
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants