-
Notifications
You must be signed in to change notification settings - Fork 997
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[spark] Support table options via SQL conf for Spark Engine #4393
Conversation
@YannByron @Aitozi Hi, would you kindly review this |
b1a023a
to
48864d7
Compare
Do we need to support the table with same name in different db/catalog? Just like flink's global option do. #2104 |
I think we should find an unified way to unify Flink and Spark. |
@Aitozi @JingsongLi Thx for reply. +1 for unify this. |
48864d7
to
fa99bea
Compare
744b03f
to
a565167
Compare
@JingsongLi @Aitozi Table options format: Dynamic table options will override global options if there are conflicts. WDYT? |
4e4551e
to
774f85c
Compare
@Aitozi I’ve updated the dynamic global options format for Flink as |
Get it, LGTM |
Why flink contains ${catalogName}, but spark not |
72fe7a2
to
cbb9342
Compare
@Zouxxyy Updated Spark table option format as: |
… user experience with Flink engine
cbb9342
to
d988039
Compare
@Zouxxyy @JingsongLi CI has passed, please take a look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @xiangyuf , looks good to me!
val catalogContext = CatalogContext.create( | ||
Options.fromMap(mergeSQLConf(options)), | ||
SparkSession.active.sessionState.newHadoopConf()) | ||
Options.fromMap( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For SparkSource loadTable, maybe we can just keep the original way, because it is simpler and more general through spark.read.format("paimon").options()
to set table level config. And it is difficult to get catalogname and dbname here
… user experience with Flink engine (apache#4393)
Purpose
Linked issue: close #4371
In some cases, users may want to use spark time travel by setting properties like set
spark.paimon.scan.tag-name=tag_3
. However, this property will take effect globally if the spark job read multiple tables at the same time.It would be better if we can support table options via sql conf for Spark Engine. So user can specify different time travel options for different table, like this:
Tests
API and Format
Documentation