Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature][core] Flink SQL task Support insert result preview #3893

Closed
3 tasks done
MactavishCui opened this issue Oct 31, 2024 · 11 comments · Fixed by #3897
Closed
3 tasks done

[Feature][core] Flink SQL task Support insert result preview #3893

MactavishCui opened this issue Oct 31, 2024 · 11 comments · Fixed by #3897
Labels
New Feature New feature

Comments

@MactavishCui
Copy link
Contributor

MactavishCui commented Oct 31, 2024

Search before asking

  • I had searched in the issues and found no similar feature requirement.

Description

A short description of your feature

Insert result of flink SQL task can be previewed as well as debug insert result do not effect production environment data

Why this feature is useful to most users?

Situation
Currently, dinky only supports previewing the results of a single Select statement, it is inconvenient during Flink SQL task debug. As is shown above, every sink function of a multi-sink Flink SQL task needs a Select statement to check the insert. Then select statement must be changed to Insert before production environment deployment, while errors may happen during statements changing. Besides, frequent statement changing is time wasting for users.

Additional information:

The real-time data warehouse platform of Meituan NAU has implemented similar feature. Insert function is mocked when debug. Debug results are inserted and selected by S3.
Statements are not needed to changed before deployment. Multi-insert task's result preview is also supported. And data of production environment will not be affected. REF: https://zhuanlan.zhihu.com/p/532657279

Use case

A possible solution:

Dinky supports the settings including job auto cancel, maximum catch rows etc. and data preview is implemented by SelectResult.class. In order to be compatible with historical logic like mentioned above and reuse previous code as much as possible, I designed the following scheme to implement this feature: A customized connector is designed to save insert data to accumulators. That means results can be caught by TableResult.class. Also, that means results can be handled in the similar way as SelectResult.class, more codes can be reused. And the connector options will be changed to the customized mock connector by SqlExplainer if the Task is set to be mocked.
Solution
MockFunction

My attempt

Based on the scheme mentioned above, I have implemented this feature in my local repository. Here lists the results:
image
If this issue is allowed to submit a pr, I will submit by following steps:

  1. Customized connector
  2. MockExplainer.class: Implement of Insert statements change to mock statements based on template.
  3. Code change of module core and admin: Implement of backend part.
  4. Implement of front part.

Looking forward to your reply and dissicussion.

Related issues

No.

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@MactavishCui MactavishCui added New Feature New feature Waiting for reply Waiting for reply labels Oct 31, 2024
@aiwenmo aiwenmo removed the Waiting for reply Waiting for reply label Nov 1, 2024
@aiwenmo
Copy link
Contributor

aiwenmo commented Nov 1, 2024

Your idea is very good.
Our IDE is currently being refactored.
Please implement it based on the latest IDE.
We are looking forward to your code.

@MactavishCui
Copy link
Contributor Author

@aiwenmo I have submitted a draft PR https://github.com/DataLinkDC/dinky/pull/3897. I find that both https://github.com/DataLinkDC/dinky/pull/3889 and https://github.com/DataLinkDC/dinky/pull/3854 are working on data studio or SQL execute refactoring and both of them have '1.,2.0' milestone tag. Is all refactors will be included in version 1.2.0? Can I submit my PR with new refactored code after the publication of dinky 1.2.0? I will refactor my codes of this feature and switch the status to 'ready for review' if all the IDE refactors you mentioned is ready. Looking forward to your reply.

@Zzm0809
Copy link
Contributor

Zzm0809 commented Nov 3, 2024

@MactavishCui Thank you for your contribution. As you can see, the task submission process and the overall DataStudio module on the front end are being refactored. This refacturing will be released in 1.2.0, so your pr needs to wait for the above two prs to be merged. Make synchronous changes. I suggest you pay attention to the progress of the above two prs. Of course, the above PR merger will also notify you.

@MactavishCui
Copy link
Contributor Author

@MactavishCui Thank you for your contribution. As you can see, the task submission process and the overall DataStudio module on the front end are being refactored. This refacturing will be released in 1.2.0, so your pr needs to wait for the above two prs to be merged. Make synchronous changes. I suggest you pay attention to the progress of the above two prs. Of course, the above PR merger will also notify you.

@Zzm0809 Thanks for your reply! I will read the refactored code, redesign my scheme implement, make synchronous changes and update the draft PR linked to this issue after the merger of related PRs. And I will keep on working on other issues, hope to make more contributions to dinky!

@Zzm0809
Copy link
Contributor

Zzm0809 commented Nov 4, 2024

@MactavishCui Thank you for your contribution. As you can see, the task submission process and the overall DataStudio module on the front end are being refactored. This refacturing will be released in 1.2.0, so your pr needs to wait for the above two prs to be merged. Make synchronous changes. I suggest you pay attention to the progress of the above two prs. Of course, the above PR merger will also notify you.

@Zzm0809 Thanks for your reply! I will read the refactored code, redesign my scheme implement, make synchronous changes and update the draft PR linked to this issue after the merger of related PRs. And I will keep on working on other issues, hope to make more contributions to dinky!

ok, thanks!!

@Pirate5946
Copy link

Pirate5946 commented Nov 5, 2024

借帖子提个问题哈,dinky V1.1.0 是否支持 Flink SQL 1.19 ,一个任务写多个 insert into 语句 到 不同 sink ? @Zzm0809 @aiwenmo
image

@Zzm0809
Copy link
Contributor

Zzm0809 commented Nov 5, 2024

借帖子提个问题哈,dinky V1.1.0 是否支持 Flink SQL 1.19 ,一个任务写多个 insert into 语句 到 不同 sink ? @Zzm0809 @aiwenmo
image

支持。直接编写多个SQL即可默认已经开启语句集

@Zzm0809
Copy link
Contributor

Zzm0809 commented Nov 7, 2024

Hi @MactavishCui Front-end code refactoring has come to an end for the time being. You can pull the latest code to develop the front-end code. Back-end part #3889 is planned to be carried out when release-1.3.0, which will have no impact on your existing code.

@MactavishCui
Copy link
Contributor Author

Hi @MactavishCui Front-end code refactoring has come to an end for the time being. You can pull the latest code to develop the front-end code. Back-end part #3889 is planned to be carried out when release-1.3.0, which will have no impact on your existing code.

@Zzm0809 Thanks for notifying! I will refactor my code of front end part and set linked draft PR to 'ready for review' these days!

@MactavishCui
Copy link
Contributor Author

Hi @Zzm0809 , I have also implemented this feature in the new data studio front end part. My PR is ready for review now! Looking forward to your review and discussion!

@Zzm0809
Copy link
Contributor

Zzm0809 commented Nov 9, 2024

Hi @Zzm0809 , I have also implemented this feature in the new data studio front end part. My PR is ready for review now! Looking forward to your review and discussion!

OK

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
New Feature New feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants