Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core] Support branch batch/streaming read and write #2748

Merged
merged 21 commits into from
Feb 7, 2024

Conversation

TaoZex
Copy link
Contributor

@TaoZex TaoZex commented Jan 20, 2024

Purpose

Write: After data is written, the new snapshot is written to the corresponding branch directory
Read: Read the snapshot in the corresponding branch directory to get the data file position

Linked issue:
#2727
#2726

subtask of #1795

Tests

API and Format

Next

support merge branch

@xiangyuf
Copy link
Contributor

Hi @TaoZex, I browsed through your request and thx for ur work. In my humble opinion, this PR contains lots of changes and can be separated into several PRs in order to improve the efficiency of code review. For example, we can have one PR for changes in BranchManager, TagManager, SnapShotManager and SchemaManager to support some basic operations of Branch. After that, we can launch another PR for branch Streaming/Batch read or Streaming/Batch write. By doing this, we can make the code review work more easy here. And once the basic operations of Branch have been merged, we can implement the replace/merge operation of Branch. WDYT? Also, I'd like to hear more from @FangYongs .

@TaoZex
Copy link
Contributor Author

TaoZex commented Jan 30, 2024

Hi @TaoZex, I browsed through your request and thx for ur work. In my humble opinion, this PR contains lots of changes and can be separated into several PRs in order to improve the efficiency of code review. For example, we can have one PR for changes in BranchManager, TagManager, SnapShotManager and SchemaManager to support some basic operations of Branch. After that, we can launch another PR for branch Streaming/Batch read or Streaming/Batch write. By doing this, we can make the code review work more easy here. And once the basic operations of Branch have been merged, we can implement the replace/merge operation of Branch. WDYT? Also, I'd like to hear more from @FangYongs .

Thank you for your suggestion. We have discussed related issues in #2608. In order to be compatible with the original design and reduce changes, the form of changes is similar to the following figure:
image
The current approach is the result of the previous discussion, and there are not many changes shown in the figure above, so I still hope to merge the current pr. Thanks again for your advice.

@TaoZex TaoZex requested a review from FangYongs January 30, 2024 07:00
@TaoZex
Copy link
Contributor Author

TaoZex commented Jan 31, 2024

Timeout in ci, please rerun it. Thanks.

@schnappi17
Copy link
Contributor

+1 for this, thanks @TaoZex for contribution

@schnappi17 schnappi17 merged commit 9bd6020 into apache:master Feb 7, 2024
9 checks passed
@TaoZex
Copy link
Contributor Author

TaoZex commented Feb 7, 2024

Thanks for every reviewer. Happy New Year! @FangYongs @schnappi17 @xiangyuf

@xiangyuf
Copy link
Contributor

xiangyuf commented Feb 7, 2024

@TaoZex Great work, thx for contribution.

@TaoZex TaoZex deleted the branch_read_and_write branch February 14, 2024 15:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants