-
Notifications
You must be signed in to change notification settings - Fork 990
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Core][Flink][Spark] Add deletedFileTotalLenInBytes in result of OrphanFilesClean #4545
[Core][Flink][Spark] Add deletedFileTotalLenInBytes in result of OrphanFilesClean #4545
Conversation
10a4a45
to
94a868e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
0361a11
to
da67df0
Compare
@JingsongLi |
deleteFiles::add, | ||
p -> { | ||
try { | ||
deletedFilesSizeInBytes.addAndGet(fileIO.getFileSize(p)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getFileSize should always be together with delete, otherwise, I feel there may be issues with execution efficiency, getFileSize should be executed parallelismly too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getFileSize should always be together with delete, otherwise, I feel there may be issues with execution efficiency, getFileSize should be executed parallelismly too.
@JingsongLi Very thanks for your suggestions.
I would like to confirm with you the code improvement plan:
Are you means that getFileStatus execute only once in OrphanFilesClean#createFileCleaner, and fileCleaner can direct return the size of deleted file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just think about it, maybe we don't need to invoke getFileSize
, the files come from listStatus
, it already contains file size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, Good idea.
I will have a try.
Thx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just think about it, maybe we don't need to invoke
getFileSize
, the files come fromlistStatus
, it already contains file size.
@JingsongLi
Hi,please cc, THX.
366f382
to
f97585e
Compare
f97585e
to
66d1f25
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @wwj6591812 , looks good to me!
Purpose
In my company, after we run a RemoveOrphanFilesAction, we not only want to know the number of orphan files that have been cleared, but also want to know how much capacity has been cleared.
This pr add deletedFileTotalLenInBytes in result of OrphanFilesClean.
Linked issue: close #xxx
Tests
API and Format
Documentation