Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(core-clp): Add
BoundedReader
to prevent out-of-bound reads in segmented input streams. #624feat(core-clp): Add
BoundedReader
to prevent out-of-bound reads in segmented input streams. #624Changes from 9 commits
7dad86e
6ac50eb
ed35cb5
34ad765
47b0cc0
7aff0c6
2ab88fd
8977dc4
93c7d3b
6c3a537
6321c0c
e54d434
cbe2073
4dcb7ff
d261ca2
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add
[[nodiscard]]
for any non-void returnsThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we add a doc string to explain why we override the default implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reason is really just that BoundedReader can't delegate to a potentially more efficient implementation in the underlying reader (since it won't respect the bounds), and most code really shouldn't use this interface anyway since its a performance trap.
Can add a docstring saying as much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, let's add a doc string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since parameters are not used, shall we add
[[maybe_unused]]
to silence clang-tidy warnings?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this only for supporting a new test case?
What you intend to do makes sense to me, but It's bit annoying that the behavior of standard seek interface allows seeking beyond the file ending position, so I feel we need some justification when we decide to change the behavior.
@kirkrodrigues any comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would classify this as a bug I'm fixing in the StringReader class so that I can use it for tests. Every other reader we have will EOF if you seek past the end from what I've seen.
Actually, the way the rest of this class is written if you first seek past the end of the input buffer it will happily let you read data beyond the end of the buffer. I.e. it is very explicitly a bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, The FileReader internally calls fseeko, which I believe would allow seeking beyond the end of file.
BufferedFileReader and BufferReader would return ErrorCode_Truncated, and won't update the pos at all if the pos is greater than the maximum length,
From the consistency point of view, maybe we should let it return ErrorCode_Truncated. But also wonder if the rest of your code is dependent on the current behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FileReader
) will rely on the lower implementation to return an error if the seek fails.BufferReader
) will return an error if the seek is past the end, but they won't modifym_pos
.NetworkReader
) will return an error if the seek is past the end, and they will modifym_pos
.I think the practical implementation is for
try_seek_from_begin
to:pos
m_pos
should be updated to just past the last byte.ErrorCode_EndOfFile
.The reason to update
m_pos
even though we get topos
is because for some implementations like FileReader, we can't easily check what the last byte is until we seek, and if we seek up to the end of the file, we may not be able to seek backwards to the originalm_pos
.If y'all agree, we should open a GH issue to refactor the existing implementations. And for this implementation, we implement the proposal above (basically what Devin's implemented---although on error, we're still updating
m_cur_pos
, even tough the error may not be EOF?).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, the standard behavior sounds good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense to me.
Also went and updated BoundedReader to only update
m_curr_pos
on error if that error is EOF.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use
constexpr std::string_view
for const stringsThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm using constexpr char[] because StringReader takes std::string const& and I don't want to manually initialize an std::string every time I open a StringReader.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
constexpr char[]
andconstexpr std::string_view
won't make a difference: check hereThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I know, I've read the coding guidelines, and I understand that both ways of doing it are functionally the same. What I'm saying is that I don't like the explicit std::string{} initialization that you have to do when passing string_view in this specific case since it wastes horizontal space and reads worse.
I'll make the change to avoid wasting time, but I think it makes the code less readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about using
std::array
to: