feat(core-clp): Add `BoundedReader` to prevent out-of-bound reads in segmented input streams. #624

gibber9809 · 2024-12-05T19:39:23Z

Description

This PR adds a BoundedReader class that can help avoid backwards seeks when reading input streams segmented into several logical chunks. The BoundedReader is a ReaderInterface that prevents reading or seeking beyond a certain "bound" byte offset.

This is used in a follow-up PR to ensure that readers for different parts of a single-file-archive being streamed over the network do not accidentally read past the end of their section (this happens frequently with readers that buffer input beyond what the user has requested such as ZstdDecompressor).

For example consider an input stream divided into the following logical chunks:
| zstd stream 1 | header bytes | zstd stream 2|

If a ZstdDecompressor reading zstd stream 1 directly wraps that input stream it will almost certainly end up consuming the header bytes and some parts of zstd stream 2 while populating its internal buffer. In fact this sort of speculative buffering is required if we want to ZstdDecompressor to be performant. As a result, reading the header bytes and decompressing zstd stream 2 requires first seeking backwards in the original input stream.

BoundedReader addresses this problem by wrapping the input stream making it so that the ZstdDecompressor is unable to consume any bytes beyond the end of the logical section it belongs to, meaning that the following header and stream sections can always be read without backwards seeks.

This BoundedReader approach has some advantages over approaches that allow backwards seeking by buffering the input stream:

It uses less memory and requires less data-copy
The implementation is simple and easy to verify
The BoundedReader approach helps prevent a whole class of bugs involving faulty readers reading past the end of their logical section, and can help catch issues with corrupt archives

Validation performed

Added tests for seeking and reading edge cases

Summary by CodeRabbit

New Features
- Introduced the BoundedReader class to manage reading limits within input streams.
- Added unit tests for BoundedReader functionalities, ensuring robust error handling and boundary checks.
Bug Fixes
- Enhanced error handling in the StringReader class to prevent out-of-bounds seeking.

…an underlying stream

… of input

coderabbitai · 2024-12-05T19:39:31Z

Walkthrough

The pull request introduces changes to the CLP project by adding a new class, BoundedReader, along with its corresponding header and test files. The BoundedReader class implements methods for reading data with boundary checks and error handling. Additionally, modifications are made to the StringReader class to enhance error handling in the try_seek_from_begin method. The CMakeLists.txt file is updated to include the new source and test files, while existing configurations and functionalities remain unchanged.

Changes

File Path	Change Summary
components/core/CMakeLists.txt	Added new source file `BoundedReader.cpp`, header `BoundedReader.hpp`, and test file `test-BoundedReader.cpp`.
components/core/src/clp/StringReader.cpp	Updated `try_seek_from_begin` method to include a condition for out-of-bounds position handling.
components/core/src/clp/BoundedReader.cpp	Introduced `BoundedReader` class with methods `try_seek_from_begin` and `try_read`, implementing boundary checks.
components/core/src/clp/BoundedReader.hpp	Added `BoundedReader` class definition, constructor, and method overrides for `ReaderInterface`.
components/core/tests/test-BoundedReader.cpp	Created unit tests for `BoundedReader` using Catch2 framework, covering various functionalities.

Possibly related PRs

test(clp-s): Add end-to-end test case for compression and extraction. #595: This PR adds end-to-end tests for the clp_s module, which includes functionality related to reading and writing data, similar to the new BoundedReader class introduced in the main PR.
refactor(core): Fix clang-tidy warnings in the streaming compressor interface and its implementations; Modernize and refactor test-StreamingCompression for conciseness. #599: This PR refactors the streaming compression tests and modifies the Compressor class, which is relevant to the BoundedReader class's methods for reading and seeking, as both involve handling data streams and error management.

Suggested reviewers

kirkrodrigues

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4dcb7ff and d261ca2.

📒 Files selected for processing (1)

components/core/tests/test-BoundedReader.cpp (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

components/core/tests/test-BoundedReader.cpp

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR. (Experiment)
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (2)

components/core/src/clp/CheckpointReader.cpp (1)
10-12: Clarify condition by using equality check

Since m_cur_pos is set to be at most m_checkpoint in line 5, the condition m_cur_pos >= m_checkpoint will only be true when m_cur_pos == m_checkpoint. For clarity, consider changing the condition to:
if (m_cur_pos == m_checkpoint) {
    return ErrorCode_EndOfFile;
}
components/core/tests/test-CheckpointReader.cpp (1)

9-94: Add tests for null m_reader scenarios

Currently, there are no tests covering the case where CheckpointReader is constructed with a nullptr for m_reader. Adding such tests would enhance coverage and ensure the class handles this scenario gracefully.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between ec0821d and 34ad765.

📒 Files selected for processing (5)

components/core/CMakeLists.txt (2 hunks)
components/core/src/clp/CheckpointReader.cpp (1 hunks)
components/core/src/clp/CheckpointReader.hpp (1 hunks)
components/core/src/clp/StringReader.cpp (1 hunks)
components/core/tests/test-CheckpointReader.cpp (1 hunks)

🔇 Additional comments (4)

components/core/src/clp/CheckpointReader.hpp (1)

16-74: Overall class implementation is correct

The CheckpointReader class correctly implements the necessary methods from ReaderInterface and enforces the checkpoint limit as intended.

components/core/src/clp/StringReader.cpp (1)

44-47: Enhanced error handling for seeking beyond the end

The added condition in try_seek_from_begin properly handles attempts to seek beyond the end of the input string. It sets the position to the end of the string and returns ErrorCode_EndOfFile, preventing undefined behaviour.

components/core/tests/test-CheckpointReader.cpp (1)

9-94: Unit tests comprehensively cover CheckpointReader functionality

The test cases thoroughly validate the behaviour of CheckpointReader, including reading and seeking operations relative to the checkpoint and the end of the underlying stream.

components/core/CMakeLists.txt (1)

356-357: New files correctly added to the build configuration

The source files CheckpointReader.cpp, CheckpointReader.hpp, and the test file test-CheckpointReader.cpp have been appropriately included in the CMakeLists.txt, ensuring they are part of the build and test processes.

Also applies to: 554-554

components/core/src/clp/CheckpointReader.hpp

components/core/src/clp/CheckpointReader.cpp

haiqi96

Talked with Devin offline and there is a concrete example that explains the usecase of this class. Devin, please add it to the PR when you get a chance.

In general the PR makes sense to me. While there might be a more elegant design to achieve the same target, I feel we can go with the design in this PR given
1.the class design is simple and straight forward
2. We have a rather tight deadline , including other upcoming changes.

Left a few comments, and also we can rename the class to be BoundedReader. Unless @kirkrodrigues has other naming suggestions

Note this class has different purpose from BufferedFileReader. The BufferedFileReader is designed for user who knows a specific pos they will need to seek back, but this class is designed for user who knows a specific pos that they don't want to seek beyond.

components/core/src/clp/CheckpointReader.hpp

haiqi96 · 2024-12-06T16:26:28Z

components/core/src/clp/StringReader.cpp

@@ -41,6 +41,10 @@ ErrorCode StringReader::try_read(char* buf, size_t num_bytes_to_read, size_t& nu
 }

 ErrorCode StringReader::try_seek_from_begin(size_t pos) {
+    if (pos > input_string.size()) {


Is this only for supporting a new test case?

What you intend to do makes sense to me, but It's bit annoying that the behavior of standard seek interface allows seeking beyond the file ending position, so I feel we need some justification when we decide to change the behavior.

@kirkrodrigues any comment?

I would classify this as a bug I'm fixing in the StringReader class so that I can use it for tests. Every other reader we have will EOF if you seek past the end from what I've seen.

Actually, the way the rest of this class is written if you first seek past the end of the input buffer it will happily let you read data beyond the end of the buffer. I.e. it is very explicitly a bug.

Hmm, The FileReader internally calls fseeko, which I believe would allow seeking beyond the end of file.

BufferedFileReader and BufferReader would return ErrorCode_Truncated, and won't update the pos at all if the pos is greater than the maximum length,

From the consistency point of view, maybe we should let it return ErrorCode_Truncated. But also wonder if the rest of your code is dependent on the current behavior.

The implementations of ReaderInterface have inconsistent behaviour.

Some (e.g., FileReader) will rely on the lower implementation to return an error if the seek fails.

Some (e.g. BufferReader) will return an error if the seek is past the end, but they won't modify m_pos.

Some (e.g., NetworkReader) will return an error if the seek is past the end, and they will modify m_pos.

Each of the above returns different error codes.

I think the practical implementation is for try_seek_from_begin to:

try to seek until pos

if that's past the end of the medium (file/buffer/etc.), m_pos should be updated to just past the last byte.

the method should return ErrorCode_EndOfFile.

The reason to update m_pos even though we get to pos is because for some implementations like FileReader, we can't easily check what the last byte is until we seek, and if we seek up to the end of the file, we may not be able to seek backwards to the original m_pos.

If y'all agree, we should open a GH issue to refactor the existing implementations. And for this implementation, we implement the proposal above (basically what Devin's implemented---although on error, we're still updating m_cur_pos, even tough the error may not be EOF?).

Sure, the standard behavior sounds good to me.

That makes sense to me.

Also went and updated BoundedReader to only update m_curr_pos on error if that error is EOF.

components/core/tests/test-CheckpointReader.cpp

components/core/src/clp/CheckpointReader.cpp

Co-authored-by: haiqi96 <[email protected]>

kirkrodrigues · 2024-12-07T00:04:23Z

I'm fine with BoundedReader.

…n error if EOF

coderabbitai

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (4)

components/core/src/clp/BoundedReader.hpp (2)
22-28: Document Exceptions Thrown in Constructor

In the constructor, exceptions are thrown if m_reader is nullptr or if m_curr_pos exceeds m_bound. It's advisable to document these exceptions clearly in the class interface to inform users of potential exceptions during object instantiation.

38-38: Modify try_get_pos to Reflect Bounded Position

Currently, try_get_pos delegates directly to m_reader->try_get_pos(pos). Consider modifying it to return m_curr_pos instead, ensuring that users receive the position within the bounded context, adhering to the boundary constraints enforced by BoundedReader.

Apply this diff to adjust the method:
-auto try_get_pos(size_t& pos) -> ErrorCode override { return m_reader->try_get_pos(pos); }
+auto try_get_pos(size_t& pos) -> ErrorCode override {
+    pos = m_curr_pos;
+    return ErrorCode_Success;
+}
components/core/tests/test-BoundedReader.cpp (2)
22-33: Clarify Test Section Name for Accuracy

The test section named "BoundedReader does not allow reads beyond end of underlying stream." may be misleading. The test actually verifies that reads are limited to the available data without causing errors when attempting to read beyond the stream's end. Consider renaming the section for clarity.

Apply this change:
-SECTION("BoundedReader does not allow reads beyond end of underlying stream.") {
+SECTION("BoundedReader limits reads to available data when reading beyond stream end.") {
76-86: Remove Unused Variables in Test Section

In the test section "BoundedReader does not allow seeks beyond checkpoint.", the variables buf and num_bytes_read are declared but not used. Removing these unused variables will clean up the test code.

Apply this diff:
-    char buf[cTestStringLen];
-    size_t num_bytes_read{};

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 2ab88fd and 8977dc4.

📒 Files selected for processing (4)

components/core/CMakeLists.txt (2 hunks)
components/core/src/clp/BoundedReader.cpp (1 hunks)
components/core/src/clp/BoundedReader.hpp (1 hunks)
components/core/tests/test-BoundedReader.cpp (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

components/core/CMakeLists.txt

🧰 Additional context used

📓 Path-based instructions (3)

components/core/tests/test-BoundedReader.cpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

components/core/src/clp/BoundedReader.hpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

components/core/src/clp/BoundedReader.cpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

🔇 Additional comments (2)

components/core/src/clp/BoundedReader.cpp (2)

4-15: Verify Error Handling in try_seek_from_begin Method

The method try_seek_from_begin adjusts next_pos based on m_bound and handles errors from the underlying reader. However, when ErrorCode_EndOfFile is returned from m_reader->try_seek_from_begin, m_curr_pos is updated to next_pos. Please verify that this behaviour correctly reflects the end-of-file condition and does not lead to inconsistencies in m_curr_pos.

17-38: Ensure Consistent Handling of Partial Reads in try_read

The try_read method correctly limits num_bytes_to_read to prevent reading beyond m_bound. After the read operation, it handles end-of-file scenarios, especially when ErrorCode_EndOfFile is returned with num_bytes_read equal to zero. Please confirm that this logic aligns with the expected behaviour of the underlying reader, particularly in cases of partial reads.

components/core/src/clp/BoundedReader.cpp

haiqi96 · 2024-12-07T23:54:19Z

Left another two comments for nit. otherwise the code looks good to me

Co-authored-by: haiqi96 <[email protected]>

haiqi96

Looks good to me. @kirkrodrigues do you want to do another round of review?

haiqi96 · 2024-12-09T15:14:27Z

@gibber9809 Can you please also open an issue to track the proposed change for commonizing seek interface?

gibber9809 · 2024-12-09T15:42:48Z

@gibber9809 Can you please also open an issue to track the proposed change for commonizing seek interface?

Added issue #628.

LinZhihao-723

Reviewed the header + implementation files, didn't go through unit tests yet

components/core/src/clp/BoundedReader.hpp

LinZhihao-723 · 2024-12-09T23:22:59Z

components/core/src/clp/BoundedReader.hpp

+    auto
+    try_read(char* buf, size_t num_bytes_to_read, size_t& num_bytes_read) -> ErrorCode override;
+
+    auto try_read_to_delimiter(char delim, bool keep_delimiter, bool append, std::string& str)


Shall we add a doc string to explain why we override the default implementation?

Reason is really just that BoundedReader can't delegate to a potentially more efficient implementation in the underlying reader (since it won't respect the bounds), and most code really shouldn't use this interface anyway since its a performance trap.

Can add a docstring saying as much.

Sure, let's add a doc string

LinZhihao-723 · 2024-12-09T23:23:33Z

components/core/src/clp/BoundedReader.hpp

+    auto
+    try_read(char* buf, size_t num_bytes_to_read, size_t& num_bytes_read) -> ErrorCode override;
+
+    auto try_read_to_delimiter(char delim, bool keep_delimiter, bool append, std::string& str)


Since parameters are not used, shall we add [[maybe_unused]] to silence clang-tidy warnings?

LinZhihao-723 · 2024-12-09T23:26:17Z

components/core/src/clp/BoundedReader.hpp

+     * @return ErrorCode_Success on success
+     * @return ErrorCode_errno on failure
+     */
+    auto try_get_pos(size_t& pos) -> ErrorCode override { return m_reader->try_get_pos(pos); }


We should add [[nodiscard]] for any non-void returns

components/core/src/clp/BoundedReader.hpp

components/core/src/clp/BoundedReader.cpp

Co-authored-by: Lin Zhihao <[email protected]>

coderabbitai

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (3)

components/core/src/clp/BoundedReader.hpp (3)
12-13: Correct grammatical error in the class description.

In the comment explaining the purpose of BoundedReader, there's a missing preposition.

Please modify the comment as follows to improve readability:
- * want to prevent a reader for an earlier segment consuming any bytes from a later segment.
+ * want to prevent a reader for an earlier segment from consuming any bytes from a later segment.
58-59: Consistent formatting for function declaration.

For better readability, place the return type and function name on the same line.

Modify the declaration as follows:
-auto
-try_read(char* buf, size_t num_bytes_to_read, size_t& num_bytes_read) -> ErrorCode override;
+auto try_read(char* buf, size_t num_bytes_to_read, size_t& num_bytes_read) -> ErrorCode override;
67-69: Consider using smart pointers for member variables to manage ownership and lifetime.

Using a raw pointer for m_reader does not enforce lifetime management, potentially leading to dangling references if the underlying reader is destroyed before BoundedReader.

Consider using std::shared_ptr or std::unique_ptr to manage the ownership:
-    ReaderInterface* m_reader{nullptr};
+    std::shared_ptr<ReaderInterface> m_reader;
This change would ensure that the underlying reader remains valid for the lifetime of BoundedReader.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 6c3a537 and 6321c0c.

📒 Files selected for processing (1)

components/core/src/clp/BoundedReader.hpp (1 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

components/core/src/clp/BoundedReader.hpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

🔇 Additional comments (5)

components/core/src/clp/BoundedReader.hpp (5)

4-4: Include missing headers for size_t and ErrorCode.

The file is missing includes for cstddef (for size_t) and the header where ErrorCode is defined. This could lead to compilation issues.

26-27: Verify the boundary condition in the constructor.

Currently, the check uses m_curr_pos > m_bound. Should the condition be m_curr_pos >= m_bound to prevent the position from being equal to the bound?

Please confirm whether the equality case should be considered an error, ensuring that the reader does not start at the exact boundary position.

38-38: Add [[nodiscard]] attribute to non-void return functions.

To prevent unintended ignoring of return codes, consider adding [[nodiscard]] to the try_get_pos method.

47-47: Add [[nodiscard]] attribute to try_seek_from_begin.

Since the method returns an ErrorCode, adding [[nodiscard]] encourages checking the return value.

61-64: Add [[maybe_unused]] to unused parameters to avoid warnings.

The parameters in try_read_to_delimiter are unused, which may trigger compiler warnings.

Consider updating the method signature:

-auto try_read_to_delimiter(char delim, bool keep_delimiter, bool append, std::string& str)
+auto try_read_to_delimiter([[maybe_unused]] char delim, [[maybe_unused]] bool keep_delimiter, [[maybe_unused]] bool append, [[maybe_unused]] std::string& str)

Alternatively, you can omit parameter names if they are unused:

-auto try_read_to_delimiter(char delim, bool keep_delimiter, bool append, std::string& str)
+auto try_read_to_delimiter(char, bool, bool, std::string&) -> ErrorCode override {

components/core/src/clp/BoundedReader.hpp

LinZhihao-723

The implementation looks good to me. Sorry that there are still a few suggestions for fixing clang-tidy warnings in the test file. Your IDE/clang-tidy run might also complain about REQUIRE macro, but we can ignore that for now since we're planning to fix this by upgrading catch2 to a higher version.

LinZhihao-723 · 2024-12-12T22:55:47Z

components/core/tests/test-BoundedReader.cpp

+#include "../src/clp/StringReader.hpp"
+
+TEST_CASE("Test Bounded Reader", "[BoundedReader]") {
+    constexpr char cTestString[]{"0123456789"};


We can use constexpr std::string_view for const strings

I'm using constexpr char[] because StringReader takes std::string const& and I don't want to manually initialize an std::string every time I open a StringReader.

This was to follow our guideline here: https://www.notion.so/yscope/WIP-Coding-Guidelines-9a308b847a5343958ba3cb97a850be66?pvs=4#13604e4d9e6b80b09a18d4f71a89f1c8

From compiler's perspective using constexpr char[] and constexpr std::string_view won't make a difference: check here

Yes, I know, I've read the coding guidelines, and I understand that both ways of doing it are functionally the same. What I'm saying is that I don't like the explicit std::string{} initialization that you have to do when passing string_view in this specific case since it wastes horizontal space and reads worse.

I'll make the change to avoid wasting time, but I think it makes the code less readable.

components/core/tests/test-BoundedReader.cpp

LinZhihao-723 · 2024-12-12T22:59:20Z

components/core/tests/test-BoundedReader.cpp

+        clp::StringReader string_reader;
+        string_reader.open(cTestString);
+        clp::BoundedReader bounded_reader{&string_reader, cTestStringLen + 1};
+        char buf[cTestStringLen + 1];


How about using std::array to:

Silence clang-tidy warnings

Enforce initialization on the allocated array memory

gibber9809 · 2024-12-13T15:34:39Z

The implementation looks good to me. Sorry that there are still a few suggestions for fixing clang-tidy warnings in the test file. Your IDE/clang-tidy run might also complain about REQUIRE macro, but we can ignore that for now since we're planning to fix this by upgrading catch2 to a higher version.

Interesting. Yeah I didn't get any clang-tidy warnings (besides some incorrect ones) running from the command line likely because it was getting confused about the catch2 macro expansions. I'm guessing clion quietly does a lot of extra configuration to deal with this sort of thing.

LinZhihao-723

For the PR title, how about:

feat(core-clp): Add `BoundedReader` to prevent out-of-bound reads in segmented input streams.

…segmented input streams. (y-scope#624) Co-authored-by: haiqi96 <[email protected]> Co-authored-by: Lin Zhihao <[email protected]>

gibber9809 added 4 commits December 5, 2024 18:11

Add CheckpointReader that prevents reading beyond a certain point in …

7dad86e

…an underlying stream

Fix bug in StringReader where StringReader permitted seeks beyond end…

6ac50eb

… of input

Update code style of CheckpointReader to conform with standards

ed35cb5

Add tests for CheckpointReader

34ad765

gibber9809 requested a review from haiqi96 December 5, 2024 19:39

coderabbitai bot reviewed Dec 5, 2024

View reviewed changes

components/core/src/clp/CheckpointReader.hpp Outdated Show resolved Hide resolved

components/core/src/clp/CheckpointReader.cpp Outdated Show resolved Hide resolved

haiqi96 reviewed Dec 6, 2024

View reviewed changes

components/core/src/clp/CheckpointReader.cpp Outdated Show resolved Hide resolved

components/core/src/clp/CheckpointReader.cpp Outdated Show resolved Hide resolved

components/core/src/clp/CheckpointReader.cpp Outdated Show resolved Hide resolved

gibber9809 and others added 3 commits December 6, 2024 13:31

Apply suggestions from code review

47b0cc0

Co-authored-by: haiqi96 <[email protected]>

Address review comments

7aff0c6

Ensure CheckpointReader sets num_bytes_read to zero on EOF

2ab88fd

gibber9809 requested a review from haiqi96 December 6, 2024 19:03

Rename to BoundedReader, rename to m_curr_pos, only update curr_pos o…

8977dc4

…n error if EOF

gibber9809 changed the title ~~feat(clp): Add CheckpointReader class to help avoid backwards seeks when reading input streams segmented into logical chunks.~~ feat(clp): Add BoundedReader class to help avoid backwards seeks when reading input streams segmented into logical chunks. Dec 7, 2024

gibber9809 requested a review from kirkrodrigues December 7, 2024 15:16

coderabbitai bot reviewed Dec 7, 2024

View reviewed changes

haiqi96 reviewed Dec 7, 2024

View reviewed changes

components/core/src/clp/BoundedReader.cpp Outdated Show resolved Hide resolved

haiqi96 reviewed Dec 7, 2024

View reviewed changes

components/core/src/clp/BoundedReader.cpp Show resolved Hide resolved

Update components/core/src/clp/BoundedReader.cpp

93c7d3b

Co-authored-by: haiqi96 <[email protected]>

gibber9809 requested a review from haiqi96 December 8, 2024 16:05

haiqi96 previously approved these changes Dec 9, 2024

View reviewed changes

gibber9809 mentioned this pull request Dec 9, 2024

Refactor CLP ReaderInterface implementations to achieve consistent seek behaviour. #628

Open

LinZhihao-723 requested changes Dec 9, 2024

View reviewed changes

Apply suggestions from code review

6c3a537

Co-authored-by: Lin Zhihao <[email protected]>

gibber9809 dismissed haiqi96’s stale review via 6c3a537 December 10, 2024 16:29

Apply suggestions from code review

6321c0c

Co-authored-by: Lin Zhihao <[email protected]>

coderabbitai bot reviewed Dec 10, 2024

View reviewed changes

components/core/src/clp/BoundedReader.hpp Show resolved Hide resolved

Address review comments

e54d434

gibber9809 requested a review from LinZhihao-723 December 10, 2024 18:32

Fix clang-tidy warnings

cbe2073

LinZhihao-723 requested changes Dec 12, 2024

View reviewed changes

Minor updates to test-BoundedReader.cpp

4dcb7ff

gibber9809 requested a review from LinZhihao-723 December 13, 2024 15:50

Use string_view in BoundedReader test

d261ca2

LinZhihao-723 approved these changes Dec 13, 2024

View reviewed changes

gibber9809 changed the title ~~feat(clp): Add BoundedReader class to help avoid backwards seeks when reading input streams segmented into logical chunks.~~ feat(core-clp): Add BoundedReader to prevent out-of-bound reads in segmented input streams. Dec 13, 2024

gibber9809 merged commit ddba9b9 into y-scope:main Dec 13, 2024
21 checks passed

feat(core-clp): Add BoundedReader to prevent out-of-bound reads in segmented input streams. #624

feat(core-clp): Add BoundedReader to prevent out-of-bound reads in segmented input streams. #624

Conversation

gibber9809 commented Dec 5, 2024 • edited by coderabbitai bot Loading

Description

Validation performed

Summary by CodeRabbit

coderabbitai bot commented Dec 5, 2024 • edited Loading

Walkthrough

Changes

Possibly related PRs

Suggested reviewers

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

haiqi96 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kirkrodrigues Dec 7, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kirkrodrigues commented Dec 7, 2024

coderabbitai bot left a comment

Choose a reason for hiding this comment

haiqi96 commented Dec 7, 2024

haiqi96 left a comment

Choose a reason for hiding this comment

haiqi96 commented Dec 9, 2024

gibber9809 commented Dec 9, 2024

LinZhihao-723 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

LinZhihao-723 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gibber9809 Dec 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gibber9809 commented Dec 13, 2024

LinZhihao-723 left a comment • edited Loading

Choose a reason for hiding this comment

feat(core-clp): Add `BoundedReader` to prevent out-of-bound reads in segmented input streams. #624

feat(core-clp): Add `BoundedReader` to prevent out-of-bound reads in segmented input streams. #624

gibber9809 commented Dec 5, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 5, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

kirkrodrigues Dec 7, 2024 •

edited

Loading

gibber9809 Dec 13, 2024 •

edited

Loading

LinZhihao-723 left a comment •

edited

Loading