-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(core): Fix clang-tidy warnings in the streaming compressor interface and its implementations; Modernize and refactor test-StreamingCompression
for conciseness.
#599
Conversation
WalkthroughThe pull request introduces several changes across multiple files, primarily focusing on the Changes
Suggested reviewers
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
🧹 Outside diff range and nitpick comments (10)
components/core/src/clp/streaming_compression/passthrough/Compressor.cpp (1)
45-47
: Consider adding null check for file_writerWhile the FileWriter is passed by reference, it would be safer to validate that it's properly initialized before assignment.
auto Compressor::open(FileWriter& file_writer, [[maybe_unused]] int compression_level) -> void { + if (!file_writer.is_open()) { + throw OperationFailed(ErrorCode_BadParam, __FILENAME__, __LINE__); + } m_compressed_stream_file_writer = &file_writer; }components/core/src/clp/streaming_compression/Compressor.hpp (2)
4-9
: Consider using more portable standard headersThe inclusion of <sys/types.h> might affect cross-platform compatibility. Consider using for fixed-width integer types or for size_t and ptrdiff_t instead.
-#include <sys/types.h> +#include <cstdint>
52-63
: Enhance documentation for unsupported operationsWhile the code correctly marks unsupported operations, consider expanding the documentation to explain why these operations are unsupported in the context of streaming compression. This would help future maintainers understand the design decisions.
components/core/src/clp/streaming_compression/passthrough/Compressor.hpp (1)
50-76
: LGTM! Well-designed interface with forward compatibility.The interface is well-documented and properly prepared for future compression implementations:
- Consistent use of trailing return types
- Addition of compression_level parameter with [[maybe_unused]] attribute shows good forward planning for derived compressor implementations
Consider documenting the valid range and meaning of compression_level values in the open() method's documentation for future implementers.
components/core/src/clp/streaming_compression/zstd/Compressor.cpp (2)
89-93
: Enhance error message with compression details.Consider adding compression level and data size to the error message for better debugging context.
- SPDLOG_ERROR( - "streaming_compression::zstd::Compressor: ZSTD_compressStream() error: {}", - ZSTD_getErrorName(compress_result) + SPDLOG_ERROR( + "streaming_compression::zstd::Compressor: ZSTD_compressStream() error: {} (buffer size: {}, compressed: {})", + ZSTD_getErrorName(compress_result), + uncompressed_stream_block.size, + m_compressed_stream_block.pos
170-173
: Add documentation for the error checking helper.Consider adding a documentation comment explaining the purpose of this helper method and the specific error conditions it checks for.
+/** + * Checks if a ZSTD function result indicates an error condition. + * @param size_t_function_result The result from a ZSTD function call + * @return true if the result indicates an error, false otherwise + */ auto Compressor::zstd_is_error(size_t size_t_function_result) -> bool {components/core/tests/test-StreamingCompression.cpp (4)
61-65
: Optimise buffer initialisation for efficiencyInstead of reserving and pushing back each element individually, you can initialise
uncompressed_buffer
with the desired size and assign values directly. This approach improves performance and code clarity.Apply this diff to optimise the buffer initialisation:
-std::vector<char> uncompressed_buffer{}; -uncompressed_buffer.reserve(cUncompressedDataSize); -for (size_t i{0}; i < cUncompressedDataSize; ++i) { - uncompressed_buffer.push_back((char)('a' + (i % cUncompressedDataPatternPeriod))); -} +std::vector<char> uncompressed_buffer(cUncompressedDataSize); +for (size_t i = 0; i < cUncompressedDataSize; ++i) { + uncompressed_buffer[i] = static_cast<char>('a' + (i % cUncompressedDataPatternPeriod)); +}
86-97
: Optimise buffer clearing and prevent potential issuesClearing the entire
decompressed_buffer
withmemset
on each iteration is unnecessary and inefficient. Additionally, ensure that the buffer size matches thechunk_size
being processed to avoid out-of-bounds access.Apply this diff to optimise the buffer handling:
for (auto chunk_size : compression_chunk_sizes) { - memset(decompressed_buffer.data(), 0, cUncompressedDataSize); + decompressed_buffer.resize(chunk_size); + memset(decompressed_buffer.data(), 0, chunk_size); REQUIRE( (ErrorCode_Success == decompressor->get_decompressed_stream_region( uncompressed_bytes, decompressed_buffer.data(), chunk_size )) ); REQUIRE((memcmp(uncompressed_buffer.data(), decompressed_buffer.data(), chunk_size) == 0)); uncompressed_bytes += chunk_size; }
23-25
: Remove unusedusing
directivesThe
using
directives on lines 23 and 24 forCompressor
andDecompressor
may not be necessary if fully qualified names are used throughout the code. Removing unused directives cleans up the code.Consider removing these lines if they are not required.
87-87
: Simplify buffer clearingSince you're overwriting the
decompressed_buffer
inget_decompressed_stream_region
, clearing it withmemset
may not be needed. Removing this step can improve performance.Apply this diff to remove unnecessary
memset
:- memset(decompressed_buffer.data(), 0, chunk_size);
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (9)
components/core/src/clp/TraceableException.hpp
(1 hunks)components/core/src/clp/streaming_compression/Compressor.hpp
(3 hunks)components/core/src/clp/streaming_compression/Constants.hpp
(0 hunks)components/core/src/clp/streaming_compression/passthrough/Compressor.cpp
(2 hunks)components/core/src/clp/streaming_compression/passthrough/Compressor.hpp
(2 hunks)components/core/src/clp/streaming_compression/zstd/Compressor.cpp
(8 hunks)components/core/src/clp/streaming_compression/zstd/Compressor.hpp
(4 hunks)components/core/src/clp/streaming_compression/zstd/Constants.hpp
(1 hunks)components/core/tests/test-StreamingCompression.cpp
(1 hunks)
💤 Files with no reviewable changes (1)
- components/core/src/clp/streaming_compression/Constants.hpp
✅ Files skipped from review due to trivial changes (2)
- components/core/src/clp/TraceableException.hpp
- components/core/src/clp/streaming_compression/zstd/Constants.hpp
🔇 Additional comments (15)
components/core/src/clp/streaming_compression/passthrough/Compressor.cpp (5)
Line range hint 9-24
: LGTM! Robust error handling and input validation
The write method implementation is solid with proper null checks and error handling. The early return for zero-length data is an efficient optimization.
25-32
: LGTM! Proper null check before operation
The flush implementation correctly validates the writer state before proceeding with the operation.
33-40
: LGTM! Consistent error handling pattern
The method follows a good pattern of returning error codes for operations that can fail gracefully, rather than throwing exceptions.
41-43
: LGTM! Simple and idempotent implementation
The close method is appropriately simple for a passthrough implementation, and can be safely called multiple times.
3-6
: Verify the necessity of TraceableException.hpp inclusion
The header TraceableException.hpp
is included but not directly used in the visible code. Let's verify if it's required by any of the included headers or error handling mechanisms.
✅ Verification successful
The inclusion of TraceableException.hpp is necessary and correctly used
The TraceableException.hpp inclusion is required as the Compressor class throws OperationFailed exceptions which inherit from TraceableException. This is evidenced by:
- The OperationFailed class definition in Compressor.hpp inherits from TraceableException
- The Compressor.cpp implementation throws OperationFailed exceptions in multiple places:
- When writing with an uninitialized stream
- When writing with invalid parameters
- When flushing an uninitialized stream
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check if TraceableException is used by any of the included headers
# or if OperationFailed inherits from it
# Check if TraceableException is used in ErrorCode.hpp or Compressor.hpp
rg -l "TraceableException" --type cpp --type hpp
# Check inheritance hierarchy
ast-grep --pattern 'class OperationFailed : $$$TraceableException$$$'
Length of output: 148
Script:
#!/bin/bash
# Check if TraceableException is used in any source files
rg -l "TraceableException"
# Check content of ErrorCode.hpp to understand dependencies
rg -p "class.*Exception|TraceableException" "ErrorCode.hpp"
# Look for any class that inherits from TraceableException
ast-grep --pattern 'class $_ : public TraceableException'
# Check the content of Compressor.cpp for exception usage
cat "components/core/src/clp/streaming_compression/passthrough/Compressor.cpp"
Length of output: 10094
Script:
#!/bin/bash
# Check if OperationFailed is defined in ErrorCode.hpp and its relationship with TraceableException
rg -p "class OperationFailed" "components/core/src/clp/ErrorCode.hpp"
# Check the content of TraceableException.hpp
cat "components/core/src/clp/TraceableException.hpp"
# Look for OperationFailed usage in the codebase
rg -p "class.*OperationFailed" --type cpp --type hpp
Length of output: 1607
Script:
#!/bin/bash
# Check for OperationFailed class definition
rg -p "class OperationFailed"
# Check if OperationFailed is used in the error handling
rg -A 2 "throw OperationFailed" "components/core/src/clp/streaming_compression/passthrough/Compressor.cpp"
# Look for OperationFailed in header files
rg -p "class.*OperationFailed" "components/core/src/clp/streaming_compression/passthrough/Compressor.hpp"
Length of output: 15709
components/core/src/clp/streaming_compression/Compressor.hpp (3)
28-31
: Well-structured exception handling with modern C++ features
Good use of [[nodiscard]] attribute and trailing return type syntax for the what() method. This improves code clarity and helps prevent accidental return value ignoring.
39-44
: Excellent implementation of Rule of Five
The explicit declaration of copy/move semantics follows modern C++ best practices. This makes the class's memory management behaviour clear and prevents accidental object copying.
79-81
: Good use of encapsulation and modern initialization
The change to private access and use of brace initialization follows good C++ practices. This ensures proper encapsulation and prevents undefined member variable states.
components/core/src/clp/streaming_compression/passthrough/Compressor.hpp (4)
4-11
: LGTM! Well-organized includes following best practices.
The includes are properly organized with standard headers first, followed by project headers.
26-28
: LGTM! Good use of modern C++ features.
The what()
method properly uses trailing return type syntax and the [[nodiscard]] attribute, following modern C++ best practices.
32-42
: LGTM! Excellent RAII implementation.
The class properly implements the rule of five with:
- Explicit deletion of copy operations
- Defaulted move operations with correct noexcept specification
- Clear ownership semantics
80-80
: LGTM! Good use of in-class initialization.
The pointer member is properly initialized to nullptr using in-class initialization.
components/core/src/clp/streaming_compression/zstd/Compressor.cpp (3)
3-13
: LGTM! Well-organized include statements.
The includes are properly organized into standard library, third-party, and project-specific headers, following good C++ practices.
153-154
:
Fix incorrect error message reference.
The error message mentions ZSTD_compressStream2()
but the actual call is to ZSTD_flushStream()
.
- "streaming_compression::zstd::Compressor: ZSTD_compressStream2() error: {}",
+ "streaming_compression::zstd::Compressor: ZSTD_flushStream() error: {}",
Likely invalid or redundant comment.
Line range hint 17-31
: Verify member initialization order matches declaration order.
While the member initializer list is well-structured, ensure the initialization order matches the declaration order in the header file to prevent potential initialization issues.
components/core/src/clp/streaming_compression/zstd/Compressor.cpp
Outdated
Show resolved
Hide resolved
components/core/src/clp/streaming_compression/zstd/Compressor.hpp
Outdated
Show resolved
Hide resolved
components/core/src/clp/streaming_compression/zstd/Compressor.hpp
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (3)
components/core/src/clp/streaming_compression/passthrough/Compressor.hpp (1)
81-81
: Consider using std::optional for the file writerWhile the nullptr initialization is correct, consider using std::optional<FileWriter&> for m_compressed_stream_file_writer to make the optional nature of the reference more explicit in the type system.
- FileWriter* m_compressed_stream_file_writer{nullptr}; + std::optional<std::reference_wrapper<FileWriter>> m_compressed_stream_file_writer{std::nullopt};components/core/tests/test-StreamingCompression.cpp (2)
27-37
: Add documentation for the chosen test data sizesThe constants are well-defined, but would benefit from comments explaining:
- Why 128MB was chosen as the uncompressed data size
- The rationale behind the chunk size distribution
Add documentation like this:
-constexpr size_t cUncompressedDataSize{128L * 1024 * 1024}; // 128MB +// 128MB - Large enough to test realistic streaming scenarios while keeping test duration reasonable +constexpr size_t cUncompressedDataSize{128L * 1024 * 1024}; -constexpr auto cCompressionChunkSizes = std::to_array<size_t>( +// Test various chunk sizes from 1% to 100% of total size to verify streaming behaviour +constexpr auto cCompressionChunkSizes = std::to_array<size_t>(
101-101
: Verify successful file cleanupThe cleanup should verify that the file was successfully removed.
Add verification:
- boost::filesystem::remove(compressed_file_path); + REQUIRE(boost::filesystem::remove(compressed_file_path)); + REQUIRE(!boost::filesystem::exists(compressed_file_path));
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (6)
components/core/src/clp/streaming_compression/Compressor.hpp
(1 hunks)components/core/src/clp/streaming_compression/passthrough/Compressor.cpp
(2 hunks)components/core/src/clp/streaming_compression/passthrough/Compressor.hpp
(2 hunks)components/core/src/clp/streaming_compression/zstd/Compressor.cpp
(8 hunks)components/core/src/clp/streaming_compression/zstd/Compressor.hpp
(2 hunks)components/core/tests/test-StreamingCompression.cpp
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- components/core/src/clp/streaming_compression/Compressor.hpp
- components/core/src/clp/streaming_compression/passthrough/Compressor.cpp
- components/core/src/clp/streaming_compression/zstd/Compressor.cpp
- components/core/src/clp/streaming_compression/zstd/Compressor.hpp
🔇 Additional comments (7)
components/core/src/clp/streaming_compression/passthrough/Compressor.hpp (4)
4-5
: LGTM! Well-organized includes
The addition of is appropriate for size_t usage, and the include order follows best practices.
23-28
: Excellent use of modern C++ features
The updates demonstrate good practices:
- Uniform initialization syntax
- [[nodiscard]] attribute for what()
- Clear and descriptive error message
50-77
: Good interface design with appropriate attributes
The interface methods demonstrate:
- Consistent use of trailing return types
- Appropriate use of [[nodiscard]] for error-returning methods
- Clear documentation
- Proper handling of unused parameters
32-42
: Well-implemented move semantics and initialization
The class correctly implements the Rule of Five with appropriate move semantics and explicit deletion of copy operations.
Let's verify if compression_level needs to be stored:
components/core/tests/test-StreamingCompression.cpp (3)
1-25
: LGTM! Well-organized header inclusions
The header organization and using declarations are clear and follow modern C++ practices.
46-47
: LGTM! Good use of smart pointers
The use of std::unique_ptr for compressor and decompressor instances ensures proper resource management.
61-68
: LGTM! Proper buffer initialization
The buffers are correctly sized using resize() and initialized appropriately.
components/core/src/clp/streaming_compression/passthrough/Compressor.hpp
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Outside diff range and nitpick comments (5)
components/core/src/clp/streaming_compression/passthrough/Compressor.cpp (2)
Line range hint
15-30
: LGTM! Consider adding Doxygen documentationThe implementation includes proper null checks and error handling. Consider adding Doxygen documentation to describe the error conditions and parameter requirements.
+/** + * Writes data to the underlying file writer. + * @param data Pointer to the data buffer to write + * @param data_length Length of the data to write + * @throws OperationFailed if writer is not initialized or parameters are invalid + */ auto Compressor::write(char const* data, size_t const data_length) -> void {
51-53
: Consider adding state validation in open()The current implementation might benefit from these improvements:
- Validate that we're not already open (m_compressed_stream_file_writer != nullptr)
- Add validation for the input FileWriter
- Consider calling close() first if already open
auto Compressor::open(FileWriter& file_writer) -> void { + if (m_compressed_stream_file_writer != nullptr) { + close(); + } m_compressed_stream_file_writer = &file_writer; }components/core/tests/test-StreamingCompression.cpp (3)
30-40
: Consider using more descriptive constant namesThe constant
cAlphabetLength
could be renamed to better reflect its purpose in generating test data. Consider usingcTestDataPatternLength
orcUncompressedDataPatternPeriod
to better describe its role in the pattern generation.
57-61
: Consider using std::fill_n with a modulo operationThe current buffer initialization could be more efficient using standard library algorithms.
Consider this more efficient approach:
-for (size_t i{0}; i < cBufferSize; ++i) { - uncompressed_buffer.at(i) = static_cast<char>(('a' + (i % cAlphabetLength))); -} +std::fill_n(uncompressed_buffer.begin(), cBufferSize, + [i = 0, pattern = std::string(cAlphabetLength, '\0')] + (void) mutable { + return 'a' + (i++ % cAlphabetLength); + });
101-101
: Add error handling for file cleanupFile removal should be verified to ensure proper cleanup.
Add error handling:
-boost::filesystem::remove(compressed_file_path); +REQUIRE(boost::filesystem::remove(compressed_file_path));
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (6)
components/core/src/clp/streaming_compression/Compressor.hpp
(1 hunks)components/core/src/clp/streaming_compression/passthrough/Compressor.cpp
(2 hunks)components/core/src/clp/streaming_compression/passthrough/Compressor.hpp
(2 hunks)components/core/src/clp/streaming_compression/zstd/Compressor.cpp
(8 hunks)components/core/src/clp/streaming_compression/zstd/Compressor.hpp
(3 hunks)components/core/tests/test-StreamingCompression.cpp
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- components/core/src/clp/streaming_compression/Compressor.hpp
- components/core/src/clp/streaming_compression/passthrough/Compressor.hpp
- components/core/src/clp/streaming_compression/zstd/Compressor.cpp
- components/core/src/clp/streaming_compression/zstd/Compressor.hpp
🔇 Additional comments (6)
components/core/src/clp/streaming_compression/passthrough/Compressor.cpp (4)
3-8
: LGTM! Well-organized header includes
The header organization follows good practices with standard library headers separated from project headers.
31-38
: LGTM! Clean implementation
The flush implementation properly validates the writer state and delegates to the underlying writer.
39-46
: LGTM! Proper error handling pattern
The implementation follows the try_ pattern correctly with appropriate error handling and output parameter usage.
47-50
: Verify resource management in close()
The method only nulls the pointer without performing any cleanup. Please verify if the underlying FileWriter
requires any cleanup or if its lifecycle is managed elsewhere.
✅ Verification successful
Resource management is correctly implemented
The FileWriter
is managed externally and only referenced through a raw pointer in the Compressor
class. The close()
method correctly nulls this pointer as the ownership and cleanup responsibility lies with the caller, which is evident from:
- The
open(FileWriter& file_writer)
method takes a reference and stores its address, indicating external ownership - The
FileWriter
itself has proper cleanup in its ownclose()
method as seen in the codebase - The passthrough compressor is just a wrapper that doesn't take ownership of the writer
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for FileWriter destructor or cleanup patterns
ast-grep --pattern 'class FileWriter {
$$$
~FileWriter() {
$$$
}
$$$
}'
Length of output: 85
Script:
#!/bin/bash
# Let's search for FileWriter class definition and its usage patterns
rg -A 10 "class.*FileWriter"
# Also search for any cleanup or close methods in FileWriter
rg -A 5 "void.*close\(\)"
# Search for smart pointer usage with FileWriter
rg "unique_ptr.*FileWriter|shared_ptr.*FileWriter"
Length of output: 61765
Script:
#!/bin/bash
# Let's search for the specific FileWriter implementation used in this context
rg -A 10 "m_compressed_stream_file_writer" components/core/src/clp/streaming_compression/passthrough/
# Also search for the FileWriter ownership pattern in the parent class
rg -A 10 "class.*Compressor.*\{" components/core/src/clp/streaming_compression/
Length of output: 9095
components/core/tests/test-StreamingCompression.cpp (2)
1-26
: LGTM! Well-organized includes and using declarations
The includes are properly organized following the standard convention (standard library, third-party, project headers), and the using declarations improve code readability without polluting the global namespace.
75-79
:
Add error handling for memory-mapped file operations
Memory-mapped file operations should be verified for success.
Add these checks:
clp::ReadOnlyMemoryMappedFile const memory_mapped_compressed_file{compressed_file_path};
+REQUIRE(memory_mapped_compressed_file.is_open());
auto const compressed_file_view{memory_mapped_compressed_file.get_view()};
+REQUIRE(compressed_file_view.data() != nullptr);
+REQUIRE(compressed_file_view.size() > 0);
decompressor->open(compressed_file_view.data(), compressed_file_view.size());
+REQUIRE(ErrorCode_Success == decompressor->get_last_error());
Likely invalid or redundant comment.
components/core/src/clp/streaming_compression/passthrough/Compressor.cpp
Outdated
Show resolved
Hide resolved
components/core/src/clp/streaming_compression/zstd/Compressor.cpp
Outdated
Show resolved
Hide resolved
components/core/src/clp/streaming_compression/zstd/Compressor.cpp
Outdated
Show resolved
Hide resolved
components/core/src/clp/streaming_compression/zstd/Compressor.cpp
Outdated
Show resolved
Hide resolved
components/core/src/clp/streaming_compression/zstd/Compressor.cpp
Outdated
Show resolved
Hide resolved
Co-authored-by: Lin Zhihao <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (6)
components/core/tests/test-StreamingCompression.cpp (4)
1-4
: Add missing headerThe code uses std::equal but the header is not included. While it might work due to transitive includes, it's better to explicitly include headers for the algorithms we use.
#include <algorithm> #include <array> #include <cstring> +#include <numeric> #include <memory>
30-30
: Consider making buffer size configurableA fixed 128MB buffer size might be too large for some systems or CI environments. Consider making this configurable or reducing the size for testing purposes.
- constexpr size_t cBufferSize{128L * 1024 * 1024}; // 128MB + constexpr size_t cBufferSize{16L * 1024 * 1024}; // 16MB for testing
68-68
: Consider specifying compression level for ZStdWhen using ZStd compression, it's beneficial to specify a compression level based on your performance vs compression ratio requirements.
- compressor->open(file_writer); + // Choose level based on requirements: 1 (fast) to 22 (best compression) + compressor->open(file_writer, 3); // Example: balanced level
101-101
: Add error handling for file cleanupThe file removal operation should be verified to ensure cleanup succeeds.
- boost::filesystem::remove(compressed_file_path); + REQUIRE(boost::filesystem::remove(compressed_file_path));components/core/src/clp/streaming_compression/zstd/Compressor.cpp (2)
37-48
: Consider moving initializations to member declarationsFollowing modern C++ guidelines, consider moving these initializations to the member declarations in the header file. This approach:
- Reduces constructor complexity
- Ensures members are always properly initialized
- Follows the "initialize-at-declaration" principle
For example:
class Compressor { private: FileWriter* m_compressed_stream_file_writer{nullptr}; ZSTD_CStream* m_compression_stream{ZSTD_createCStream()}; bool m_compression_stream_contains_data{false}; // ... other members };
116-123
: Consider adding explicit buffer size checkWhile the current check for
pos > 0
is correct, consider adding an explicit size check to make the code more maintainable and self-documenting.- if (m_compressed_stream_block.pos > 0) { + const size_t bytes_compressed = m_compressed_stream_block.pos; + if (bytes_compressed > 0 && bytes_compressed <= m_compressed_stream_block_size) { m_compressed_stream_file_writer->write( static_cast<char const*>(m_compressed_stream_block.dst), - m_compressed_stream_block.pos + bytes_compressed ); }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (3)
components/core/src/clp/streaming_compression/Compressor.hpp
(1 hunks)components/core/src/clp/streaming_compression/zstd/Compressor.cpp
(8 hunks)components/core/tests/test-StreamingCompression.cpp
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- components/core/src/clp/streaming_compression/Compressor.hpp
🔇 Additional comments (3)
components/core/tests/test-StreamingCompression.cpp (1)
31-39
: Validate compression chunk sizes
The chunk sizes should be validated to ensure they don't exceed the buffer size.
Add this validation:
constexpr auto cCompressionChunkSizes = std::to_array<size_t>(
{cBufferSize / 100,
cBufferSize / 50,
cBufferSize / 25,
cBufferSize / 10,
cBufferSize / 5,
cBufferSize / 2,
cBufferSize}
);
+ static_assert(
+ std::ranges::all_of(cCompressionChunkSizes, [](size_t size) {
+ return size <= cBufferSize;
+ }),
+ "Chunk sizes must not exceed buffer size"
+ );
components/core/src/clp/streaming_compression/zstd/Compressor.cpp (2)
3-14
: LGTM! Well-organized includes.
The includes are properly organized into distinct sections (standard library, third-party, and project includes) and all are necessary for the implementation.
173-174
:
Fix incorrect error message
The error message references ZSTD_compressStream2
but the actual function call is to ZSTD_flushStream
.
- "streaming_compression::zstd::Compressor: ZSTD_compressStream2() error: {}",
+ "streaming_compression::zstd::Compressor: ZSTD_flushStream() error: {}",
Likely invalid or redundant comment.
components/core/src/clp/streaming_compression/zstd/Compressor.cpp
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Outside diff range and nitpick comments (12)
components/core/src/clp/streaming_compression/Decompressor.hpp (2)
Line range hint
1-65
: Consider documenting the compression type handling requirement.With the removal of compression type management from the base class, it would be beneficial to add documentation explaining that derived classes are responsible for handling their specific compression type.
Add class-level documentation:
namespace clp::streaming_compression { +/** + * Abstract base class for streaming decompression. + * Derived classes are responsible for handling their specific compression type + * and implementing the decompression operations. + */ class Decompressor : public ReaderInterface {
Line range hint
1-65
: Consider enhancing error handling for compression type mismatches.Since compression type management is now delegated to derived classes, consider adding mechanisms to prevent or detect compression type mismatches at runtime.
Suggestions:
- Add a virtual method to query the compression type supported by each derived class
- Consider adding validation in the
open
methods to verify the compressed data matches the expected format- Enhance the
OperationFailed
exception to include compression type mismatch scenarioscomponents/core/src/clp/streaming_compression/Compressor.hpp (2)
27-29
: Consider enhancing the error messageThe current error message is quite generic. Consider including more context about what operation failed or what the error code means.
- return "streaming_compression::Compressor operation failed"; + return "Streaming compression operation failed: " + get_error_code_string(get_error_code());
71-75
: Consider future RAII implementationWhile the current
open()
method works, consider these points for future refactoring:
- A RAII design would be more idiomatic, preventing resource leaks
- The method could benefit from a compression level parameter for derived classes
This could be addressed in a future PR by:
- Moving initialization to the constructor
- Adding a compression level parameter:
virtual auto open(FileWriter& file_writer, int compression_level = 0) -> void = 0;components/core/src/clp/streaming_compression/passthrough/Decompressor.hpp (2)
Line range hint
29-33
: LGTM! Consider adding inline comments for clarity.The member initialization list properly initializes all members in the correct order with appropriate default values. To improve readability, consider adding brief inline comments explaining the purpose of each initialization.
Decompressor() - : m_input_type(InputType::NotInitialized), - m_compressed_data_buf(nullptr), - m_compressed_data_buf_len(0), - m_decompressed_stream_pos(0) {} + : m_input_type(InputType::NotInitialized), // Start in uninitialized state + m_compressed_data_buf(nullptr), // No input buffer + m_compressed_data_buf_len(0), // Zero buffer length + m_decompressed_stream_pos(0) {} // Start at beginning of stream
Line range hint
13-24
: Excellent architectural improvements!The simplified constructor and removal of compression type management:
- Improves separation of concerns
- Reduces coupling between compression implementations
- Makes the codebase more maintainable
- Facilitates the future addition of LZMA compression
This aligns well with the PR's objectives and modern C++ design principles.
components/core/tests/test-StreamingCompression.cpp (2)
31-41
: Consider adding documentation for the buffer size choiceWhile the buffer size and chunk size distributions are reasonable, it would be helpful to document why 128MB was chosen as the buffer size and how the chunk size ratios were determined.
Add a comment explaining the rationale:
- constexpr size_t cBufferSize{128L * 1024 * 1024}; // 128MB + // 128MB buffer size chosen to test both small and large data chunks while + // staying within reasonable memory constraints for testing environments + constexpr size_t cBufferSize{128L * 1024 * 1024};
101-112
: Enhance cleanup robustnessWhile the sanity check is good, the cleanup process could be more robust.
Add error handling for cleanup:
// Cleanup - boost::filesystem::remove(compressed_file_path); + REQUIRE(boost::filesystem::exists(compressed_file_path)); + REQUIRE(boost::filesystem::remove(compressed_file_path)); + REQUIRE(!boost::filesystem::exists(compressed_file_path));components/core/src/clp/streaming_compression/zstd/Compressor.cpp (3)
Line range hint
13-23
: Consider moving initialization to member initializer list.According to the project's coding guidelines, member initialization should be moved to the initializer list. Additionally, the nullptr check suggests that
m_compression_stream
is initialized before the constructor body, but we're checking its validity after the fact.Consider refactoring to:
-Compressor::Compressor() { +Compressor::Compressor() + : m_compression_stream(ZSTD_createCStream()) +{ if (nullptr == m_compression_stream) { SPDLOG_ERROR("streaming_compression::zstd::Compressor: ZSTD_createCStream() error"); throw OperationFailed(ErrorCode_Failure, __FILENAME__, __LINE__); } }
Line range hint
30-37
: Simplify error checking using ZSTD_isError.The current error checking pattern using
ZSTD_getErrorCode
is more complex than necessary. ZStd's documentation recommends usingZSTD_isError
for error detection.Consider simplifying to:
- if (ZSTD_error_no_error != ZSTD_getErrorCode(init_result)) { + if (ZSTD_isError(init_result)) { SPDLOG_ERROR( "streaming_compression::zstd::Compressor: ZSTD_initCStream() error: {}", ZSTD_getErrorName(init_result) ); throw OperationFailed(ErrorCode_Failure, __FILENAME__, __LINE__); }
Line range hint
101-111
: Maintain consistent error checking pattern.For consistency with other methods, use
ZSTD_isError
for error detection.Consider updating to:
- if (ZSTD_error_no_error != ZSTD_getErrorCode(end_stream_result)) { + if (ZSTD_isError(end_stream_result)) { SPDLOG_ERROR( "streaming_compression::zstd::Compressor: ZSTD_endStream() error: {}", ZSTD_getErrorName(end_stream_result) ); throw OperationFailed(ErrorCode_Failure, __FILENAME__, __LINE__); }components/core/src/clp/streaming_compression/zstd/Decompressor.cpp (1)
Line range hint
36-46
: Remove unreachablebreak;
statements afterreturn
andthrow
statements in switch casesThe
break;
statements followingreturn
andthrow
statements within the switch cases are unreachable and can be removed to clean up the code.Apply this diff to remove the unreachable
break;
statements:return ErrorCode_Success; - break; case InputType::File: { // ... break; } default: throw OperationFailed(ErrorCode_Unsupported, __FILENAME__, __LINE__); - break; }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (14)
components/core/CMakeLists.txt
(0 hunks)components/core/src/clp/clg/CMakeLists.txt
(0 hunks)components/core/src/clp/clo/CMakeLists.txt
(0 hunks)components/core/src/clp/clp/CMakeLists.txt
(0 hunks)components/core/src/clp/streaming_compression/Compressor.hpp
(1 hunks)components/core/src/clp/streaming_compression/Constants.hpp
(0 hunks)components/core/src/clp/streaming_compression/Decompressor.hpp
(1 hunks)components/core/src/clp/streaming_compression/passthrough/Compressor.cpp
(2 hunks)components/core/src/clp/streaming_compression/passthrough/Compressor.hpp
(2 hunks)components/core/src/clp/streaming_compression/passthrough/Decompressor.hpp
(1 hunks)components/core/src/clp/streaming_compression/zstd/Compressor.cpp
(8 hunks)components/core/src/clp/streaming_compression/zstd/Compressor.hpp
(2 hunks)components/core/src/clp/streaming_compression/zstd/Decompressor.cpp
(1 hunks)components/core/tests/test-StreamingCompression.cpp
(1 hunks)
💤 Files with no reviewable changes (5)
- components/core/CMakeLists.txt
- components/core/src/clp/clg/CMakeLists.txt
- components/core/src/clp/clo/CMakeLists.txt
- components/core/src/clp/clp/CMakeLists.txt
- components/core/src/clp/streaming_compression/Constants.hpp
🚧 Files skipped from review as they are similar to previous changes (3)
- components/core/src/clp/streaming_compression/passthrough/Compressor.cpp
- components/core/src/clp/streaming_compression/passthrough/Compressor.hpp
- components/core/src/clp/streaming_compression/zstd/Compressor.hpp
🔇 Additional comments (11)
components/core/src/clp/streaming_compression/Decompressor.hpp (2)
27-27
: LGTM! The default constructor simplifies the class interface.
The change from a parameterized constructor to a default constructor aligns well with the removal of compression type management from the base class.
27-27
: Verify the impact on derived decompressor classes.
The removal of compression type management from the base class means derived classes must now handle this responsibility internally.
✅ Verification successful
Let me gather more information about the decompressor classes and their implementations.
Let me check the constructors of the derived classes to see how they handle compression types.
The removal of compression type management from the base class is safe
The codebase analysis shows that the derived decompressor classes in both namespaces (glt
and clp
) already handle compression types appropriately:
- In
glt::streaming_compression
, derived classes likezstd::Decompressor
andpassthrough::Decompressor
explicitly set their compression types in their constructors - In
clp::streaming_compression
, the base class no longer manages compression types, and derived classes have been updated accordingly - In
clp_s
, compression types are still managed through constructor parameters
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Find all derived decompressor classes to verify they handle compression type properly
# Search for classes that inherit from Decompressor
ast-grep --pattern 'class $CLASS : $$$public $$$Decompressor$$$'
# Search for compression type usage in derived classes
rg -l "class.*:.*public.*Decompressor" | xargs rg "CompressorType|compression_type"
Length of output: 314
Script:
#!/bin/bash
# Find all files containing Decompressor class definitions
fd -e hpp -e cpp -e h -e c | xargs rg "class.*Decompressor" -B 2 -A 5
# Look for compression type related enums and definitions
rg -g '*.{hpp,cpp,h,c}' "enum.*CompressorType|struct.*CompressorType"
Length of output: 7494
Script:
#!/bin/bash
# Check constructors in derived decompressor classes
rg -g '*.{hpp,cpp,h,c}' "Decompressor\s*\([^)]*\)" -B 2 -A 2
# Check if there are any other compression type related methods in derived classes
rg -g '*.{hpp,cpp,h,c}' "get_compression_type|set_compression_type" -B 2 -A 2
Length of output: 11503
components/core/src/clp/streaming_compression/Compressor.hpp (2)
4-9
: LGTM: Include directives are now more focused
The header includes have been optimized to include only what's necessary, improving compilation times and reducing dependencies.
33-44
: LGTM: Special member functions follow modern C++ practices
The class properly implements the Rule of Five with explicit handling of all special member functions. The comments follow the latest coding standards.
components/core/tests/test-StreamingCompression.cpp (4)
1-27
: Well-organized includes and using declarations
The includes are properly organized and grouped logically. The addition of algorithm, memory, and numeric headers supports the modern C++ features used in the refactored code.
58-64
: Good choice of test data pattern
The repeating alphabet pattern is well-suited for compression testing as it provides both predictable patterns and variability. The use of Array is appropriate for fixed-size buffers.
44-56
: 🛠️ Refactor suggestion
Add error handling for compression device initialization
While using std::unique_ptr is good practice, we should verify the initialization succeeded.
Add these checks:
SECTION("ZStd single phase compression") {
compressor = std::make_unique<clp::streaming_compression::zstd::Compressor>();
decompressor = std::make_unique<clp::streaming_compression::zstd::Decompressor>();
+ REQUIRE(compressor != nullptr);
+ REQUIRE(decompressor != nullptr);
+ REQUIRE(ErrorCode_Success == compressor->get_last_error());
+ REQUIRE(ErrorCode_Success == decompressor->get_last_error());
}
Likely invalid or redundant comment.
76-99
: 🛠️ Refactor suggestion
Add error handling for memory-mapped file operations
While the decompression logic is sound, it needs proper error handling for memory mapping operations.
Add these checks:
clp::ReadOnlyMemoryMappedFile const memory_mapped_compressed_file{compressed_file_path};
+ REQUIRE(memory_mapped_compressed_file.is_open());
auto const compressed_file_view{memory_mapped_compressed_file.get_view()};
+ REQUIRE(compressed_file_view.data() != nullptr);
+ REQUIRE(compressed_file_view.size() > 0);
decompressor->open(compressed_file_view.data(), compressed_file_view.size());
+ REQUIRE(ErrorCode_Success == decompressor->get_last_error());
Likely invalid or redundant comment.
components/core/src/clp/streaming_compression/zstd/Compressor.cpp (2)
3-10
: LGTM! Well-organized include directives.
The includes are properly organized into three sections: standard library, third-party libraries, and project headers, improving readability and maintainability.
81-88
: Verify buffer capacity before writing.
The code writes to the file writer without explicitly verifying if the buffer has sufficient capacity.
components/core/src/clp/streaming_compression/zstd/Decompressor.cpp (1)
Line range hint 74-94
: Ensure that try_read_exact_length
is defined and implemented correctly
The function try_read_exact_length
is called but its implementation is not provided in the current context. Please verify that try_read_exact_length
is defined and correctly handles the required functionality.
Run the following script to verify the definition of try_read_exact_length
in the codebase:
✅ Verification successful
Let me gather more context about the inheritance hierarchy and implementation to ensure the correct try_read_exact_length
is being used.
Let me check one final piece to understand the base Decompressor class implementation.
try_read_exact_length
is properly defined and accessible through inheritance
The verification shows that try_read_exact_length
is correctly available to the Zstd Decompressor class through the inheritance chain:
- The Zstd
Decompressor
inherits from basestreaming_compression::Decompressor
- The base
Decompressor
inherits fromReaderInterface
ReaderInterface
provides the implementation oftry_read_exact_length
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Search for the definition of `try_read_exact_length` within the codebase.
# Test: Find the function definition. Expect: The function should be defined.
rg -A 5 -B 2 'ErrorCode\s+.*::try_read_exact_length\('
Length of output: 2467
Script:
#!/bin/bash
# Find the class definition of the Decompressor to understand inheritance
rg -B 3 -A 3 "class.*Decompressor\s*:" components/core/src/clp/streaming_compression/zstd/Decompressor.cpp
# Find includes in the Decompressor file to understand dependencies
rg -B 1 "^#include" components/core/src/clp/streaming_compression/zstd/Decompressor.cpp
Length of output: 368
Script:
#!/bin/bash
# Check the header file for class definition and inheritance
cat components/core/src/clp/streaming_compression/zstd/Decompressor.hpp
# Look for ReaderInterface inheritance or usage
rg -B 2 -A 2 "ReaderInterface" components/core/src/clp/streaming_compression/zstd/Decompressor.hpp
Length of output: 5806
Script:
#!/bin/bash
# Check the base Decompressor class definition
cat components/core/src/clp/streaming_compression/Decompressor.hpp
# Also check if there's any direct inheritance from ReaderInterface
rg -B 2 -A 2 "class.*Decompressor.*:.*ReaderInterface" components/core/src/clp/streaming_compression/
Length of output: 2577
components/core/src/clp/streaming_compression/zstd/Compressor.cpp
Outdated
Show resolved
Hide resolved
components/core/src/clp/streaming_compression/zstd/Compressor.cpp
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (4)
components/core/src/clp/streaming_compression/zstd/Compressor.cpp (4)
3-10
: Consider using absolute paths for local includesWhile the includes are well-organized (standard → third-party → local), using relative paths (
../../
) for local includes could be fragile during refactoring. Consider using absolute paths from the project root.
Line range hint
30-37
: Standardize error checking patternFor consistency with ZStd's recommended approach and to align with other error checks in the codebase, consider using
ZSTD_isError
:- if (ZSTD_error_no_error != ZSTD_getErrorCode(init_result)) { + if (ZSTD_isError(init_result)) {
Line range hint
101-108
: Standardize error checking and enhance documentation
- Use
ZSTD_isError
for consistency- Consider adding the specific buffer size requirement in the comment
- if (ZSTD_error_no_error != ZSTD_getErrorCode(end_stream_result)) { + if (ZSTD_isError(end_stream_result)) { // Note: Output buffer is large enough (ZSTD_CStreamOutSize()) that it is guaranteed // to have enough room to be able to flush the entire buffer
Line range hint
1-154
: Consider centralizing error handlingTo improve maintainability and reduce duplication, consider:
- Creating a utility function for ZStd error checking
- Using RAII patterns for resource management
- Creating a custom exception class for ZStd errors
Example utility function:
static void check_zstd_error(size_t result, std::string_view operation) { if (ZSTD_isError(result)) { SPDLOG_ERROR("streaming_compression::zstd::Compressor: {} error: {}", operation, ZSTD_getErrorName(result)); throw OperationFailed(ErrorCode_Failure, __FILENAME__, __LINE__); } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
components/core/src/clp/streaming_compression/zstd/Compressor.cpp
(8 hunks)
🔇 Additional comments (1)
components/core/src/clp/streaming_compression/zstd/Compressor.cpp (1)
Line range hint 13-22
: Verify initialization of m_compression_stream
The constructor checks if m_compression_stream
is nullptr, but its initialization isn't visible in this file. This suggests the member is initialized in the class declaration. Consider moving the initialization and error checking into a single location for better maintainability.
components/core/src/clp/streaming_compression/zstd/Compressor.cpp
Outdated
Show resolved
Hide resolved
components/core/src/clp/streaming_compression/zstd/Compressor.cpp
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also try to resolve this comment: #599 (comment)
After that is should be OK to merge (assuming Kirk is ok with #599 (comment))
components/core/src/clp/streaming_compression/passthrough/Decompressor.hpp
Outdated
Show resolved
Hide resolved
components/core/src/clp/streaming_compression/zstd/Compressor.hpp
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (2)
components/core/src/clp/streaming_compression/zstd/Compressor.hpp (2)
4-4
: Include<cstddef>
only if necessaryPlease verify whether including
<cstddef>
is required. If it's not used in this header file, consider removing it to reduce unnecessary dependencies.
77-79
: Inline implementation ofopen
may increase compile timeImplementing
open(FileWriter& file_writer)
inline in the header might increase compile times and expose implementation details. Consider moving the implementation to the source file to improve encapsulation.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
components/core/src/clp/streaming_compression/zstd/Compressor.cpp
(8 hunks)components/core/src/clp/streaming_compression/zstd/Compressor.hpp
(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- components/core/src/clp/streaming_compression/zstd/Compressor.cpp
🔇 Additional comments (5)
components/core/src/clp/streaming_compression/zstd/Compressor.hpp (5)
26-28
: Confirm the use of [[nodiscard]]
attribute
The [[nodiscard]]
attribute on the what()
method may not be meaningful since it's an override of a standard library function. Ensure that adding [[nodiscard]]
is appropriate here.
43-44
: Good use of move semantics with noexcept
The move constructor and move assignment operator are correctly defaulted and marked as noexcept
, which is good for exception safety and allows optimizations.
65-65
: Appropriate use of [[nodiscard]]
on try_get_pos
The [[nodiscard]]
attribute appropriately emphasizes the importance of checking the return value of try_get_pos
.
98-98
: Ensure proper initialization and cleanup of m_compression_stream
Initializing m_compression_stream
with ZSTD_createCStream()
directly in the member initializer list is acceptable, but ensure that it is properly freed in the destructor to prevent resource leaks.
101-101
: Initialization of m_compressed_stream_block_buffer
Initializing m_compressed_stream_block_buffer
with ZSTD_CStreamOutSize()
is appropriate. Ensure that the order of member initializations matches the order of declarations to avoid any issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the PR title, how about:
refactor(core): Fix clang-tidy warnings in the streaming compressor interface and its implementations; Modernize and refactor `test-StreamingCompression` for conciseness.
test-StreamingCompression
for conciseness.
…nterface and its implementations; Modernize and refactor `test-StreamingCompression` for conciseness. (y-scope#599) Co-authored-by: Bingran Hu <[email protected]> Co-authored-by: Lin Zhihao <[email protected]>
Description
Clang tidy compressor-related files in streaming compression in preparation for porting LZMA compression workflow.
Refactor and clang-tidy the unit test for easier addition of the LZMA tests in the future.
No new features are added except for the following:
size_t
result returned from ZStd API functions indicates an error status.Validation performed
Passes all the original and refactored unit tests.
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Refactor