-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add message pipe #27
Conversation
Bump spdlog version to solve conflict with fmtlib. Add using external fmt to spdlog build. Add PIC in fmt build. Add process to boost build.
WalkthroughThe pull request introduces several enhancements to the build configuration and source files of the project. Key changes include updates to the Changes
Possibly related PRs
Suggested reviewers
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 16
🧹 Outside diff range and nitpick comments (25)
src/spider/.clang-format (1)
13-14
: Remove unnecessary empty alternative in regex patternThe regex pattern includes an empty alternative
|
at the start which serves no purpose and could be removed.Apply this diff to simplify the pattern:
- - Regex: "^<(|absl|boost|catch2|fmt|mariadb|msgpack|spdlog)" + - Regex: "^<(absl|boost|catch2|fmt|mariadb|msgpack|spdlog)"src/spider/worker/message_pipe.hpp (1)
1-9
: LGTM! Consider adding file documentation.The include guards and header organization are well-structured. The IWYU pragmas indicate thoughtful dependency management.
Consider adding a brief file-level documentation comment explaining the purpose and usage of this message pipe facility:
#ifndef SPIDER_WORKER_MESSAGE_PIPE_HPP #define SPIDER_WORKER_MESSAGE_PIPE_HPP +/** + * @file message_pipe.hpp + * @brief Provides message passing functionality through pipes for worker communication + * + * This header defines interfaces for synchronous and asynchronous message passing + * using Boost.Asio pipes and POSIX stream descriptors. + */ + #include <functional>tests/CMakeLists.txt (1)
16-20
: Consider using CMake's path handling functionsThe current path construction using string concatenation could be fragile. Consider using
cmake_path()
orfile(REAL_PATH)
for more robust path handling.Here's a more robust approach:
set(SPIDER_TEST_WORKER_SOURCES) +set(SPIDER_SRC_DIR "${CMAKE_CURRENT_SOURCE_DIR}/../src/spider") foreach(worker_source ${SPIDER_WORKER_SOURCES}) - list(APPEND SPIDER_TEST_WORKER_SOURCES "../src/spider/${worker_source}") + cmake_path(APPEND SPIDER_SRC_DIR "${worker_source}" OUTPUT_VARIABLE source_path) + list(APPEND SPIDER_TEST_WORKER_SOURCES "${source_path}") endforeach()src/spider/CMakeLists.txt (2)
38-44
: Separate headers from source filesConsider creating a separate SPIDER_WORKER_HEADERS variable for .hpp files to maintain better CMake structure.
set(SPIDER_WORKER_SOURCES worker/message_pipe.cpp + CACHE INTERNAL + "spider worker source files" +) + +set(SPIDER_WORKER_HEADERS worker/message_pipe.hpp worker/TaskExecutorMessage.hpp CACHE INTERNAL - "spider worker source files" + "spider worker header files" )
50-59
: Add version constraints for spdlog dependencyWhile Boost dependencies are well-specified, consider adding version constraints for spdlog to ensure compatibility.
+find_package(spdlog 1.9.0 REQUIRED) target_link_libraries( spider_worker PRIVATE Boost::headers Boost::filesystem Boost::process Boost::program_options Boost::system spdlog::spdlog )
tools/scripts/lib_install/spdlog.sh (1)
Line range hint
1-7
: Update dependency documentationThe script now requires the fmt library due to the external spdlog configuration, but this isn't documented in the dependencies section.
Add fmt to the dependencies list:
# Dependencies: # - cmake # - curl # - g++ +# - fmt # NOTE: Dependencies should be installed outside the script to allow the script to be largely distro-agnostic
src/spider/worker/FunctionManager.cpp (1)
13-38
: Consider adding error logging in the catch blockWhile the error handling is comprehensive, adding logging in the catch block would help with debugging when type errors occur.
} catch (msgpack::type_error& e) { + // Log the error for debugging purposes + spdlog::debug("Failed to unpack message: {}", e.what()); return std::nullopt; }tests/worker/test-MessagePipe.cpp (3)
36-36
: Consider making the timeout duration configurable.The 5-second timeout is hardcoded. Consider making this a named constant or configuration parameter for better maintainability.
+ static constexpr auto RESPONSE_TIMEOUT = std::chrono::seconds(5); - REQUIRE(future.wait_for(std::chrono::seconds(5)) == std::future_status::ready); + REQUIRE(future.wait_for(RESPONSE_TIMEOUT) == std::future_status::ready);
41-52
: Simplify nested validation logic.The nested if statements can be simplified since we've already verified the presence of values with REQUIRE.
- if (response_option.has_value()) { - msgpack::sbuffer const& response_buffer = response_option.value(); - REQUIRE(spider::worker::TaskExecutorResponseType::Result - == spider::worker::get_response_type(response_buffer)); - std::optional<std::tuple<std::string, int>> const parse_response - = spider::core::response_get_result<std::tuple<std::string, int>>(response_buffer); - REQUIRE(parse_response.has_value()); - if (parse_response.has_value()) { - std::tuple<std::string, int> result = parse_response.value(); - REQUIRE(cSampleResult == result); - } - } + msgpack::sbuffer const& response_buffer = response_option.value(); + REQUIRE(spider::worker::TaskExecutorResponseType::Result + == spider::worker::get_response_type(response_buffer)); + std::optional<std::tuple<std::string, int>> const parse_response + = spider::core::response_get_result<std::tuple<std::string, int>>(response_buffer); + REQUIRE(parse_response.has_value()); + REQUIRE(cSampleResult == parse_response.value());
16-53
: Consider adding more test cases for error scenarios.The current test case only covers the happy path. Consider adding test cases for:
- Invalid message format
- Pipe disconnection
- Timeout scenarios
- Buffer overflow conditions
Would you like me to help generate additional test cases for these scenarios?
src/spider/worker/TaskExecutorMessage.hpp (4)
6-6
: Consider documenting the IWYU pragmaWhile the IWYU pragma is correctly used, it would be helpful to add a brief comment explaining why this pragma is necessary for MsgPack.hpp.
9-16
: Add documentation for enum valuesEach response type should be documented to explain its purpose and when it's used. This helps maintainers understand the message protocol.
Example:
enum class TaskExecutorResponseType : std::uint8_t { + /// Indicates an invalid or unrecognized response Unknown = 0, + /// Contains the successful execution result Result, + /// Contains error information from execution Error, // ... document other states };
40-54
: Reduce code duplication with get_response_typeConsider creating a template function to handle the common message parsing logic.
+template<typename T> +inline auto get_message_type(msgpack::sbuffer const& buffer) -> T { + // NOLINTBEGIN(cppcoreguidelines-pro-type-union-access,cppcoreguidelines-pro-bounds-pointer-arithmetic) + msgpack::object_handle const handle = msgpack::unpack(buffer.data(), buffer.size()); + msgpack::object const object = handle.get(); + if (object.type != msgpack::type::ARRAY || object.via.array.size < 2) { + return T::Unknown; + } + msgpack::object const header = object.via.array.ptr[0]; + try { + return header.as<T>(); + } catch (msgpack::type_error const&) { + return T::Unknown; + } + // NOLINTEND(cppcoreguidelines-pro-type-union-access,cppcoreguidelines-pro-bounds-pointer-arithmetic) +} + +inline auto get_response_type(msgpack::sbuffer const& buffer) -> TaskExecutorResponseType { + return get_message_type<TaskExecutorResponseType>(buffer); +} + +inline auto get_request_type(msgpack::sbuffer const& buffer) -> TaskExecutorRequestType { + return get_message_type<TaskExecutorRequestType>(buffer); +}
1-70
: Consider adding protocol versioning and documentationTo ensure forward compatibility and maintainability:
- Add a version field to the message protocol
- Create protocol documentation explaining the message format, types, and their use cases
- Consider adding a protocol validation layer
This will help with future protocol evolution and make it easier for new developers to understand the messaging system.
tests/worker/test-FunctionManager.cpp (3)
Line range hint
28-62
: Consider enhancing error handling validation.While the test covers both successful and error scenarios well, consider these improvements:
- Validate error message content in addition to error codes
- Simplify optional handling using std::optional methods
Here's a suggested improvement for the error handling section:
- if (wrong_result_option.has_value()) { - REQUIRE(spider::core::FunctionInvokeError::WrongNumberOfArguments - == std::get<0>(wrong_result_option.value())); - } + REQUIRE(std::get<0>(wrong_result_option.value()) == + spider::core::FunctionInvokeError::WrongNumberOfArguments); + REQUIRE(!std::get<1>(wrong_result_option.value()).empty());
64-75
: Consider adding error case tests for tuple returns.The happy path is well tested, but consider adding tests for:
- Wrong argument types
- Wrong number of arguments
- Edge cases with empty strings
Would you like me to provide example test cases for these scenarios?
Additional test coverage needed for Data handling
Looking at the codebase, while there are some Data-related tests in test-DataStorage.cpp, they don't fully cover edge cases. The Data class implementation in src/spider/core/Data.hpp shows that it accepts any string values without validation, making it important to test boundary conditions.
Specific test cases to add:
- Empty strings for both key and value
- Very large strings to ensure no implicit size limitations
- Strings with special characters and Unicode content
- Edge cases around UUID generation when using the default constructor
- Duplicate key scenarios (partially covered in DataStorage tests)
🔗 Analysis chain
Line range hint
77-93
: Consider adding edge cases for Data handling.The current test verifies basic functionality well. Consider adding tests for:
- Empty data values
- Very large data values
- Special characters in data
Let's check if there are any existing edge case tests elsewhere:
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for other Data-related tests rg -l "TEST_CASE.*Data" tests/Length of output: 32
Script:
#!/bin/bash # Let's try a broader search for Data-related tests rg "TEST_CASE.*[^a-zA-Z]Data" tests/ -A 5 # Also search for any Data usage in tests rg "spider::core::Data" tests/ -A 2Length of output: 3701
Script:
#!/bin/bash # Let's examine the Data class implementation to understand what edge cases should be tested rg -A 10 "class Data" src/ # Also check if there are any size limits or validation in Data implementation ast-grep --pattern 'Data::Data($$$) { $$$ }'Length of output: 1491
src/spider/worker/message_pipe.cpp (3)
19-19
: Usestd::size_t
instead ofsize_t
for consistencyConsider using
std::size_t
to explicitly specify the namespace and maintain consistency across the codebase.
21-46
: Eliminate code duplication insend_message
functionsThe two
send_message
functions have identical implementations except for the type of the pipe parameter. Consider templating the function or using a common base class to reduce code duplication and improve maintainability.Here's how you might refactor using a template:
template <typename Pipe> auto send_message(Pipe& pipe, msgpack::sbuffer const& request) -> bool { try { size_t const size = request.size(); std::string const size_str = fmt::format("{:016d}", size); boost::asio::write(pipe, boost::asio::buffer(size_str)); boost::asio::write(pipe, boost::asio::buffer(request.data(), size)); return true; } catch (boost::system::system_error const& e) { spdlog::error("Failed to send message: {}", e.what()); return false; } }
91-99
: Consistent error logging for EOF conditionWhen checking for
boost::asio::error::eof
, ensure that the error logging is consistent with the synchronous version. Currently, the asynchronous function may not log all relevant errors.Consider aligning the error handling with
receive_message
.src/spider/worker/FunctionManager.hpp (5)
19-20
: Include guard verification for "TaskExecutorMessage.hpp"Ensure that
"TaskExecutorMessage.hpp"
has proper include guards or uses#pragma once
to prevent multiple inclusions, which could lead to compilation issues.
99-103
: Consolidatecreate_error_response
andcreate_error_buffer
to reduce duplicationBoth functions serve similar purposes in creating error responses. Merging them could simplify the codebase and eliminate redundancy.
210-213
: Include exception details in result parsing error messagesAdding
e.what()
to the error message would provide more context, aiding in debugging efforts.Apply this diff to include exception details:
} catch (msgpack::type_error& e) { return create_error_response( FunctionInvokeError::ResultParsingError, - fmt::format("Cannot parse result.") + fmt::format("Cannot parse result: {}", e.what()) ); }
215-218
: Provide exception information in function execution errorsIncluding the exception message
e.what()
in the error response would offer valuable insights during debugging.Apply this diff to enhance the error message:
} catch (std::exception& e) { return create_error_response( FunctionInvokeError::FunctionExecutionError, - "Function execution error" + fmt::format("Function execution error: {}", e.what()) ); }
253-254
: Avoid returning raw pointers fromget_function
Returning a raw pointer may lead to safety issues. Consider returning a reference or an
std::optional<const Function&>
to enhance safety and express the possibility of a missing function.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (16)
CMakeLists.txt
(3 hunks)src/spider/.clang-format
(1 hunks)src/spider/CMakeLists.txt
(2 hunks)src/spider/core/BoostAsio.hpp
(1 hunks)src/spider/worker/FunctionManager.cpp
(1 hunks)src/spider/worker/FunctionManager.hpp
(6 hunks)src/spider/worker/TaskExecutorMessage.hpp
(1 hunks)src/spider/worker/message_pipe.cpp
(1 hunks)src/spider/worker/message_pipe.hpp
(1 hunks)tests/CMakeLists.txt
(1 hunks)tests/worker/test-FunctionManager.cpp
(4 hunks)tests/worker/test-MessagePipe.cpp
(1 hunks)tools/scripts/lib_install/boost.sh
(1 hunks)tools/scripts/lib_install/fmtlib.sh
(1 hunks)tools/scripts/lib_install/linux/install-lib.sh
(1 hunks)tools/scripts/lib_install/spdlog.sh
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- src/spider/core/BoostAsio.hpp
🧰 Additional context used
🪛 cppcheck (2.10-2)
tests/worker/test-FunctionManager.cpp
[error] 28-28: syntax error
(syntaxError)
🔇 Additional comments (26)
src/spider/.clang-format (1)
13-14
: Verify completeness of library headers list
Let's ensure all required libraries for the message pipe feature are included in the pattern.
✅ Verification successful
Library headers pattern in .clang-format is complete
The current pattern "^<(|absl|boost|catch2|fmt|mariadb|msgpack|spdlog)"
correctly covers all third-party library headers used in the codebase. The verification shows:
- All major libraries are included: Abseil (
absl
), Boost (boost
), Catch2 (catch2
), fmt (fmt
), MariaDB (mariadb
), MessagePack (msgpack
), and spdlog (spdlog
) - The remaining includes are either:
- Standard C++ headers (like
<vector>
,<string>
, etc.) - Project-specific includes using quotes
- Nested includes from the already-covered libraries
- Standard C++ headers (like
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for any missing library includes that should be added to .clang-format
# Search for library includes in source files
echo "Searching for library includes..."
rg -g '*.{cpp,hpp,h}' '^\s*#include\s*<([^>]+)>' -o | \
sed -E 's/.*<([^/]+).*/\1/' | \
sort -u | \
grep -vE '^(absl|boost|catch2|fmt|mariadb|msgpack|spdlog|spider|std.*)'
echo "Checking CMake dependencies..."
# Look for find_package calls in CMake files
fd -e cmake -e txt -x grep -h "find_package" {}
Length of output: 1117
Script:
#!/bin/bash
# Let's check for any includes from Abseil, Boost, or other libraries that might be hidden in subdirectories
echo "Checking for library includes with path components..."
rg -g '*.{cpp,hpp,h}' '^\s*#include\s*<([^>]+)>' -o | \
grep -v '^#include <[^/]*>' | \
sort -u
# Also check for any third-party includes that might be using quotes instead of angle brackets
echo "Checking for quoted includes..."
rg -g '*.{cpp,hpp,h}' '^\s*#include\s*"([^"]+)"' -o | \
grep -v '\.\./' | \
grep -v '\./' | \
sort -u
Length of output: 11602
tools/scripts/lib_install/linux/install-lib.sh (1)
24-24
: Verify spdlog version compatibility
The version update to spdlog 1.15.0 looks good, but let's verify its stability and compatibility with the external fmt library requirement.
✅ Verification successful
The previous results show that spdlog 1.15.0 is a stable release, but there's a known issue with VS2017 and potential compatibility concerns with older fmt versions. Let's verify the fmt library configuration in the codebase to ensure compatibility.
Version update to spdlog 1.15.0 is compatible and properly configured
The update is safe to proceed because:
- The project correctly uses fmt 8.0.1, which is compatible with spdlog 1.15.0
- SPDLOG_FMT_EXTERNAL is properly configured in both CMake and build scripts
- While there's an open VS2017 issue, it doesn't affect this Linux-specific installation script
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify spdlog 1.15.0 compatibility and release information
# Check if spdlog 1.15.0 is tagged as stable
curl -s "https://api.github.com/repos/gabime/spdlog/releases/tags/v1.15.0" | jq -r '.prerelease'
# Check minimum required fmt version for spdlog 1.15.0
rg -A 5 "FMT_VERSION|fmt_version" "$(fd -t f -e cmake -e txt CMakeLists)"
# Verify if any known issues exist with 1.15.0
curl -s "https://api.github.com/repos/gabime/spdlog/issues?state=all" | \
jq -r '.[] | select(.title | contains("1.15.0")) | {title: .title, state: .state, labels: [.labels[].name]}'
Length of output: 877
Script:
#!/bin/bash
# Let's check for fmt library version requirements and usage in the codebase
# Search for fmt version requirements in CMake files
rg -A 3 "find_package\(fmt"
# Check if SPDLOG_FMT_EXTERNAL is defined
rg "SPDLOG_FMT_EXTERNAL|FMT_EXTERNAL"
# Look for fmt library usage patterns
rg -l "fmt::|#include.*fmt"
Length of output: 568
src/spider/worker/message_pipe.hpp (2)
22-24
: LGTM! Proper namespace and include guard closing.
The closing comments for both namespace and include guard enhance readability in larger codebases.
12-20
: Verify error handling and message protocol details.
The boolean and optional return types suggest basic error handling, but we should verify the implementation details.
✅ Verification successful
Let me gather more information about the message protocol implementation.
Error handling and message protocol implementation are well-structured and robust
The implementation shows a solid message protocol with:
- Fixed-size header (16 bytes) containing message length
- Proper error handling for both sync and async operations
- Comprehensive error logging for all failure scenarios
- Graceful EOF handling
- Consistent error propagation through boolean/optional returns
Error handling covers:
- System errors during read/write operations
- Header parsing failures
- Size mismatches
- EOF conditions
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check implementation details for error handling and message protocol
# Look for error handling patterns in the implementation
echo "Checking error handling implementation..."
rg -A 5 "catch|throw|error|fail" src/spider/worker/message_pipe.cpp
# Look for message format/protocol details
echo "Checking message protocol implementation..."
ast-grep --pattern 'msgpack::pack($_)'
Length of output: 2467
Script:
#!/bin/bash
# Check the message protocol implementation details
echo "Checking message protocol implementation in message_pipe.cpp..."
rg -A 10 "send_message|receive_message" src/spider/worker/message_pipe.cpp
# Look for message size/format constants
echo "Checking message format constants..."
rg "constexpr|#define.*size" src/spider/worker/message_pipe.cpp
Length of output: 2460
tests/CMakeLists.txt (1)
9-9
: LGTM: Test file addition is well-structured
The addition of test-MessagePipe.cpp follows the established pattern for test organization and is properly integrated into the test suite.
src/spider/CMakeLists.txt (2)
2-7
: LGTM: Core sources configuration is well-structured
The CACHE INTERNAL setting ensures proper variable scope accessibility, and the source files are logically organized.
15-16
: Verify Boost.Asio and MessagePack dependencies
The addition of BoostAsio.hpp and MsgPack.hpp suggests new dependencies. Let's verify the configuration.
✅ Verification successful
Let me check the CMakeLists.txt content to verify the dependency configuration.
Dependencies are properly configured
The dependencies for both Boost.Asio and MessagePack are correctly set up:
- Boost is configured with required components including
system
(which provides Asio) - MessagePack is properly configured via
find_package(msgpack-cxx 7.0.0 REQUIRED)
- Both libraries are correctly linked:
- Boost via
Boost::boost
andBoost::system
- MessagePack via
msgpack-cxx
- Boost via
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for proper dependency configuration
# Check if Boost.Asio is properly configured
rg -l "find_package.*Boost.*asio" --type cmake
# Check if MessagePack is properly configured
rg -l "find_package.*msgpack" --type cmake
# Verify usage of these dependencies
rg -l "include.*BoostAsio\.hpp"
rg -l "include.*MsgPack\.hpp"
Length of output: 594
Script:
#!/bin/bash
# Check root CMakeLists.txt for dependency configuration
cat CMakeLists.txt
# Check spider's CMakeLists.txt
cat src/spider/CMakeLists.txt
Length of output: 6302
tools/scripts/lib_install/fmtlib.sh (1)
67-67
: LGTM: Position Independent Code flag is correctly added
The addition of -DCMAKE_POSITION_INDEPENDENT_CODE=ON
is appropriate for building shared libraries and aligns with the project's broader changes.
src/spider/worker/FunctionManager.cpp (4)
1-11
: LGTM! Well-organized header includes
The header organization follows best practices with proper grouping (standard library, external, then internal headers).
62-67
: LGTM! Clean and efficient implementation
The method follows best practices with proper const usage and efficient map lookup.
71-72
: LGTM! Proper use of BOOST_DLL_ALIAS
The DLL alias is correctly implemented with appropriate NOLINT directive.
40-49
: Verify message size constraints
The function should validate the message size to prevent potential memory issues with very large error messages.
tests/worker/test-MessagePipe.cpp (2)
1-14
: LGTM! Well-structured includes with appropriate pragmas.
The includes are well-organized and the IWYU/NOLINT pragmas are appropriately used for test code.
1-56
: Verify test integration in the test suite.
Let's ensure this test is properly integrated into the build system and test suite.
✅ Verification successful
Let me check the CMakeLists.txt content to verify how the test is integrated.
Test is properly integrated in the test suite
The test file is correctly integrated:
- Listed in
SPIDER_TEST_SOURCES
variable in tests/CMakeLists.txt - Included in the
unitTest
executable target - Has proper library linkage with required dependencies (Boost, Catch2)
- Is the only MessagePipe test file, which is appropriate given its comprehensive coverage of the message pipe functionality
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify test integration in CMake configuration
# Check if the test file is included in CMakeLists.txt
echo "Checking test file inclusion in CMakeLists.txt..."
fd -t f "CMakeLists.txt" tests/ -x grep -l "test-MessagePipe.cpp"
# Check for other message pipe related tests
echo "Checking for related test files..."
fd -t f "test.*MessagePipe.*\\.cpp$" tests/
Length of output: 351
Script:
#!/bin/bash
# Check the content of tests/CMakeLists.txt
cat tests/CMakeLists.txt
Length of output: 962
src/spider/worker/TaskExecutorMessage.hpp (1)
66-68
: LGTM! MessagePack registration is properly implemented
The enum registration is correctly placed in the global namespace with clear documentation.
tests/worker/test-FunctionManager.cpp (1)
24-26
: LGTM! Test function registration is comprehensive.
The test functions are well-structured to cover different scenarios including POD types, tuple returns, and custom Data objects.
CMakeLists.txt (2)
58-60
: LGTM! Position Independent Code is essential for modern builds.
Enabling PIC globally is a good practice, especially when building components that might be used in shared libraries or plugins.
124-124
: LGTM! Consistent spdlog configuration.
The SPDLOG_FMT_EXTERNAL definition is correctly added, matching the existing configuration set above.
src/spider/worker/FunctionManager.hpp (8)
93-95
: response_get_error
function implementation is sound
The addition of response_get_error
enhances error handling by providing a structured way to extract error information from the buffer.
96-98
: create_error_response
improves error message consistency
The create_error_response
function effectively standardizes error responses, which aids in consistent error handling across the system.
112-114
: Verify robustness of array size check in response_get_result
The condition object.via.array.size == 2
assumes the array always contains exactly two elements. Ensure this assumption holds true in all contexts to prevent potential parsing errors.
116-119
: Confirm accuracy of response type comparison
Double-check that the comparison with TaskExecutorResponseType::Result
correctly identifies the expected response type and that the as<>()
conversion is safe and reliable.
130-136
: create_result_response
function is well-structured
The function correctly packs the result into a msgpack::sbuffer
, ensuring that the response is properly formatted.
149-158
: create_args_request
efficiently packs arguments
The implementation of create_args_request
effectively serializes the arguments into a request buffer, facilitating seamless communication.
208-208
: Ensure exception safety when returning results
While returning results, confirm that any exceptions during the packing process are appropriately handled to prevent unexpected crashes.
249-250
: register_function_invoker
method is appropriately implemented
The addition of register_function_invoker
allows for direct registration of function invokers, increasing flexibility.
auto send_message(boost::asio::writable_pipe& pipe, msgpack::sbuffer const& request) -> bool; | ||
|
||
auto send_message(boost::asio::posix::stream_descriptor& fd, msgpack::sbuffer const& request) | ||
-> bool; | ||
|
||
auto receive_message_async(std::reference_wrapper<boost::asio::readable_pipe> pipe | ||
) -> boost::asio::awaitable<std::optional<msgpack::sbuffer>>; | ||
|
||
auto receive_message(boost::asio::posix::stream_descriptor& fd) -> std::optional<msgpack::sbuffer>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add comprehensive function documentation and consider error handling strategy.
While the function signatures are well-designed, they would benefit from detailed documentation and explicit error handling guidance.
Add Doxygen-style documentation for each function:
+/**
+ * @brief Sends a message through a writable pipe
+ * @param pipe The pipe to write the message to
+ * @param request The message buffer to send
+ * @return true if the message was sent successfully, false otherwise
+ */
auto send_message(boost::asio::writable_pipe& pipe, msgpack::sbuffer const& request) -> bool;
+/**
+ * @brief Sends a message through a POSIX stream descriptor
+ * @param fd The file descriptor to write the message to
+ * @param request The message buffer to send
+ * @return true if the message was sent successfully, false otherwise
+ */
auto send_message(boost::asio::posix::stream_descriptor& fd, msgpack::sbuffer const& request)
-> bool;
+/**
+ * @brief Asynchronously receives a message from a readable pipe
+ * @param pipe Reference wrapper to the pipe to read from
+ * @return An awaitable that resolves to an optional message buffer
+ */
auto receive_message_async(std::reference_wrapper<boost::asio::readable_pipe> pipe
) -> boost::asio::awaitable<std::optional<msgpack::sbuffer>>;
+/**
+ * @brief Synchronously receives a message from a POSIX stream descriptor
+ * @param fd The file descriptor to read from
+ * @return An optional containing the received message buffer if successful
+ */
auto receive_message(boost::asio::posix::stream_descriptor& fd) -> std::optional<msgpack::sbuffer>;
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
auto send_message(boost::asio::writable_pipe& pipe, msgpack::sbuffer const& request) -> bool; | |
auto send_message(boost::asio::posix::stream_descriptor& fd, msgpack::sbuffer const& request) | |
-> bool; | |
auto receive_message_async(std::reference_wrapper<boost::asio::readable_pipe> pipe | |
) -> boost::asio::awaitable<std::optional<msgpack::sbuffer>>; | |
auto receive_message(boost::asio::posix::stream_descriptor& fd) -> std::optional<msgpack::sbuffer>; | |
/** | |
* @brief Sends a message through a writable pipe | |
* @param pipe The pipe to write the message to | |
* @param request The message buffer to send | |
* @return true if the message was sent successfully, false otherwise | |
*/ | |
auto send_message(boost::asio::writable_pipe& pipe, msgpack::sbuffer const& request) -> bool; | |
/** | |
* @brief Sends a message through a POSIX stream descriptor | |
* @param fd The file descriptor to write the message to | |
* @param request The message buffer to send | |
* @return true if the message was sent successfully, false otherwise | |
*/ | |
auto send_message(boost::asio::posix::stream_descriptor& fd, msgpack::sbuffer const& request) | |
-> bool; | |
/** | |
* @brief Asynchronously receives a message from a readable pipe | |
* @param pipe Reference wrapper to the pipe to read from | |
* @return An awaitable that resolves to an optional message buffer | |
*/ | |
auto receive_message_async(std::reference_wrapper<boost::asio::readable_pipe> pipe | |
) -> boost::asio::awaitable<std::optional<msgpack::sbuffer>>; | |
/** | |
* @brief Synchronously receives a message from a POSIX stream descriptor | |
* @param fd The file descriptor to read from | |
* @return An optional containing the received message buffer if successful | |
*/ | |
auto receive_message(boost::asio::posix::stream_descriptor& fd) -> std::optional<msgpack::sbuffer>; |
tests/CMakeLists.txt
Outdated
target_link_libraries( | ||
unitTest | ||
PRIVATE | ||
Boost::headers | ||
Boost::filesystem | ||
Boost::program_options | ||
Boost::process | ||
Boost::system | ||
spdlog::spdlog | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Remove unused Boost components from test dependencies
Based on the code analysis, some Boost components are not being used in the test suite:
Boost::filesystem
andBoost::program_options
are not used anywhere in the tests- Only the following Boost components are actually needed:
Boost::headers
(for UUID functionality)Boost::process
andBoost::system
(for asio functionality in MessagePipe tests)
Additionally, while spdlog is linked, it appears to be unused in the test code and can be removed.
🔗 Analysis chain
Verify usage of all Boost components
While the dependencies are properly configured, let's verify that all Boost components are actually needed by the test suite.
Let me gather more information about the actual usage of Boost components in the test files.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check usage of Boost components in test files
echo "Checking Boost::filesystem usage..."
rg -l "boost/filesystem" tests/
echo "Checking Boost::program_options usage..."
rg -l "boost/program_options" tests/
echo "Checking Boost::process usage..."
rg -l "boost/process" tests/
echo "Checking Boost::system usage..."
rg -l "boost/system" tests/
echo "Checking spdlog usage..."
rg -l "spdlog" tests/
Length of output: 547
Script:
#!/bin/bash
# Let's check the actual content of test files to see Boost usage
echo "Searching for Boost headers and usage..."
rg "boost" tests/ -A 3
echo -e "\nSearching for spdlog usage..."
rg "spdlog" tests/ -A 3
Length of output: 7844
cmake -DCMAKE_POSITION_INDEPENDENT_CODE=ON ../ | ||
make -j${num_cpus} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider enhancing build verification and cleanup
The script should verify the build success and clean up on failure. Consider adding:
- Build status verification
- Cleanup trap for failures
Apply this diff to improve error handling:
cmake -DCMAKE_POSITION_INDEPENDENT_CODE=ON ../
+if [ $? -ne 0 ]; then
+ echo "CMAKE configuration failed"
+ exit 1
+fi
make -j${num_cpus}
+if [ $? -ne 0 ]; then
+ echo "Build failed"
+ exit 1
+fi
And at the start of the script, after set -e
:
# Exit on any error
set -e
+
+# Cleanup on error
+cleanup() {
+ if [ $? -ne 0 ]; then
+ echo "Error occurred, cleaning up..."
+ rm -rf $temp_dir
+ fi
+}
+trap cleanup EXIT
Committable suggestion skipped: line range outside the PR's diff.
@@ -71,7 +71,7 @@ fi | |||
cd ${extracted_dir} | |||
mkdir -p build | |||
cd build | |||
cmake ../ | |||
cmake -DSPDLOG_FMT_EXTERNAL=ON .. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Version mismatch between build script and CMake requirement
The build system has a version mismatch for the fmt library:
- CMakeLists.txt requires fmt version 8.0.1
- install-lib.sh installs fmt version 11.0.2
While the newer version should be backwards compatible, it's recommended to:
- Update CMakeLists.txt to require fmt 11.0.2, or
- Update install-lib.sh to install fmt 8.0.1
Otherwise, the -DSPDLOG_FMT_EXTERNAL=ON
flag is properly supported as:
- fmtlib.sh is installed before spdlog.sh
- The installation scripts handle dependencies correctly
🔗 Analysis chain
Verify fmt library dependency for external usage
The addition of -DSPDLOG_FMT_EXTERNAL=ON
is appropriate for the messaging pipe feature. However, we should ensure the fmt library dependency is properly handled.
Let's verify the fmt library dependency:
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for fmt library installation script and dependencies
# Check for fmt library installation script
echo "Checking for fmt library installation script..."
fd -t f "fmt.sh" "tools/scripts/lib_install"
# Check if fmt is mentioned in any CMake files
echo "Checking for fmt in CMake files..."
rg -l "fmt" -g "CMakeLists.txt"
# Check for fmt package requirements
echo "Checking for fmt package requirements..."
rg "fmt" -g "package.xml|requirements.txt|dependencies"
Length of output: 593
Script:
#!/bin/bash
# Let's check the content of CMakeLists.txt files that mention fmt
echo "Checking CMakeLists.txt content for fmt dependency..."
rg -A 5 -B 5 "fmt" CMakeLists.txt src/spider/CMakeLists.txt
# Check if fmt library is installed before spdlog
echo "Checking spdlog.sh script for fmt dependency handling..."
cat tools/scripts/lib_install/spdlog.sh
# Look for any fmt installation scripts with different naming
echo "Looking for any fmt-related installation scripts..."
fd -t f ".*fmt.*" tools/scripts/lib_install/
Length of output: 4134
Script:
#!/bin/bash
# Let's check the fmtlib installation script
echo "Checking fmtlib.sh content..."
cat tools/scripts/lib_install/fmtlib.sh
# Check if fmtlib is installed before spdlog in any build scripts
echo "Checking for build order in scripts..."
rg -l "fmtlib.sh|spdlog.sh" tools/scripts/
# Verify fmt version requirement matches
echo "Checking fmt version requirements..."
rg "fmt.*8\.0\.1" -g "*.sh"
Length of output: 2407
Script:
#!/bin/bash
# Check the installation order in the Linux install script
echo "Checking library installation order..."
cat tools/scripts/lib_install/linux/install-lib.sh
# Check if there are any version specifications for fmt in CMake
echo "Checking fmt version in CMake files..."
rg "fmt.*VERSION" -A 2 -B 2 CMakeLists.txt
Length of output: 1287
void create_error_buffer( | ||
FunctionInvokeError error, | ||
std::string const& message, | ||
msgpack::sbuffer& buffer | ||
) { | ||
msgpack::packer packer{buffer}; | ||
packer.pack_array(2); | ||
packer.pack(error); | ||
packer.pack(message); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Array size difference is intentional, but buffer capacity check is recommended
The difference in array sizes between create_error_buffer
(2 elements) and create_error_response
(3 elements) is intentional:
create_error_response
includes the response type as its first element (TaskExecutorResponseType::Error
)create_error_buffer
is used when the response type is already known/written
However, the second concern about buffer capacity remains valid. Consider adding a check or using a buffer with automatic growth.
🔗 Analysis chain
Review array size inconsistency and add buffer capacity check
Two concerns:
- This function packs 2 elements while
create_error_response
packs 3 elements. Is this intentional? - Consider checking buffer capacity before writing to prevent overflow.
Let me gather more information about the create_error_response
function implementation to understand the array size difference.
Let me search for the implementation of create_error_response
to compare with create_error_buffer
.
Let me try one more search with a different approach to find the implementation.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Search for other uses of these functions to understand the expected array sizes
ast-grep --pattern 'create_error_buffer($$$)' -A 5
ast-grep --pattern 'create_error_response($$$)' -A 5
Length of output: 3579
Script:
#!/bin/bash
# Search for create_error_response implementation
ast-grep --pattern 'create_error_response($$$)' --lang cpp
rg "create_error_response.*\{" -A 5
Length of output: 2217
Script:
#!/bin/bash
# Search for create_error_response implementation in cpp files
rg -t cpp "create_error_response.*\{" -B 2 -A 10
# Also check the header file for its declaration
rg -t hpp "create_error_response" -B 2 -A 2
Length of output: 122
Script:
#!/bin/bash
# Search for create_error_response implementation
rg "create_error_response" -A 10 src/spider/worker/FunctionManager.cpp
# Also look for any msgpack array packing in the response creation
rg "pack_array" -B 5 -A 5 src/spider/worker/FunctionManager.cpp
Length of output: 1113
auto receive_message_async(std::reference_wrapper<boost::asio::readable_pipe> pipe | ||
) -> boost::asio::awaitable<std::optional<msgpack::sbuffer>> { | ||
std::array<char, cHeaderSize> header_buffer{0}; | ||
// NOLINTNEXTLINE(clang-analyzer-core.NullDereference) | ||
auto [header_ec, header_n] = co_await boost::asio::async_read( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Investigate suppressed null dereference warning
The comment // NOLINTNEXTLINE(clang-analyzer-core.NullDereference)
suppresses a potential null dereference warning. Instead of suppressing the warning, consider addressing the underlying issue to ensure code safety.
Review whether pipe.get()
could be null and add necessary checks or assertions.
src/spider/worker/message_pipe.cpp
Outdated
size_t response_size = 0; | ||
try { | ||
response_size = std::stoul(std::string{header_buffer.data(), cHeaderSize}); | ||
} catch (std::exception& e) { | ||
spdlog::error( | ||
"Cannot parse header: {} {}", | ||
e.what(), | ||
std::string{header_buffer.data(), cHeaderSize} | ||
); | ||
co_return std::nullopt; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle exceptions when parsing response size
Similar to the synchronous version, ensure that exceptions thrown by std::stoul
when parsing response_size
are properly handled. A malformed header could cause unexpected termination.
Ensure a try-catch block is present and include size checks:
try {
response_size = std::stoul(std::string{header_buffer.data(), cHeaderSize});
if (response_size >= max_allowed_size) {
spdlog::error("Response size exceeds maximum allowed limit: {}", response_size);
co_return std::nullopt;
}
} catch (std::exception& e) {
spdlog::error(
"Cannot parse header: {} {}",
e.what(),
std::string{header_buffer.data(), cHeaderSize}
);
co_return std::nullopt;
}
src/spider/worker/message_pipe.cpp
Outdated
if (response_size != response_n) { | ||
spdlog::error( | ||
"Response read size not match. Expect {}. Got {}.", | ||
response_size, | ||
response_n | ||
); | ||
co_return std::nullopt; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ensure complete asynchronous read of the message body
After the asynchronous read, verify that the number of bytes read matches the expected response_size
to handle any incomplete reads.
Modify the check:
if (response_size != response_n) {
spdlog::error(
"Response read size does not match. Expected {} bytes, got {} bytes.",
response_size,
response_n
);
co_return std::nullopt;
}
// NOLINTBEGIN(cppcoreguidelines-macro-usage) | ||
#define CONCAT_DIRECT(s1, s2) s1##s2 | ||
#define CONCAT(s1, s2) CONCAT_DIRECT(s1, s2) | ||
#define ANONYMOUS_VARIABLE(str) CONCAT(str, __COUNTER__) | ||
// NOLINTEND(cppcoreguidelines-macro-usage) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Replace macros with constexpr
functions for better type safety
Using macros like CONCAT
, CONCAT_DIRECT
, and ANONYMOUS_VARIABLE
can lead to less type-safe code and harder-to-debug errors. Consider replacing them with constexpr
functions or inline variables to enhance type safety and maintainability.
inline const auto ANONYMOUS_VARIABLE(var) \ | ||
= spider::core::FunctionManager::get_instance().register_function(#func, func); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Avoid macros in REGISTER_TASK
; use templates instead
The REGISTER_TASK
macro could be replaced with a templated function to reduce macro usage and improve code clarity. This change would enhance type safety and make the code easier to debug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (6)
src/spider/worker/message_pipe.cpp (2)
120-122
: Maintain consistency with synchronous versionThe async version checks for zero body size while the synchronous version doesn't. This inconsistency could lead to different behaviour between the two implementations.
Either add the check to the synchronous version or document why the behaviours differ.
1-148
: Add documentation for message format and size limitsThe code would benefit from documentation describing:
- The message format specification (16-byte header + variable body)
- Maximum message size limits and their rationale
- Error handling behaviour and recovery strategies
Consider adding a detailed comment block at the top of the file explaining these aspects.
src/spider/worker/FunctionManager.hpp (4)
93-103
: Add documentation for error handling functionsThese public API functions would benefit from documentation explaining:
- Expected input format
- Possible error conditions
- Return value semantics
112-118
: Replace magic numbers with named constantsThe array size check uses magic number '2'. Consider defining a constant for better maintainability:
+ static constexpr std::size_t EXPECTED_RESPONSE_SIZE = 2; - if (msgpack::type::ARRAY != object.type || 2 != object.via.array.size) { + if (msgpack::type::ARRAY != object.type || EXPECTED_RESPONSE_SIZE != object.via.array.size) {
Line range hint
177-218
: Enhance error messages with more contextConsider adding more detailed error messages that include:
- The actual values that caused parsing failures
- The specific type conversion that failed
- Stack trace information when available
253-253
: Document pointer ownership semanticsPlease add documentation clarifying:
- Pointer ownership (who owns the returned Function pointer)
- Lifetime guarantees
- Whether the pointer can be null
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (4)
src/spider/CMakeLists.txt
(2 hunks)src/spider/worker/FunctionManager.cpp
(1 hunks)src/spider/worker/FunctionManager.hpp
(6 hunks)src/spider/worker/message_pipe.cpp
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- src/spider/CMakeLists.txt
- src/spider/worker/FunctionManager.cpp
🔇 Additional comments (5)
src/spider/worker/message_pipe.cpp (3)
1-19
: LGTM! Header organization and namespace setup look good.
The includes are well-organized, and the IWYU pragmas are appropriately used for managing header dependencies.
21-32
:
Add size validation before formatting message header
The current implementation doesn't validate the message size before formatting it into the header string. This could lead to issues with very large messages.
Add size validation before formatting:
auto send_message(boost::asio::writable_pipe& pipe, msgpack::sbuffer const& request) -> bool {
try {
size_t const size = request.size();
+ if (size >= 1e16) {
+ spdlog::error("Message size exceeds maximum allowed limit: {}", size);
+ return false;
+ }
std::string const size_str = fmt::format("{:016d}", size);
90-94
:
Address suppressed null dereference warning
The suppressed warning suggests a potential null dereference issue that should be addressed rather than suppressed.
Add explicit null check:
- // NOLINTNEXTLINE(clang-analyzer-core.NullDereference)
+ if (&pipe.get() == nullptr) {
+ spdlog::error("Null pipe reference");
+ co_return std::nullopt;
+ }
auto [header_ec, header_n] = co_await boost::asio::async_read(
src/spider/worker/FunctionManager.hpp (2)
21-29
: Consider modernizing macro usage with constexpr alternatives
The use of macros for concatenation and variable generation could be replaced with more type-safe alternatives.
149-158
: Consider adding buffer size checks
For safety, consider adding a maximum limit to the number of arguments to prevent potential buffer overflow issues.
src/spider/worker/message_pipe.cpp
Outdated
auto send_message(boost::asio::posix::stream_descriptor& fd, msgpack::sbuffer const& request) | ||
-> bool { | ||
try { | ||
size_t const size = request.size(); | ||
std::string const size_str = fmt::format("{:016d}", size); | ||
boost::asio::write(fd, boost::asio::buffer(size_str)); | ||
boost::asio::write(fd, boost::asio::buffer(request.data(), size)); | ||
return true; | ||
} catch (boost::system::system_error const& e) { | ||
spdlog::error("Failed to send message: {}", e.what()); | ||
return false; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Refactor duplicated send_message logic
Both send_message overloads share identical logic, differing only in the type of the output device.
Consider implementing a template function to reduce code duplication:
+template<typename T>
+auto send_message_impl(T& output, msgpack::sbuffer const& request) -> bool {
+ try {
+ size_t const size = request.size();
+ if (size >= 1e16) {
+ spdlog::error("Message size exceeds maximum allowed limit: {}", size);
+ return false;
+ }
+ std::string const size_str = fmt::format("{:016d}", size);
+ boost::asio::write(output, boost::asio::buffer(size_str));
+ boost::asio::write(output, boost::asio::buffer(request.data(), size));
+ return true;
+ } catch (boost::system::system_error const& e) {
+ spdlog::error("Failed to send message: {}", e.what());
+ return false;
+ }
+}
+
+auto send_message(boost::asio::writable_pipe& pipe, msgpack::sbuffer const& request) -> bool {
+ return send_message_impl(pipe, request);
+}
+
+auto send_message(boost::asio::posix::stream_descriptor& fd, msgpack::sbuffer const& request) -> bool {
+ return send_message_impl(fd, request);
+}
Committable suggestion skipped: line range outside the PR's diff.
auto receive_message(boost::asio::posix::stream_descriptor& fd) -> std::optional<msgpack::sbuffer> { | ||
std::array<char, cHeaderSize> header_buffer{0}; | ||
try { | ||
boost::asio::read(fd, boost::asio::buffer(header_buffer)); | ||
} catch (boost::system::system_error& e) { | ||
if (boost::asio::error::eof != e.code()) { | ||
spdlog::error("Fail to read header: {}", e.what()); | ||
} | ||
return std::nullopt; | ||
} | ||
size_t body_size = 0; | ||
try { | ||
body_size = std::stoul(std::string{header_buffer.data(), cHeaderSize}); | ||
} catch (std::exception& e) { | ||
spdlog::error( | ||
"Cannot parse header: {} {}", | ||
e.what(), | ||
std::string{header_buffer.data(), cHeaderSize} | ||
); | ||
return std::nullopt; | ||
} | ||
|
||
std::vector<char> body_buffer(body_size); | ||
try { | ||
size_t body_read_size = boost::asio::read(fd, boost::asio::buffer(body_buffer)); | ||
if (body_read_size != body_size) { | ||
spdlog::error( | ||
"Message body read size mismatch. Expect {}. Got {}", | ||
body_size, | ||
body_read_size | ||
); | ||
return std::nullopt; | ||
} | ||
} catch (boost::system::system_error& e) { | ||
spdlog::error("Fail to read response body: {}", e.what()); | ||
return std::nullopt; | ||
} | ||
msgpack::sbuffer buffer; | ||
buffer.write(body_buffer.data(), body_buffer.size()); | ||
return buffer; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add memory safety checks for message body allocation
The function allocates a buffer based on the parsed size without checking if the allocation is reasonable or possible.
Add memory safety checks:
try {
body_size = std::stoul(std::string{header_buffer.data(), cHeaderSize});
+ constexpr size_t max_message_size = 1024 * 1024 * 1024; // 1GB or adjust as needed
+ if (body_size > max_message_size) {
+ spdlog::error("Message size {} exceeds maximum allowed size {}", body_size, max_message_size);
+ return std::nullopt;
+ }
} catch (std::exception& e) {
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
auto receive_message(boost::asio::posix::stream_descriptor& fd) -> std::optional<msgpack::sbuffer> { | |
std::array<char, cHeaderSize> header_buffer{0}; | |
try { | |
boost::asio::read(fd, boost::asio::buffer(header_buffer)); | |
} catch (boost::system::system_error& e) { | |
if (boost::asio::error::eof != e.code()) { | |
spdlog::error("Fail to read header: {}", e.what()); | |
} | |
return std::nullopt; | |
} | |
size_t body_size = 0; | |
try { | |
body_size = std::stoul(std::string{header_buffer.data(), cHeaderSize}); | |
} catch (std::exception& e) { | |
spdlog::error( | |
"Cannot parse header: {} {}", | |
e.what(), | |
std::string{header_buffer.data(), cHeaderSize} | |
); | |
return std::nullopt; | |
} | |
std::vector<char> body_buffer(body_size); | |
try { | |
size_t body_read_size = boost::asio::read(fd, boost::asio::buffer(body_buffer)); | |
if (body_read_size != body_size) { | |
spdlog::error( | |
"Message body read size mismatch. Expect {}. Got {}", | |
body_size, | |
body_read_size | |
); | |
return std::nullopt; | |
} | |
} catch (boost::system::system_error& e) { | |
spdlog::error("Fail to read response body: {}", e.what()); | |
return std::nullopt; | |
} | |
msgpack::sbuffer buffer; | |
buffer.write(body_buffer.data(), body_buffer.size()); | |
return buffer; | |
} | |
auto receive_message(boost::asio::posix::stream_descriptor& fd) -> std::optional<msgpack::sbuffer> { | |
std::array<char, cHeaderSize> header_buffer{0}; | |
try { | |
boost::asio::read(fd, boost::asio::buffer(header_buffer)); | |
} catch (boost::system::system_error& e) { | |
if (boost::asio::error::eof != e.code()) { | |
spdlog::error("Fail to read header: {}", e.what()); | |
} | |
return std::nullopt; | |
} | |
size_t body_size = 0; | |
try { | |
body_size = std::stoul(std::string{header_buffer.data(), cHeaderSize}); | |
constexpr size_t max_message_size = 1024 * 1024 * 1024; // 1GB or adjust as needed | |
if (body_size > max_message_size) { | |
spdlog::error("Message size {} exceeds maximum allowed size {}", body_size, max_message_size); | |
return std::nullopt; | |
} | |
} catch (std::exception& e) { | |
spdlog::error( | |
"Cannot parse header: {} {}", | |
e.what(), | |
std::string{header_buffer.data(), cHeaderSize} | |
); | |
return std::nullopt; | |
} | |
std::vector<char> body_buffer(body_size); | |
try { | |
size_t body_read_size = boost::asio::read(fd, boost::asio::buffer(body_buffer)); | |
if (body_read_size != body_size) { | |
spdlog::error( | |
"Message body read size mismatch. Expect {}. Got {}", | |
body_size, | |
body_read_size | |
); | |
return std::nullopt; | |
} | |
} catch (boost::system::system_error& e) { | |
spdlog::error("Fail to read response body: {}", e.what()); | |
return std::nullopt; | |
} | |
msgpack::sbuffer buffer; | |
buffer.write(body_buffer.data(), body_buffer.size()); | |
return buffer; | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (4)
src/spider/worker/FunctionManager.hpp (4)
93-104
: Add documentation for public API functionsThese public API functions would benefit from documentation explaining:
- Parameters and return values
- Error conditions and handling
- Usage examples
112-118
: Replace magic number with named constantThe array size check uses a magic number '2'. Consider defining a named constant for better maintainability.
+ static constexpr std::size_t RESPONSE_ARRAY_SIZE = 2; - if (msgpack::type::ARRAY != object.type || 2 != object.via.array.size) { + if (msgpack::type::ARRAY != object.type || RESPONSE_ARRAY_SIZE != object.via.array.size) {
130-136
: Implementation looks good, consider consistent constant usageThe implementation is correct, but should use the same named constant for array size as suggested above.
Line range hint
177-218
: Enhance error messages for better debuggingConsider including more context in error messages:
- For ArgumentParsingError: Include the actual parsing error message from the exception
- For FunctionExecutionError: Include the exception message from std::exception
- fmt::format("Cannot parse arguments.") + fmt::format("Cannot parse arguments: {}", e.what())- "Function execution error" + fmt::format("Function execution error: {}", e.what())
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
src/spider/worker/FunctionManager.hpp
(6 hunks)src/spider/worker/message_pipe.cpp
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- src/spider/worker/message_pipe.cpp
🔇 Additional comments (3)
src/spider/worker/FunctionManager.hpp (3)
21-29
: Consider modernizing macro-based implementations
The macro-based approach for variable name generation could be replaced with more type-safe alternatives.
149-158
: LGTM! Clean and efficient implementation
The implementation makes good use of variadic templates and parameter pack expansion.
249-255
: Verify type safety of direct function registration
The register_function_invoker
method allows bypassing the type checking provided by FunctionInvoker
. Ensure that all direct registrations maintain proper type safety.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (1)
tests/worker/test-FunctionManager.cpp (1)
Line range hint
28-61
: Consider adding test case descriptions using SECTIONWhile the test coverage is comprehensive, consider using Catch2's SECTION macro to better organize and describe different test scenarios.
Here's how you could refactor this:
TEST_CASE("Register and run function with POD inputs", "[core]") { spider::core::FunctionManager const& manager = spider::core::FunctionManager::get_instance(); + SECTION("Unregistered function returns nullptr") { // Get the function that has not been registered should return nullptr REQUIRE(nullptr == manager.get_function("foo")); + } + SECTION("Successful execution with correct arguments") { // Get the registered function should succeed spider::core::Function const* function = manager.get_function("int_test"); spider::core::ArgsBuffer const args_buffers = spider::core::create_args_buffers(2, 3); msgpack::sbuffer const result = (*function)(args_buffers); REQUIRE(5 == spider::core::response_get_result<int>(result).value_or(0)); + }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
tests/worker/test-FunctionManager.cpp
(4 hunks)
🧰 Additional context used
🪛 cppcheck (2.10-2)
tests/worker/test-FunctionManager.cpp
[error] 28-28: syntax error
(syntaxError)
🔇 Additional comments (2)
tests/worker/test-FunctionManager.cpp (2)
24-26
: LGTM! Good test coverage across different data types.
The registration of test functions provides comprehensive coverage across primitive types (int_test), compound types (tuple_ret_test), and custom types (data_test).
Line range hint 77-93
: Enhance const correctness in data test
While the test is thorough, consider improving const correctness for better safety and clarity.
Apply this change:
- spider::core::Data data{"test"};
+ spider::core::Data const data{"test"};
Also, verify that the Data class's get_id() and get_value() methods are marked as const:
✅ Verification successful
Both get_id() and get_value() are already const-correct, only data variable needs const
The Data class implementation in src/spider/core/Data.hpp shows that both get_id() and get_value() methods are properly marked as const. The only improvement needed is making the data variable const in the test case, as originally suggested.
- spider::core::Data data{"test"};
+ spider::core::Data const data{"test"};
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check if Data class methods are properly marked as const
ast-grep --pattern 'class Data {
$$$
get_id() const {
$$$
}
$$$
get_value() const {
$$$
}
$$$
}'
Length of output: 120
Script:
#!/bin/bash
# Let's try to find the Data class definition first
rg -l "class Data"
# Then let's look for the method declarations
rg "get_id|get_value" -A 2
Length of output: 9341
Script:
#!/bin/bash
# Let's examine the Data class definition in core
cat src/spider/core/Data.hpp
Length of output: 1940
TEST_CASE("Register and run function with tuple return", "[core]") { | ||
SPIDER_WORKER_REGISTER_TASK(tuple_ret_test); | ||
|
||
spider::core::FunctionManager& manager = spider::core::FunctionManager::get_instance(); | ||
spider::core::FunctionManager const& manager = spider::core::FunctionManager::get_instance(); | ||
|
||
spider::core::Function const* function = manager.get_function("tuple_ret_test"); | ||
|
||
spider::core::ArgsBuffer const args_buffers = spider::core::create_args_buffers("test", 3); | ||
msgpack::sbuffer const result = (*function)(args_buffers); | ||
REQUIRE(std::make_tuple("test", 3) | ||
== spider::core::buffer_get<std::tuple<std::string, int>>(result).value_or( | ||
== spider::core::response_get_result<std::tuple<std::string, int>>(result).value_or( | ||
std::make_tuple("", 0) | ||
)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider adding error cases for tuple return test
The current test only covers the happy path. Consider adding test cases for error scenarios, such as incorrect argument types or missing arguments.
Here's a suggested addition:
// Test with wrong argument type
spider::core::ArgsBuffer const wrong_args = spider::core::create_args_buffers(42, 3); // int instead of string
msgpack::sbuffer const error_result = (*function)(wrong_args);
auto error = spider::core::response_get_error(error_result);
REQUIRE(error.has_value());
REQUIRE(spider::core::FunctionInvokeError::ArgumentParsingError == std::get<0>(error.value()));
Description
Add task executor message through pipe.
Validation performed
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes
Tests
Chores