feat: Add message pipe #27

sitaowang1998 · 2024-11-29T16:58:52Z

Description

Add task executor message through pipe.

Validation performed

lint:check
Unit test on devcontainer

Summary by CodeRabbit

Release Notes

New Features
- Introduced new header and source files for enhanced error handling and messaging capabilities in the spider application.
- Added asynchronous message handling functions for improved communication.
- Enhanced build configuration with updated library dependencies and settings for position-independent code.
Bug Fixes
- Updated error handling methods to ensure consistency and robustness in function execution and response generation.
Tests
- Added new test cases for message-pipe functionality and updated existing tests for the FunctionManager to improve coverage and reliability.
Chores
- Updated installation scripts for libraries to ensure compatibility and improved error handling.

Bump spdlog version to solve conflict with fmtlib. Add using external fmt to spdlog build. Add PIC in fmt build. Add process to boost build.

coderabbitai · 2024-11-29T16:58:59Z

Walkthrough

The pull request introduces several enhancements to the build configuration and source files of the project. Key changes include updates to the CMakeLists.txt to enable Position Independent Code (PIC) and to expand the search for Boost components. New source files are added, including FunctionManager.cpp, BoostAsio.hpp, and TaskExecutorMessage.hpp, which enhance functionality related to error handling and message processing. Additionally, new test cases are introduced to validate the new features, and various scripts are updated for improved installation processes.

Changes

File Path	Change Summary
`CMakeLists.txt`	Added `set(CMAKE_POSITION_INDEPENDENT_CODE ON)`, modified `find_package(Boost ...)`, added `add_compile_definitions(SPDLOG_FMT_EXTERNAL)`.
`src/spider/CMakeLists.txt`	Updated `SPIDER_CORE_SOURCES` and `SPIDER_WORKER_SOURCES`, modified target sources and linked libraries for `spider_worker`.
`src/spider/core/BoostAsio.hpp`	New header file added for Boost.Asio functionalities.
`src/spider/worker/FunctionManager.cpp`	New source file added with functions for error handling and response creation.
`src/spider/worker/FunctionManager.hpp`	Updated function signatures and added new methods for error and result handling.
`src/spider/worker/TaskExecutorMessage.hpp`	New header file added defining enums and functions for message handling.
`src/spider/worker/message_pipe.cpp`	Implemented message transmission and reception functionalities using Boost.Asio.
`src/spider/worker/message_pipe.hpp`	New header file defining message communication functions.
`tests/CMakeLists.txt`	Added new test source file and updated target sources and libraries for unit tests.
`tests/worker/test-FunctionManager.cpp`	Updated test cases to use new response handling methods and added new tests.
`tests/worker/test-MessagePipe.cpp`	New test file added for testing message-pipe functionality.
`tools/scripts/lib_install/boost.sh`	Modified to include `process` library during Boost installation.
`tools/scripts/lib_install/fmtlib.sh`	Added `-DCMAKE_POSITION_INDEPENDENT_CODE=ON` flag to CMake command.
`tools/scripts/lib_install/linux/install-lib.sh`	Updated `spdlog` library version from `1.14.1` to `1.15.0`.
`tools/scripts/lib_install/spdlog.sh`	Added `-DSPDLOG_FMT_EXTERNAL=ON` flag to CMake command for external fmt library usage.

Possibly related PRs

build: Add CMake config and libraries #13: Related changes in CMakeLists.txt for library management and configuration settings.
build: Split core into spearate library #15: Updates to CMakeLists.txt for C++ standards and linking libraries.
build: Add mariadb-connector-cpp library #18: Modifications in handling the MariaDB client library in the CMakeLists.txt.
test: Add tests for data storage #21: Enhancements to the build configuration and library linking with unit tests.
feat: Add function manager for register and run function by function name #25: Introduction of the FunctionManager aligns with updates to the project’s structure and functionality.

Suggested reviewers

kirkrodrigues

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 16

🧹 Outside diff range and nitpick comments (25)

src/spider/.clang-format (1)
13-14: Remove unnecessary empty alternative in regex pattern

The regex pattern includes an empty alternative | at the start which serves no purpose and could be removed.

Apply this diff to simplify the pattern:
-  - Regex: "^<(|absl|boost|catch2|fmt|mariadb|msgpack|spdlog)"
+  - Regex: "^<(absl|boost|catch2|fmt|mariadb|msgpack|spdlog)"
src/spider/worker/message_pipe.hpp (1)
1-9: LGTM! Consider adding file documentation.

The include guards and header organization are well-structured. The IWYU pragmas indicate thoughtful dependency management.

Consider adding a brief file-level documentation comment explaining the purpose and usage of this message pipe facility:
 #ifndef SPIDER_WORKER_MESSAGE_PIPE_HPP
 #define SPIDER_WORKER_MESSAGE_PIPE_HPP
 
+/**
+ * @file message_pipe.hpp
+ * @brief Provides message passing functionality through pipes for worker communication
+ *
+ * This header defines interfaces for synchronous and asynchronous message passing
+ * using Boost.Asio pipes and POSIX stream descriptors.
+ */
+
 #include <functional>
tests/CMakeLists.txt (1)
16-20: Consider using CMake's path handling functions

The current path construction using string concatenation could be fragile. Consider using cmake_path() or file(REAL_PATH) for more robust path handling.

Here's a more robust approach:
 set(SPIDER_TEST_WORKER_SOURCES)
+set(SPIDER_SRC_DIR "${CMAKE_CURRENT_SOURCE_DIR}/../src/spider")
 foreach(worker_source ${SPIDER_WORKER_SOURCES})
-    list(APPEND SPIDER_TEST_WORKER_SOURCES "../src/spider/${worker_source}")
+    cmake_path(APPEND SPIDER_SRC_DIR "${worker_source}" OUTPUT_VARIABLE source_path)
+    list(APPEND SPIDER_TEST_WORKER_SOURCES "${source_path}")
 endforeach()
src/spider/CMakeLists.txt (2)
38-44: Separate headers from source files

Consider creating a separate SPIDER_WORKER_HEADERS variable for .hpp files to maintain better CMake structure.
 set(SPIDER_WORKER_SOURCES
     worker/message_pipe.cpp
+    CACHE INTERNAL
+    "spider worker source files"
+)
+
+set(SPIDER_WORKER_HEADERS
     worker/message_pipe.hpp
     worker/TaskExecutorMessage.hpp
     CACHE INTERNAL
-    "spider worker source files"
+    "spider worker header files"
)
50-59: Add version constraints for spdlog dependency

While Boost dependencies are well-specified, consider adding version constraints for spdlog to ensure compatibility.
+find_package(spdlog 1.9.0 REQUIRED)
 target_link_libraries(
     spider_worker
     PRIVATE
         Boost::headers
         Boost::filesystem
         Boost::process
         Boost::program_options
         Boost::system
         spdlog::spdlog
)
tools/scripts/lib_install/spdlog.sh (1)
Line range hint 1-7: Update dependency documentation

The script now requires the fmt library due to the external spdlog configuration, but this isn't documented in the dependencies section.

Add fmt to the dependencies list:
 # Dependencies:
 # - cmake
 # - curl
 # - g++
+# - fmt
 # NOTE: Dependencies should be installed outside the script to allow the script to be largely distro-agnostic
src/spider/worker/FunctionManager.cpp (1)
13-38: Consider adding error logging in the catch block

While the error handling is comprehensive, adding logging in the catch block would help with debugging when type errors occur.
    } catch (msgpack::type_error& e) {
+       // Log the error for debugging purposes
+       spdlog::debug("Failed to unpack message: {}", e.what());
        return std::nullopt;
    }
tests/worker/test-MessagePipe.cpp (3)
36-36: Consider making the timeout duration configurable.

The 5-second timeout is hardcoded. Consider making this a named constant or configuration parameter for better maintainability.
+    static constexpr auto RESPONSE_TIMEOUT = std::chrono::seconds(5);
-    REQUIRE(future.wait_for(std::chrono::seconds(5)) == std::future_status::ready);
+    REQUIRE(future.wait_for(RESPONSE_TIMEOUT) == std::future_status::ready);
41-52: Simplify nested validation logic.

The nested if statements can be simplified since we've already verified the presence of values with REQUIRE.
-    if (response_option.has_value()) {
-        msgpack::sbuffer const& response_buffer = response_option.value();
-        REQUIRE(spider::worker::TaskExecutorResponseType::Result
-                == spider::worker::get_response_type(response_buffer));
-        std::optional<std::tuple<std::string, int>> const parse_response
-                = spider::core::response_get_result<std::tuple<std::string, int>>(response_buffer);
-        REQUIRE(parse_response.has_value());
-        if (parse_response.has_value()) {
-            std::tuple<std::string, int> result = parse_response.value();
-            REQUIRE(cSampleResult == result);
-        }
-    }
+    msgpack::sbuffer const& response_buffer = response_option.value();
+    REQUIRE(spider::worker::TaskExecutorResponseType::Result
+            == spider::worker::get_response_type(response_buffer));
+    std::optional<std::tuple<std::string, int>> const parse_response
+            = spider::core::response_get_result<std::tuple<std::string, int>>(response_buffer);
+    REQUIRE(parse_response.has_value());
+    REQUIRE(cSampleResult == parse_response.value());
16-53: Consider adding more test cases for error scenarios.

The current test case only covers the happy path. Consider adding test cases for:

Invalid message format

Pipe disconnection

Timeout scenarios

Buffer overflow conditions

Would you like me to help generate additional test cases for these scenarios?
src/spider/worker/TaskExecutorMessage.hpp (4)
6-6: Consider documenting the IWYU pragma

While the IWYU pragma is correctly used, it would be helpful to add a brief comment explaining why this pragma is necessary for MsgPack.hpp.

9-16: Add documentation for enum values

Each response type should be documented to explain its purpose and when it's used. This helps maintainers understand the message protocol.

Example:
 enum class TaskExecutorResponseType : std::uint8_t {
+    /// Indicates an invalid or unrecognized response
     Unknown = 0,
+    /// Contains the successful execution result
     Result,
+    /// Contains error information from execution
     Error,
     // ... document other states
 };
40-54: Reduce code duplication with get_response_type

Consider creating a template function to handle the common message parsing logic.
+template<typename T>
+inline auto get_message_type(msgpack::sbuffer const& buffer) -> T {
+    // NOLINTBEGIN(cppcoreguidelines-pro-type-union-access,cppcoreguidelines-pro-bounds-pointer-arithmetic)
+    msgpack::object_handle const handle = msgpack::unpack(buffer.data(), buffer.size());
+    msgpack::object const object = handle.get();
+    if (object.type != msgpack::type::ARRAY || object.via.array.size < 2) {
+        return T::Unknown;
+    }
+    msgpack::object const header = object.via.array.ptr[0];
+    try {
+        return header.as<T>();
+    } catch (msgpack::type_error const&) {
+        return T::Unknown;
+    }
+    // NOLINTEND(cppcoreguidelines-pro-type-union-access,cppcoreguidelines-pro-bounds-pointer-arithmetic)
+}
+
+inline auto get_response_type(msgpack::sbuffer const& buffer) -> TaskExecutorResponseType {
+    return get_message_type<TaskExecutorResponseType>(buffer);
+}
+
+inline auto get_request_type(msgpack::sbuffer const& buffer) -> TaskExecutorRequestType {
+    return get_message_type<TaskExecutorRequestType>(buffer);
+}
1-70: Consider adding protocol versioning and documentation

To ensure forward compatibility and maintainability:

Add a version field to the message protocol

Create protocol documentation explaining the message format, types, and their use cases

Consider adding a protocol validation layer

This will help with future protocol evolution and make it easier for new developers to understand the messaging system.
tests/worker/test-FunctionManager.cpp (3)
Line range hint 28-62: Consider enhancing error handling validation.

While the test covers both successful and error scenarios well, consider these improvements:

Validate error message content in addition to error codes

Simplify optional handling using std::optional methods

Here's a suggested improvement for the error handling section:
-    if (wrong_result_option.has_value()) {
-        REQUIRE(spider::core::FunctionInvokeError::WrongNumberOfArguments
-                == std::get<0>(wrong_result_option.value()));
-    }
+    REQUIRE(std::get<0>(wrong_result_option.value()) == 
+            spider::core::FunctionInvokeError::WrongNumberOfArguments);
+    REQUIRE(!std::get<1>(wrong_result_option.value()).empty());
64-75: Consider adding error case tests for tuple returns.

The happy path is well tested, but consider adding tests for:

Wrong argument types

Wrong number of arguments

Edge cases with empty strings

Would you like me to provide example test cases for these scenarios?

Additional test coverage needed for Data handling

Looking at the codebase, while there are some Data-related tests in test-DataStorage.cpp, they don't fully cover edge cases. The Data class implementation in src/spider/core/Data.hpp shows that it accepts any string values without validation, making it important to test boundary conditions.

Specific test cases to add:

Empty strings for both key and value

Very large strings to ensure no implicit size limitations

Strings with special characters and Unicode content

Edge cases around UUID generation when using the default constructor

Duplicate key scenarios (partially covered in DataStorage tests)

🔗 Analysis chain

Line range hint 77-93: Consider adding edge cases for Data handling.

The current test verifies basic functionality well. Consider adding tests for:

Empty data values

Very large data values

Special characters in data

Let's check if there are any existing edge case tests elsewhere:
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Search for other Data-related tests
rg -l "TEST_CASE.*Data" tests/
Length of output: 32

Script:
#!/bin/bash
# Let's try a broader search for Data-related tests
rg "TEST_CASE.*[^a-zA-Z]Data" tests/ -A 5

# Also search for any Data usage in tests
rg "spider::core::Data" tests/ -A 2
Length of output: 3701

Script:
#!/bin/bash
# Let's examine the Data class implementation to understand what edge cases should be tested
rg -A 10 "class Data" src/

# Also check if there are any size limits or validation in Data implementation
ast-grep --pattern 'Data::Data($$$) {
  $$$
}'
Length of output: 1491
src/spider/worker/message_pipe.cpp (3)
19-19: Use std::size_t instead of size_t for consistency

Consider using std::size_t to explicitly specify the namespace and maintain consistency across the codebase.

21-46: Eliminate code duplication in send_message functions

The two send_message functions have identical implementations except for the type of the pipe parameter. Consider templating the function or using a common base class to reduce code duplication and improve maintainability.

Here's how you might refactor using a template:
template <typename Pipe>
auto send_message(Pipe& pipe, msgpack::sbuffer const& request) -> bool {
    try {
        size_t const size = request.size();
        std::string const size_str = fmt::format("{:016d}", size);
        boost::asio::write(pipe, boost::asio::buffer(size_str));
        boost::asio::write(pipe, boost::asio::buffer(request.data(), size));
        return true;
    } catch (boost::system::system_error const& e) {
        spdlog::error("Failed to send message: {}", e.what());
        return false;
    }
}
91-99: Consistent error logging for EOF condition

When checking for boost::asio::error::eof, ensure that the error logging is consistent with the synchronous version. Currently, the asynchronous function may not log all relevant errors.

Consider aligning the error handling with receive_message.
src/spider/worker/FunctionManager.hpp (5)
19-20: Include guard verification for "TaskExecutorMessage.hpp"

Ensure that "TaskExecutorMessage.hpp" has proper include guards or uses #pragma once to prevent multiple inclusions, which could lead to compilation issues.

99-103: Consolidate create_error_response and create_error_buffer to reduce duplication

Both functions serve similar purposes in creating error responses. Merging them could simplify the codebase and eliminate redundancy.

210-213: Include exception details in result parsing error messages

Adding e.what() to the error message would provide more context, aiding in debugging efforts.

Apply this diff to include exception details:
} catch (msgpack::type_error& e) {
    return create_error_response(
        FunctionInvokeError::ResultParsingError,
-       fmt::format("Cannot parse result.")
+       fmt::format("Cannot parse result: {}", e.what())
    );
}
215-218: Provide exception information in function execution errors

Including the exception message e.what() in the error response would offer valuable insights during debugging.

Apply this diff to enhance the error message:
} catch (std::exception& e) {
    return create_error_response(
        FunctionInvokeError::FunctionExecutionError,
-       "Function execution error"
+       fmt::format("Function execution error: {}", e.what())
    );
}
253-254: Avoid returning raw pointers from get_function

Returning a raw pointer may lead to safety issues. Consider returning a reference or an std::optional<const Function&> to enhance safety and express the possibility of a missing function.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between b7ce91e and f7f8e3f.

📒 Files selected for processing (16)

CMakeLists.txt (3 hunks)
src/spider/.clang-format (1 hunks)
src/spider/CMakeLists.txt (2 hunks)
src/spider/core/BoostAsio.hpp (1 hunks)
src/spider/worker/FunctionManager.cpp (1 hunks)
src/spider/worker/FunctionManager.hpp (6 hunks)
src/spider/worker/TaskExecutorMessage.hpp (1 hunks)
src/spider/worker/message_pipe.cpp (1 hunks)
src/spider/worker/message_pipe.hpp (1 hunks)
tests/CMakeLists.txt (1 hunks)
tests/worker/test-FunctionManager.cpp (4 hunks)
tests/worker/test-MessagePipe.cpp (1 hunks)
tools/scripts/lib_install/boost.sh (1 hunks)
tools/scripts/lib_install/fmtlib.sh (1 hunks)
tools/scripts/lib_install/linux/install-lib.sh (1 hunks)
tools/scripts/lib_install/spdlog.sh (1 hunks)

✅ Files skipped from review due to trivial changes (1)

src/spider/core/BoostAsio.hpp

🧰 Additional context used

🪛 cppcheck (2.10-2)

tests/worker/test-FunctionManager.cpp

[error] 28-28: syntax error

(syntaxError)

🔇 Additional comments (26)

src/spider/.clang-format (1)

13-14: Verify completeness of library headers list

Let's ensure all required libraries for the message pipe feature are included in the pattern.

✅ Verification successful

Library headers pattern in .clang-format is complete

All major libraries are included: Abseil (absl), Boost (boost), Catch2 (catch2), fmt (fmt), MariaDB (mariadb), MessagePack (msgpack), and spdlog (spdlog)
The remaining includes are either:
- Standard C++ headers (like <vector>, <string>, etc.)
- Project-specific includes using quotes
- Nested includes from the already-covered libraries

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for any missing library includes that should be added to .clang-format

# Search for library includes in source files
echo "Searching for library includes..."
rg -g '*.{cpp,hpp,h}' '^\s*#include\s*<([^>]+)>' -o | \
  sed -E 's/.*<([^/]+).*/\1/' | \
  sort -u | \
  grep -vE '^(absl|boost|catch2|fmt|mariadb|msgpack|spdlog|spider|std.*)'

echo "Checking CMake dependencies..."
# Look for find_package calls in CMake files
fd -e cmake -e txt -x grep -h "find_package" {}

Length of output: 1117

Script:

#!/bin/bash
# Let's check for any includes from Abseil, Boost, or other libraries that might be hidden in subdirectories
echo "Checking for library includes with path components..."
rg -g '*.{cpp,hpp,h}' '^\s*#include\s*<([^>]+)>' -o | \
  grep -v '^#include <[^/]*>' | \
  sort -u

# Also check for any third-party includes that might be using quotes instead of angle brackets
echo "Checking for quoted includes..."
rg -g '*.{cpp,hpp,h}' '^\s*#include\s*"([^"]+)"' -o | \
  grep -v '\.\./' | \
  grep -v '\./' | \
  sort -u

Length of output: 11602

tools/scripts/lib_install/linux/install-lib.sh (1)

24-24: Verify spdlog version compatibility

The version update to spdlog 1.15.0 looks good, but let's verify its stability and compatibility with the external fmt library requirement.

✅ Verification successful

The previous results show that spdlog 1.15.0 is a stable release, but there's a known issue with VS2017 and potential compatibility concerns with older fmt versions. Let's verify the fmt library configuration in the codebase to ensure compatibility.

Version update to spdlog 1.15.0 is compatible and properly configured

The update is safe to proceed because:

The project correctly uses fmt 8.0.1, which is compatible with spdlog 1.15.0
SPDLOG_FMT_EXTERNAL is properly configured in both CMake and build scripts
While there's an open VS2017 issue, it doesn't affect this Linux-specific installation script

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify spdlog 1.15.0 compatibility and release information

# Check if spdlog 1.15.0 is tagged as stable
curl -s "https://api.github.com/repos/gabime/spdlog/releases/tags/v1.15.0" | jq -r '.prerelease'

# Check minimum required fmt version for spdlog 1.15.0
rg -A 5 "FMT_VERSION|fmt_version" "$(fd -t f -e cmake -e txt CMakeLists)"

# Verify if any known issues exist with 1.15.0
curl -s "https://api.github.com/repos/gabime/spdlog/issues?state=all" | \
  jq -r '.[] | select(.title | contains("1.15.0")) | {title: .title, state: .state, labels: [.labels[].name]}'

Length of output: 877

Script:

#!/bin/bash
# Let's check for fmt library version requirements and usage in the codebase

# Search for fmt version requirements in CMake files
rg -A 3 "find_package\(fmt" 

# Check if SPDLOG_FMT_EXTERNAL is defined
rg "SPDLOG_FMT_EXTERNAL|FMT_EXTERNAL"

# Look for fmt library usage patterns
rg -l "fmt::|#include.*fmt"

Length of output: 568

src/spider/worker/message_pipe.hpp (2)

22-24: LGTM! Proper namespace and include guard closing.

The closing comments for both namespace and include guard enhance readability in larger codebases.

12-20: Verify error handling and message protocol details.

The boolean and optional return types suggest basic error handling, but we should verify the implementation details.

✅ Verification successful

Let me gather more information about the message protocol implementation.

Error handling and message protocol implementation are well-structured and robust

The implementation shows a solid message protocol with:

Fixed-size header (16 bytes) containing message length
Proper error handling for both sync and async operations
Comprehensive error logging for all failure scenarios
Graceful EOF handling
Consistent error propagation through boolean/optional returns

Error handling covers:

System errors during read/write operations
Header parsing failures
Size mismatches
EOF conditions

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check implementation details for error handling and message protocol

# Look for error handling patterns in the implementation
echo "Checking error handling implementation..."
rg -A 5 "catch|throw|error|fail" src/spider/worker/message_pipe.cpp

# Look for message format/protocol details
echo "Checking message protocol implementation..."
ast-grep --pattern 'msgpack::pack($_)'

Length of output: 2467

Script:

#!/bin/bash
# Check the message protocol implementation details
echo "Checking message protocol implementation in message_pipe.cpp..."
rg -A 10 "send_message|receive_message" src/spider/worker/message_pipe.cpp

# Look for message size/format constants
echo "Checking message format constants..."
rg "constexpr|#define.*size" src/spider/worker/message_pipe.cpp

Length of output: 2460

tests/CMakeLists.txt (1)

9-9: LGTM: Test file addition is well-structured

The addition of test-MessagePipe.cpp follows the established pattern for test organization and is properly integrated into the test suite.

src/spider/CMakeLists.txt (2)

2-7: LGTM: Core sources configuration is well-structured

The CACHE INTERNAL setting ensures proper variable scope accessibility, and the source files are logically organized.

15-16: Verify Boost.Asio and MessagePack dependencies

The addition of BoostAsio.hpp and MsgPack.hpp suggests new dependencies. Let's verify the configuration.

✅ Verification successful

Let me check the CMakeLists.txt content to verify the dependency configuration.

Dependencies are properly configured

The dependencies for both Boost.Asio and MessagePack are correctly set up:

Boost is configured with required components including system (which provides Asio)
MessagePack is properly configured via find_package(msgpack-cxx 7.0.0 REQUIRED)
Both libraries are correctly linked:
- Boost via Boost::boost and Boost::system
- MessagePack via msgpack-cxx

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for proper dependency configuration

# Check if Boost.Asio is properly configured
rg -l "find_package.*Boost.*asio" --type cmake

# Check if MessagePack is properly configured
rg -l "find_package.*msgpack" --type cmake

# Verify usage of these dependencies
rg -l "include.*BoostAsio\.hpp"
rg -l "include.*MsgPack\.hpp"

Length of output: 594

Script:

#!/bin/bash
# Check root CMakeLists.txt for dependency configuration
cat CMakeLists.txt

# Check spider's CMakeLists.txt
cat src/spider/CMakeLists.txt

Length of output: 6302

tools/scripts/lib_install/fmtlib.sh (1)

67-67: LGTM: Position Independent Code flag is correctly added

The addition of -DCMAKE_POSITION_INDEPENDENT_CODE=ON is appropriate for building shared libraries and aligns with the project's broader changes.

src/spider/worker/FunctionManager.cpp (4)

1-11: LGTM! Well-organized header includes

The header organization follows best practices with proper grouping (standard library, external, then internal headers).

62-67: LGTM! Clean and efficient implementation

The method follows best practices with proper const usage and efficient map lookup.

71-72: LGTM! Proper use of BOOST_DLL_ALIAS

The DLL alias is correctly implemented with appropriate NOLINT directive.

40-49: Verify message size constraints

The function should validate the message size to prevent potential memory issues with very large error messages.

tests/worker/test-MessagePipe.cpp (2)

1-14: LGTM! Well-structured includes with appropriate pragmas.

The includes are well-organized and the IWYU/NOLINT pragmas are appropriately used for test code.

1-56: Verify test integration in the test suite.

Let's ensure this test is properly integrated into the build system and test suite.

✅ Verification successful

Let me check the CMakeLists.txt content to verify how the test is integrated.

Test is properly integrated in the test suite

The test file is correctly integrated:

Listed in SPIDER_TEST_SOURCES variable in tests/CMakeLists.txt
Included in the unitTest executable target
Has proper library linkage with required dependencies (Boost, Catch2)
Is the only MessagePipe test file, which is appropriate given its comprehensive coverage of the message pipe functionality

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify test integration in CMake configuration

# Check if the test file is included in CMakeLists.txt
echo "Checking test file inclusion in CMakeLists.txt..."
fd -t f "CMakeLists.txt" tests/ -x grep -l "test-MessagePipe.cpp"

# Check for other message pipe related tests
echo "Checking for related test files..."
fd -t f "test.*MessagePipe.*\\.cpp$" tests/

Length of output: 351

Script:

#!/bin/bash
# Check the content of tests/CMakeLists.txt
cat tests/CMakeLists.txt

Length of output: 962

src/spider/worker/TaskExecutorMessage.hpp (1)

66-68: LGTM! MessagePack registration is properly implemented

The enum registration is correctly placed in the global namespace with clear documentation.

tests/worker/test-FunctionManager.cpp (1)

24-26: LGTM! Test function registration is comprehensive.

The test functions are well-structured to cover different scenarios including POD types, tuple returns, and custom Data objects.

CMakeLists.txt (2)

58-60: LGTM! Position Independent Code is essential for modern builds.

Enabling PIC globally is a good practice, especially when building components that might be used in shared libraries or plugins.

124-124: LGTM! Consistent spdlog configuration.

The SPDLOG_FMT_EXTERNAL definition is correctly added, matching the existing configuration set above.

src/spider/worker/FunctionManager.hpp (8)

93-95: response_get_error function implementation is sound

The addition of response_get_error enhances error handling by providing a structured way to extract error information from the buffer.

96-98: create_error_response improves error message consistency

The create_error_response function effectively standardizes error responses, which aids in consistent error handling across the system.

112-114: Verify robustness of array size check in response_get_result

The condition object.via.array.size == 2 assumes the array always contains exactly two elements. Ensure this assumption holds true in all contexts to prevent potential parsing errors.

116-119: Confirm accuracy of response type comparison

Double-check that the comparison with TaskExecutorResponseType::Result correctly identifies the expected response type and that the as<>() conversion is safe and reliable.

130-136: create_result_response function is well-structured

The function correctly packs the result into a msgpack::sbuffer, ensuring that the response is properly formatted.

149-158: create_args_request efficiently packs arguments

The implementation of create_args_request effectively serializes the arguments into a request buffer, facilitating seamless communication.

208-208: Ensure exception safety when returning results

While returning results, confirm that any exceptions during the packing process are appropriately handled to prevent unexpected crashes.

249-250: register_function_invoker method is appropriately implemented

The addition of register_function_invoker allows for direct registration of function invokers, increasing flexibility.

coderabbitai · 2024-11-29T17:04:50Z

src/spider/worker/message_pipe.hpp

+auto send_message(boost::asio::writable_pipe& pipe, msgpack::sbuffer const& request) -> bool;
+
+auto send_message(boost::asio::posix::stream_descriptor& fd, msgpack::sbuffer const& request)
+        -> bool;
+
+auto receive_message_async(std::reference_wrapper<boost::asio::readable_pipe> pipe
+) -> boost::asio::awaitable<std::optional<msgpack::sbuffer>>;
+
+auto receive_message(boost::asio::posix::stream_descriptor& fd) -> std::optional<msgpack::sbuffer>;


🛠️ Refactor suggestion

Add comprehensive function documentation and consider error handling strategy.

While the function signatures are well-designed, they would benefit from detailed documentation and explicit error handling guidance.

Add Doxygen-style documentation for each function:

+/** + * @brief Sends a message through a writable pipe + * @param pipe The pipe to write the message to + * @param request The message buffer to send + * @return true if the message was sent successfully, false otherwise + */ auto send_message(boost::asio::writable_pipe& pipe, msgpack::sbuffer const& request) -> bool; +/** + * @brief Sends a message through a POSIX stream descriptor + * @param fd The file descriptor to write the message to + * @param request The message buffer to send + * @return true if the message was sent successfully, false otherwise + */ auto send_message(boost::asio::posix::stream_descriptor& fd, msgpack::sbuffer const& request) -> bool; +/** + * @brief Asynchronously receives a message from a readable pipe + * @param pipe Reference wrapper to the pipe to read from + * @return An awaitable that resolves to an optional message buffer + */ auto receive_message_async(std::reference_wrapper<boost::asio::readable_pipe> pipe ) -> boost::asio::awaitable<std::optional<msgpack::sbuffer>>; +/** + * @brief Synchronously receives a message from a POSIX stream descriptor + * @param fd The file descriptor to read from + * @return An optional containing the received message buffer if successful + */ auto receive_message(boost::asio::posix::stream_descriptor& fd) -> std::optional<msgpack::sbuffer>;

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

auto send_message(boost::asio::writable_pipe& pipe, msgpack::sbuffer const& request) -> bool;

auto send_message(boost::asio::posix::stream_descriptor& fd, msgpack::sbuffer const& request)

-> bool;

auto receive_message_async(std::reference_wrapper<boost::asio::readable_pipe> pipe

) -> boost::asio::awaitable<std::optional<msgpack::sbuffer>>;

auto receive_message(boost::asio::posix::stream_descriptor& fd) -> std::optional<msgpack::sbuffer>;

/**

* @brief Sends a message through a writable pipe

* @param pipe The pipe to write the message to

* @param request The message buffer to send

* @return true if the message was sent successfully, false otherwise

*/

auto send_message(boost::asio::writable_pipe& pipe, msgpack::sbuffer const& request) -> bool;

/**

* @brief Sends a message through a POSIX stream descriptor

* @param fd The file descriptor to write the message to

* @param request The message buffer to send

* @return true if the message was sent successfully, false otherwise

*/

auto send_message(boost::asio::posix::stream_descriptor& fd, msgpack::sbuffer const& request)

-> bool;

/**

* @brief Asynchronously receives a message from a readable pipe

* @param pipe Reference wrapper to the pipe to read from

* @return An awaitable that resolves to an optional message buffer

*/

auto receive_message_async(std::reference_wrapper<boost::asio::readable_pipe> pipe

) -> boost::asio::awaitable<std::optional<msgpack::sbuffer>>;

/**

* @brief Synchronously receives a message from a POSIX stream descriptor

* @param fd The file descriptor to read from

* @return An optional containing the received message buffer if successful

*/

auto receive_message(boost::asio::posix::stream_descriptor& fd) -> std::optional<msgpack::sbuffer>;

coderabbitai · 2024-11-29T17:04:50Z

tests/CMakeLists.txt

+target_link_libraries(
+    unitTest
+    PRIVATE
+    Boost::headers
+    Boost::filesystem
+    Boost::program_options
+    Boost::process
+    Boost::system
+    spdlog::spdlog
+)


💡 Codebase verification

Remove unused Boost components from test dependencies

Based on the code analysis, some Boost components are not being used in the test suite:

Boost::filesystem and Boost::program_options are not used anywhere in the tests

Only the following Boost components are actually needed:

Boost::headers (for UUID functionality)

Boost::process and Boost::system (for asio functionality in MessagePipe tests)

Additionally, while spdlog is linked, it appears to be unused in the test code and can be removed.

🔗 Analysis chain

Verify usage of all Boost components

While the dependencies are properly configured, let's verify that all Boost components are actually needed by the test suite.

Let me gather more information about the actual usage of Boost components in the test files.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Description: Check usage of Boost components in test files echo "Checking Boost::filesystem usage..." rg -l "boost/filesystem" tests/ echo "Checking Boost::program_options usage..." rg -l "boost/program_options" tests/ echo "Checking Boost::process usage..." rg -l "boost/process" tests/ echo "Checking Boost::system usage..." rg -l "boost/system" tests/ echo "Checking spdlog usage..." rg -l "spdlog" tests/

Length of output: 547

Script:

#!/bin/bash # Let's check the actual content of test files to see Boost usage echo "Searching for Boost headers and usage..." rg "boost" tests/ -A 3 echo -e "\nSearching for spdlog usage..." rg "spdlog" tests/ -A 3

Length of output: 7844

coderabbitai · 2024-11-29T17:04:50Z

tools/scripts/lib_install/fmtlib.sh

+cmake -DCMAKE_POSITION_INDEPENDENT_CODE=ON ../
 make -j${num_cpus}


🛠️ Refactor suggestion

Consider enhancing build verification and cleanup

The script should verify the build success and clean up on failure. Consider adding:

Build status verification

Cleanup trap for failures

Apply this diff to improve error handling:

cmake -DCMAKE_POSITION_INDEPENDENT_CODE=ON ../ +if [ $? -ne 0 ]; then + echo "CMAKE configuration failed" + exit 1 +fi make -j${num_cpus} +if [ $? -ne 0 ]; then + echo "Build failed" + exit 1 +fi

And at the start of the script, after set -e:

# Exit on any error set -e + +# Cleanup on error +cleanup() { + if [ $? -ne 0 ]; then + echo "Error occurred, cleaning up..." + rm -rf $temp_dir + fi +} +trap cleanup EXIT

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-11-29T17:04:50Z

tools/scripts/lib_install/spdlog.sh

@@ -71,7 +71,7 @@ fi
 cd ${extracted_dir}
 mkdir -p build
 cd build
-cmake ../
+cmake -DSPDLOG_FMT_EXTERNAL=ON ..


💡 Codebase verification

Version mismatch between build script and CMake requirement

The build system has a version mismatch for the fmt library:

CMakeLists.txt requires fmt version 8.0.1

install-lib.sh installs fmt version 11.0.2

While the newer version should be backwards compatible, it's recommended to:

Update CMakeLists.txt to require fmt 11.0.2, or

Update install-lib.sh to install fmt 8.0.1

Otherwise, the -DSPDLOG_FMT_EXTERNAL=ON flag is properly supported as:

fmtlib.sh is installed before spdlog.sh

The installation scripts handle dependencies correctly

🔗 Analysis chain

Verify fmt library dependency for external usage

The addition of -DSPDLOG_FMT_EXTERNAL=ON is appropriate for the messaging pipe feature. However, we should ensure the fmt library dependency is properly handled.

Let's verify the fmt library dependency:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Description: Check for fmt library installation script and dependencies # Check for fmt library installation script echo "Checking for fmt library installation script..." fd -t f "fmt.sh" "tools/scripts/lib_install" # Check if fmt is mentioned in any CMake files echo "Checking for fmt in CMake files..." rg -l "fmt" -g "CMakeLists.txt" # Check for fmt package requirements echo "Checking for fmt package requirements..." rg "fmt" -g "package.xml|requirements.txt|dependencies"

Length of output: 593

Script:

#!/bin/bash # Let's check the content of CMakeLists.txt files that mention fmt echo "Checking CMakeLists.txt content for fmt dependency..." rg -A 5 -B 5 "fmt" CMakeLists.txt src/spider/CMakeLists.txt # Check if fmt library is installed before spdlog echo "Checking spdlog.sh script for fmt dependency handling..." cat tools/scripts/lib_install/spdlog.sh # Look for any fmt installation scripts with different naming echo "Looking for any fmt-related installation scripts..." fd -t f ".*fmt.*" tools/scripts/lib_install/

Length of output: 4134

Script:

#!/bin/bash # Let's check the fmtlib installation script echo "Checking fmtlib.sh content..." cat tools/scripts/lib_install/fmtlib.sh # Check if fmtlib is installed before spdlog in any build scripts echo "Checking for build order in scripts..." rg -l "fmtlib.sh|spdlog.sh" tools/scripts/ # Verify fmt version requirement matches echo "Checking fmt version requirements..." rg "fmt.*8\.0\.1" -g "*.sh"

Length of output: 2407

Script:

#!/bin/bash # Check the installation order in the Linux install script echo "Checking library installation order..." cat tools/scripts/lib_install/linux/install-lib.sh # Check if there are any version specifications for fmt in CMake echo "Checking fmt version in CMake files..." rg "fmt.*VERSION" -A 2 -B 2 CMakeLists.txt

Length of output: 1287

coderabbitai · 2024-11-29T17:04:50Z

src/spider/worker/FunctionManager.cpp

+void create_error_buffer(
+        FunctionInvokeError error,
+        std::string const& message,
+        msgpack::sbuffer& buffer
+) {
+    msgpack::packer packer{buffer};
+    packer.pack_array(2);
+    packer.pack(error);
+    packer.pack(message);
+}


💡 Codebase verification

⚠️ Potential issue

Array size difference is intentional, but buffer capacity check is recommended

The difference in array sizes between create_error_buffer (2 elements) and create_error_response (3 elements) is intentional:

create_error_response includes the response type as its first element (TaskExecutorResponseType::Error)

create_error_buffer is used when the response type is already known/written

However, the second concern about buffer capacity remains valid. Consider adding a check or using a buffer with automatic growth.

🔗 Analysis chain

Review array size inconsistency and add buffer capacity check

Two concerns:

This function packs 2 elements while create_error_response packs 3 elements. Is this intentional?

Consider checking buffer capacity before writing to prevent overflow.

Let me gather more information about the create_error_response function implementation to understand the array size difference.

Let me search for the implementation of create_error_response to compare with create_error_buffer.

Let me try one more search with a different approach to find the implementation.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Search for other uses of these functions to understand the expected array sizes ast-grep --pattern 'create_error_buffer($$$)' -A 5 ast-grep --pattern 'create_error_response($$$)' -A 5

Length of output: 3579

Script:

#!/bin/bash # Search for create_error_response implementation ast-grep --pattern 'create_error_response($$$)' --lang cpp rg "create_error_response.*\{" -A 5

Length of output: 2217

Script:

#!/bin/bash # Search for create_error_response implementation in cpp files rg -t cpp "create_error_response.*\{" -B 2 -A 10 # Also check the header file for its declaration rg -t hpp "create_error_response" -B 2 -A 2

Length of output: 122

Script:

#!/bin/bash # Search for create_error_response implementation rg "create_error_response" -A 10 src/spider/worker/FunctionManager.cpp # Also look for any msgpack array packing in the response creation rg "pack_array" -B 5 -A 5 src/spider/worker/FunctionManager.cpp

Length of output: 1113

coderabbitai · 2024-11-29T17:04:51Z

src/spider/worker/message_pipe.cpp

+auto receive_message_async(std::reference_wrapper<boost::asio::readable_pipe> pipe
+) -> boost::asio::awaitable<std::optional<msgpack::sbuffer>> {
+    std::array<char, cHeaderSize> header_buffer{0};
+    // NOLINTNEXTLINE(clang-analyzer-core.NullDereference)
+    auto [header_ec, header_n] = co_await boost::asio::async_read(


🛠️ Refactor suggestion

Investigate suppressed null dereference warning

The comment // NOLINTNEXTLINE(clang-analyzer-core.NullDereference) suppresses a potential null dereference warning. Instead of suppressing the warning, consider addressing the underlying issue to ensure code safety.

Review whether pipe.get() could be null and add necessary checks or assertions.

coderabbitai · 2024-11-29T17:04:51Z

src/spider/worker/message_pipe.cpp

+    size_t response_size = 0;
+    try {
+        response_size = std::stoul(std::string{header_buffer.data(), cHeaderSize});
+    } catch (std::exception& e) {
+        spdlog::error(
+                "Cannot parse header: {} {}",
+                e.what(),
+                std::string{header_buffer.data(), cHeaderSize}
+        );
+        co_return std::nullopt;
+    }


⚠️ Potential issue

Handle exceptions when parsing response size

Similar to the synchronous version, ensure that exceptions thrown by std::stoul when parsing response_size are properly handled. A malformed header could cause unexpected termination.

Ensure a try-catch block is present and include size checks:

try { response_size = std::stoul(std::string{header_buffer.data(), cHeaderSize}); if (response_size >= max_allowed_size) { spdlog::error("Response size exceeds maximum allowed limit: {}", response_size); co_return std::nullopt; } } catch (std::exception& e) { spdlog::error( "Cannot parse header: {} {}", e.what(), std::string{header_buffer.data(), cHeaderSize} ); co_return std::nullopt; }

coderabbitai · 2024-11-29T17:04:51Z

src/spider/worker/message_pipe.cpp

+    if (response_size != response_n) {
+        spdlog::error(
+                "Response read size not match. Expect {}. Got {}.",
+                response_size,
+                response_n
+        );
+        co_return std::nullopt;
+    }


⚠️ Potential issue

Ensure complete asynchronous read of the message body

After the asynchronous read, verify that the number of bytes read matches the expected response_size to handle any incomplete reads.

Modify the check:

if (response_size != response_n) { spdlog::error( "Response read size does not match. Expected {} bytes, got {} bytes.", response_size, response_n ); co_return std::nullopt; }

coderabbitai · 2024-11-29T17:04:52Z

src/spider/worker/FunctionManager.hpp

+// NOLINTBEGIN(cppcoreguidelines-macro-usage)
+#define CONCAT_DIRECT(s1, s2) s1##s2
+#define CONCAT(s1, s2) CONCAT_DIRECT(s1, s2)
+#define ANONYMOUS_VARIABLE(str) CONCAT(str, __COUNTER__)
+// NOLINTEND(cppcoreguidelines-macro-usage)


🛠️ Refactor suggestion

Replace macros with constexpr functions for better type safety

Using macros like CONCAT, CONCAT_DIRECT, and ANONYMOUS_VARIABLE can lead to less type-safe code and harder-to-debug errors. Consider replacing them with constexpr functions or inline variables to enhance type safety and maintainability.

coderabbitai · 2024-11-29T17:04:52Z

src/spider/worker/FunctionManager.hpp

+    inline const auto ANONYMOUS_VARIABLE(var) \
+            = spider::core::FunctionManager::get_instance().register_function(#func, func);


🛠️ Refactor suggestion

Avoid macros in REGISTER_TASK; use templates instead

The REGISTER_TASK macro could be replaced with a templated function to reduce macro usage and improve code clarity. This change would enhance type safety and make the code easier to debug.

coderabbitai

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (6)

src/spider/worker/message_pipe.cpp (2)

120-122: Maintain consistency with synchronous version

The async version checks for zero body size while the synchronous version doesn't. This inconsistency could lead to different behaviour between the two implementations.

Either add the check to the synchronous version or document why the behaviours differ.

1-148: Add documentation for message format and size limits

The code would benefit from documentation describing:

The message format specification (16-byte header + variable body)

Maximum message size limits and their rationale

Error handling behaviour and recovery strategies

Consider adding a detailed comment block at the top of the file explaining these aspects.
src/spider/worker/FunctionManager.hpp (4)
93-103: Add documentation for error handling functions

These public API functions would benefit from documentation explaining:

Expected input format

Possible error conditions

Return value semantics

112-118: Replace magic numbers with named constants

The array size check uses magic number '2'. Consider defining a constant for better maintainability:
+    static constexpr std::size_t EXPECTED_RESPONSE_SIZE = 2;
-    if (msgpack::type::ARRAY != object.type || 2 != object.via.array.size) {
+    if (msgpack::type::ARRAY != object.type || EXPECTED_RESPONSE_SIZE != object.via.array.size) {
Line range hint 177-218: Enhance error messages with more context

Consider adding more detailed error messages that include:

The actual values that caused parsing failures

The specific type conversion that failed

Stack trace information when available

253-253: Document pointer ownership semantics

Please add documentation clarifying:

Pointer ownership (who owns the returned Function pointer)

Lifetime guarantees

Whether the pointer can be null

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 6fb2f0b and 034e309.

📒 Files selected for processing (4)

src/spider/CMakeLists.txt (2 hunks)
src/spider/worker/FunctionManager.cpp (1 hunks)
src/spider/worker/FunctionManager.hpp (6 hunks)
src/spider/worker/message_pipe.cpp (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

src/spider/CMakeLists.txt
src/spider/worker/FunctionManager.cpp

🔇 Additional comments (5)

src/spider/worker/message_pipe.cpp (3)

1-19: LGTM! Header organization and namespace setup look good.

The includes are well-organized, and the IWYU pragmas are appropriately used for managing header dependencies.

21-32: ⚠️ Potential issue

Add size validation before formatting message header

The current implementation doesn't validate the message size before formatting it into the header string. This could lead to issues with very large messages.

Add size validation before formatting:

 auto send_message(boost::asio::writable_pipe& pipe, msgpack::sbuffer const& request) -> bool {
     try {
         size_t const size = request.size();
+        if (size >= 1e16) {
+            spdlog::error("Message size exceeds maximum allowed limit: {}", size);
+            return false;
+        }
         std::string const size_str = fmt::format("{:016d}", size);

90-94: ⚠️ Potential issue

Address suppressed null dereference warning

The suppressed warning suggests a potential null dereference issue that should be addressed rather than suppressed.

Add explicit null check:

-    // NOLINTNEXTLINE(clang-analyzer-core.NullDereference)
+    if (&pipe.get() == nullptr) {
+        spdlog::error("Null pipe reference");
+        co_return std::nullopt;
+    }
     auto [header_ec, header_n] = co_await boost::asio::async_read(

src/spider/worker/FunctionManager.hpp (2)

21-29: Consider modernizing macro usage with constexpr alternatives

The use of macros for concatenation and variable generation could be replaced with more type-safe alternatives.

149-158: Consider adding buffer size checks

For safety, consider adding a maximum limit to the number of arguments to prevent potential buffer overflow issues.

coderabbitai · 2024-11-30T05:11:19Z

src/spider/worker/message_pipe.cpp

+auto send_message(boost::asio::posix::stream_descriptor& fd, msgpack::sbuffer const& request)
+        -> bool {
+    try {
+        size_t const size = request.size();
+        std::string const size_str = fmt::format("{:016d}", size);
+        boost::asio::write(fd, boost::asio::buffer(size_str));
+        boost::asio::write(fd, boost::asio::buffer(request.data(), size));
+        return true;
+    } catch (boost::system::system_error const& e) {
+        spdlog::error("Failed to send message: {}", e.what());
+        return false;
+    }
+}


🛠️ Refactor suggestion

Refactor duplicated send_message logic

Both send_message overloads share identical logic, differing only in the type of the output device.

Consider implementing a template function to reduce code duplication:

+template<typename T> +auto send_message_impl(T& output, msgpack::sbuffer const& request) -> bool { + try { + size_t const size = request.size(); + if (size >= 1e16) { + spdlog::error("Message size exceeds maximum allowed limit: {}", size); + return false; + } + std::string const size_str = fmt::format("{:016d}", size); + boost::asio::write(output, boost::asio::buffer(size_str)); + boost::asio::write(output, boost::asio::buffer(request.data(), size)); + return true; + } catch (boost::system::system_error const& e) { + spdlog::error("Failed to send message: {}", e.what()); + return false; + } +} + +auto send_message(boost::asio::writable_pipe& pipe, msgpack::sbuffer const& request) -> bool { + return send_message_impl(pipe, request); +} + +auto send_message(boost::asio::posix::stream_descriptor& fd, msgpack::sbuffer const& request) -> bool { + return send_message_impl(fd, request); +}

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-11-30T05:11:19Z

src/spider/worker/message_pipe.cpp

+auto receive_message(boost::asio::posix::stream_descriptor& fd) -> std::optional<msgpack::sbuffer> {
+    std::array<char, cHeaderSize> header_buffer{0};
+    try {
+        boost::asio::read(fd, boost::asio::buffer(header_buffer));
+    } catch (boost::system::system_error& e) {
+        if (boost::asio::error::eof != e.code()) {
+            spdlog::error("Fail to read header: {}", e.what());
+        }
+        return std::nullopt;
+    }
+    size_t body_size = 0;
+    try {
+        body_size = std::stoul(std::string{header_buffer.data(), cHeaderSize});
+    } catch (std::exception& e) {
+        spdlog::error(
+                "Cannot parse header: {} {}",
+                e.what(),
+                std::string{header_buffer.data(), cHeaderSize}
+        );
+        return std::nullopt;
+    }
+
+    std::vector<char> body_buffer(body_size);
+    try {
+        size_t body_read_size = boost::asio::read(fd, boost::asio::buffer(body_buffer));
+        if (body_read_size != body_size) {
+            spdlog::error(
+                    "Message body read size mismatch. Expect {}. Got {}",
+                    body_size,
+                    body_read_size
+            );
+            return std::nullopt;
+        }
+    } catch (boost::system::system_error& e) {
+        spdlog::error("Fail to read response body: {}", e.what());
+        return std::nullopt;
+    }
+    msgpack::sbuffer buffer;
+    buffer.write(body_buffer.data(), body_buffer.size());
+    return buffer;
+}


⚠️ Potential issue

Add memory safety checks for message body allocation

The function allocates a buffer based on the parsed size without checking if the allocation is reasonable or possible.

Add memory safety checks:

try { body_size = std::stoul(std::string{header_buffer.data(), cHeaderSize}); + constexpr size_t max_message_size = 1024 * 1024 * 1024; // 1GB or adjust as needed + if (body_size > max_message_size) { + spdlog::error("Message size {} exceeds maximum allowed size {}", body_size, max_message_size); + return std::nullopt; + } } catch (std::exception& e) {

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

auto receive_message(boost::asio::posix::stream_descriptor& fd) -> std::optional<msgpack::sbuffer> {

std::array<char, cHeaderSize> header_buffer{0};

try {

boost::asio::read(fd, boost::asio::buffer(header_buffer));

} catch (boost::system::system_error& e) {

if (boost::asio::error::eof != e.code()) {

spdlog::error("Fail to read header: {}", e.what());

}

return std::nullopt;

}

size_t body_size = 0;

try {

body_size = std::stoul(std::string{header_buffer.data(), cHeaderSize});

} catch (std::exception& e) {

spdlog::error(

"Cannot parse header: {} {}",

e.what(),

std::string{header_buffer.data(), cHeaderSize}

);

return std::nullopt;

}

std::vector<char> body_buffer(body_size);

try {

size_t body_read_size = boost::asio::read(fd, boost::asio::buffer(body_buffer));

if (body_read_size != body_size) {

spdlog::error(

"Message body read size mismatch. Expect {}. Got {}",

body_size,

body_read_size

);

return std::nullopt;

}

} catch (boost::system::system_error& e) {

spdlog::error("Fail to read response body: {}", e.what());

return std::nullopt;

}

msgpack::sbuffer buffer;

buffer.write(body_buffer.data(), body_buffer.size());

return buffer;

}

auto receive_message(boost::asio::posix::stream_descriptor& fd) -> std::optional<msgpack::sbuffer> {

std::array<char, cHeaderSize> header_buffer{0};

try {

boost::asio::read(fd, boost::asio::buffer(header_buffer));

} catch (boost::system::system_error& e) {

if (boost::asio::error::eof != e.code()) {

spdlog::error("Fail to read header: {}", e.what());

}

return std::nullopt;

}

size_t body_size = 0;

try {

body_size = std::stoul(std::string{header_buffer.data(), cHeaderSize});

constexpr size_t max_message_size = 1024 * 1024 * 1024; // 1GB or adjust as needed

if (body_size > max_message_size) {

spdlog::error("Message size {} exceeds maximum allowed size {}", body_size, max_message_size);

return std::nullopt;

}

} catch (std::exception& e) {

spdlog::error(

"Cannot parse header: {} {}",

e.what(),

std::string{header_buffer.data(), cHeaderSize}

);

return std::nullopt;

}

std::vector<char> body_buffer(body_size);

try {

size_t body_read_size = boost::asio::read(fd, boost::asio::buffer(body_buffer));

if (body_read_size != body_size) {

spdlog::error(

"Message body read size mismatch. Expect {}. Got {}",

body_size,

body_read_size

);

return std::nullopt;

}

} catch (boost::system::system_error& e) {

spdlog::error("Fail to read response body: {}", e.what());

return std::nullopt;

}

msgpack::sbuffer buffer;

buffer.write(body_buffer.data(), body_buffer.size());

return buffer;

}

…ang-tidy

coderabbitai

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (4)

src/spider/worker/FunctionManager.hpp (4)
93-104: Add documentation for public API functions

These public API functions would benefit from documentation explaining:

Parameters and return values

Error conditions and handling

Usage examples

112-118: Replace magic number with named constant

The array size check uses a magic number '2'. Consider defining a named constant for better maintainability.
+ static constexpr std::size_t RESPONSE_ARRAY_SIZE = 2;
- if (msgpack::type::ARRAY != object.type || 2 != object.via.array.size) {
+ if (msgpack::type::ARRAY != object.type || RESPONSE_ARRAY_SIZE != object.via.array.size) {
130-136: Implementation looks good, consider consistent constant usage

The implementation is correct, but should use the same named constant for array size as suggested above.

Line range hint 177-218: Enhance error messages for better debugging

Consider including more context in error messages:

For ArgumentParsingError: Include the actual parsing error message from the exception

For FunctionExecutionError: Include the exception message from std::exception
- fmt::format("Cannot parse arguments.")
+ fmt::format("Cannot parse arguments: {}", e.what())
- "Function execution error"
+ fmt::format("Function execution error: {}", e.what())

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 7b55d7b and 56f5684.

📒 Files selected for processing (2)

src/spider/worker/FunctionManager.hpp (6 hunks)
src/spider/worker/message_pipe.cpp (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

src/spider/worker/message_pipe.cpp

🔇 Additional comments (3)

src/spider/worker/FunctionManager.hpp (3)

21-29: Consider modernizing macro-based implementations

The macro-based approach for variable name generation could be replaced with more type-safe alternatives.

149-158: LGTM! Clean and efficient implementation

The implementation makes good use of variadic templates and parameter pack expansion.

249-255: Verify type safety of direct function registration

The register_function_invoker method allows bypassing the type checking provided by FunctionInvoker. Ensure that all direct registrations maintain proper type safety.

coderabbitai

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (1)

tests/worker/test-FunctionManager.cpp (1)

Line range hint 28-61: Consider adding test case descriptions using SECTION

While the test coverage is comprehensive, consider using Catch2's SECTION macro to better organize and describe different test scenarios.

Here's how you could refactor this:

 TEST_CASE("Register and run function with POD inputs", "[core]") {
     spider::core::FunctionManager const& manager = spider::core::FunctionManager::get_instance();
 
+    SECTION("Unregistered function returns nullptr") {
     // Get the function that has not been registered should return nullptr
     REQUIRE(nullptr == manager.get_function("foo"));
+    }
 
+    SECTION("Successful execution with correct arguments") {
     // Get the registered function should succeed
     spider::core::Function const* function = manager.get_function("int_test");
     spider::core::ArgsBuffer const args_buffers = spider::core::create_args_buffers(2, 3);
     msgpack::sbuffer const result = (*function)(args_buffers);
     REQUIRE(5 == spider::core::response_get_result<int>(result).value_or(0));
+    }

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 56f5684 and 88b37a3.

📒 Files selected for processing (1)

tests/worker/test-FunctionManager.cpp (4 hunks)

🧰 Additional context used

🪛 cppcheck (2.10-2)

tests/worker/test-FunctionManager.cpp

[error] 28-28: syntax error

(syntaxError)

🔇 Additional comments (2)

tests/worker/test-FunctionManager.cpp (2)

24-26: LGTM! Good test coverage across different data types.

The registration of test functions provides comprehensive coverage across primitive types (int_test), compound types (tuple_ret_test), and custom types (data_test).

Line range hint 77-93: Enhance const correctness in data test

While the test is thorough, consider improving const correctness for better safety and clarity.

Apply this change:

-    spider::core::Data data{"test"};
+    spider::core::Data const data{"test"};

Also, verify that the Data class's get_id() and get_value() methods are marked as const:

✅ Verification successful

Both get_id() and get_value() are already const-correct, only data variable needs const

The Data class implementation in src/spider/core/Data.hpp shows that both get_id() and get_value() methods are properly marked as const. The only improvement needed is making the data variable const in the test case, as originally suggested.

-    spider::core::Data data{"test"};
+    spider::core::Data const data{"test"};

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if Data class methods are properly marked as const
ast-grep --pattern 'class Data {
  $$$
  get_id() const {
    $$$
  }
  $$$
  get_value() const {
    $$$
  }
  $$$
}'

Length of output: 120

Script:

#!/bin/bash
# Let's try to find the Data class definition first
rg -l "class Data"

# Then let's look for the method declarations
rg "get_id|get_value" -A 2

Length of output: 9341

Script:

#!/bin/bash
# Let's examine the Data class definition in core
cat src/spider/core/Data.hpp

Length of output: 1940

coderabbitai · 2024-11-30T06:15:59Z

tests/worker/test-FunctionManager.cpp

 TEST_CASE("Register and run function with tuple return", "[core]") {
-    SPIDER_WORKER_REGISTER_TASK(tuple_ret_test);
-
-    spider::core::FunctionManager& manager = spider::core::FunctionManager::get_instance();
+    spider::core::FunctionManager const& manager = spider::core::FunctionManager::get_instance();

    spider::core::Function const* function = manager.get_function("tuple_ret_test");

    spider::core::ArgsBuffer const args_buffers = spider::core::create_args_buffers("test", 3);
    msgpack::sbuffer const result = (*function)(args_buffers);
    REQUIRE(std::make_tuple("test", 3)
-            == spider::core::buffer_get<std::tuple<std::string, int>>(result).value_or(
+            == spider::core::response_get_result<std::tuple<std::string, int>>(result).value_or(
                    std::make_tuple("", 0)
            ));
 }


🛠️ Refactor suggestion

Consider adding error cases for tuple return test

The current test only covers the happy path. Consider adding test cases for error scenarios, such as incorrect argument types or missing arguments.

Here's a suggested addition:

// Test with wrong argument type spider::core::ArgsBuffer const wrong_args = spider::core::create_args_buffers(42, 3); // int instead of string msgpack::sbuffer const error_result = (*function)(wrong_args); auto error = spider::core::response_get_error(error_result); REQUIRE(error.has_value()); REQUIRE(spider::core::FunctionInvokeError::ArgumentParsingError == std::get<0>(error.value()));

sitaowang1998 added 5 commits November 29, 2024 10:48

Add message pipe

0186863

Add libraries to clang format

95b6ba8

Add TaskExecutorMessage and update function manager

31a74d2

Update build scripts

dbc7d7b

Bump spdlog version to solve conflict with fmtlib. Add using external fmt to spdlog build. Add PIC in fmt build. Add process to boost build.

Add worker files in test cmake build

f7f8e3f

coderabbitai bot reviewed Nov 29, 2024

View reviewed changes

sitaowang1998 added 3 commits November 29, 2024 19:18

Fix cmake lint

6fb2f0b

Merge branch 'main' into message_pipe

872616f

Improve read size check

034e309

coderabbitai bot reviewed Nov 30, 2024

View reviewed changes

sitaowang1998 added 2 commits November 30, 2024 05:17

Use template to remove duplicated send_message code

7b55d7b

Rename worker register task due to error in previous merge and fix cl…

56f5684

…ang-tidy

coderabbitai bot reviewed Nov 30, 2024

View reviewed changes

Fix renaming

88b37a3

coderabbitai bot reviewed Nov 30, 2024

View reviewed changes

sitaowang1998 merged commit c0c9fbc into y-scope:main Nov 30, 2024
2 checks passed

sitaowang1998 deleted the message_pipe branch November 30, 2024 16:45

This was referenced Nov 30, 2024

feat: Add task executor #28

Merged

feat: Add scheduler scheduling #31

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add message pipe #27

feat: Add message pipe #27

sitaowang1998 commented Nov 29, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 29, 2024 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot Nov 29, 2024

coderabbitai bot Nov 29, 2024

coderabbitai bot Nov 29, 2024

coderabbitai bot Nov 29, 2024

coderabbitai bot Nov 29, 2024

coderabbitai bot Nov 29, 2024

coderabbitai bot Nov 29, 2024

coderabbitai bot Nov 29, 2024

coderabbitai bot Nov 29, 2024

coderabbitai bot Nov 29, 2024

coderabbitai bot left a comment

coderabbitai bot Nov 30, 2024

coderabbitai bot Nov 30, 2024

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot Nov 30, 2024

-auto send_message(boost::asio::writable_pipe& pipe, msgpack::sbuffer const& request) -> bool;
-auto send_message(boost::asio::posix::stream_descriptor& fd, msgpack::sbuffer const& request)
-        -> bool;
-auto receive_message_async(std::reference_wrapper<boost::asio::readable_pipe> pipe
-) -> boost::asio::awaitable<std::optional<msgpack::sbuffer>>;
-auto receive_message(boost::asio::posix::stream_descriptor& fd) -> std::optional<msgpack::sbuffer>;
+/**
+ * @brief Sends a message through a writable pipe
+ * @param pipe The pipe to write the message to
+ * @param request The message buffer to send
+ * @return true if the message was sent successfully, false otherwise
+ */
+auto send_message(boost::asio::writable_pipe& pipe, msgpack::sbuffer const& request) -> bool;
+/**
+ * @brief Sends a message through a POSIX stream descriptor
+ * @param fd The file descriptor to write the message to
+ * @param request The message buffer to send
+ * @return true if the message was sent successfully, false otherwise
+ */
+auto send_message(boost::asio::posix::stream_descriptor& fd, msgpack::sbuffer const& request)
+        -> bool;
+/**
+ * @brief Asynchronously receives a message from a readable pipe
+ * @param pipe Reference wrapper to the pipe to read from
+ * @return An awaitable that resolves to an optional message buffer
+ */
+auto receive_message_async(std::reference_wrapper<boost::asio::readable_pipe> pipe
+) -> boost::asio::awaitable<std::optional<msgpack::sbuffer>>;
+/**
+ * @brief Synchronously receives a message from a POSIX stream descriptor
+ * @param fd The file descriptor to read from
+ * @return An optional containing the received message buffer if successful
+ */
+auto receive_message(boost::asio::posix::stream_descriptor& fd) -> std::optional<msgpack::sbuffer>;

		cmake -DCMAKE_POSITION_INDEPENDENT_CODE=ON ../
		make -j${num_cpus}

		inline const auto ANONYMOUS_VARIABLE(var) \
		= spider::core::FunctionManager::get_instance().register_function(#func, func);

feat: Add message pipe #27

feat: Add message pipe #27

Conversation

sitaowang1998 commented Nov 29, 2024 • edited by coderabbitai bot Loading

Description

Validation performed

Summary by CodeRabbit

Release Notes

coderabbitai bot commented Nov 29, 2024 • edited Loading

Walkthrough

Changes

Possibly related PRs

Suggested reviewers

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Nov 29, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 29, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 29, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 29, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 29, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 29, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 29, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 29, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 29, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 29, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Nov 30, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 30, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Nov 30, 2024

Choose a reason for hiding this comment

sitaowang1998 commented Nov 29, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 29, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)