-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add variable size message sending and receiving through socket #32
Conversation
WalkthroughThe pull request introduces significant modifications to the Spider project's CMake configuration and file organization. Key changes include the addition of new source and header files related to I/O operations, specifically for MessagePack serialization. Several header files have been relocated from the Changes
Possibly related PRs
Suggested reviewers
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Outside diff range and nitpick comments (2)
src/spider/io/msgpack_message.hpp (1)
14-17
: Simplify Function Parameters by Passing References DirectlyIn the asynchronous functions
send_message_async
andreceive_message_async
, usingstd::reference_wrapper
for parameters may be unnecessary. Passing the socket and buffer by reference directly can simplify the code and improve readability.Suggested change:
- auto send_message_async( - std::reference_wrapper<boost::asio::ip::tcp::socket> socket, - std::reference_wrapper<msgpack::sbuffer> buffer - ) -> boost::asio::awaitable<bool>; + auto send_message_async( + boost::asio::ip::tcp::socket& socket, + msgpack::sbuffer& buffer + ) -> boost::asio::awaitable<bool>;And similarly for
receive_message_async
:- auto receive_message_async(std::reference_wrapper<boost::asio::ip::tcp::socket> socket - ) -> boost::asio::awaitable<std::optional<msgpack::sbuffer>>; + auto receive_message_async(boost::asio::ip::tcp::socket& socket + ) -> boost::asio::awaitable<std::optional<msgpack::sbuffer>>;Also applies to: 21-22
tests/io/test-MsgpackMessage.cpp (1)
83-89
: Consider extracting message creation to a helper function.The message creation logic is duplicated between sync and async tests. Consider extracting it to improve maintainability.
+namespace { +auto create_test_buffer(size_t buffer_size) -> msgpack::sbuffer { + msgpack::sbuffer buffer; + for (size_t i = 0; i < buffer_size; ++i) { + char const value = i % 256; + buffer.write(&value, sizeof(value)); + } + return buffer; +} +} // namespace
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (20)
src/spider/CMakeLists.txt
(2 hunks)src/spider/client/Data.hpp
(1 hunks)src/spider/client/task.hpp
(1 hunks)src/spider/core/Data.hpp
(1 hunks)src/spider/io/BoostAsio.hpp
(1 hunks)src/spider/io/msgpack_message.cpp
(1 hunks)src/spider/io/msgpack_message.hpp
(1 hunks)src/spider/worker/FunctionManager.cpp
(1 hunks)src/spider/worker/FunctionManager.hpp
(1 hunks)src/spider/worker/TaskExecutor.cpp
(1 hunks)src/spider/worker/TaskExecutor.hpp
(1 hunks)src/spider/worker/TaskExecutorMessage.hpp
(1 hunks)src/spider/worker/message_pipe.cpp
(1 hunks)src/spider/worker/message_pipe.hpp
(1 hunks)src/spider/worker/task_executor.cpp
(1 hunks)tests/CMakeLists.txt
(1 hunks)tests/io/test-MsgpackMessage.cpp
(1 hunks)tests/worker/test-FunctionManager.cpp
(1 hunks)tests/worker/test-MessagePipe.cpp
(1 hunks)tests/worker/test-TaskExecutor.cpp
(1 hunks)
✅ Files skipped from review due to trivial changes (12)
- src/spider/worker/FunctionManager.hpp
- src/spider/client/Data.hpp
- src/spider/worker/TaskExecutorMessage.hpp
- src/spider/client/task.hpp
- src/spider/worker/message_pipe.hpp
- src/spider/worker/TaskExecutor.hpp
- tests/worker/test-TaskExecutor.cpp
- src/spider/core/Data.hpp
- src/spider/worker/task_executor.cpp
- tests/worker/test-FunctionManager.cpp
- src/spider/worker/message_pipe.cpp
- tests/worker/test-MessagePipe.cpp
🧰 Additional context used
🪛 cppcheck (2.10-2)
tests/io/test-MsgpackMessage.cpp
[error] 25-25: syntax error
(syntaxError)
🔇 Additional comments (9)
src/spider/io/msgpack_message.cpp (1)
1-252
: Verify C++20 Support for char8_t
and std::bit_cast
The code uses C++20 features such as char8_t
and std::bit_cast
. Please verify that the project is configured to compile with the C++20 standard; otherwise, the code may not compile.
Run the following script to check if the CMakeLists.txt
files specify C++20:
✅ Verification successful
C++20 Support is Properly Configured
The project correctly specifies C++20 as the required C++ standard in the root CMakeLists.txt
file, which ensures support for modern features like char8_t
and std::bit_cast
.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Search for CMakeLists.txt files specifying C++20
# Test: Search for occurrences of 'CXX_STANDARD 20' or 'c++20'
# Expect: The project should specify C++20 as the C++ standard.
fd -e txt -e cmake CMakeLists.txt | xargs grep -H -E 'CXX_STANDARD.*20|c\+\+20'
Length of output: 123
src/spider/io/msgpack_message.hpp (1)
8-8
: Correct the IWYU Pragma Syntax
There is a typographical error in the IWYU pragma comment. The space between pragma
and :keep
should be removed.
[typographical_error]
Change:
-#include "MsgPack.hpp" // IWYU pragma :keep
+#include "MsgPack.hpp" // IWYU pragma: keep
src/spider/io/BoostAsio.hpp (1)
14-34
: Additional Includes Are Appropriate
The added Boost.Asio headers are necessary for the new I/O functionalities implemented in the project. Including these headers ensures that all required components are available for socket communication and asynchronous operations.
tests/CMakeLists.txt (1)
11-11
: Good Addition of Unit Test for MsgpackMessage
Including io/test-MsgpackMessage.cpp
ensures that the new message handling functionalities are adequately tested. This enhances code reliability and helps detect potential issues early.
src/spider/worker/FunctionManager.cpp (1)
9-9
: Update Include Path to Reflect Codebase Restructuring
Changing the include path to ../io/MsgPack.hpp
aligns with the reorganization of the codebase. This ensures that the correct header file is included, maintaining consistency and preventing potential build issues.
src/spider/CMakeLists.txt (1)
5-5
: LGTM! Well-structured reorganization of IO-related components.
The movement of network and serialization headers to the io/
directory and addition of the new msgpack implementation improves code organization and modularity.
Also applies to: 16-19
src/spider/worker/TaskExecutor.cpp (1)
11-12
: LGTM! Include paths properly updated.
The include path changes correctly reflect the reorganization of IO-related headers.
tests/io/test-MsgpackMessage.cpp (2)
102-104
: LGTM! Proper timeout handling for async operations.
The 5-second timeout for futures and context restart between iterations ensures robust async testing.
1-16
: LGTM! Well-organized includes.
The includes are properly organized and grouped by standard library, third-party, and project headers.
src/spider/io/msgpack_message.cpp
Outdated
if (boost::asio::error::eof != header_ec) { | ||
spdlog::error( | ||
"Cannot read message body size or body from socket {}: {}", | ||
body_size_ec.value(), | ||
body_size_ec.message() | ||
); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct Error Handling to Use Appropriate Error Codes
In receive_message_async
, the error codes checked after the async_read
operations are mismatched:
- At lines 193-199, when handling
body_size_ec
, the code compares it toheader_ec
instead ofbody_size_ec
. - At lines 230-236, when handling
body_ec
, the code compares it toheader_ec
instead ofbody_ec
.
This could lead to incorrect error handling and mask potential read errors.
Apply this diff to correct the error code checks:
- if (boost::asio::error::eof != header_ec) {
+ if (boost::asio::error::eof != body_size_ec) {
spdlog::error(
"Cannot read message body size or body from socket {}: {}",
body_size_ec.value(),
body_size_ec.message()
);
}
- if (boost::asio::error::eof != header_ec) {
+ if (boost::asio::error::eof != body_ec) {
spdlog::error(
"Cannot read message body from socket {}: {}",
body_ec.value(),
body_ec.message()
);
}
Also applies to: 230-236
src/spider/io/msgpack_message.cpp
Outdated
} | ||
msgpack::sbuffer buffer; | ||
// NOLINTNEXTLINE(cppcoreguidelines-pro-type-reinterpret-cast) | ||
buffer.write(reinterpret_cast<char*>(&body_size_vec[1]), body_size_vec.size() - 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid reinterpret_cast
Between char8_t*
and char*
The code uses reinterpret_cast
to convert char8_t*
to char*
, which may lead to undefined behaviour since char8_t
and char
are distinct types in C++20. Consider using std::bit_cast
or copying the data to a std::vector<char>
to safely convert between these types.
Here's a possible fix:
- buffer.write(reinterpret_cast<char*>(&body_size_vec[1]), body_size_vec.size() - 1);
+ std::vector<char> data(body_size_vec.size() - 1);
+ std::memcpy(data.data(), &body_size_vec[1], data.size());
+ buffer.write(data.data(), data.size());
Also applies to: 151-151
src/spider/io/msgpack_message.cpp
Outdated
auto send_message(boost::asio::ip::tcp::socket& socket, msgpack::sbuffer const& buffer) -> bool { | ||
msgpack::sbuffer message_buffer; | ||
msgpack::packer packer{message_buffer}; | ||
packer.pack_ext(buffer.size(), msgpack::type::BIN); | ||
packer.pack_ext_body(buffer.data(), buffer.size()); | ||
size_t const size = boost::asio::write( | ||
socket, | ||
boost::asio::buffer(message_buffer.data(), message_buffer.size()) | ||
); | ||
return size == message_buffer.size(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle Exceptions in send_message
Function
The send_message
function uses boost::asio::write
, which may throw exceptions on failure. Currently, there is no exception handling, which could lead to unhandled exceptions and potential application crashes. It's recommended to add try-catch blocks to handle exceptions and maintain robust error handling.
Apply this diff to add exception handling:
+try {
size_t const size = boost::asio::write(
socket,
boost::asio::buffer(message_buffer.data(), message_buffer.size())
);
return size == message_buffer.size();
+} catch (boost::system::system_error& e) {
+ spdlog::error("Failed to send message: {}", e.what());
+ return false;
+}
Committable suggestion skipped: line range outside the PR's diff.
src/spider/io/msgpack_message.cpp
Outdated
auto receive_message(boost::asio::ip::tcp::socket& socket) -> std::optional<msgpack::sbuffer> { | ||
// Read header | ||
char8_t header = 0; | ||
boost::asio::read(socket, boost::asio::buffer(&header, sizeof(header))); | ||
std::optional<std::pair<size_t, bool>> const optional_body_pair = read_ext_type(header); | ||
if (false == optional_body_pair.has_value()) { | ||
return std::nullopt; | ||
} | ||
|
||
// Read next | ||
std::pair<size_t, bool> const body_pair = optional_body_pair.value(); | ||
std::vector<char8_t> body_size_vec(body_pair.first); | ||
boost::asio::read(socket, boost::asio::buffer(body_size_vec)); | ||
if (false == body_pair.second) { | ||
// Entire body read with type. Validate type to be bin. | ||
if (body_size_vec[0] != msgpack::type::BIN) { | ||
return std::nullopt; | ||
} | ||
msgpack::sbuffer buffer; | ||
// NOLINTNEXTLINE(cppcoreguidelines-pro-type-reinterpret-cast) | ||
buffer.write(reinterpret_cast<char*>(&body_size_vec[1]), body_size_vec.size() - 1); | ||
return buffer; | ||
} | ||
std::optional<size_t> const optional_body_size | ||
= read_ext_body_size(std::u8string_view{body_size_vec.data(), body_size_vec.size()}); | ||
if (false == optional_body_size.has_value()) { | ||
return std::nullopt; | ||
} | ||
size_t const body_size = optional_body_size.value(); | ||
|
||
// Read body | ||
std::vector<char8_t> body_vec(body_size + 1); | ||
boost::asio::read(socket, boost::asio::buffer(body_vec)); | ||
// Validate type to be bin | ||
if (body_vec[0] != msgpack::type::BIN) { | ||
return std::nullopt; | ||
} | ||
msgpack::sbuffer buffer; | ||
// NOLINTNEXTLINE(cppcoreguidelines-pro-type-reinterpret-cast) | ||
buffer.write(reinterpret_cast<char*>(&body_vec[1]), body_vec.size() - 1); | ||
return buffer; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle Exceptions in receive_message
Function
The receive_message
function calls boost::asio::read
, which can throw exceptions if an error occurs during the read operation. Currently, exceptions are not caught, which could lead to application crashes. It's advisable to wrap the read operations in try-catch blocks to handle potential exceptions gracefully.
Here's a suggested change:
+try {
// Read header
char8_t header = 0;
boost::asio::read(socket, boost::asio::buffer(&header, sizeof(header)));
// Rest of the code...
+} catch (boost::system::system_error& e) {
+ spdlog::error("Failed to receive message: {}", e.what());
+ return std::nullopt;
+}
Committable suggestion skipped: line range outside the PR's diff.
tests/io/test-MsgpackMessage.cpp
Outdated
constexpr std::array<size_t, 12> cBufferSizes{1, 2, 3, 4, 5, 6, 7, 8, 9, 17, 257, 65'537}; | ||
constexpr unsigned cPort = 6021; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider using dynamic port allocation for tests.
Using a fixed port (6021) could cause test failures if the port is already in use. Consider using port 0 to let the system assign an available port, then retrieve it using acceptor.local_endpoint().port()
.
-constexpr unsigned cPort = 6021;
+tcp::acceptor acceptor{context, {boost::asio::ip::address::from_string("127.0.0.1"), 0}};
+unsigned const port = acceptor.local_endpoint().port();
Committable suggestion skipped: line range outside the PR's diff.
This reverts commit 870f3d5.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (2)
tests/io/test-MsgpackMessage.cpp (2)
21-23
: Consider expanding buffer size test cases.While the current test cases cover a good range, consider adding these scenarios:
- Maximum message size limit test
- Zero-byte message test (if supported)
- Size boundary tests (e.g., 65,536 and 65,538)
-constexpr std::array<size_t, 12> cBufferSizes{1, 2, 3, 4, 5, 6, 7, 8, 9, 17, 257, 65'537}; +constexpr std::array<size_t, 15> cBufferSizes{ + 0, // Zero-byte test + 1, 2, 3, 4, 5, 6, 7, 8, 9, // Small sizes + 17, 257, // Medium sizes + 65'536, 65'537, 65'538 // Size boundary tests +};
74-129
: Enhance async test robustness and error reporting.The asynchronous test case is well-structured but could be improved:
TEST_CASE("Async socket msgpack", "[io]") { + constexpr auto OPERATION_TIMEOUT = std::chrono::seconds(5); boost::asio::io_context context; + boost::system::error_code ec; // ... existing code ... std::future<bool> client_future = boost::asio::co_spawn( context, spider::core::send_message_async(client_socket, client_buffer), boost::asio::use_future ); + + auto check_future = [](auto& future, const char* operation) { + if (future.wait_for(OPERATION_TIMEOUT) != std::future_status::ready) { + FAIL(operation << " timed out after " << OPERATION_TIMEOUT.count() << " seconds"); + } + }; + + check_future(client_future, "Client send operation"); + check_future(server_future, "Server receive operation"); - REQUIRE(client_future.wait_for(std::chrono::seconds(5)) == std::future_status::ready); - REQUIRE(server_future.wait_for(std::chrono::seconds(5)) == std::future_status::ready);
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
src/spider/io/msgpack_message.cpp
(1 hunks)tests/io/test-MsgpackMessage.cpp
(1 hunks)
🧰 Additional context used
🪛 cppcheck (2.10-2)
tests/io/test-MsgpackMessage.cpp
[error] 24-24: syntax error
(syntaxError)
🔇 Additional comments (3)
src/spider/io/msgpack_message.cpp (2)
241-248
: Correct Error Handling to Use Appropriate Error Codes
In receive_message_async
, the error code checked after the async_read
operation for the message body is mismatched:
- At lines 242-248, when handling
body_ec
, the code compares it tobody_size_ec
instead ofbody_ec
.
This could lead to incorrect error handling and mask potential read errors.
Apply this diff to correct the error code check:
- if (boost::asio::error::eof != body_size_ec) {
+ if (boost::asio::error::eof != body_ec) {
spdlog::error(
"Cannot read message body size or body from socket {}: {}",
body_ec.value(),
body_ec.message()
);
}
139-139
: Avoid Incorrect Use of std::bit_cast
Between char8_t*
and char*
The code uses std::bit_cast
to convert char8_t*
to char*
, which is not appropriate. std::bit_cast
is intended for casting between objects of the same size, not pointers. This may lead to undefined behaviour.
Apply this diff to correct the casting:
- buffer.write(std::bit_cast<char*>(&body_size_vec[1]), body_size_vec.size() - 1);
+ std::vector<char> data(body_size_vec.size() - 1);
+ std::memcpy(data.data(), &body_size_vec[1], data.size());
+ buffer.write(data.data(), data.size());
Also applies to:
- buffer.write(std::bit_cast<char*>(&body_vec[1]), body_vec.size() - 1);
+ std::vector<char> data(body_vec.size() - 1);
+ std::memcpy(data.data(), &body_vec[1], data.size());
+ buffer.write(data.data(), data.size());
Also applies to: 158-158
tests/io/test-MsgpackMessage.cpp (1)
1-20
: LGTM! Headers and namespace setup are well-organized.
The includes are comprehensive and properly organized. The use of IWYU pragmas and anonymous namespace demonstrates good practice.
TEST_CASE("Sync socket msgpack", "[io]") { | ||
boost::asio::io_context context; | ||
// Create server acceptor | ||
tcp::endpoint const local_endpoint{address::from_string("127.0.0.1"), 0}; | ||
tcp::acceptor acceptor{context, local_endpoint}; | ||
|
||
std::thread server_thread([&acceptor, &context]() { | ||
// Create server socket | ||
tcp::socket socket{context}; | ||
acceptor.accept(socket); | ||
|
||
// NOLINTBEGIN(clang-analyzer-unix.Malloc) | ||
for (size_t const buffer_size : cBufferSizes) { | ||
std::optional<msgpack::sbuffer> const optional_buffer | ||
= spider::core::receive_message(socket); | ||
REQUIRE(optional_buffer.has_value()); | ||
if (optional_buffer.has_value()) { | ||
msgpack::sbuffer const& buffer = optional_buffer.value(); | ||
REQUIRE(buffer_size == buffer.size()); | ||
for (size_t i = 0; i < buffer.size(); ++i) { | ||
// NOLINTNEXTLINE(cppcoreguidelines-pro-bounds-pointer-arithmetic) | ||
REQUIRE(i % 256 == std::bit_cast<uint8_t>(buffer.data()[i])); | ||
} | ||
} | ||
} | ||
// NOLINTEND(clang-analyzer-unix.Malloc) | ||
}); | ||
|
||
// Create client socket | ||
tcp::socket socket(context); | ||
boost::asio::connect( | ||
socket, | ||
std::vector{tcp::endpoint{ | ||
address::from_string("127.0.0.1"), | ||
acceptor.local_endpoint().port() | ||
}} | ||
); | ||
|
||
for (size_t const buffer_size : cBufferSizes) { | ||
msgpack::sbuffer buffer; | ||
for (size_t i = 0; i < buffer_size; ++i) { | ||
// NOLINTNEXTLINE(bugprone-narrowing-conversions,cppcoreguidelines-narrowing-conversions) | ||
char const value = i % 256; | ||
buffer.write(&value, sizeof(value)); | ||
} | ||
REQUIRE(spider::core::send_message(socket, buffer)); | ||
} | ||
server_thread.join(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add timeout and error handling mechanisms.
The synchronous test case could be more robust with these improvements:
- Add a timeout mechanism for the server thread
- Implement RAII for socket cleanup
- Add error handling for socket operations
Here's a suggested implementation:
+class ScopedThread {
+ std::thread thread_;
+public:
+ template<typename F>
+ explicit ScopedThread(F&& f) : thread_(std::forward<F>(f)) {}
+ ~ScopedThread() { if(thread_.joinable()) thread_.join(); }
+};
TEST_CASE("Sync socket msgpack", "[io]") {
boost::asio::io_context context;
tcp::endpoint const local_endpoint{address::from_string("127.0.0.1"), 0};
tcp::acceptor acceptor{context, local_endpoint};
- std::thread server_thread([&acceptor, &context]() {
+ auto server_thread = ScopedThread([&acceptor, &context]() {
+ try {
tcp::socket socket{context};
acceptor.accept(socket);
// ... existing code ...
+ } catch (const std::exception& e) {
+ FAIL("Server thread failed: " << e.what());
+ }
});
// ... client code ...
- server_thread.join();
Committable suggestion skipped: line range outside the PR's diff.
🧰 Tools
🪛 cppcheck (2.10-2)
[error] 24-24: syntax error
(syntaxError)
Description
Add variable size msgpack message sending and receiving through socket. Add unit tests for message sending and receiving.
Validation performed
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes
Documentation
Tests