-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft the design document and prepare a rough mockup of the C++ API #1
Changes from 7 commits
e451011
579c180
bfc347d
f8dd93a
32fc5c0
e1eef4c
c8d097a
4c97198
240a0c0
2bec503
39f3357
c4b03ef
963a8bf
ef6a782
aec45e6
d5007bc
b275694
e57d37f
12bff65
b464cec
6230df8
fc01f26
8a765a0
72fdfd8
96b34fb
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,107 @@ | ||
# Open Cyphal Vehicle System Management Daemon for GNU/Linux | ||
|
||
This project implements a user-facing C++14 library backed by a GNU/Linux daemon used to asynchronously perform certain common operations on an OpenCyphal network. Being based on LibCyphal, the solution can theoretically support all transport protocols supported by LibCyphal, notably Cyphal/UDP and Cyphal/CAN. | ||
|
||
The implementation is planned to proceed in multiple stages. The milestones achieved at every stage are described here along with the overall longer-term vision. | ||
|
||
The design of the C++ API is inspired by the [`ravemodemfactory`](https://github.com/aleksander0m/ravemodemfactory) project (see `src/librmf/rmf-operations.h`). | ||
|
||
[Yakut](https://github.com/OpenCyphal/yakut) is a distantly related project with the following key differences: | ||
|
||
- Yakut is a developer tool, while OCVSMD is a well-packaged component intended for deployment in production systems. | ||
|
||
- Yakut is a user-interactive tool with a CLI, while OCVSMD is equipped with a machine-friendly interface -- a C++ API. Eventually, OCVSDM may be equipped with a CLI as well, but it will always come secondary to the well-formalized C++ API. | ||
|
||
- Yakut is entirely written in Python, and thus it tends to be resource-heavy when used in embedded computers. | ||
pavel-kirienko marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
- Yakut is not designed to be highly robust. | ||
pavel-kirienko marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## Long-term vision | ||
|
||
Not all of the listed items will be implemented the way they are seen at the time of writing this document, but the current description provides a general direction things are expected to develop in. | ||
|
||
OCVSMD is focused on solving problems that are pervasive in intra-vehicular OpenCyphal networks with minimal focus on any application-specific details. This list may eventually include: | ||
|
||
- Publish/subscribe on Cyphal subjects with arbitrary DSDL data types loaded at runtime, with the message objects represented as dynamically typed structures. More on this below. | ||
- RPC client for invoking arbitrarily-typed RPC servers with DSDL types loaded at runtime. | ||
- Support for the common Cyphal network services out of the box, configurable via the daemon API: | ||
- File server running with the specified set of root directories (see Yakut). | ||
- Firmware update on a directly specified remote node with a specified filename. | ||
- Automatic firmware update as implemented in Yakut. | ||
- Centralized (eventually could be distributed for fault tolerance) plug-and-play node-ID allocation server. | ||
- Depending on how the named topics project develops (many an intern has despaired over it), the Cyphal resource name server may also be implemented as part of OCVSMD at some point. | ||
pavel-kirienko marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Being a daemon designed for unattended operation in deeply-embedded vehicular computers, OCVSMD must meet the following requirements: | ||
pavel-kirienko marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
- Ability to operate from a read-only filesystem. | ||
- Startup time much faster than that of Yakut. This should not be an issue for a native application since most of the Yakut startup time is spent on the Python runtime initialization, compilation, and module importing. | ||
- Local node configuration ((redundant) transport configuration, node-ID, node description, etc) is loaded from a file, which is common for daemons. | ||
|
||
### Dynamic DSDL loading | ||
|
||
Dynamic DSDL loading is proposed to be implemented by creating serializer objects whose behavior is defined by the DSDL definition ingested at runtime. The serialization method is to accept a byte stream and to produce a DSDL object model providing named field accessors, similar to what one would find in a JSON serialization library; the deserialization method is the inverse of that. Naturally, said model will heavily utilize PMR for storage. An API mockup is given in `dsdl.hpp`. | ||
|
||
One approach assumes that instances of `dsdl::Object` are not exchanged between the client and the daemon; instead, only their serialized representations are transferred between the processes; thus, the entire DSDL support machinery exists in the client's process only. This approach involves certain work duplication between clients, and may impair their ability to start up quickly if DSDL parsing needs to be done. Another approach is to use shared-memory-friendly containers like Boost Interprocess or specialized PMR. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think there are two approaches to define:
For 1. we should simply create the types in C++ and make them part of the client library (we could use Nunavut with custom templates to generate these at build-time) with serialization being opaque to the client. The first implementation we could cheat this by exchanging the objects directly over unix domain sockets and then work in type-safe message exchange later. For 2. the dsdl::Object approach seems better suited.
pavel-kirienko marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
### C++ API | ||
|
||
The API will consist of several well-segregated C++ interfaces, each dedicated to a particular feature subset. The interface-based design is chosen to simplify testing in client applications. The API is intentionally designed to not hide the structure of the Cyphal protocol itself; that is to say that it is intentionally low-level. Higher-level abstractions can be built on top of it on the client side rather than the daemon side to keep the IPC protocol stable. | ||
|
||
The `Error` type used in the API definition here is a placeholder for the actual algebraic type listing all possible error states per API entity. | ||
|
||
The main file of the C++ API is the `daemon.hpp`, which contains the abstract factory `Daemon` for the specialized interfaces, as well as the static factory factory (sic) `connect() -> Daemon`. | ||
|
||
### Anonymous mode considerations | ||
|
||
Normally, the daemon should have a node-ID of its own. It should be possible to run it without one, in the anonymous mode, with limited functionality: | ||
|
||
- The Monitor will not be able to query GetInfo. | ||
- The RegisterClient, PnPNodeIDAllocator, FileServer, NodeCommandClient, etc. will not be operational. | ||
|
||
### Configuration file format | ||
|
||
The daemon configuration is stored in a TSV file, where each row contains a key, followed by at least one whitespace separator, followed by the value. The keys are register names. Example: | ||
|
||
```tsv | ||
uavcan.node.id 123 | ||
uavcan.node.description This is the OCVSMD | ||
uavcan.udp.iface 192.168.1.33 192.168.2.33 | ||
``` | ||
|
||
For the standard register names, refer to <https://github.com/OpenCyphal/public_regulated_data_types/blob/f9f67906cc0ca5d7c1b429924852f6b28f313cbf/uavcan/register/384.Access.1.0.dsdl#L103-L199>. | ||
|
||
### CLI | ||
|
||
TBD | ||
|
||
### Common use cases | ||
|
||
#### Firmware update | ||
|
||
Per the design of the OpenCyphal's standard network services, the firmware update process is entirely driven by the node being updated (updatee) rather than the node providing the new firmware file (updater). While it is possible to indirectly infer the progress of the update process by observing the offset of the file reads done by the updatee, this solution is fragile because there is ultimately no guarantee that the updatee will read the file sequentially, or even read it in its entirety. Per the OpenCyphal design, the only relevant parameters of a remote node that can be identified robustly are: | ||
|
||
- Whether a firmware update is currently in progress or not. | ||
- The version numbers, CRC, and VCS ID of the firmware that is currently being executed. | ||
|
||
The proposed API allows one to commence an update process and wait for its completion as follows: | ||
|
||
1. Identify the node that requires a firmware update, and locate a suitable firmware image file on the local machine. | ||
2. `daemon.get_file_server().add_root(firmware_path)`, where `firmware_path` is the path to the new image. | ||
3. `daemon.get_node_command_client().begin_software_update(node_id, firmware_name)`, where `firmware_name` is the last component of the `firmware_path`. | ||
4. Using `daemon.get_monitor().snapshot()`, ensure that the node in question has entered the firmware update mode. Abort if not. | ||
5. Using `daemon.get_monitor().snapshot()`, wait until the node has left the firmware update mode. | ||
6. Using `daemon.get_monitor().snapshot()`, ensure that the firmware version numbers match those of the new image. | ||
|
||
It is possible to build a convenience method that manages the above steps. Said method will be executed on the client side as opposed the daemon side. | ||
|
||
## Milestone 0 | ||
|
||
This milestone includes the very barebones implementation, including only: | ||
|
||
- The daemon itself, compatible with System V architecture only. Support for systemd will be introduced in a future milestone. | ||
- Running a local Cyphal/UDP node. No support for other transports yet. | ||
- Loading the configuration from the configuration file as defined above. | ||
- File server. | ||
- Node command client. | ||
|
||
These items will be sufficient to perform firmware updates on remote nodes, but not to monitor the update progress. Progress monitoring will require the Monitor module. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
namespace ocvsmd | ||
{ | ||
|
||
/// An abstract factory for the specialized interfaces. | ||
class Daemon | ||
{ | ||
public: | ||
virtual std::expected<std::unique_ptr<Publisher>, Error> make_publisher(const dsdl::Type& type, | ||
const std::uint16_t subject_id) = 0; | ||
|
||
virtual std::expected<std::unique_ptr<Subscriber>, Error> make_subscriber(const dsdl::Type& type, | ||
const std::uint16_t subject_id) = 0; | ||
|
||
virtual std::expected<std::unique_ptr<RPCClient>, Error> make_client(const dsdl::Type& type, | ||
const std::uint16_t service_id) = 0; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does the Client need to include the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, forgot this one. I think we'll go the same way we did in LibCyphal and PyCyphal -- the server node-ID is to be the factory parameter. |
||
|
||
virtual FileServer& get_file_server() = 0; | ||
virtual const FileServer& get_file_server() const = 0; | ||
|
||
virtual NodeCommandClient& get_node_command_client() = 0; | ||
|
||
virtual RegisterClient& get_register_client() = 0; | ||
|
||
virtual Monitor& get_monitor() = 0; | ||
virtual const Monitor& get_monitor() const = 0; | ||
|
||
virtual PnPNodeIDAllocator& get_pnp_node_id_allocator() = 0; | ||
virtual const PnPNodeIDAllocator& get_pnp_node_id_allocator() const = 0; | ||
}; | ||
|
||
/// A factory for the abstract factory that connects to the daemon. | ||
/// Returns nullptr if the daemon cannot be connected to (not running). | ||
std::unique_ptr<Daemon> connect(); | ||
pavel-kirienko marked this conversation as resolved.
Show resolved
Hide resolved
|
||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
namespace ocvsmd::dsdl | ||
{ | ||
/// Represents a DSDL object of any type. | ||
class Object | ||
{ | ||
friend class Type; | ||
public: | ||
/// Field accessor by name. Empty if no such field. | ||
std::optional<Object> operator[](const std::string_view field_name) const; | ||
|
||
/// Array element accessor by index. Empty if out of range. | ||
std::optional<std::span<Object>> operator[](const std::size_t array_index); | ||
std::optional<std::span<const Object>> operator[](const std::size_t array_index) const; | ||
|
||
/// Coercion to primitives (implicit truncation or the loss of precision are possible). | ||
operator std::optional<std::int64_t>() const; | ||
operator std::optional<std::uint64_t>() const; | ||
operator std::optional<double>() const; | ||
|
||
/// Coercion from primitives (implicit truncation or the loss of precision are possible). | ||
Object& operator=(const std::int64_t value); | ||
Object& operator=(const std::uint64_t value); | ||
Object& operator=(const double value); | ||
|
||
const class Type& get_type() const noexcept; | ||
|
||
std::expected<void, Error> serialize(const std::span<std::byte> output) const; | ||
std::expected<void, Error> deserialize(const std::span<const std::byte> input); | ||
}; | ||
|
||
/// Represents a parsed DSDL definition. | ||
class Type | ||
{ | ||
friend std::pmr::unordered_map<TypeNameAndVersion, Type> read_namespaces(directories, pmr, ...); | ||
public: | ||
/// Constructs a default-initialized Object of this Type. | ||
Object instantiate() const; | ||
... | ||
}; | ||
|
||
using TypeNameAndVersion = std::tuple<std::pmr::string, std::uint8_t, std::uint8_t>; | ||
|
||
/// Reads all definitions from the specified namespaces and returns mapping from the full type name | ||
/// and version to its type model. | ||
/// Optionally, the function should cache the results per namespace, with an option to disable the cache. | ||
std::pmr::unordered_map<TypeNameAndVersion, Type> read_namespaces(directories, pmr, ...); | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
namespace ocvsmd | ||
{ | ||
|
||
/// The daemon always has the standard file server running. | ||
/// This interface can be used to configure it. | ||
/// It is not possible to stop the server; the closest alternative is to remove all root directories. | ||
class FileServer | ||
{ | ||
public: | ||
/// When the file server handles a request, it will attempt to locate the path relative to each of its root | ||
/// directories. See Yakut for a hands-on example. | ||
/// The daemon will canonicalize the path and resolve symlinks. | ||
/// The same path may be added multiple times to avoid interference across different clients. | ||
/// The path may be that of a file rather than a directory. | ||
virtual std::expected<void, Error> add_root(const std::string_view path); | ||
|
||
/// Does nothing if such root does not exist (no error reported). | ||
/// If such root is listed more than once, only one copy is removed. | ||
/// The daemon will canonicalize the path and resolve symlinks. | ||
virtual std::expected<void, Error> remove_root(const std::string_view path); | ||
|
||
/// The returned paths are canonicalized. The entries are not unique. | ||
virtual std::expected<std::pmr::vector<std::pmr::string>, Error> list_roots() const; | ||
}; | ||
|
||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,68 @@ | ||
#include <uavcan/node/Heartbeat_1.hpp> | ||
#include <uavcan/node/GetInfo_1.hpp> | ||
|
||
namespace ocvsmd | ||
{ | ||
|
||
/// The monitor continuously maintains a list of online nodes in the network. | ||
class Monitor | ||
{ | ||
public: | ||
using Heartbeat = uavcan::node::Heartbeat_1; | ||
using NodeInfo = uavcan::node::GetInfo_1::Response; | ||
|
||
/// An avatar represents the latest known state of the remote node. | ||
/// The info struct is available only if the node responded to a uavcan.node.GetInfo request since last bootup. | ||
/// GetInfo requests are sent continuously until a response is received. | ||
/// If heartbeat publications cease, the corresponding node is marked as offline. | ||
struct Avatar | ||
pavel-kirienko marked this conversation as resolved.
Show resolved
Hide resolved
|
||
{ | ||
std::uint16_t node_id; | ||
|
||
bool is_online; ///< If not online, the other fields contain the latest known information. | ||
|
||
std::chrono::system_clock::time_point last_heartbeat_at; | ||
Heartbeat last_heartbeat; | ||
|
||
/// The info is automatically reset when the remote node is detected to have restarted. | ||
/// It is automatically re-populated as soon as a GetInfo response is received. | ||
struct Info final | ||
{ | ||
std::chrono::system_clock::time_point received_at; | ||
NodeInfo info; | ||
}; | ||
std::optional<Info> info; | ||
|
||
/// The port list is automatically reset when the remote node is detected to have restarted. | ||
/// It is automatically re-populated as soon as an update is received. | ||
struct PortList final | ||
{ | ||
std::chrono::system_clock::time_point received_at; | ||
std::bitset<65536> publishers; | ||
std::bitset<65536> subscribers; | ||
std::bitset<512> clients; | ||
std::bitset<512> servers; | ||
}; | ||
std::optional<PortList> port_list; | ||
}; | ||
|
||
struct Snapshot final | ||
{ | ||
/// If a node appears online at least once, it will be given a slot in the table permanently. | ||
/// If it goes offline, it will be retained in the table but it's is_online field will be false. | ||
/// The table is ordered by node-ID. Use binary search for fast lookup. | ||
std::pmr::vector<Avatar> table; | ||
std::tuple<Heartbeat, NodeInfo> daemon; | ||
bool has_anonymous; ///< If any anonymous nodes are online (e.g., someone is trying to get a PnP node-ID allocation) | ||
}; | ||
|
||
/// Returns a snapshot of the current network state plus the daemon's own node state. | ||
virtual Snapshot snap() const = 0; | ||
|
||
// TODO: Eventually, we could equip the monitor with snooping support so that we could also obtain: | ||
// - Actual traffic per port. | ||
// - Update node info and local register cache without sending separate requests. | ||
// Yakut does that with the help of the snooping support in PyCyphal, but LibCyphal does not currently have that capability. | ||
}; | ||
|
||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
#include <uavcan/node/ExecuteCommand_1.hpp> | ||
|
||
namespace ocvsmd | ||
{ | ||
|
||
/// A helper for invoking the uavcan.node.ExecuteCommand service on the specified remote nodes. | ||
/// The daemon always has a set of uavcan.node.ExecuteCommand clients ready. | ||
class NodeCommandClient | ||
{ | ||
public: | ||
using Request = uavcan::node::ExecuteCommand_1::Request; | ||
using Response = uavcan::node::ExecuteCommand_1::Response; | ||
|
||
/// Empty response indicates that the associated node did not respond in time. | ||
using Result = std::expected<std::pmr::unordered_map<std::uint16_t, std::optional<Response>>, Error>; | ||
|
||
/// Empty option indicates that the corresponding node did not return a response on time. | ||
/// All requests are sent concurrently and the call returns when the last response has arrived, | ||
/// or the timeout has expired. | ||
virtual Result send_custom_command(const std::span<const std::uint16_t> node_ids, | ||
const Request& request, | ||
const std::chrono::microseconds timeout = 1s) = 0; | ||
|
||
/// A convenience method for invoking send_custom_command() with COMMAND_RESTART. | ||
Result restart(const std::span<const std::uint16_t> node_ids, const std::chrono::microseconds timeout = 1s) | ||
{ | ||
return send_custom_command(node_ids, {65535, ""}, timeout); | ||
} | ||
|
||
/// A convenience method for invoking send_custom_command() with COMMAND_BEGIN_SOFTWARE_UPDATE. | ||
/// The file_path is relative to one of the roots configured in the file server. | ||
Result begin_software_update(const std::span<const std::uint16_t> node_ids, | ||
const std::string_view file_path, | ||
const std::chrono::microseconds timeout = 1s) | ||
{ | ||
return send_custom_command(node_ids, {65533, file_path}, timeout); | ||
} | ||
|
||
// TODO: add convenience methods for the other standard commands. | ||
}; | ||
|
||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
namespace ocvsmd | ||
{ | ||
|
||
/// Implementation detail: internally, the PnP allocator uses the Monitor because the Monitor continuously | ||
/// maintains the mapping between node-IDs and their unique-IDs. It needs to subscribe to notifications from the | ||
/// monitor; this is not part of the API though. See pycyphal.application.plug_and_play.Allocator. | ||
class PnPNodeIDAllocator | ||
{ | ||
public: | ||
/// Maps unique-ID <=> node-ID. | ||
/// For some node-IDs there may be no unique-ID (at least temporarily until a GetInfo response is received). | ||
/// The table includes the daemon's node as well. | ||
using UID = std::array<std::uint8_t, 16>; | ||
using Entry = std::tuple<std::uint16_t, std::optional<UID>>; | ||
using Table = std::pmr::vector<Entry>; | ||
|
||
/// The method is infallible because the corresponding publishers/subscribers are always active; | ||
/// when enabled==false, the allocator simply refuses to send responses. | ||
virtual void set_enabled(const bool enabled) = 0; | ||
virtual bool is_enabled() const = 0; | ||
|
||
/// The allocation table may or may not be persistent (retained across daemon restarts). | ||
virtual Table get_table() const = 0; | ||
|
||
/// Forget all allocations; the table will be rebuilt from the Monitor state. | ||
virtual void drop_table() = 0; | ||
}; | ||
|
||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any concerns with including daemon in the name if it's also going to be a CLI? I know we're still deciding on the name but might be something to consider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CLI would be a different entry-point/executable (I think)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CLI sits on top of the daemon, so I think it is fine.