RFC: timeout & deadline overhaul #22
J-Loudet
started this conversation in
Zenoh Flow
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Summary
Instead of using the term "timeout" to express this strictly enforced maximum duration on a path of a data flow, we propose to revisit our definition of a deadline.
The new definition would associate a
mode
to a deadline that expresses how Zenoh Flow enforces it:To reformulate, the relaxed mode notifies the user after the message reaches the last node in the path of the deadline while the strict mode tries to detect a violation as soon as possible (without any strong guarantee — for instance, if a message is stuck in a queue, Zenoh Flow cannot start the associated timer).
The new definition would also remove the distinction between local (i.e. that applies on a single node) and end-to-end (i.e. that applies on a path) deadlines: a deadline will be able to start at an input and end at an output, so if both input and output belong to the same node, the deadline would be local.
The new definition will also require users to fully qualify the path on which a deadline applies. This change is to make the behaviour of a deadline more predictable: as there can be multiple paths connecting two nodes, without qualifying the path, we could end up in a situation where several "deadline miss notifications" are sent for the same message. By fully qualifying the path of a deadline, we ensure that only one notification can be sent for a single message as there would only be one path for the deadline.
Intended outcome
Deadlines would be declared in a single section of the YAML description of a data flow.
The format would be as following:
Caveats
Measuring a deadline
The first caveat has to do with the moment at which Zenoh Flow starts measuring a deadline. This moment depends on where the path of a deadline starts: if it is at the input of a node, we start measuring it the moment it is first processed in the Input Rules; if it is at the output of a node, we start measuring it the moment we send it on the output (hence, after the output rules).
Deadline notifications
The second caveat has to do with where the notification of a deadline miss is sent. This depends on where the path of a deadline ends: if it is at the input of a node the notification is sent to the Input Rules; if it is at the output then it is sent to the Output Rules. In both cases, this notification is sent as a Control Message.
Edge-cases
Additionally, there are some edge-cases that we cannot prevent and that could lead to, somehow, unexpected notifications.
Branches
For instance, in the following scenario, if a strict deadline is violated at A and the end result was supposed to go to node B (something that Zenoh Flow cannot predict), node C will still receive a notification that the deadline was violated.
Nested deadlines
The behaviour in case of nested deadlines is not necessarily straight-forward: if an outer strict deadline is violated, inner deadlines will not receive a notification. If an inner deadline is violated, other deadlines are still checked as nodes have the possibility to provide default values (through Input & Output Rules). In the latter case, if no values are produced (i.e. no data produced on an output tied to any of the remaining deadlines) after a deadline miss, the other deadlines are dropped and no other notifications are sent to downstream operators.
Beta Was this translation helpful? Give feedback.
All reactions