Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add docs best effort responses #3865

Open
wants to merge 22 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
f61fe5a
WIP: adjust the docs to best-effort responses
oggy-dfin Nov 29, 2024
5f89f7e
Adjust the safe retries doc
oggy-dfin Dec 3, 2024
44806de
Adjust security best practices
oggy-dfin Dec 5, 2024
410e042
Improve the terminology in the message execution properties doc
oggy-dfin Dec 6, 2024
2916a32
Revamp the overview doc for inter-canister calls
oggy-dfin Dec 6, 2024
0613d32
Fix some typos
oggy-dfin Dec 6, 2024
d344cd9
Some more improvements
oggy-dfin Dec 10, 2024
ddf5818
Apply suggestions from code review
oggy-dfin Dec 11, 2024
e213314
Address David's and Jessie's comments
oggy-dfin Dec 12, 2024
7d82156
Update docs/references/message-execution-properties.mdx
oggy-dfin Dec 12, 2024
d545ab8
Update docs/references/message-execution-properties.mdx
oggy-dfin Dec 12, 2024
0132c93
Update docs/references/message-execution-properties.mdx
oggy-dfin Dec 12, 2024
9eb0ce2
Update docs/references/message-execution-properties.mdx
oggy-dfin Dec 12, 2024
7f4f09b
Update docs/references/message-execution-properties.mdx
oggy-dfin Dec 12, 2024
d626bb4
Address Alin's comments (WIP)
oggy-dfin Dec 12, 2024
7da5a43
Address Alin's comments
oggy-dfin Dec 12, 2024
dacf8bc
Minor improvements
oggy-dfin Dec 12, 2024
7a4f424
Update docs/developer-docs/smart-contracts/advanced-features/async-co…
oggy-dfin Dec 13, 2024
2cd1f77
Andy's comments
oggy-dfin Dec 23, 2024
bf6cb13
Merge branch 'master' into oggy/best-effort-responses
oggy-dfin Dec 23, 2024
2ad10b6
Improve the messaging properties doc a bit
oggy-dfin Dec 23, 2024
f7f7e1a
Alin's comment on users vs applications
oggy-dfin Dec 23, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -309,29 +309,33 @@ Finally, note that the same guard can be used in several methods to restrict par

### Security concern

As stated by the Property 6 above, inter-canister calls can fail in which case they result in a **reject**. See [reject codes](/docs/current/references/ic-interface-spec#reject-codes) for more detail. The caller must correctly deal with the reject cases, as they can happen in normal operation, because of insufficient cycles on the sender or receiver side, or because some data structures like message queues are full.
As stated by the Property 6 above, inter-canister calls can fail in which case they result in a **reject**. See [reject codes](/docs/current/references/ic-interface-spec#reject-codes) for more detail. The caller must correctly deal with the reject cases, as they can happen in normal operation, because of insufficient cycles on the sender or receiver side, or even for reasons outside of the sender's or receiver's control, like the system (Internet Computer) being under heavy load (e.g., message queues becoming full).

Not handling the error cases correctly is risky: For example, if a ledger transfer results in an error, the callback dealing with that error must interpret it correctly. That is, it must be interpreted as "the transfer did not happen".
Not handling the reject cases correctly is risky: For example, if a ledger transfer results in a reject, the callback dealing with that error must interpret it correctly. That is, it should be interpreted as "the transfer did not happen", unless:

1. the call was issued as a best-effort response call, and the system responded with a `SYS_UNKNOWN` reject code. In this case, the caller cannot be a priori sure whether the call took effect or not.
2. the system responded with a `CANISTER_ERROR` reject code. This indicates a bug in the ledger canister. In this case, it is still possible that the call had a partial effect on the ledger canister.
3. the system responded with a `CANISTER_REJECT` reject code. This means that the call was explicitly rejected by the ledger canister. Normally, this indicates that the transfer didn't happen, but this depends on the ledger canister. The ICP ledger canister for example never rejects calls explicitly.

### Recommendation

When making inter-canister calls, always handle the error cases (rejects) correctly. These errors imply that the message has not been successfully executed.
When making inter-canister calls, always handle the error cases (rejects) correctly. Other than the `SYS_UNKNOWN` error code, these errors imply that the message has not been successfully executed. For `SYS_UNKNOWN`, follow the guidelines in the [safe retries & idempotency](/docs/current/developer-docs/smart-contracts/best-practices/idempotency) document to handle this scenario correctly.

## Be aware of the risks involved in calling untrustworthy canisters

### Security concern

- If inter-canister calls are made to potentially malicious canisters, this can lead to DoS issues or there could be issues related to candid decoding. Also, the data returned from a canister call could be assumed to be trustworthy when it is not.

- When another canister is called with a callback being registered, and the receiver stalls the response indefinitely by not responding, the result would be a DoS. Additionally, that canister can no longer be upgraded if it has callbacks registered. Recovery would require wiping the state of the canister by reinstalling it. Note that even a trustworthy canister could have a bug causing it to stall indefinitely. However, such a bug seems rather unlikely to occur.
- When a canister `C1` calls a canister `C2` using a guaranteed-response inter-canister call, and `C2` stalls the response indefinitely by not responding, the result would be a DoS on `C1`. Additionally, since the call registers a callback on `C1`, `C1` can no longer be stopped because of the outstanding callback, and thus can no longer be cleanly upgraded. Recovery would require wiping the state of the canister by reinstalling it. Note that even if `C2` was trustworthy it could still stall indefinitely. This could happen due to a bug in`C2` (which is rather unlikely to occur). But other causes could be a stall of the subnet hosting `C2` (assuming that `C1` and `C2` are on different subnets), or `C2` making a downstream call to an untrusted canister `C3`.

- In summary, this can DoS a canister, consume an excessive amount of resources, or lead to logic bugs if the behavior of the canister depends on the inter-canister call response.

### Recommendation

- Making inter-canister calls to trustworthy canisters is safe, except for the rather unlikely case that there is a bug in the callee that makes it stall forever.
- Making inter-canister calls to trustworthy canisters is safe, except for the rather unlikely case that there is a bug in the callee or its subnet that makes it stall forever.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above.

I would leave out "a bug in [...] its subnet", unless you mean "a temporarily stalled subnet".


- Interacting with untrustworthy canisters is still possible by using a state-free proxy canister which could easily be re-installed if it is attacked as described above and is stuck. When the proxy is reinstalled, the caller obtains an error response to the open calls.
- Interacting with untrustworthy canisters is still possible by using best-effort response calls, which cannot be stalled by the recipient. In particular, when using calls that do not change the callee's state (e.g., just fetching information), prefer using best-effort response calls. Another option is using guaranteed response calls through a state-free proxy canister which could easily be re-installed if it is attacked as described above and is stuck. When the proxy is reinstalled, the caller obtains an error response to the open calls.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point the proxy canister is a poor man's best-effort call. Which is I guess what it always was. I don't know whether it's still worth mentioning as an option.


- Sanitize data returned from inter-canister calls.

Expand All @@ -348,7 +352,7 @@ Loops in the call graph (e.g. canister A calling B, B calling C, C calling A) ma

### Recommendation

- Avoid such loops.
- Avoid such loops, or rely on best-effort response calls instead, since these provide timeouts.

- For more information, see [current limitations of the Internet Computer](https://wiki.internetcomputer.org/wiki/Current_limitations_of_the_Internet_Computer), section "Loops in call graphs".

Loading
Loading