Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate all VaultsSecrets handlers to RawHandler #822

Open
4 tasks
aryanjassal opened this issue Oct 8, 2024 · 7 comments
Open
4 tasks

Migrate all VaultsSecrets handlers to RawHandler #822

aryanjassal opened this issue Oct 8, 2024 · 7 comments
Assignees
Labels
development Standard development

Comments

@aryanjassal
Copy link
Member

aryanjassal commented Oct 8, 2024

Specification

Most, if not all, VaultsSecrets handlers transfer binary data. Doing so using plain JSON isn't the most efficient, as the required bandwidth can be as high as 200% of actual data size, making regular JSON extremely inefficient for transferring binary data.

This is solved by the RawHandler, which is able to stream raw bytes of data directly, bypassing the large overhead. However, it also has many issues, like not supporting error serialisation by default. As such, any errors thrown within the context will hit the transport layer, causing a read 0 error.

To solve this, each handler needs to implement its own method of error serialisation and deserialisation. The handler side needs to serialise the errors so they are passed through as binary data, and the client side needs to deserialise the data back into the original error. Serialising and deserialising all possible errors is not feasible, and should not be done either. Only create serialisation and deserialisation handlers for relevant errors that are expected, and raise a generic error for all other, unexpected errors.

Additional context

Tasks

  1. Migrate the following to use RawHandlers
    • VaultsSecretsCat
    • VaultsSecretsWrite
    • VaultsSecretsCopy
    • VaultsSecretsMove
@aryanjassal aryanjassal added the development Standard development label Oct 8, 2024
@aryanjassal aryanjassal self-assigned this Oct 8, 2024
Copy link

linear bot commented Oct 8, 2024

@CMCDragonkai
Copy link
Member

This needs to be weighed up against making the RPC too unique to our usecases and making it more difficult to use in other clients.

Copy link
Member Author

Actually, not all VaultsSecrets handlers need to be converted. Only the ones which deal with actual binary data like VaultsSecretsCat. I don't think a raw stream is required to file names, as it would be much rarer for filenames to have escaped characters than binary file contents.

This would be crucial, however, for VaultsSecretsCopy and VaultsSecretsMove, as both the handlers will send a tar file storing the entire file tree. This will have a lot of unprintable binary, and escaping it all for server streams will lead to a massively blown-up size.

Copy link
Member

Raw stream just means a json message initially and then just raw binary data afterwards. We don't have a trailing message protocol so it's easy. This also makes it possible to protocol matching, like enabling the ability to raw pipe binary data from the fs straight into the stream after the initial json message.

Copy link
Member

This makes it far more efficient too. It was the original intention!

Copy link
Member Author

Another point to note would be that error serialisation would be restricted. Normally, the regular streams would throw errors normally, and they would get caught by the RPC internals, get serialised, and sent over the RPC as JSON. Then, the other side will convert it back. For raw streams, however, if an error is thrown from inside the RPC, then it will hit the transport layer and bring down the entire connection with a read 0 message.

To deal with this, we would need to do manual error serialisation/deserialisation, which is fine but it would restrict what errors we can send over. If we restrict the errors we can send over, then we are restricting the visibility of bugs. If an error happens on the handler, then instead of getting the error, we would get a UnknownError or something. If we implement a robust error handing, then that is almost what we already have in Polykey, and it would be repetition of effort to repeat it. Perhaps the existing implementation for codeToError and errorToCode might be reusable as a drop-in method, which would simplify the whole process, but if it isn't, then that would be a major issue for the usability and boilerplate needed for the specific RPC handlers.

@tegefaulkes
Copy link
Contributor

If you want to send errors over the binary stream then we need to work out a protocol that can multiplex these errors over the stream. It would have to be implemented by a re-usable transform stream that handles it for us.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
development Standard development
Development

No branches or pull requests

3 participants