Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RPC message parser can support arbitrary whitespace such as '\n' within and between JSON RPC messages #1

Open
2 tasks
tegefaulkes opened this issue Feb 16, 2023 · 17 comments · Fixed by #8
Assignees
Labels
development Standard development r&d:polykey:supporting activity Supporting core activity

Comments

@tegefaulkes
Copy link
Contributor

tegefaulkes commented Feb 16, 2023

Specification

Currently the agnostic RPC message parser separates the messages based on the}{ boundry., which makes it difficult to read messages since they are in a single line.

The objective is to improve readability by formatting a message with newlines and whitespaces, and ensuring that the parser is able to handle these characters.

// OLD
{some: 'message'}{other: 'message'}...

// new
{some: 'message'}
{other: 'message'}
...

This should be possible by adjusting the separator parameter which should be able to discard the newline and other whitespace characters when constructing the JsonStreamParser. (Constructing the stream)

Additional context

Related MatrixAI/Polykey#249
Related MatrixAI/Polykey#498

Tasks

  • 1. Format the message stream to be human-readable by incorporating newlines and whitespace.
  • 2. Modify parser to discard ' ', \n, \t, \r, \f, \v, \u200B, \u00A0.
@tegefaulkes tegefaulkes added the development Standard development label Feb 16, 2023
@CMCDragonkai
Copy link
Member

Arbitrary newlines and spaces should be possible.

@CMCDragonkai
Copy link
Member

Any whitespace that could exist... can be automatically discarded and should not hit the buffer limit.

@CMCDragonkai
Copy link
Member

Might need to just check what jsonstream parser supports. It could be limited to only a single character delimiter or no delimiters, if it can take in variable length delimitation of arbitrary whitespace, that would also be useful.

@CMCDragonkai CMCDragonkai added r&d:polykey:core activity 1 Secret Vault Sharing and Secret History Management r&d:polykey:supporting activity Supporting core activity and removed r&d:polykey:core activity 1 Secret Vault Sharing and Secret History Management labels Jul 9, 2023
@CMCDragonkai CMCDragonkai changed the title Message parser '\n' delimitation RPC message parser '\n' delimitation for JSON RPC messages Aug 1, 2023
@CMCDragonkai
Copy link
Member

Regarding websocket transport, each websocket frame is already a separate message frame.

Therefore, when we are translating websocket connections to web streams, we are simply concatenating the messages in the stream.

Due to being transport agnostic, the stream has no idea that the underlying transport, in this case websocket is already doing message framing. It just appears as a continuous byte stream to the RPC handler.

This issue is therefore about enabling the ability to parse optional unlimited whitespace in between complete JSON RPC messages.

In fact... it should be possible to parse unlimited whitespace anywhere that the JSON spec allows. This would depend on the jsonstream parser and what it is capable of doing.

Note that after the HTTP handshake, every websocket frame is already binary on the wire-level. It's not a text protocol like HTTP 1.1.

This means even if \n were to exist possibly due to something like conn.write(jsonMessage + '\n');, it would not actually aid in debugging the client service. This is because if you were to use wireshark or something similar, you would still have to decode the websocket binary frames first and then get to the payload which you would have to parse.

The addition of \n is there not necessary when using websocket as a transport for the RPC which is the case for client service.

However for quic as the transport, there's no equivalent to the websocket frame.

Anyway... the problem is 2 things:

  1. Enabling \n isn't really going to increase the ability to debug RPC messages because of the underlying transport. At the end of the day, something is going to be looking at the transport protocol and having to stitch together complete JSON messages.
  2. However being able to accept arbitrary whitespace including newlines would make our RPC system more robust to third party JSON RPC implementations... it would make it less strict in this sense.

It is important that if whitespace were to exist inside the JSON stream, that these whitespace characters are dropped to avoid a build up of memory. If they are in fact dropped, they should not count towards our buffer limit. This needs to be verified.

@CMCDragonkai CMCDragonkai changed the title RPC message parser '\n' delimitation for JSON RPC messages RPC message parser can support arbitrary whitespace such as '\n' delimitation for JSON RPC messages Aug 1, 2023
@CMCDragonkai CMCDragonkai changed the title RPC message parser can support arbitrary whitespace such as '\n' delimitation for JSON RPC messages RPC message parser can support arbitrary whitespace such as '\n' within and between JSON RPC messages Aug 1, 2023
@CMCDragonkai
Copy link
Member

Going to move this to js-rpc as it would be done over there.

@CMCDragonkai CMCDragonkai transferred this issue from MatrixAI/Polykey Aug 11, 2023
@addievo addievo closed this as completed Sep 12, 2023
@CMCDragonkai
Copy link
Member

Hey @addievo don't close issues directly. Only when a PR is merged is the issue auto-closed.

@CMCDragonkai CMCDragonkai reopened this Sep 12, 2023
@CMCDragonkai
Copy link
Member

Since the PR hasn't been merged. You need to say Fixes #1 in the list. No need for checkboxes. The template was designed to be good enough.

@addievo addievo pinned this issue Sep 13, 2023
@tegefaulkes
Copy link
Contributor Author

The core goal of this issue is to allow the stream JSON parser to allow for arbitrary white-space for it's input.

What the binaryToJsonMessageStream middleware does is takes a stringified JSON stream and converts it to JSON objects. Right now the input stream is expected to be separated by nothing. That is the messages are formatted with {...message}{...message}.

We want to allow for arbitrary white space in the stream when separating the messages. So any combination of white space should be valid on the stream. For example the following should be valid

// Normal separation
{...message}{...message}{...message}

// Spaces
{...message} {...message}    {...message}

// New lines
{...message}
{...message}
{...message}

// Any combination
{...message}                                  {...message}
                       {...message}
{...message}

Note that this is on the input to the parser and stream transform.

This is to allow flexibility with 3rd parties using the RPC API. I for example could connect to the rpc through netcat and write my own hand crafted JSONRPC messages that are separated by newlines.

addievo added a commit that referenced this issue Sep 15, 2023
[ci skip]
@addievo
Copy link
Contributor

addievo commented Sep 20, 2023

Whitespace removal - possible change required in jsonstreamparser. See juanjoDiaz/streamparser-json#35 - I suspect this requires going into the source code and modifying the tokeniser so it tokenises a whitespace token based on an array of candidate characters. Extend the separator option to have an array of potential strings instead of a single string like ['\n', '\t', '\r']. Alternatively, allow separator to instead be a regular expression. Remember that the tokeniser should be capable of greedily eating up all the separators into 1 token.

addievo added a commit that referenced this issue Sep 26, 2023
callers and handlers are now refactored

* WIP - Newline now works, refers issue #1

node v20 fix

feat: handlers implementations are now abstract arrow functions

* Fixes #5

[ci skip]

* resolves issue 5, makes RPC handlers abstract arrow function properties

feat: rename to uppercase

[ci skip]

fix: handler export fix

[ci skip]

fix: tsconf from quic

[ci skip]

fix: dependencies (js quic), events and errors versions, changing to relative imports, jest dev dependency, js-quic tsconfig

[ci skip]

fix: tests imports, using @

[ci skip]

chore: removed sysexits

chore: fix default exports for callers and handlers
Fixed index for handlers

fix: remove @matrixai/id

fix: remove @matrixai/id and ix

chore : diagram

[ci skip]

chore : lintfix
fix: errors now extend AbstractError

[ci skip]

fix: undoing fix #1

[ci skip]

replacd errorCode with just code, references std error codes from rpc spec

feat: events based createDestroy

[ci skip]

chore: img format fix

[ci skip]

chore: img in README.md

[ci skip]

feat: allows the user to pass in a generator function if the user wishes to specify a particular id

[ci skip]

fix: fixes #7

* Removes graceTimer and related jests

chore: idGen name change. idGen parameter in creation and constructor. No longer optional. Only defaulted in one place.

wip: added idgen to jests, was missing.

[ci skip]

wip: reimported ix, since a few tests rely on it.
removed, matrixai/id

wip: jests for #4
removed, matrixai/id

wip: * Implements custom RPC Error codes.
     * Fixed jest for concurrent timeouts
     * All errors now have a cause
     * All errors now use custom error codes.

wip: *Client uses ctx timer now

wip: *Jests to test concurrency

wip: *custom RPC based errors for RPC Client, now all errors have a cause and an error message

WIP: * Refactor out sensitiveReplacer

WIP: * Refactor out sensitiveReplacer

WIP: * Update to latest async init and events
* set default timeout to Infinity
* jest to check server and client with infinite timeout
* fixing jests which broke after changing default timeout to infinity

WIP: f1x #4

WIP: f1x #11

f1x: parameterize toError, fromError and replacer

wip: tofrom

fix: parameterize toError, fromError and replacer

fix: Makes concurrent jests non deterministic

* Related #4

fix: parameterize replacer toError and fromError, change fromError to return JSONValue, stringify fromError usages

* Related #10

fix: Converted global state for fromError to handle it internally.

*Related: #10
Reviewed-by: @tegefaulkes
[ci skip]

chore: Jests for fromError and toError, and using a custom replacer.

related: #10

[ci skip]
@addievo addievo reopened this Sep 26, 2023
@CMCDragonkai
Copy link
Member

Will depend on juanjoDiaz/streamparser-json#35. Putting this into the to do to revisit.

@addievo
Copy link
Contributor

addievo commented Oct 25, 2023

@CMCDragonkai
Copy link
Member

juanjoDiaz/streamparser-json#36

@addievo what does that have to do with this problem?

@juanjoDiaz
Copy link

In case you don't follow the streamparser-json repo, it turns our that using random whitespaces as separators was always allowed by simply setting the separator to and empty string ''

@tegefaulkes tegefaulkes removed their assignment Jul 31, 2024
@CMCDragonkai
Copy link
Member

@amydevs can you verify if this is is already done as per the comment by @juanjoDiaz?

@CMCDragonkai
Copy link
Member

Upstream says it is supported already. @tegefaulkes @aryanjassal can check if this can be a quick fix.

@CMCDragonkai
Copy link
Member

Make sure to close this as you're working on it @aryanjassal.

@aryanjassal
Copy link
Member

I don't think I've gotten around to working on this issue yet, but I will update it when I start.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
development Standard development r&d:polykey:supporting activity Supporting core activity
Development

Successfully merging a pull request may close this issue.

6 participants