-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How do you represent bencoded strings that contain a <hex>
tag to avoid ambiguity?
#7
Comments
Proposal 1Maybe we can always include some tags. For byte sequences containing valid UTF-8: "<utf8>spam</utf8>" And not UTF-8 sequences: "<bytes>fffe</bytes>" The encoded value |
Proposal 2"<string>spam</string>" And not UTF-8 sequences: "<hex>fffe</hex>" This would be half compatible with the other implementation. |
Proposal 3Maybe we can event simplify the metadata using just a prefix instead of a html-style tag: "string:spam" And not UTF-8 sequences: "hex:fffe" |
Hey, @da2ce7, can you add an example of what we discussed today? I mean one example like this:
For the problem with escaped chars you mentioned in the meeting. I can't come up with one example. |
Hi @da2ce7, if you have a concrete example for this ☝🏼 that would be awesome! Thanks. |
The problem is that it is lossy transformation. There are multiple bencode inputs that can produce identical json results. The issue is if we escape XML style tags, then there is two inputs that produces the same output. The So we don't know if the bencode input was "pre-escaped" or not. |
Hi @da2ce7 forward slash does not need to be escaped in JSON. The two examples you mention
For the backslash (hex
|
I think that you are correct. If we always insert the escaping character, then it will have a reversible encoding |
Hey @da2ce7 and what proposal do you prefer? I prefer the proposal 2
Because:
"<hex>ff</hex><metada>...</medatada>" as long as we include With proposal 3, we can only add new prefixes. |
@da2ce7 has just told me that proposal 2 is fine. |
After merging this change, we have to update the draft TEP. |
And also bump the version to |
Relates to: Chocobo1/bencode_online#3
How do you represent bencoded strings that contain a tag to avoid ambiguity with the tags introduced by not utf-8 bencoded strings?
Submitted on reddit by Icarium-Lifestealer
UPDATE
Example, two bencoded values producing the same output:
List of special characters used in JSON that need to be escaped in string values:
\b
Backspace (ascii code 08)\f
Form feed (ascii code 0C)\n
New line\r
Carriage return\t
Tab\"
Double quote\\
Backslash characterThe text was updated successfully, but these errors were encountered: