-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issues with whitespace definition #361
Comments
I came across this difference when looking at why the The conclusion of this exercise is that apart from Tabulating the characters involved:
|
If we agree to use the unicode <production name="whitespace">
<alt>
<character set="White_Space"/>
<character set="FS"/>
<character set="GS"/>
<character set="RS"/>
<character set="US"/>
</alt>
</production> |
Looking at commit history, it appears as if at some point Java's |
I think it makes good sense to stick with Unicode here. Do we even need the special additions of FS, GS, RS and US? |
The They are likely to not occur in Cypher queries. I'd say it's harmless to either include or exclude them. |
I agree. I would lean towards going with Unicode rather than Java (and abandon Cypher's implementation history), but I don't feel strongly about it. I wonder if any of the two alternatives makes a difference for implementability? I doubt it. |
See #530 |
Neither Java's
Character.isWhitespace(int)
, orCharacter.isSpaceChar(int)
, or the unicode[:White_Space:]
specification treats\u180E
(MONGOLIAN VOWEL SEPARATOR) as a whitespace.Yet the openCypher grammar considers this a whitespace character, why?
openCypher/grammar/basic-grammar.xml
Line 781 in 346aa0d
Furthermore the definition of whitespace in the openCypher grammar does not consider
\u0085
(NEXT LINE) to be whitespace, while it is part of the unicode[:White_Space:]
specification. Perhaps that should be added? (it is not considered a whitespace by eitherCharacter.isWhitespace(int)
orCharacter.isSpaceChar(int)
, which explains why it is not in the grammar).The text was updated successfully, but these errors were encountered: