-
Notifications
You must be signed in to change notification settings - Fork 238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
model: Add wrapWithBacktickFence, to use with quote-and-reply #179
Conversation
12a1030
to
55a689f
Compare
lib/model/compose.dart
Outdated
final matches = _openingBacktickFenceRegex.allMatches(content); | ||
int result = 3; | ||
for (final match in matches) { | ||
result = max(result, match[1]!.length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
match[1]!.length + 1
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eep, yep, thanks for the catch! An example of why we'll want this code to have tests before we merge it. 😅
lib/model/compose.dart
Outdated
for (int i = 0; i < fenceLength; i++) { | ||
resultBuffer.write('`'); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FTR Dart suports string repetition with '`' * fenceLength
.
test/model/compose_test.dart
Outdated
// TODO: Find a nice compact way to write test cases. :-) The tricky bit is | ||
// that almost all of them will involve two multiline strings that need to | ||
// be easy to read (input and expected output). Can we e.g. read data from | ||
// a separate file in a format that's convenient to maintain? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd start by writing the strings in a way like this:
print('''\
```quote
words
```
''');
Note the backslash after the opening triple-quote — that allows the source to have a newline, so that the string appears in the source with the same alignment as it'd have on its own, without introducing a leading newline in the resulting string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. But in DartPad:
main() {
const stringA = '''\
```quote
words
```
''';
const stringB = '''
```quote
words
```
''';
print(stringA == stringB); // true
print(stringB[0]); // `
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if the opening """ is followed by an optional \
and then a newline, the newline is ignored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, if I want the ending """ to be set off by a newline that doesn't appear in the source, I'm not sure I can accomplish that with \
:
main() {
const stringA = '''
```quote
words
```
''';
const stringB = '''
```quote
words
```\
''';
print(stringA == stringB); // true
print(stringB.endsWith('\n')); // true
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if the opening """ is followed by an optional
\
and then a newline, the newline is ignored.
Huh indeed! Convenient, then.
Also, if I want the ending """ to be set off by a newline that doesn't appear in the source, I'm not sure I can accomplish that with
\
:
From the examples, I think you mean a newline that does appear in the source but not in the resulting string.
That seems fine — I think we'll generally want these test strings to end with a newline anyway, or just won't care.
If there are some test cases for situations where the lack of a final newline affects things, then you could add a bit of code manipulating the string. For example define a chomp
function like String chomp(String s) => s.replaceFirst(RegExp(r'\n$'), '')
, and write:
checkFoo(chomp('''
```quote
words
```
'''));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the examples, I think you mean a newline that does appear in the source but not in the resulting string.
Ah, yes, thanks.
55a689f
to
6405798
Compare
Thanks for the reviews! Revision pushed, with tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I appreciate the detailed comments and references to the CommonMark spec. Comments below.
test/model/compose_test.dart
Outdated
final expectedWithInfoString = chomp(''' | ||
``````info | ||
``` | ||
hello world | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One change that I think would make it easier to read these tests is to remove the infoString: …
case from almost all of them. Then there could be a couple of tests that are specifically about the infoString
behavior.
That wouldn't sacrifice any meaningful degree of test coverage, I think, because there's very little interaction between the handling of infoString
and the rest of what this wrapWithBacktickFence
function does. And on the other hand, it would make each test case quite a bit shorter and less repetitive, which would help in scanning through them and seeing what kinds of situations are covered.
test/model/compose_test.dart
Outdated
final expected = chomp(''' | ||
``` | ||
// hello world | ||
|
||
~~~~~~ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then I think a further improvement to the test code would be to use a test helper to deduplicate between content
and expected
, probably in all of the test cases. That would be a further big reduction in the repetitiveness of the cases, making it easier to scan through to understand what they're all doing.
The key here is that content
is straightforward to infer from expected
. So the test helper can take expected
as an argument; remove the first and last lines; pass that to wrapWithBacktickFence
; and just check that the result equals the original expected
.
Then each of these test cases will have just a single copy of the string, and a simple function call around it.
test/model/compose_test.dart
Outdated
test('no panic on badly positioned fences', () { | ||
final content = chomp(''' | ||
````javascript | ||
// hello world | ||
``` | ||
asdf | ||
```` | ||
```python | ||
'''); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I wouldn't even call these "bad" in any way.
There's a "javascript" code block containing:
// hello world
```
asdf
followed by a "python" code block that's empty.
That isn't well-formed JS inside the "javascript" block, but the Markdown here is perfectly fine — certainly according to CommonMark and to the Zulip server.
check(wrapWithBacktickFence(content: content, infoString: 'info')).equals(expectedWithInfoString); | ||
}); | ||
|
||
test('backtick fences may be indented up to three spaces', () { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this test tests this.
In particular, if I remove that logic:
- const lineStart = r'^ {0,3}';
+ const lineStart = r'^';
the test still passes.
Once you make the test simplifications discussed above, this test can look like:
checkFenceWrap('''
````
```
````
''');
checkFenceWrap('''
````
```
````
''');
checkFenceWrap('''
````
```
````
''');
// but at 4 spaces of indentation it no longer counts:
checkFenceWrap('''
```
```
```
''');
lib/model/compose.dart
Outdated
resultBuffer.write('`' * fenceLength); | ||
return resultBuffer.toString(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be writing a newline at the end, too.
That's necessary in order for the string to consist of complete lines, rather than end with a line fragment. And the syntax we're aiming to produce is a multi-line piece of syntax, in the context of Markdown which is quite line-oriented in general.
Concretely, if we take the output of this and it lacks a final newline, and then go on to put anything at all after it without first adding that missing newline, then that subsequent text will not be interpreted as intended, and will corrupt the closing code fence that this function intended. That's because that subsequent text (or the beginning of it) would become part of the last line this function started.
lib/model/compose.dart
Outdated
/// This does not parse [content] to make sure its backtick fences are properly | ||
/// paired and nested. If they aren't, it's probably not clear what | ||
/// backtick-fenced output would be more reasonable anyway -- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way for them to be improperly paired and nested?
I'm not sure there is. I think if you take one Markdown document with some fenced blocks in it, and you shuffle around the order of the fences, you get another perfectly fine Markdown document that just means something else: some blocks may be closed implicitly, and some lines that had been opening fences with info strings are now just part of the text because they can't be closing fences.
More generally, I believe Markdown is a language where there's no such thing as malformed syntax — just syntax that doesn't mean what you intended.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CommonMark does not require closing fences to be paired.
The content of the code block consists of all subsequent lines, until a closing code fence of the same type as the code block began with (backticks or tildes), and with at least as many backticks or tildes as the opening code fence.
If the end of the containing block (or document) is reached and no closing code fence has been found, the code block contains all of the lines after the opening code fence until the end of the containing block (or document).
Our ```quote
construct that parses its contents as Markdown is not part of CommonMark, but if it were, the only consistent nesting behavior to specify would be that a closing fence closes the outermost block it can close under these rules. Unremarkably, we don’t currently live up to that consistency (zulip/zulip#17610).
lib/model/compose.dart
Outdated
// Allow "up to three spaces of indentation". | ||
// As of Zulip Server 7.0, we don't comply with that detail: | ||
// https://chat.zulip.org/#narrow/stream/6-frontend/topic/quote-and-reply.20fence.20length/near/1588273 | ||
// and that's a bug. | ||
const lineStart = r'^ {0,3}'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is kind of concerning on the face of it, because it sounds like it's saying that this code's behavior disagrees with the server.
I think just s/and/but/ would already help.
Also helpful would be to be more specific about "we don't comply". I think what's meant is that the server doesn't currently accept those fences. But we're not generating indented fences here, only watching out for them; and it's harmless to make our own fence longer even if the server wouldn't notice the internal fence that we're steering clear of.
6405798
to
93c0d1e
Compare
Thanks for the reviews! Revision pushed. I'm not sure I've found the right comment here: // (A) Why not panic on dangling opening fences
// (ones without a corresponding closing fence)?
// Because the spec allows leaving the closing fence implicit:
// > If the end of the containing block (or document) is reached
// > and no closing code fence has been found,
// > the code block contains all of the lines after the opening code fence
// > until the end of the containing block (or document).
//
// (B) Why not look for dangling closing fences (ones without an opening fence)?
// Because technically there's no such thing:
// they would be indistinguishable from dangling opening fences,
// and parsers (and this function) will treat them that way.
final fenceLength = getUnusedOpeningBacktickFenceLength(content); but I did want to write something for readers who might be confused (like I was when I first read the TypeScript implementation in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I think the tests are quite readable now. Comments below; I think there's still one small behavior change we'll want.
test/model/compose_test.dart
Outdated
final openingFenceLength = firstLineMatch![1]!.length; | ||
assert(RegExp(r'^`{' + openingFenceLength.toString() + r'}$').hasMatch(lines.removeAt(lines.length - 1)), | ||
'test error: closing fence not found in `expected`'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same as saying that that line is equal to the string firstLineMatch![1]
, right?
test/model/compose_test.dart
Outdated
assert(lines.removeAt(lines.length - 1) == '', | ||
'test error: `expected` should end with a newline'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to avoid having things inside assert
that have side effects which the non-assert code is relying on.
It's probably OK in practice here in this test, because when running tests one generally wants assertions on anyway. But there are some cases where one runs test code without assertions, too, like for benchmarks. So one can't even quite draw a bright line like "in test code, always assume assertions run and don't worry about the other case".
test/model/compose_test.dart
Outdated
final lines = expected.split('\n'); | ||
final firstLineMatch = RegExp(r'^(`{3,})[^`]*$').allMatches(lines.removeAt(0)).singleOrNull; | ||
assert(firstLineMatch != null, | ||
'test error: opening fence not found in `expected`'); | ||
assert(lines.removeAt(lines.length - 1) == '', | ||
'test error: `expected` should end with a newline'); | ||
final openingFenceLength = firstLineMatch![1]!.length; | ||
assert(RegExp(r'^`{' + openingFenceLength.toString() + r'}$').hasMatch(lines.removeAt(lines.length - 1)), | ||
'test error: closing fence not found in `expected`'); | ||
final content = lines.join('\n'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's another way to express the same set of allowed expected
and resulting content
:
final lines = expected.split('\n'); | |
final firstLineMatch = RegExp(r'^(`{3,})[^`]*$').allMatches(lines.removeAt(0)).singleOrNull; | |
assert(firstLineMatch != null, | |
'test error: opening fence not found in `expected`'); | |
assert(lines.removeAt(lines.length - 1) == '', | |
'test error: `expected` should end with a newline'); | |
final openingFenceLength = firstLineMatch![1]!.length; | |
assert(RegExp(r'^`{' + openingFenceLength.toString() + r'}$').hasMatch(lines.removeAt(lines.length - 1)), | |
'test error: closing fence not found in `expected`'); | |
final content = lines.join('\n'); | |
final re = RegExp(r'^(`{3,})[^`]*?\n(.*)\n\1\n$', dotAll: true); | |
final content = re.firstMatch(expected)![2]!; |
I think it's probably actually better, though, to simplify away this helper's knowledge of things like backticks and matching fence lengths, and let the code under test speak for itself in that respect. That then looks like this:
final re = RegExp(r'^.*?\n(.*)\n.*\n$', dotAll: true);
final content = re.firstMatch(expected)![1]!;
and the helper's doc can just say it removes the first and last line to get the input.
Then if the test passes something with, say, mismatched fences:
checkFenceWrap('''
hello world
''');
the developer gets output from check
like:
00:01 +0 -1: wrapWithBacktickFence single line with no code blocks [E]
Expected: a String that:
equals '````
hello world
```'
Actual: '```
hello world
```'
Which: differs at offset 3:
````\nhello w ...
```\nhello wo ...
^
package:checks/src/checks.dart 72:9 check.<fn>
package:checks/src/checks.dart 717:12 _TestContext.expect
package:checks/src/extensions/string.dart 108:13 StringChecks.equals
test/model/compose_test.dart 20:78 main.<fn>.checkFenceWrap
test/model/compose_test.dart 24:7 main.<fn>.<fn>
which seems pretty clear, and appropriate — the result of the test is that the code under test did not produce the output that the test said was expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to make content
consist of complete lines as discussed in my other comment, the capturing group just needs to enclose the newline:
final re = RegExp(r'^.*?\n(.*\n).*\n$', dotAll: true);
Then in order to test that the function correctly handles the case of an incomplete final line too, we can add another flag to the helper:
void checkFenceWrap(String expected, {String? infoString, bool chopNewline = false}) {
final re = RegExp(r'^.*?\n(.*\n).*\n$', dotAll: true);
String content = re.firstMatch(expected)![1]!;
if (chopNewline) content = content.substring(0, content.length - 1);
and have a couple of test cases using that.
lib/model/compose.dart
Outdated
resultBuffer.write(content); | ||
resultBuffer.write('\n'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So one thing this function is doing is it's effectively assuming that its argument content
is missing its final newline — that it ends with an incomplete line. If the argument is instead made of complete lines, then this will add a blank line at the end before the closing fence.
If you look back at the spec: https://spec.commonmark.org/0.30/#fenced-code-blocks
and at the HTML output for the examples, the content (enclosed in <code>…</code>
) always consists of complete lines — it's always either empty (so consists of zero lines, all of them complete), or ends with a newline. The only exceptions are examples 121, 138, and 145… which demonstrate syntax that does not make a fenced code block, and instead makes some inline code.
And that's reflected in the spec text: "The content of the code block consists of all subsequent lines", emphasis added.
I think the norm for this function should therefore be that the input consists of complete lines, just as the output will. Probably it's best if it also cheerfully accepts an incomplete final line, and supplies its newline. The implementation then just needs to say:
resultBuffer.write(content); | |
resultBuffer.write('\n'); | |
resultBuffer.write(content); | |
if (!content.endsWith('\n')) { | |
resultBuffer.write('\n'); | |
} |
check(wrapWithBacktickFence(content: content, infoString: infoString)).equals(expected); | ||
} | ||
|
||
test('single line with no code blocks', () { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One other set of scenarios that would be good to test: empty content; content consisting of blank lines.
lib/model/compose.dart
Outdated
return RegExp(lineStart + backticks + trailing, multiLine: true); | ||
})(); | ||
|
||
/// The shortest opening backtick fence that's longer than any in [content]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the reasoning here can actually be simplified by removing the word "opening", from this summary and the function's name. We're finding a backtick fence that's longer than any backtick fence, whether opening or closing, in the content.
That's what the code is actually already doing, because the regex matches closing fences as well as opening fences.
It's easier to reason about, because a fence that's an opening fence in the original content in isolation could become a closing fence, if we were to try to enclose it in fences that were too short.
And it's what we need to do:
- We need our opening fence to be long enough that it won't be closed by any fence in the content.
- We need our closing fence to be long enough that it will close any outstanding opening fences in the content.
lib/model/compose.dart
Outdated
// (A) Why not panic on dangling opening fences | ||
// (ones without a corresponding closing fence)? | ||
// Because the spec allows leaving the closing fence implicit: | ||
// > If the end of the containing block (or document) is reached | ||
// > and no closing code fence has been found, | ||
// > the code block contains all of the lines after the opening code fence | ||
// > until the end of the containing block (or document). | ||
// | ||
// (B) Why not look for dangling closing fences (ones without an opening fence)? | ||
// Because technically there's no such thing: | ||
// they would be indistinguishable from dangling opening fences, | ||
// and parsers (and this function) will treat them that way. | ||
final fenceLength = getUnusedOpeningBacktickFenceLength(content); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To your comment above about this comment: I think a good direction to go in commenting on this would be on the lines of @andersk's comment at #179 (comment) , after making the simplification I point out in the comment before this one.
Or perhaps just take the two "what we need to do" points from the end of that comment of mine, with a link to that comment of Anders's.
93c0d1e
to
fd79d07
Compare
Thanks for the review! Revision pushed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Small comments.
test/model/compose_test.dart
Outdated
test('three-backtick block; incomplete final line', () { | ||
checkFenceWrap(chopNewline: true, ''' | ||
``` | ||
hello | ||
world | ||
``` | ||
'''); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test seems not what you intended.
I think it can just be dropped, though — there's already the test above with the same contents, as well as the five-backtick version below.
lib/model/compose.dart
Outdated
/// For example, for this Markdown: | ||
/// | ||
/// ```javascript | ||
/// console.log('Hello world!'); | ||
/// ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
/// For example, for this Markdown: | |
/// | |
/// ```javascript | |
/// console.log('Hello world!'); | |
/// ``` | |
/// For example, for this Markdown: | |
/// | |
/// ```javascript | |
/// console.log('Hello world!'); | |
/// ``` |
and similarly below, so that this dartdoc renders properly, e.g. in the IDE at a call site
test/model/compose_test.dart
Outdated
test('empty content', () { | ||
checkFenceWrap(chopNewline: true, ''' | ||
``` | ||
|
||
``` | ||
'''); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, this example doesn't look right. The output is a block whose contents are one empty line, not a block with empty contents.
I think the trouble is that that condition I suggested above:
if (!content.endsWith('\n')) {
resultBuffer.write('\n');
is incorrectly treating an empty string as ending with an incomplete line. So it should instead be like if (content.isNotEmpty && !content.endsWith('\n'))
.
fd79d07
to
96a13e1
Compare
Aha, thanks! Revision pushed. |
For the corresponding logic used in the web app and zulip-mobile, see web/shared/src/fenced_code.ts. I believe the logic here differs from that in just this way: it follows the CommonMark spec more closely by disqualifying backtick-fence lines where the "info string" has a backtick, since that's not allowed: > If the info string comes after a backtick fence, it may not > contain any backtick characters. (The reason for this restriction > is that otherwise some inline code would be incorrectly > interpreted as the beginning of a fenced code block.) Regarding the new file lib/model/compose.dart, we do have existing code that could reasonably move here, but it's pretty simple. It's the code that gives the upload-file Markdown; see registerUploadStart and registerUploadEnd in [ContentTextEditingController]. Related: zulip#116
96a13e1
to
b6556a6
Compare
Thanks! Looks good — merging 😅 |
For the corresponding logic used in the web app and zulip-mobile, see web/shared/src/fenced_code.ts.
I believe the logic here differs from that in just this way: it follows the CommonMark spec more closely by disqualifying backtick-fence lines where the "info string" has a backtick, since that's not allowed:
Regarding the new file lib/model/compose.dart, we do have existing code that could reasonably move here, but it's pretty simple. It's the code that gives the upload-file Markdown; see
registerUploadStart and registerUploadEnd in
[ContentTextEditingController].
Related: #116