Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests, fixes, and a small (optional) feature to allow a FluxBB config to work #6

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

harry-wood
Copy link

@harry-wood harry-wood commented Nov 9, 2022

A few different things here:

Generic issues (Not FluxBB specific):

  • Paragraphs in mid-tag "children_html" content will mess up the markdown output. This seems like a big issue by the way (more than a minor edge case I would say)

  • We're missing support for the standard [s]strike-through text[/s]

  • We're missing support for the standard [code] block, and we needed a new plain_tag feature to provide that (Thanks to @tomhughes for figuring that out)

  • The handling of [quote]s is weird because it's not actually converted from bbcode, but also it had some problems when it comes to using the output in discourse, plus handling of triple nested quotes lead to another fix deep in the tag-handling code by @tomhughes

For my FluxBB work:

  • Some tests which, although they are related to FluxBB, are done via the additional_tags config feature, so they serve as an example of that.
  • And we're adding a feature to let me get "## heading" markdown output working. I'm looking to do this for [h] bbcode by passing "additional_tags" because it's not part of the bbcode.org standard, however I need to add an optional :newlines => :to_br feature to allow me to do so.

Swap out tab characters for spaces to give consistent indenting.

Drop trailing whitespace characters at the ends of lines.

Add a consitent trailing newline on every file.

Done using rubocop:
  `rubocop --only Layout/IndentationStyle --autocorrect`
  `rubocop --only Layout/TrailingWhitespace --autocorrect`
  `rubocop --only Layout/TrailingEmptyLines --autocorrect`
This test is supposed to be testing [b] tag with newlines.
Add support for [s] to give strike-through formatting with a "~~" markdown. This is standard bbcode https://www.bbcode.org/making-strikethrough-text-with-bbcode.php
@harry-wood harry-wood force-pushed the fluxbb-tests-and-fixes branch from be91da6 to 921b526 Compare November 16, 2022 15:40
@Firefishy
Copy link

@harry-wood Is this PR ready for review?

@harry-wood
Copy link
Author

harry-wood commented Jan 7, 2023

I'm waiting on @nlalonde (or someone else working at discourse maybe) to take a look at my other basic PR #4

Is all of this ready merge? I think so, but I could certainly understand if the maintainers would prefer to discuss some bits as separate PRs. That was my plan.

Maybe the "Test overriding tags to fix img alt text processing" commit has some test stuff the maintainers might not want to merge here. I can't think of any reason the rest of the commits are not good to be merged.

(Note: Similar situation with my PRs to improve the importer script in discourse. Some of the later commits there require the changes in this PR, but we can bring in these changes, even though they're not merged, by referencing this branch in the gemfile. This means it's actually easy for us to run an import using both sets of unmerged changes, but obviously it would be nice if the maintainers would take a look at these PRs)

@harry-wood harry-wood changed the title WIP Assorted tests and fixes encountered with FluxBB Tests, fixes, and a small (optional) feature to allow a FluxBB config to work Jan 9, 2023
@harry-wood harry-wood marked this pull request as ready for review January 9, 2023 12:21
Start a set of tests for processing FluxBB bbcode by passing in an 'additional_tags' list. In this way we can make it do fluxBB-specific things.

The first of which is to change treatment of the [img=x] parameter, overriding the defaul WidthxHeight behaviour and allowing passing alt text instead.
When converting bbcode tags to markdown, we run into trouble if there's new paragraphs (double newlines) in their content (in their `children_html`), this may not apply to all bbcode tags, but many of them e.g the basic [b] tag. We cannot output markdown looking like this:

```
**paragraph 1

paragraph 2**
```

It doesn't work (at least not in most markdown processors)

An ideal solution might be to output
```
**paragraph 1**

**parapgraph 2**
```
To do this properly will entail splitting the processing in two and re-applying our conversion logic to each part (and re-applying any other wrapping bbcode tags). There are data structures which might allow such tricks, but this is too complicated for me to take on right now.

The ugly workaround here is to swap in a `<br>` tag to give the same vertical spacing as a new paragraph while actually being part of the same paragraph.
```
**paragraph 1
<br>paragraph 2**
```
Ugly to look at, and ugly because we're outputting html when we're supposed to be converting to markdown. Furthermore this will only work where <br> tag is allowed (github and discourse at least). Better solutions welcome!
@harry-wood harry-wood force-pushed the fluxbb-tests-and-fixes branch 2 times, most recently from dbcd37e to ca00fd3 Compare January 9, 2023 12:33
@harry-wood
Copy link
Author

harry-wood commented Jan 9, 2023

I've tidied up the "Test overriding tags to fix img alt text processing" commit a bit now, and taken this out of "Draft" mode. This whole thing is ready for review.

@gschlager
Copy link

@nlalonde This PR looks good. Unfortunately I can't merge it. Can you press the buttons please? 😉
I think it's kinda preventing us from merging discourse/discourse#18953

@Firefishy
Copy link

Firefishy commented Jan 31, 2023

FYI: We have 1 more PR which fixes [quote], strikethrough and [code]. harry-wood#1

@gschlager
Copy link

@harry-wood Feel free to attach your fixes to this PR or open a new PR in this repo.

harry-wood and others added 4 commits February 19, 2023 02:52
Allow a bbcode tag config to force all newlines into <br> html tags within its children_html. This is useful for outputting markdown headings as follows. Headings need to be all on one line even if the source bbcode allows newline chars.

`## My heading<br>continued on the next line.`

Outputting headings? This not part of the bbcode.org standard, but something we would like to be able to do as a custom "additional tag".
Add an example tag config which makes use of the new 'to_br' feature. This is how we would use it for FluxBB (but iobviously doesn't effect default behaviour)
Alow a new tag config option, `'plain_tag' => true`, to indicate that all the between text, regardless of what bbcode or other markup characters are in it, should be passed through analtered.
Add support for the [code] bbcode, making use of the new plain_text config option.

The code block is converted to be wrapped in "```" (on its own line), and crucially the block contents are unaltered by any further parsing.

TODO: In fact it will still convert ">" to "&gt;" and other html entities, which ideally it wouldn't, but when calling with `bbcode_to_md(false)` this is disactivated.

TODO: We can pass a language to achieve syntax highlighting, such as [code=javascript] in some bbcode flavours, and in the bbcode.org reference. This could be converted to "```javascript" in some markdown flavours. Currently it will just ignore the language information and convert a plain "```" block.
@harry-wood harry-wood force-pushed the fluxbb-tests-and-fixes branch from 9ae7307 to 6d7c8c8 Compare February 19, 2023 02:54
harry-wood and others added 4 commits February 19, 2023 16:13
Two test changes to begin tackling two different problems with nested quotes.

Firstly add a comment against the first test. This output does not work within discourse. Why talk about discourse specifically? The tag definition for [quote] doesn't do a proper "to markdown" conversion. It remains as bbcode, but with some newlines added. This is presumably to cater for the primary discourse use case, maybe once upon a time it did, but testing in discourse at the moment you cannot place the following test:

[quote=Kitten]
[quote=creatiu]f1[/quote]

f2[/quote]

We need a newline after 'f2' otherwise the oute quote is not rendered.

Secondly add a test for triple nesting of quotes. This test documents the current output we get, which is clearly not correct. There's no way the letters "EF" should end up next-to eachother. There's a problem with stacked handling of nesting.
Fix the conversion of [quotes] to insert more newlines so that the output works in discourse (as it was presumably originally intended to).

Note: This changes all the output even for simple quote example, which may be surprising. The original output on one line works OK, but spacing them out onto multiple lines also looks OK. The problem we're fixing here only shows up in more complex nested cases.

Specifically this works when we paste it into descourse:

[quote=Kitten]

[quote=creatiu]
f1
[/quote]
f2
[/quote]

...but this does not render fully:

[quote=Kitten]
[quote=creatiu]f1[/quote]

f2[/quote]
Modify test output to show the correct expected output for the triple-nesting example. Nesting handling is now fixed.
@tomhughes
Copy link

To be clear the bits I did, especially the one that changes how child nodes are tracked, were quick hacks to get our import working and I didn't expect them to be offered upstream like this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants