-
-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
current-non-scribble-entity-handler #48
base: master
Are you sure you want to change the base?
Conversation
Scribble does not include 'hellip as a recognised HTML entity in pre-content Entities not in: 'mdash, 'ndash, 'ldquo, 'lsquo, 'rdquo, 'rsquo, 'larr, 'rarr, or 'prime are passed through current-non-scribble-entity-handler current-non-scribble-entity-handler defaults to `values` i.e. `(current-non-scribble-entity-handler 'hellip)` -> `'hellip` other possibles are: ```racket (match-lambda ['hellip "..."]) ; literal dot dot dot replacement (match-lambda ['hellip "…"]) ; unicode horizontal ellipsis character ```
Thank you very much for taking time to prepare the pull request! My first reaction is to wonder if this should actually be in the markdown library? Or at least, wonder if it should be something quite this specific to For example a user of the library already could walk the returned x-expressions and do this transformation. However let's say that's inconvenient and maybe slow. In that case, I like your idea of a parameterizable function that defaults to By the way, To be clear, I'm just discussing this right now -- not asking you to change your PR like this. Curious what you think (independent of whether you or I would write the code). |
First things first... what I present here is a hack that gets me past Scribble! I, too, considered doing this from the Scribble end... but the multiple back ends (HTML/Latex/text) made it look like a lot of work and risk to do this properly. It's not just There's a choice of handling all entities (symbols) or non-scribble entities. The way I went allows for something that can be left transparent with (match-lambda
[(and x (or 'rsquo 'rdquo ...)) x] ; Scribble entities clause
['hellip "something special"]
;; maybe here we have no match -- causing early concerns?
) which isn't as simple. Question is... is Scribble so important a target for the markdown parser that we consider Scribble's special entity/symbols to be special? Or, as you ask, is there something special about all symbols? Personally, I came across the Oh, I haven't gotten as far as seeing how your back-referencing "back hooks" on the footnotes play with Scribble (if Scribble even sees them, that is). But again, potentially a non-scribble-entity might find its way into the xexpr. |
Sorry, wrong button pressage |
Thanks for the reply.
I think the footnotes are n/a for Scribble. I only mentioned this to say that the markdown library already does a full recursive walk of the x-expressions, for this purpose. So the incremental cost of having it look for
IIUC the PR handles just I guess what I had in mind is:
TL;DR I had mind something more general. But maybe you're right, that Scribble is a special case that's worth handling specially as in your PR. |
Correct. I'd've expected the author to wrap it, just as I wrapped your 'hellip. So we're now heading towards unDRYness. I like the
Of course, I think if you're naming a function (contract-out current-entity-handler; pseudo-code at best
(parameter/c
(-> pre-content?
(or/c
pre-content?
;; #f, if inserted into an xexpr will bust it, and could make composition easier?
#f)))) Then I can use... oh... is there a "composable or" combinator out there? (define ((or-combinator f1 f2) v) (or (f1 v) (f2 v))) Then I can use (current-entity-handler
(or-combinator
scribble-entity-handler
(match-lambda ['hellip "..."] [_ #f])))
;; actually, more likely:
(current-entity-handler
(match-lambda
['hellip "..."]
[(app scribble-entity-handler v) v])) |
Thinking about this more, it seems like the key concept is this: Every time the markdown parser wants to do something "fancy" where it automatically replaces some x-expr (like This sketch looks more complicated than it is, due to spelling out contracts for everything, but: (provide
(contract-out
[current-entity-handler (parameter/c entity-handler/c)]
[default-entity-handler entity-handler/c]
[scribble-entity-handler entity-handler/c]))
;; Given an original xexpr and a proposed symbol entity substitution,
;; an entity-handler returns which to use.
(define entity-handler/c (-> xexpr/c symbol? xexpr/c))
;; A default entity-handler that accepts every proposed substitution.
(define (default-entity-handler _ sym)
sym)
;; An entity-handler suitable for use to produce x-expressions that
;; you want to give to scribble, which expects only a limited list.
(define (scribble-entity-handler orig sym)
(match sym
[(or 'mdash 'ndash 'ldquo 'lsquo 'rdquo 'rsquo 'larr 'rarr 'prime) sym]
[_ orig]))
(define current-entity-handler (make-parameter default-entity-handler))
The remaining change is that I should go through parse.rkt, and anyplace it's doing auto-fancy-pants stuff, run it through this handler. Does that make sense? |
The current-entity-handler parameter enables customizing which HTML entities are allowed in the resulting x-expressions. The default-entity-handler allows any. The scribble-entity-handler allows only those on Scribble's short list (and was the motivating use case). This is a possible alternative approach to: #48
One thing I'm not sure of is the phrase:
This means a repeated piece of code every time you generate (or consider generating) an entity symbol. Whereas your solutions that involve doing the work during a walk of the xexpr need only be extended in the one place. But... Frankly I'd go with what you've just suggested. It is, at least, transparent - as opposed to the opacity of putting it in a walker people possibly don't even know they're calling. Unless you have a walker |
You're right, and that concern is what tilted me towards suggesting a walk, originally. What tilted me back was the If there were many dozens of these occurrences and a high likelihood of changes in the future, the walk is justified and I could store both the original text and the entity text, for the walk to choose later. That's the most robust. But I think that's probably over-designing things in this case. |
In fact, being careful to store both the original text and the entity text, in every appropriate such place ... would be no more robust than being careful to use But probably no one will add more entity symbols at all, much less frequently. |
currently If someone adds another "fancy" string to entity mapping (I have Robust, maybe not. But close to self-healing. |
Currently the markdown parser converts "..." (and friends) to an
'hellip
symbol. This works lovely forxexpr->html
, but'hellip
isn't a supportedpre-content
symbol.The parameter:
current-non-scribble-entity-handler
is called when'hellip
is generated, and is a hook to substitute other "representations" of "...". It defaults tovalues
, which leaves the output as the current parser does. It could, equally be:this handler can be used to extend any other "smart" symbol mappings.