Provide .toMathML() rendering of expressions, possibly replacing .toHTML() ? #3327
Replies: 2 comments 4 replies
-
Of course, it might add too much to the shipped mathjs bundle; since plurimath is translated from Ruby by Opal, I imagine it is a pretty heavy package. |
Beta Was this translation helpful? Give feedback.
-
In general, now that MathML is supported across popular browsers, it appears that the best way for mathjs to provide the ability to display its expressions typeset attractively for the web is to provide a .toMathML() generation method on its expression tree Node types. This would be directly analogous to .toHTML(); one could insert the results of .toMathML() into the dom of a web page and see the expression nicely laid out. In fact, these days it seems to me that most or all of the use cases for .toHTML() would be as well or better served by .toMathML(), so one approach would be to replace the .toHTML() code with .toMathML() code (perhaps leaving .toHTML() as an alias for .toMathML() so as not to completely break existing code that may be calling .toHTML() -- after all, these days MathML essentially is a part of HTML, so it wouldn't really be a lie to call it .toHTML(), just a major change in its behavior). Or the .toHTML() code could be left in with a declaration that it is deprecated and will not be maintained and will be pruned at a later time as it bitrots. In any case, implementing .toMathML() should not be particularly difficult. It will be entirely analogous to .toHTML(), just with different particulars at each node type. From an architectural point of view, though, and the main reason for having a design discussion here rather than diving in, I think one should be skeptical of simply replicating the .toHTML() machinery completely analogously. I am very suspicious of the proliferation of "renderer" methods on each Node type: .toString(), .toTex(), .toHTML() -- even .toJSON() is sort of in this category. Adding another one feels like breaking the camel's back. These are all very similar transformations of an expression tree. It seems like it might be better to try to unify them in a sort of visitor pattern: have a traverse() type of method that takes the "guts" of the operation as an argument, organized in as clear and simple a way as possible. It seems like this could clarify/simplify existing code as well, by reducing the sort of structural redundancy between the existing .toString, .toTex, and .toHTML methods. One big decision if pursuing this sort of refactoring is where the details for each operation should be collected. In other words, right now we have the pieces of how to convert to each sort of format (string, LaTeX, HTML) in each Node type, so that all the knowledge about a RelationalNode (say) is in one place, but all of the information about HTML (say) is scattered. So adding a new format, like MathML, involves adding a piece to every node type. Maybe this pattern has ended up cluttering the code for the syntax trees, and it would be better to just have information about how perform traversals in the trees, and then collect all of the information about how to render to HTML in one place. But I think either organization of where to put the information (collected with each Node type, or collected with each format type) should be possible with a refactor that prevents the need to add a new method to each Node type any time there is a need to add a new format, and reuses the traversal code rather than rewrites the traversal code for each format. But we might need to pick one way or the other for how to distribute the pieces. Since they are currently localized with the Nodes, we could keep it that way, but hopefully refactor so that rather than writing a new Node method for each format, you're just adding a quick "combiner function" for that format for that Node type and putting it in a record of formats that node supports. To make this more concrete, if you look at RelationalNode.js, you will see that the code for toString, toTex, and toHTML all have a lot of similarity/redundancy between them: they all have to compare their precedence with their children, get the rendered versions of the children, possibly parenthesize children depending on precedence and parenthesis mode, and then combine those renderings in some way. It would be a real improvement, I think, to put that precedence and getting the children's renderings properly parenthesized code in a single place. In fact, at least at the moment, all of these methods consist of just interleaving the child renderings with some string based on each relation, so we could even combine that commonality code into a putative _render method for the Relational node type (that would take the output format as an argument), and put just the information of how
That's it; as far as I can see, other than how to parenthesize a subexpression in each format, which should also be coded just once somewhere, not in each Node class, those are the only bits of code that differ among these three renderers. We could lose a lot of lines of code by doing similar things for each node type. And then adding a MathML renderer would be as simple as adding one more key to this record, e.g.
where entityMap gives the HTML entity (either as an So just to make the final proposal clear: we would add a Thoughts? |
Beta Was this translation helpful? Give feedback.
-
The plurimath project has a representation of arbitrary math formulas into numerous formats (latex, mathml, asciimath, etc). Providing a conversion from a mathjs parse tree to plurimath could replace toTex renderer and provide significant additional rendering capability.
Beta Was this translation helpful? Give feedback.
All reactions