-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTML insertion / format: html performance issue #234
Comments
this is likely caused by DOMDocument having to parse the new HTML and import it into a the primary document. It's worth testing this without Transphporm by creating one document and parsing the 1kb HTML in a second instance, then inserting it into the first using With that test done, we can then work out the overhead of Transphporm on this process. |
ok, it's not that. Here's a test doing that:
This runs fairly quickly. The issue must be here: I tested the convertNode method in isolation with this data (I just plugged it into the test above) and it's not that either. I wonder if the generator is slow. |
Here's a very cut down version of what Transphporm is doing. There's nothing slow here:
|
The time involved to generate the HTML in Transphporm grows exponentially, so small tests all look pretty fast. For example, the 4 million row test I posted takes roughly 37 seconds, but doubling to 8 million rows takes 267 seconds. That makes me suspect something like undesired recursive behavior on the part of DomDocument, or something else of that nature. The HTML string being inserted should not need to be parsed, to my way of thinking, but rather assumed correct. Am I misunderstanding something or overlooking some need? |
Increasing the HTML being appended in your example above ( Hmm, this may be fairly useless data, as much of that time might just be moving bytes around in memory during a copy or something. Yeah, it looks linear going from 4M to 8M, as the time just doubles. |
Not calling that |
|
I think this is because when using I do not understand why the loop iterates on the Since I have between 1,000 and 30,000 rows in my table, those |
It's because the HTML gets imported into a DOMDocument, so it wraps in a container tag as XML cannot have multiple root nodes. Because the container tag should not appear in the output, it has to iterate over all the child nodes and import them individually. Add a tbody tag around the rows and it will import just that one element. |
Thanks. I’ll experiment with that.
One of these days I will fully grok Transphporm’s internals, but that day
is not today. ;-)
|
Tom @TRPB : There seems to be a number of "nice to know" things related to Transphporm. Any chance the documentation will ever be updated to reflect them and features that aren't yet documented? What is your procedure for others to add documentation updates? |
Wrapping the inserted HTML in a container such as @jpeurich I cannot speak for Tom, but I think changes to the README or added documentation in the project repository itself are probably handled via Pull Requests. Changes to the Wiki pages appears to be open to anyone with a GitHub account. |
There are some features which need better documentation, the plugin system in particular. Now that I've finished my PhD I can spend more time on my open source projects. :) In this instance, though, it's just a quirk of how DOMDocument works and I hadn't really envisaged the feature being used with such a large amount of HTML that it would make a noticeable difference. This looks like it might be a faster alternative: https://www.php.net/manual/en/domdocumentfragment.appendxml.php but looking online it will only accept valid XML (there is no appendHTML equivalent like there is for loadXML/loadHTML) so any HTML entities or self-closing tags left open will break it. There are a few solutions:
edit: Actually, there might be some performance benefits to using |
Congratulations on finishing your PhD! Solution no. 3 above would suit my purposes well and would be the option I would choose. I would be happy to A/B test alternatives for performance. edit: once I've had some breakfast, I'll attempt to make some |
I've just had another thought on this. Transphporm currently turns your rows into this:
Then parses this into a DOMDocument before extracting each and importing it into the target element:
I wonder if it would be faster to:
I haven't tested this but I would assume that moving an existing element around the document is faster than importing a new element into it. |
Tom @TRPB, Yes - Congratulations on completing your PhD. Sorry for getting off track in this thread, but this had to be said. Thanks for Transphporm. |
Now I find, in my "toy" command line test case, adding the container |
We ultimately had to build the large HTML table using just straight PHP with string interpolation (e.g. HEREDOC). As described in the original issue up top, inserting this string into the template still consumed too much time. So we've unfortunately resorted to generating the full page HTML from Transphporm templating but containing a marker string as an HTML comment at the table. Then we |
What's going on here, that causes such a large performance hit when inserting HTML into a template?
And the output on my laptop:
The text was updated successfully, but these errors were encountered: