Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generating HTML table with 800+ rows consumes 60+ seconds #222

Open
cj-clx opened this issue Jan 29, 2020 · 18 comments
Open

Generating HTML table with 800+ rows consumes 60+ seconds #222

cj-clx opened this issue Jan 29, 2020 · 18 comments

Comments

@cj-clx
Copy link

cj-clx commented Jan 29, 2020

PHP's php.ini file specifies a maximum CPU time limit via the max_execution_time parameter. Most installations will have a reasonable small value, for example 30 seconds, configured, because most web users are not going to wait a long time to see their page appear.

In my application using Transphporm, I sometimes needs to generate HTML <table>s with 800 to 3000 rows. Doing so with straight PHP (i.e. no templating system) takes 2 to 3 seconds, tops. It takes 65 seconds to generate an 800 row table using Transphporm.

As a result, I am beginning to profile Transphporm to see where the in efficiency is, and if there is a way to improve its performance. This issue is here to track my research, and hopefully at some point be referenced by a PR to fix the problem. As well, perhaps a project maintainer being more familiar with the code will have a quicker and better grasp of the solution possibilities once I provide the profile data to show where the CPU time is being most consumed.

@cj-clx
Copy link
Author

cj-clx commented Jan 29, 2020

Here is the relative time callgraph, relative to all of Transphporm.

profile-graph

@TRPB
Copy link
Member

TRPB commented Jan 29, 2020

Are you using caching?

Take a look here: https://r.je/transphporm-caching-guide

At minimum use setCache to set the parser cache. This means that the tss file does not need to be parsed every time.

If you can speed up the process, then I'm happy to accept a PR though it may be down to XPath and DomDocument.

@TRPB
Copy link
Member

TRPB commented Jan 29, 2020

Is 130,000 repeat runs about right? Id guess it would be if you had 800 rows and we're also repeating the columns.

Edit: actually, even with 800 rows you'd only trigger repeat it once. With columns you should only trigger repeat once per row. So 800 rows with 10 columns should only be 801 repeat calls. I think that is the issue, why is repeat being called 130,000 times?

Can you provide some sample TSS and html?

@garrettw
Copy link
Contributor

Here is the relative time callgraph, relative to all of Transphporm.

It seems to indicate parseTokens is where most of the time is spent -- am I reading this right? And the "x" numbers are how many times each function is hit?

@cj-clx
Copy link
Author

cj-clx commented Jan 29, 2020

@garrettw : I think that's correct. I'm still getting up to speed on understanding the output from qcachegrind.

@TRPB : Yes, I can provide the TSS and HTML. Let me work on packaging that up and getting it attached here.

@cj-clx
Copy link
Author

cj-clx commented Jan 29, 2020

Here are the TSS and HTML, stripped down a bit, as I'm using some additional partials for navigation and top of page information that are not included. The main code generating the table is in the get-remit.tss and get-remit.xml files, inside the <div class="check-claims"> element at bottom of page.

@TRPB : I made some initial attempts at enabling caching, but it failed and gave me white/empty pages because it got confused about file paths to my partials in subdirectories. I will spend some more time trying to make it work today, and may be filing another issue for that.

files.zip

@cj-clx
Copy link
Author

cj-clx commented Jan 30, 2020

I have gotten Transphporm's caching working for me. Thanks much for the link to the caching guide article.

@TRPB
Copy link
Member

TRPB commented Feb 1, 2020

Can you also provide some a sample data structure for use with this template? It's difficult trying to recreate it just from the TSS rules.

@TRPB
Copy link
Member

TRPB commented Feb 1, 2020

Here's a bare-bones example:

<?php

$rows = [];

for ($i = 0; $i < 1000; $i++) {
	$rows[$i] = range(0, 9);
}

$xml = '<table>

	<tr>
		<td>A</td>
		<td>B</td>
		<td>C</td>
		<td>D</td>
		<td>E</td>
		<td>F</td>
	</tr>
</table>';


$tss = 'tr {
	repeat: data();
}

tr td:nth-child(1) {
	content: iteration(1);
}

tr td:nth-child(2) {
	content: iteration(2);
}

tr td:nth-child(3) {
	content: iteration(3);
}

tr td:nth-child(4) {
	content: iteration(4);
}

tr td:nth-child(5) {
	content: iteration(5);
}

tr td:nth-child(6) {
	content: iteration(6);
}
';


$t1 = microtime(true);
require 'vendor/autoload.php';

$template = new \Transphporm\Builder($xml, $tss);

echo $template->output($rows)->body;
$t2 = microtime(true);

echo '<p>' . ($t2 - $t1) . '</p>';

It takes 11 seconds on my machine which is longer than I'd like.

It's not repeat that's the issue, it's the various content: rules.

If I comment out the content properties and just do the lookups:

tr td:nth-child(6) {
	//content: iteration(6);
}

This is 9 seconds. So most of the time is spent finding the elements: tr td:nth-child(6)

edit: Enabling xdebug's profiler makes this go over 30 seconds so the profiler slows down the script significantly. 11 seconds is still too long though.

@TRPB
Copy link
Member

TRPB commented Feb 1, 2020

I've found the source of the issue. PropertyHook.php:39

if (!$this->pseudoMatcher->matches($element)) return;

Removing the pseudo check takes the script from 9 seconds down to 2.6.

@TRPB
Copy link
Member

TRPB commented Feb 1, 2020

Can you try this patch?

diff --git a/src/Parser/Value.php b/src/Parser/Value.php
index acf029f..1c188d6 100644
--- a/src/Parser/Value.php
+++ b/src/Parser/Value.php
@@ -54,13 +54,17 @@ class Value {
 	}
 
 	public function parseTokens($tokens, $data = null) {
+
+
+		if (count($tokens) <= 0) return [$data];
+		// If there's only one token, then there is no function call or data lookup, just return the token.
+		if (count($tokens) == 1) return $tokens->read();
+
 		$this->result = new ValueResult();
 		$this->data = new ValueData($data ? $data : $this->baseData);
 		$this->last = new Last($this->data, $this->result, $this->autoLookup);
 		$this->traversing = false;
 
-		if (count($tokens) <= 0) return [$data];
-
 		foreach (new TokenFilterIterator($tokens, [Tokenizer::WHITESPACE, Tokenizer::NEW_LINE]) as $name => $token) {
 			$this->{$this->tokenFuncs[$token['type']]}($token);
 		}

In my example above it gives a ~50% improvement in speed.

@TRPB
Copy link
Member

TRPB commented Feb 1, 2020

edit: Ignore, see below

@TRPB
Copy link
Member

TRPB commented Feb 1, 2020

diff --git a/src/Hook/PseudoMatcher.php b/src/Hook/PseudoMatcher.php
index 888c180..0f8d42e 100644
--- a/src/Hook/PseudoMatcher.php
+++ b/src/Hook/PseudoMatcher.php
@@ -11,10 +11,13 @@ class PseudoMatcher {
 	private $pseudo;
 	private $valueParser;
 	private $functions = [];
+	private $funcParts = [];
 
 	public function __construct($pseudo, \Transphporm\Parser\Value $valueParser) {
-		$this->pseudo = $pseudo;
 		$this->valueParser = $valueParser;
+		foreach ($pseudo as $i => $tokens) {
+			$this->funcParts[$i] = $this->getFuncParts($tokens);
+		}
 	}
 
 	public function registerFunction(\Transphporm\Pseudo $pseudo) {
@@ -22,18 +25,17 @@ class PseudoMatcher {
 	}
 
 	public function matches($element) {
-		foreach ($this->pseudo as $tokens) {
+		foreach ($this->funcParts as $parts) {
 			foreach ($this->functions as $function) {
-				$matches = $this->match($tokens, $function, $element);
+				$matches = $this->match($parts, $function, $element);
 				if ($matches === false) return false;
 			}
 		}
 		return true;
 	}
 
-	private function match($tokens, $function, $element) {
+	private function match($parts, $function, $element) {
 		try {
-			$parts = $this->getFuncParts($tokens);
 			$matches = $function->match($parts['name'], $parts['args'], $element);
 			if ($matches === false) return false;
 		}
@@ -62,14 +64,13 @@ class PseudoMatcher {
 	}
 
 	public function hasFunction($name) {
-		foreach ($this->pseudo as $tokens) {
-			if ($name === $this->getFuncName($tokens)) return true;
+		foreach ($this->funcParts as $parts) {
+			if ($name == $parts['name']) return true;
 		}
 	}
 
 	public function getFuncArgs($name) {
-		foreach ($this->pseudo as $tokens) {
-			$parts = $this->getFuncParts($tokens);
+		foreach ($this->funcParts as $parts) {
 			if ($name === $parts['name']) return $parts['args'];
 		}
 	}

This patch should give an 80% performance increase. With the two patches applied, my demo above using 100 rows instead of 1000 goes from nearly 8 seconds to 2.5.

A decent increase for a few minor tweaks!

@TRPB
Copy link
Member

TRPB commented Feb 1, 2020

And here's the real winner.... 1000 rows now take 0.3 seconds.

diff --git a/src/Template.php b/src/Template.php
index f46567f..9404f20 100644
--- a/src/Template.php
+++ b/src/Template.php
@@ -61,16 +61,17 @@ class Template {
 
 	/** Assigns a $hook which will be run on any element that matches the given $xpath query */
 	public function addHook($xpath, $hook) {
-		$this->hooks[] = [$xpath, $hook];
+		if (!isset($this->hooks[$xpath])) {
+			$this->hooks[$xpath] = [];
+		}
+		$this->hooks[$xpath][] = $hook;
 	}
 
 	/** Loops through all assigned hooks, runs the Xpath query and calls the hook */
 	private function processHooks() {
-		foreach ($this->hooks as list($query, $hook)) {
+		foreach ($this->hooks as $query => $hooks) {
 			foreach ($this->xpath->query($query) as $element) {
-				if ($element->getAttribute('transphporm') !== 'immutable' || $hook->runOnImmutableElements()) {
-					$hook->run($element);
-				}
+				foreach ($hooks as $hook) $hook->run($element);
 			}
 		}
 		$this->hooks = [];

This is a huge gain because previously these rules:

tr td:nth-child(1) {
}

tr td:nth-child(2) {
}

Would do this:

  1. Find all //tr/td elements
  2. Loop over all the tds
  3. Check to see if they match "nth-child(1)"
  4. Find all //tr/td elements
  5. Loop over all the tds
  6. Check to see if they match "nth-child(2)"

So if there are 6000 elements, it was looping over those same 6000 elements twice. Or, in my example, 6 times (because I had 6 different nth-child rules).

@TRPB
Copy link
Member

TRPB commented Feb 2, 2020

Ok, my results were rather weird. Trying the last fix again, I'm now seeing no real performance benefit and I'm not sure why.

It also breaks some tests due to running the hooks on the wrong elements. I've committed one beneficial patch which takes my example down from 11 seconds to 3 seconds. But it won't help in a lot of cases as any dynamic calls in pseudo elements e.g. :data or :iteration they can't be cached (at least not the way I'm doing it here, I could add closures to do it but it wouldn't help much) because the value of iteration isn't always going to be the same.

@TRPB
Copy link
Member

TRPB commented Feb 2, 2020

Here's something really annoying too. Xpath is really inconsistent.

Try this on a 1000 line table ( I was just using the output from my earlier example)

$doc = new \DomDocument;
$doc->loadXml($xml);

$xpath = new \DomXpath($doc);

$t1 = microtime(true);
// tr td.one
$xpath->query('//tr//td[contains(concat(\' \', normalize-space(@class), \' \'), \' one \')]');
$t2 = microtime(true);


echo '<p>tr td.one' . ($t2 - $t1) . '</p>';

$t1 = microtime(true);
// td.one
$xpath->query('//td[contains(concat(\' \', normalize-space(@class), \' \'), \' one \')]');
$t2 = microtime(true);


echo '<p>td.one: ' . ($t2 - $t1) . '</p>';
$t1 = microtime(true);
// tr td
$xpath->query('//tr//td');
$t2 = microtime(true);

echo '<p>tr td:' . ($t2 - $t1) . '</p>';

Results:

tr td.one: 0.24244713783264

td.one: 0.0057270526885986

tr td: 0.00080299377441406

So for some reason //tr//td is really fast. Just looking up any td with a specific class is quite fast. But 100 times slower than that is finding a td with a specific class inside a tr.

What the hell is going on here with xpath?

@TRPB
Copy link
Member

TRPB commented Feb 2, 2020

The real world result of that, is that with 6 elements this TSS takes 1.9 seconds for 1000 records:


td.one {
	content: iteration(1);
}

td.two {
	content: iteration(2);
}

td.three {
	content: iteration(3);
}

td.four {
	content: iteration(4);
}

td.five {
	content: iteration(5);
}

td.six {
	content: iteration(6);
}

While this takes 3.5:


tr td.one {
	content: iteration(1);
}

tr td.two {
	content: iteration(2);
}

tr td.three {
	content: iteration(3);
}

tr td.four {
	content: iteration(4);
}

tr td.five {
	content: iteration(5);
}

tr td.six {
	content: iteration(6);
}

Obviously the speed is slower if there are more selectors in the file.

I recommend using tr > td whenever you know that one element is a direct descendant of another as it's considerably faster.

In your example, replace



/* This is the potentially large table: */
.claim-rows {
    repeat: data(output.claim_list);
}

.claim-rows .pctlnum a {
    content: iteration(patient_control_number);
}
.claim-rows .pctlnum a:attr(href):[iteration(ref_d9)!=""] {
    content: "/claim/", iteration(ref_d9);
}
.claim-rows .pctlnum :[iteration(ref_d9)=""] {
    content: iteration(patient_control_number);
}
.claim-rows .status {
    content: iteration(clx_status);
}
.claim-rows .description span {
    content: iteration(payer_description);
}
.claim-rows .description a:attr(href) {
    content: "/get-remit-detail/", iteration(status_handle);
}
.claim-rows .description a:[iteration(status_handle)=""] {
    display: none;
}
.claim-rows .description a:attr(data-target) {
    content: "#", iteration(status_handle);
}
.claim-rows .description .modal:attr(id) {
    content: iteration(status_handle);
}
.claim-rows .charged {
    content: "$", iteration(claim_charge_amount);
    format: decimal 2;
}
.claim-rows .paid-amount {
    content: "$", iteration(payer_paid_amount);
    format: decimal 2;
}
.claim-rows .patient-amount {
    content: "$", iteration(payer_patient_responsible);
    format: decimal 2;
}
.claim-rows .audit-number {
    content: iteration(payer_audit_number);
}

With



/* This is the potentially large table: */
.claim-rows {
    repeat: data(output.claim_list);
}

.claim-rows > .pctlnum a {
    content: iteration(patient_control_number);
}
.claim-rows > .pctlnum a:attr(href):[iteration(ref_d9)!=""] {
    content: "/claim/", iteration(ref_d9);
}
.claim-rows > .pctlnum :[iteration(ref_d9)=""] {
    content: iteration(patient_control_number);
}
.claim-rows > .status {
    content: iteration(clx_status);
}
.claim-rows > .description span {
    content: iteration(payer_description);
}
.claim-rows > .description a:attr(href) {
    content: "/get-remit-detail/", iteration(status_handle);
}
.claim-rows > .description > a:[iteration(status_handle)=""] {
    display: none;
}
.claim-rows > .description > a:attr(data-target) {
    content: "#", iteration(status_handle);
}
.claim-rows > .description > .modal:attr(id) {
    content: iteration(status_handle);
}
.claim-rows > .charged {
    content: "$", iteration(claim_charge_amount);
    format: decimal 2;
}
.claim-rows > .paid-amount {
    content: "$", iteration(payer_paid_amount);
    format: decimal 2;
}
.claim-rows > .patient-amount {
    content: "$", iteration(payer_patient_responsible);
    format: decimal 2;
}
.claim-rows > .audit-number {
    content: iteration(payer_audit_number);
}

and you'll see a performance improvement just from more efficient xpath queries behind the scenes.

If you do the following:

  1. Use the direct descendant operator where applicable
  2. Pull the latest master branch with several performance tweaks

Based on my figures above you should be able to go from ~60 seconds to ~6-10. A decent improvement but still a little slower than I'd like.

I managed to get my example above down from 11 seconds to 1.7 with these changes.

Also, remember to turn off xdebug's profiler for a real world test. With XDEBUG_PROFILE=1 my script runs 10x slower than without it set.

@cj-clx
Copy link
Author

cj-clx commented Feb 3, 2020

Sorry, that I had a busy weekend, so was unable to be here to provide the data structure. XPath does seem to be behaving rather strangely. I'll change my TSS to using the direct descendant operator, and pull the latest master. I've outfitted my code with some instrumentation so that I always record the total render time without using XDebug (as you mentioned, it really kills performance). I greatly appreciate all the work. I constantly evangelize Transphporm, as every other templating system really fails the basic test of separating content from presentation.

Perhaps if there are enough specific cases of this TSS being faster than that TSS, I can write up a wiki page regarding performance optimizing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants