Skip to content
This repository has been archived by the owner on Jan 12, 2024. It is now read-only.

Possibly suboptimal translation to XML #20

Open
ajashton opened this issue Mar 7, 2011 · 18 comments
Open

Possibly suboptimal translation to XML #20

ajashton opened this issue Mar 7, 2011 · 18 comments

Comments

@ajashton
Copy link
Contributor

ajashton commented Mar 7, 2011

I'm not sure whether this is a bug, but I managed to create ~50 lines of Carto that get translated into something like 12,000 lines of XML. I probably should just be breaking things up differently, but perhaps there's room for optimization on Carto's side as well.

The offending style is at https://gist.github.com/859095 and the data to go with it is http://dl.dropbox.com/u/2398828/country-labels.geojson

Not the main issue, but I noticed many identical fontsets are created as well.

@tmcw
Copy link
Contributor

tmcw commented Jun 15, 2011

The fontset duplication was fixed in #21, I'll check out this output to see if there's a way around duplication here, though I'm suspicious that this style layout simply creates some exponential number of styles because it needs to.

@yhahn yhahn closed this as completed in 370e3f9 Aug 4, 2011
@yhahn yhahn reopened this Aug 5, 2011
@gravitystorm
Copy link
Contributor

This is triggered by having multiple column names in the same style (Z_ABBREV, Z_ADMIN, Z_NAME in ajashton's example). You can get the same thing easily with osm rules with amenity, landuse, leisure etc exploding exponentially.

A workaround that ajashton suggested is to use a different ::attachment for each object. It's sufficient to use one for each column name, for example:

.poi[amenity='post_box'][zoom>=17]::amenity { .,. }
.poi[railway='station'][zoom>=10]::railway { ... }
.poi[amenity='post_office][zoom>=17]::amenity { ... }

Note that if you do this, then the ordering of the features changes from that expected from the layer datasource, since all the objects with a given ::attachment are rendered first, then all the second ::attachment, and so on.

@tmcw
Copy link
Contributor

tmcw commented Sep 7, 2011

Pushing this to 0.2.5: very high priority for a fix, but barely livable for the time being.

@tmcw
Copy link
Contributor

tmcw commented Dec 18, 2012

Okay, this is still a high priority, but I'm still kind of at a loss for a compact summary of the expected output and how this differs.

I can see some issue in https://gist.github.com/4329932 in which excess rules are generated that will never be used, but not sure if this is the same problem that this ticket addresses.

tmcw added a commit that referenced this issue Dec 18, 2012
@tmcw
Copy link
Contributor

tmcw commented Dec 18, 2012

Experimental branch for this at https://github.com/mapbox/carto/tree/condense

Since this work will likely change the output of a lot of tests and we don't have visual tests for carto, it's not an easy thing to do on my own.

@springmeyer
Copy link

re: condense branch, sweet, happy to try to test
re: overall issue of what the heck is ideal/expected - I'm also kind of at a loss. I need to play around with small examples more to get a handle on this. But, what I'm wondering is if we may also need a non-cascading mode (e.g. a non filter-mode="first" mode)...

@tmcw
Copy link
Contributor

tmcw commented Dec 18, 2012

The main issue I'm seeing is that zoom rules are not properly taken into consideration when we're condensing filters, so we're generating 3 rules where there should be two for each zoom/filter combo.

@springmeyer
Copy link

fantastic observation.

@springmeyer
Copy link

Adding more benchmarking statements shows that for the above testcase (https://gist.github.com/859095) most time is taken up in tree.Definition.prototype.toXML. It does not appear that any one definition takes a long time, but rather that their are so many to process (245). So, @tmcw 's attempt to reduce rule generation - in any way possible - seems solid.

$ ./bin/carto issue_20.mss -b
Parsing time: 8ms
Rule generation time: 1ms
Inheritance time: 28ms
Sorted time: 1ms
Create style "0": 0ms
        Style level properties: 5ms
        Calling toXML on 245 definitions (rules): 2161ms
Calling toXML for style "0": 2168ms
Total Style toXML time: 2168ms
TOTAL: 2214ms

@tmcw
Copy link
Contributor

tmcw commented Dec 19, 2012

Finally got some profiling going: https://gist.github.com/4338438

Basic approach:

node --prof ./bin/carto -b test/rendering/issue20.mss

Check out v8, install, set

export D8_PATH=/Users/tmcw/src/v8/

then

~/src/v8/tools/mac-tick-processor >> analysis.txt

Basic jist of this is in 4ef52a8 which reduces render time for this input from 0.35s to 0.14s on my machine. This is only a generation improvement, though - the output still needs to be fixed.

@tmcw
Copy link
Contributor

tmcw commented Dec 19, 2012

Losing track of how this is a bug again - here's a reduced test case: https://gist.github.com/4339857

I get the 'extra filter generated' case, but that's at most adding a few more rules - I need a better example of ideal versus generated output, since it's very difficult to tell the difference between basic filter-explosions based on NxMxX combinations (which we can never fix in Carto until Mapnik does cascading) and a bug that can be fixed here.

@springmeyer
Copy link

okay, back in my court to see if I can gin up a better testcase.

@gravitystorm
Copy link
Contributor

So I think this issue represents a "fundamental" problem/feature of carto, rather than a bug that is easily addressed.

It's really, really easy to run into when dealing with OSM data because of the multiple-key approach to tagging. But what's easy in XML turns into a filter explosion in carto. A typical landcover layer - say 20 polygon types spread over 4 or 5 column names - is easy to write in XML with 20 filters in the XML. In carto, however, 20 rules end up with hundreds or thousands of filters in the output, and you end up with the kind of complex gymnastics shown in the sql query at gravitystorm/openstreetmap-carto#15 to achieve a similar rendering effect.

I see two ways forward with this, but neither are great:

a) Have a non-filter-mode-first output option for particular layers. This would allow writing carto that is a direct port of a simple xml landcover-style layer, i.e. where each feature could match multiple rules without needing NxMxX filters. But I suspect that might involve a bit of a rewrite of carto.

b) Have an "I don't care" or "match-one-rule" flag for particular layers. This would keep the filter-mode-first flag on the xml output, but instruct carto not to build all of the NxMxX filters. It would effectively say "if there's a feature with both landuse=forest and tourism=attraction, I'm happy with whichever rule comes first. Don't try figuring out the NxM combo". It's of limited applicability but would solve the opaque-polygon-fill case.

Slightly hand-wavy, but maybe provides ideas? My aim is for the carto required for these scenarios to be as simple to write as the XML, and ideally keep the complexity of the SQL to a minimum too.

@tmcw
Copy link
Contributor

tmcw commented Jan 7, 2013

Hey @gravitystorm - I agree. There are some filter-layer combinations that weren't really thought of in the early days of Carto and are handled poorly.

At this point, there are a few factors which basically explain why this is not moving quickly: limited time to work on Carto, lack of non-XML tests, requirement of backwards compatibility, lack of 'ideal output examples' for this ticket. Hence everyone being aware of the problem and its importance but a little limited in ability to jfdi it.

At some point, Dane will be working on symbolizer-based tests, and if you can figure out blue-sky examples of input & output, that would be super-helpful.

@lxbarth
Copy link

lxbarth commented Jan 7, 2013

How much of a show stopper is this issue for rolling out osm-carto on osm.org? Is @springmeyer's suggestion to patch in the old landcover style XML a feasible stop gap until we have the time to tackle the underlying issue?

@gravitystorm
Copy link
Contributor

It's not a showstopper at all - there are workarounds, and some are in place. This is more about the general case, and removing the need for workarounds in the first place. #235 is more of an problem for roll-out at the moment.

@emiltin
Copy link

emiltin commented Sep 14, 2013

I experienced the XML explosion and had a hard time figuring out what was causing it. I haven't looked deep into the issue, but it does seems the XML has a lot of unnecessary rules. For example, the style I'm working with generates this nonsensical rule that looks for ways that are bridges and tunnels at the same time:

  <Rule>
    <MaxScaleDenominator>400000</MaxScaleDenominator>
    <MinScaleDenominator>500</MinScaleDenominator>
    <Filter>([tunnel] = 1) and ([bridge] = 1)</Filter>
    <LineSymbolizer stroke-dasharray="3, 3" stroke="#adaca8" stroke-linecap="butt" stroke-linejoin="round" />
  </Rule>

I didn't to a reduction of this yet, so there isn't not much point is showing the carto code, but nowhere does it look for bridgetunnels :-)

@pnorman
Copy link
Contributor

pnorman commented Jun 29, 2014

Are there any updates on ways to avoid this issue without doing complicated "feature" column expressions in SQL?

@nebulon42 nebulon42 added wishlist and removed osm labels Mar 2, 2016
amandasaurus pushed a commit to geofabrik/openstreetmap-carto-vector-tiles that referenced this issue Mar 28, 2017
…sible

Previously, this style had attempted to replicate the upstream osm-carto
exactly. This meant the using the `way_pixels` and zoom trick to get the
same results. But that results in a combinatorial explosion for
amenity-points, which means carto can go from 30 seconds to parse the
file, to 3½minutes. By removing this inner selectors, the carto
performance will be back to about 30 seconds.

This affects the start up time of tessera/tilelive.

cf.:

 * mapbox/carto#470
 * mapbox/carto#20
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

9 participants