Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for LineStringZM #171

Merged
merged 5 commits into from
Oct 9, 2023
Merged

Conversation

kanatohodets
Copy link
Contributor

This PR adds support for LineStringZM, a Line String Z with a 4th value for
arbitrary metadata, like timestamps of measurements.

If this looks good to you, I'm happy to send more PRs adding 'M' variants of
other geo types (individually or batched, whatever your preference).

Kudos for an easy codebase to work with! I found it super straightforward to
implement this, even as a newcomer to GIS.

@kanatohodets kanatohodets force-pushed the btyler/line_string_zm branch from f925f0f to 7eb2282 Compare April 16, 2022 06:38
@kanatohodets
Copy link
Contributor Author

Hey @bryanjos - I realized I hadn't updated the README, so that's been fixed.

Does it make sense to you for me to go ahead and add support for the other "ZM" variants which are not currently supported by geo?

No sweat if you don't have time to review or merge anything for a while, just wanted to make sure I'm not digging too much of a hole.

@bryanjos
Copy link
Contributor

@kanatohodets I think what you have here is good. But there is a conflict here with the PR I just merged in. I also think that PR tightened up the JSON parsing to be more GeoJSON compliant and maybe the ZM variants don't work for GeoJSON. Other than that, everything else looks good!

@kanatohodets
Copy link
Contributor Author

Well! Things got a bit interesting as I dug into the merge conflict.

TL;DR: I think I found an old bug with 'Z' things and GeoJSON decoding. I'm
happy to work on solutions, but it likely means some breaking changes.

  1. The conflict was in the tests, and was really just a positional conflict (I shuffled around a neighboring test for the new one; the spec compliance PR renamed it).

  2. Interestingly, when I untangled the conflict and left my test in place, both tests passed! This was weird, because the PR should have exactly broken my test of LineStringZM GeoJSON round-tripping.

  3. It turns out that my test was passing because only for LineStringZ
    list_to_tuple/1 (with the "drop 3rd position or beyond" behavior) is not
    called on GeoJSON decode. Instead, it uses List.to_tuple. I copied this when
    I made LineStringZM. This dates from f49c6f1.

As far as I understand the GeoJSON spec, it does not allow LineStringZ (or any
other 'Z' types) in the "type" field. So, I think f49c6f was in error. The same
goes for MultiPolygonZ, introduced in 0b49b0b.

PointZ, however, is correct - it does not parse "PointZ".

I think this means that geo has been accidentally supporting "type" names of
LineStringZ and MultiPolygonZ when decoding GeoJSON. However, it was
a misleading situation, as it has discarded the Z bit since 2019
(9ebef60).

I would suggest the following:

  1. Remove support for any 'Z' (or M) type names from the GeoJSON decoder. Error
    if such a type is given in the GeoJSON, as PostGIS does. This is a breaking
    change, but support for these types was already not working (as the 3rd
    position was stripped).

  2. When GeoJSON types such as LineString are given with the 3rd position
    present in coordinates, assume it is Z, as the spec indicates. Parse it into
    the correct 'Z' type. This means that a LineString GeoJSON could be decoded
    into either Geo.LineString or Geo.LineStringZ, depending on the data.

  3. Continue to strip out 4th or greater positions on GeoJSON decode, such as
    'M'. Decode 4+ member coordinates into 3 member 'Z' types.

  4. (more of a discussion topic, as I'm really new to this space) - maybe
    there's appetite out there for "relaxed GeoJSON" which supports a wider range
    of types? could imagine this as an alternative function, with a config setting
    for the default behavior (defaulting that setting to "strict").

Do any of these suggestions resonate with you? I think doing 1-3 will bring
decoding into compliance with the spec, and also behave similarly to PostGIS.
4 is probably dubious, fomenting minor rebellion against the spec. That said,
it could provide a path for folks who have been relying on geo to parse
GeoJSON with the 'Z' type names.

I'm happy to pick up this work (or something else entirely!) in whatever order
you think is reasonable.

Examples from my reference parser, PostGIS 3.2:

ST_GeomFromGeoJSON throws an error when given 'LineStringZ' as type:

  yolo_dev=# select ST_AsText(ST_GeomFromGeoJSON('{"type": "LineStringZ",
  "coordinates":[[1, 2, 3], [4, 5, 6]]}'));
  ERROR:  invalid GeoJson representation

The same function decodes into a LineStringZ when the GeoJSON coords have
3 members:

yolo_dev=# select ST_AsText(ST_GeomFromGeoJSON('{"type": "LineString", "coordinates": [[1, 2, 3], [4, 5, 6]]}'));
         st_astext
----------------------------
 LINESTRING Z (1 2 3,4 5 6)

And when GeoJSON is provided with the 'M' member, it is ignored and decoded as
'LineStringZ':

yolo_dev=# select ST_AsText(ST_GeomFromGeoJSON('{"type": "LineString", "coordinates":[[1, 2, 3, 4], [5, 6, 7, 8]]}'));
         st_astext
----------------------------
 LINESTRING Z (1 2 3,5 6 7)

@bryanjos
Copy link
Contributor

@kanatohodets thanks for the research you put in here. I am aligned in your thinking with points 1-3. And for point 4, I would just stick to the spec. There's been enough discussions and issues around it that I think it's best to stick to the spec.

@bryanjos
Copy link
Contributor

@kanatohodets And I think geo will need a breaking change for some unrelated things so with what you have set forth, I think you can add some deprecations until that change is ready.

Copy link
Contributor

@s3cur3 s3cur3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for your work on this, @kanatohodets! I learned a lot reading through this discussion, and the code and tests look great.

I'm gonna merge this with the tests fixed to match the existing behavior of dropping the M coordinate on GeoJSON encode, since it broadly aligns with the way the rest of the Z and M handling works already, but I've filed #188 to follow up on the GeoJSON compatibility stuff. I think I'm with you all of points 1-4—in particular, my preference would be to continue accepting "type": "LineStringZ" for backward compatibility. Even though it's out of spec, it seems easy enough to avoid breaking in case somebody's relying on that behavior, and I don't think anyone's using Geo's parser as a check for whether a file is spec-compliant.

Comment on lines 147 to 150
# GeoJSON does not allow "m", and does not recognize "LineStringZM" as
# a type
coordinates = Enum.map(coordinates, fn({x, y, z, _m}) -> [x, y, z] end)
%{"type" => "LineString", "coordinates" => coordinates}
Copy link
Contributor

@s3cur3 s3cur3 Oct 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a fine compromise for compatibility. I wonder, though, whether it'll be surprising to folks that the m value totally disappears. If people complain, we may want to do something like include the m in the GeoJSON feature's properties. That seems like a hack, though, so not something I'd do proactively.

@@ -1,12 +1,14 @@
defmodule Geo.WKB.Decoder do
@moduledoc false

# these numbers can be referenced against postgis.git/doc/ZMSgeoms.txt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The link is very much appreciated!

@s3cur3 s3cur3 merged commit f989295 into felt:master Oct 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants