Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add explicit language around edges and ensure available options capture engine behaviours that import/export Parquet #250

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

paleolimbot
Copy link
Collaborator

This PR is an attempt to improve the language around the "edges" key, to ensure it can capture the intent of all potential producers. We've done more research since the original wording and have some specific examples and use cases where implementations and specifications have struggled. Notably, this PR:

  • Adds a "geodesic" option. I only knew about the spherical approximation because I'd only worked with s2, but it's been rightly pointed out that geodesic edges that consider the ellipsoid are a thing in several major engines.
  • Adds examples of producers and consumers that use each edge type
  • Adds explicit language around "planar" from simple features access
  • Adds practical considerations around importing a column with an unsupported edge type (basically: you can import them silently if there are no edges; you should force an opt-in or display a prominent warning otherwise).

I've also added a PR to add similar language to GeoArrow geoarrow/geoarrow#79 .

@paleolimbot paleolimbot changed the title Add explicit Add explicit language around edges and ensure available options capture engine behaviours that import/export Parquet Dec 4, 2024
> **simple feature** feature with all geometric attributes described piecewise
> by straight line or planar interpolation between sets of points (Section 4.19).

- `"spherical"`: Edges follow the shortest distance between vertices approximated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can one describe the perfect sphere via a geodetic CRS? That way, we can merge spherical and geodesic together and recommend everyone to look into the crs key for interpreting the edge?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, a sphere is described by an ellipsoid with an inverse flattening conventionally set to 0 (it should actually be infinity, but the 0 convention is heavily used)
I believe an issue though (mentionned in some other discussion) is that in some situations some software might apply the formulas for spheres on non-spherical ellipsoids as a simplification avoiding using elliptical integrals

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we did this we would loose the original CRS information, which is important for ensuring that the position of every single vertex is precisely defined (e.g., a CRS transform from a perfect sphere ellipsoid to UTM, for example, is not the transform we want to invoke on the longitude and latitude of the vertices). The spherical approximation is strictly limited to the (usually small) space between adjacent vertices.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference between the geodesic and spherical are probably always trivial (I can at some point do some math to get a handle on the maximum/average error), but if we want a precise definition, we need it to be there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I understand. spherical mode actually overrides the edge specified in the crs. That's where the confusion came from.

I think this is making great progress. I will schedule another meeting among all parties to settle down on the geography type of Iceberg

Co-authored-by: Joris Van den Bossche <[email protected]>
@jorisvandenbossche
Copy link
Collaborator

I think this is good and clarifying content to add (thanks Dewey!), but one practical concern: do we have an idea if anyone would actually be planning to use the distinction between "spherical" and "geodesic" ?
For example, would bigquery, when importing data that is marked as "geodesic", do a similar edge tessellation like with importing planar data to ensure there is a certain max error? (and bigquery here is just an example, same applies to all other engines or packages mentioned. But at least BigQuery does correct planar data, while I have not seen similar documentation for several of the other systems)

Because if there is not actually an "ask" for this, do we need to add an actual separate option (instead of just documenting this and adding most of the language you are adding here)

@paleolimbot
Copy link
Collaborator Author

Because if there is not actually an "ask" for this, do we need to add an actual separate option (instead of just documenting this and adding most of the language you are adding here)

That's a great point (we could just acknowledge that such edges exist without allowing the option). I think allowing the option is good because in GeoArrow we actually will interact with that edge type as a vessel for database results coming from PostGIS and Redshift at least, and it would be a bit odd if we allowed the option there but not here.

In the same sphere (pun slightly intended), we could also mention loxodrome edges (constant bearing), which GeographicLib can also interpret. I don't know of any producers of Arrow or Parquet that do this but it does exist.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants