-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LD_LIBRARY_PATH override stomps EMR settings #6
Comments
Things to keep in mind:
EmrConfig("spark-env").withProperties(
"LD_LIBRARY_PATH" -> "/usr/local/lib:$LD_LIBRARY_PATH"
) |
CloudNiner
added a commit
to geotrellis/geotrellis-osm-diff-demo
that referenced
this issue
Jun 3, 2019
This is a multi-stage improvement: 1. Switch to a full join so that we can compare all combinations of OSM and Bing buildings 2. Match with a series of tiered strategies. First filter any geoms from the match list that intersect with no other geoms. Then perform the original centroid check. Then perform an intersection area overlap check. 3. Allow one to many matching for Bing -> OSM. Bing areas are generally larger and more poorly defined since they're satellite derived footprints. This change generally allows us to retain fidelity in the output when Bing sees one large area that is actually a tight cluster of buildings such as a block of row homes. The schema of the output tiles is changed. The properties for each output geometry are: Old: - "hasOsm": Boolean New: - "source": String, one of "osm"|"bing"|"both" - "name": String, value of osm tag "name", if available. Defaults to empty string. - "building_type": String, value of osm tag "building", if available. Defaults to empty string. Adds support to the CLI for dumping the input RDD layers in addition to the output diff layer via the `--source` argument. Fixes a bug where the geomesa geojson writer fails to properly encode string properties that contain double quotes. See: - https://geomesa.atlassian.net/browse/GEOMESA-2631 - https://geomesa.atlassian.net/browse/GEOMESA-2630 Fixes a bug where invalid geoms could persist to the diff algorithm and throw errors during the intersection check phase. Fixes a EMR bootstrap bug where our library path override meant to allow EMR to find our GDAL installation overwrites the defaults in EMR which leaves the cluster executors unable to find other libs. See: geotrellis/geotrellis-spark-job.g8#6
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Leading to the following exception in any job where Spark attempts to compress data:
This was verified by running the following:
versus
We override this setting so that executors are able to find the GDAL bindings installed by the bootstrap script. I'd expect to have the spark cluster configured such that all of the native EMR installed libs as well as GDAL are available on the LD_LIBRARY_PATH
The text was updated successfully, but these errors were encountered: