This connector extends Amazon Athena's capability by adding UDFs (via Lambda) for selected h3-java Java functions to support geospatial indexing and queries with Uber's H3. A Maven Site hosted on GitHub Pages holds the API documentation for this repository.
- Find the App in the AWS Serverless Application Repository
- Click 'Deploy'
# build
mvn clean verify package -Dpublishing=true
# deploy
sam deploy \
--resolve-s3 \
--stack-name aws-athena-udfs-h3-stack \
--template-file ./template.yaml \
--capabilities CAPABILITY_IAM
In your AWS SAM template.yaml
file:
Resources:
AwsAthenaUdfsH3:
Type: AWS::Serverless::Application
Properties:
Location:
ApplicationId: arn:aws:serverlessrepo:us-east-1:922535613973:applications/aws-athena-udfs-h3
SemanticVersion: 1.0.0-rc7
Parameters:
# The name of Lambda function, which calls the H3AthenaUDFHandler
# LambdaFunctionName: 'h3-athena-udf-handler' # Uncomment to override default value
# Lambda memory in MB
# LambdaMemory: '3008' # Uncomment to override default value
# Maximum Lambda invocation runtime in seconds
# LambdaTimeout: '300' # Uncomment to override default value
The API is very similar to the h3-java API.
USING EXTERNAL FUNCTION geo_to_h3(lat DOUBLE, lng DOUBLE, res INTEGER)
RETURNS BIGINT
LAMBDA 'h3-athena-udf-handler'
SELECT geo_to_h3(52.495999878401896, 13.414889023293945, 13) h3_index;
|h3_index |
|------------------|
|635554602371582271|
A GeoCoord
in the h3-java API is represented as a well-known-text (WKT) point, which is compatible with Athena geospatial functions.
USING EXTERNAL FUNCTION h3_to_geo(h3 BIGINT)
RETURNS VARCHAR
LAMBDA 'h3-athena-udf-handler'
select h3_to_geo(635554602371582271) wkt_point;
|wkt_point |
|---------------------------|
|POINT (13.414849 52.496016)|
USING EXTERNAL FUNCTION h3_to_string(h3 BIGINT)
RETURNS VARCHAR
LAMBDA 'h3-athena-udf-handler'
SELECT h3_to_string(635554602371582271) h3_address;
h3_address |
---------------+
8d1f18b25b9093f|
See Querying with User Defined Functions
In the AWS Athena Console with an Athena workgroup with Athena Query Engine 2 enabled, select a udf_name
(any public method of the H3AthenaUDFHandler
) and implement the function signature like so:
USING EXTERNAL FUNCTION udf_name(variable1 data_type[, variable2 data_type][,...])
RETURNS data_type
LAMBDA 'lambda-function-name' -- the LambdaFunctionName of the serverless app.
SELECT [...] udf_name(expression) [...]
Most h3-java API functions have an equivalent, snake-cased method in the H3AthenaUDFHandler
API. Some do not.
- Functions returning lists of lists in the h3-java API are not supported. There is a limitation in the
UserDefinedFunctionHandler
that does not allow serialization of complex/nested types. These include:kRings
kRingDistances
hexRange
- Experimental I, J coordinate h3-java API functions are not supported.
- The following UDFs do not work as expected, and should not be used:
get_res_0_indexes() RETURNS ARRAY<BIGINT>
- Note: always throws
NullPointerException
- Note: always throws
get_res_0_indexes_addresses() RETURNS ARRAY<VARCHAR>
- Note: always throws
NullPointerException
- Note: always throws
In the Athena console, run the query in create_planet.sql to create some test data from the current Open Street Maps database.
Then run test_udfs_planet.sql to test the H3 functions available via this application are registering and working correctly.
In the Athena console, run create_hrsl.sql, and then run repair_hrsl.sql to create some test data from the Facebook Data For Good Population Density dataset.
In your SQL client, run the SQL script create_hrsl_h3.sql (or run each statement individually in the Athena console).
Then run create_planet_h3.sql.
The created tables have an H3 index at resolution 15.
Get restaurants per person in Germany at H3 resolution 7 and output H3 index string for mapping with tools like Unfolded.ai by running restaurants_per_person.sql.
Format your Java contributions with the spotless Maven plugin. This is done automatically when running mvn verify
or mvn install
. Modify pom.xml to change formatting rules.
mvn spotless:apply
The GitHub Pages Site is built with mvn site
and is published manually. Change the contents of the site by modifying pom.xml and site.xml.
Build the site locally.
mvn -Preporting site site:stage
# Open the built site in your browser
open ./target/site/index.html
Publish the site to GitHub Pages.
mvn scm-publish:publish-scm
Publishing this code the the AWS Serverless Application Repository is done manually. New semantic versions should be published for new tagged commits in the main
branch of this repository.
# build
mvn spotless:apply clean install -Dpublishing=true
# package
sam package \
--resolve-s3 \
--output-template-file ./target/packaged.yaml
# publish
sam publish \
--template-file ./target/packaged.yaml \
--semantic-version 1.0.0-rc7
See the AWS blog post Translate and analyze text using SQL functions with Amazon Athena, Amazon Translate, and Amazon Comprehend
This project is licensed under the Apache-2.0 License.