This GitHub lab will take you through hands-on exercise of visualizing graph data in Amazon Neptune using VIS.js library. Amazon Neptune is a fast, reliable, fully managed graph database service available from AWS. With Amazon Neptune you can use open source and popular graph query languages such as Apache TinkerPop Gremlin for property graph databases or SPARQL for W3C RDF model graph databases.
Currently, customers can leverage our partner solutions mentioned below to perform visualization of data in Amazon Neptune. Some of our partners in this area are -
- Tom Sawyer Perspectives by TomSawyer Softwares
- Keylines by Cambridge Intelligence
- Metaphactory by Metaphacts etc.
Customers can leverage above commercial solutions to visualize data in Amazon Neptune.
There are also open source libraries and solutions available. Few of them are - - GraphExp open source visualization tool by
- D3.js javascript library by D3JS.org
- VIS.js open source library by VISJS.org etc. Customers can use these visualization libraries to build their own applications and products on top of Amazon Neptune.
In the rest of the lab, we will focus on visualizing data in Amazon Neptune using VIS.js. VIS.js is a Javascript library used for visualizing graph data. It has various components such as DataSet, Timeline, Graph2D, Graph3D, Network etc. for displaying data in various ways.
- Amazon Neptune Cluster
- Access to AWS IAM Permissions for
-> creating AWS Lambda functions
-> creating IAM roles for Amazon Neptune cluster to access S3 and, for API Gateway to access AWS Lambda functions
-> APIs in Amazon API Gateway
-> Amazon S3 buckets
-> creating and attaching VPC Endpoints to the VPC
Below is the architecture that we will be using to visualize data in Amazon Neptune using VIS.js. Since Amazon Neptune instance can't be accessed outside the VPC, we are using AWS Lambda function in VPC. This AWS Lambda function is accessed through the proxy created and exposed to internet using Amazon API Gateway. Once the proxy is exposed, we can access the APIs from Javascript code being executed from Amazon S3 (static website).
- Provision Amazon Neptune Cluster
- Load sample data into Amazon Neptune
- Create and Configure AWS Lambda Function
- Create and Configure Amazon API Gateway - Proxy API
- Configure Amazon S3 bucket for hosting a static website
Follow the procedure mentioned in Amazon Neptune Documentation to provision a new Amazon Neptune Instance inside a VPC.
This GitHub repo provides you sample twitter data that you can load into Amazon Neptune.
To load the data into Amazon Neptune, you will need to
- copy the sample data to Amazon S3
- create IAM Role for accessing S3 bucket and attach it to Amazon Neptune cluster
- create Amazon S3 VPC Endpoint and attach it to the VPC
- create a "bastion host" inside the same VPC to load data from Amazon S3 into Amazon Neptune.
For details on how to load data from Amazon S3 to Amazon Neptune please refer this link from AWS documentation.
Once you load data into Amazon Neptune, you need to create the AWS Lambda function to access this data and expose it over RESTful interface through Amazon API Gateway.
Execute below steps from the terminal to create the deployment package and create AWS Lambda function -
sudo yum install git
sudo yum install npm
git clone https://github.com/aws-samples/amazon-neptune-samples.git
cd gremlin/visjs-neptune
npm install
zip lambdapackage.zip -r node_modules/ indexLambda.js
Once AWS Lambda deployment package (.zip file) is ready, we can create the Lambda function using AWS CLI.
To install and configure AWS CLI on your operating system, please refer to - https://docs.aws.amazon.com/cli/latest/userguide/installing.html
After installing AWS CLI, run aws configure
to set the access_key, secret_key and AWS region.
Run below commands to create AWS Lambda function within the same VPC as Amazon Neptune cluster.
AWS Lambda function would need an execution role to be able to create ENIs in the VPC for accessing the Neptune instance.
aws iam create-role --path /service-role/ --role-name lambda-vpc-access-role --assume-role-policy-document '{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "lambda.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}' --description "VPC Access role for lambda function"
Below is the command to attach the policy provided by AWS to the above role.
aws iam attach-role-policy --role-name lambda-vpc-access-role --policy-arn arn:aws:iam::aws:policy/service-role/AWSLambdaENIManagementAccess
We will now create the AWS Lambda function using the deployment package and IAM role created in previous steps.
NOTE: Use subnet-ids
from the VPC in which Amazon Neptune cluster is provisioned.
aws lambda create-function --function-name <lambda-function-name> \
--role "arn:aws:iam::<aws-account-number>:role/service-role/lambda-vpc-access-role" \
--runtime nodejs14.x --handler indexLambda.handler \
--description "Lambda function to make gremlin calls to Amazon Neptune" \
--timeout 120 --memory-size 256 --publish \
--vpc-config SubnetIds=<subnet-ids>,SecurityGroupIds=<sec-group-id> \
--zip-file fileb://lambdapackage.zip \
--environment Variables="{NEPTUNE_CLUSTER_ENDPOINT=<your-neptune-cluster-endpoint>,NEPTUNE_PORT=<your-neptune-db-port>}"
We recommend you to go through the AWS Lambda function source code at this point to understand how to query data using Gremlin
APIs and how to parse and reformat the data to send it over to the clients.
We would expose the AWS Lambda function created in the earlier step through Amazon API Gateway Proxy API.
For more details on this approach please refer to AWS documentation @https://docs.aws.amazon.com/apigateway/latest/developerguide/set-up-lambda-proxy-integrations.html
First we create the Restful API using the below command from AWS CLI.
aws apigateway create-rest-api --name lambda-neptune-proxy-api --description "API Proxy for AWS Lambda function in VPC accessing Amazon Neptune"
Note the value of "id" field from the earlier output and use it as a <rest-api-id>
value below.
aws apigateway get-resources --rest-api-id <rest-api-id>
Note the value of "id" field from the earlier output and use it as a <parent-id>
value below.
Below command would create a resource under the root strutrure of the API.
aws apigateway create-resource --rest-api-id <rest-api-id> --parent-id <parent-id> --path-part {proxy+}
Note the value of "id" field from the output and use it as a <resource-id>
in the command below.
aws apigateway put-method --rest-api-id <rest-api-id> --resource-id <resource-id> --http-method ANY \
--authorization-type NONE
So far we created, an API, API Resource and Methods for that Resource (GET/PUT/POST/DELETE or ANY for all methods). We will now create the API method integration, that would identify the AWS Lambda function for which this resource will be acting as a PROXY.
Use appropriate values obtained from the previous commands.
aws apigateway put-integration --rest-api-id <rest-api-id> \
--resource-id <resource-id> --http-method ANY --type AWS_PROXY \
--integration-http-method POST \
--uri arn:aws:apigateway:<aws-region-code>:lambda:path/2015-03-31/functions/arn:aws:lambda:<aws-region-code>:<aws-account-number>:function:<lambda-function-name>/invocations
Finally we deploy the API using below command.
aws apigateway create-deployment --rest-api-id <rest-api-id> --stage-name test
In order for Amazon API Gateway API, to invoke AWS Lambda function we either need to provide the "execution-role" to API Integration or we can also add the permission (subscription) in AWS Lambda explicitly that says an API "X" can invoke a lambda function. This API Gateway subscription is also reflected in AWS Console.
Execute below command to add API Gateway subscription/permission to AWS Lambda function.
aws lambda add-permission --function-name <lamnda-function-name> \
--statement-id <any-unique-id> --action lambda:* \
--principal apigateway.amazonaws.com \
--source-arn arn:aws:execute-api:<aws-region-code>:<aws-account-number>:<rest-api-id>/*/*/*
We have now created an API Gateway proxy for the AWS Lambda function.
Now that we have all the backend infrastructure ready for handling the API requests to get data out from Amazon Neptune, let's create an Amazon S3 bucket that will be used to host a static website.
Run below commands to create an Amazon S3 bucket as a static website and upload the visualize-graph.html
into it.
--create Amazon S3 bucket with public read access
aws s3api create-bucket --bucket <bucket-name> --acl public-read --region <aws-region-code> --create-bucket-configuration LocationConstraint=<aws-region-code>
--configure website hosting on S3 bucket
aws s3api put-bucket-website --bucket <bucket-name> --website-configuration '{
"IndexDocument": {
"Suffix": "visualize-graph.html"
},
"ErrorDocument": {
"Key": "visualization-error.html"
}
}'
The visualize-graph.html file that we have as a part of this GitHub repository has to be updated to reflect the API Gateway Endpoint that we created in the steps above.
Execute below commands to replace value of API_GATEWAY_ENDPOINT
placeholder by real API Gateway Endpoint.
You can obtain the value of API_GATEWAY_ENDPOINT
from AWS Management Console by navigating to API Gateway -> APIs -> Select the API Name -> Stages. Copy the value of Invoke URL
field as API_GATEWAY_ENDPOINT
value as replace it in the below commands.
You can also construct this URL using https://<rest-api-id>.execute-api.<aws-region-code>.amazonaws.com/<stage-name>
.
NOTE: While executing the below commands make sure to use escape character "\" in URLs.
For Linux:
sed -i -e 's/API_GATEWAY_ENDPOINT/<API-Gateway-Endpoint>/g' visualize-graph.html
e.g.
sed -i -e 's/API_GATEWAY_ENDPOINT/https:\/\/7brms4lx43.execute-api.us-east-2.amazonaws.com\/test/g' visualize-graph.html
For MacOS:
find . -type f -name visualize-graph.html | xargs sed -i '' 's/API_GATEWAY_ENDPOINT/<API-Gateway-Endpoint>/g'
e.g.
find . -type f -name visualize-graph.html | xargs sed -i '' 's/API_GATEWAY_ENDPOINT/https:\/\/7brms4lx43.execute-api.us-east-2.amazonaws.com\/test/g'
Once you have replace the value of placeholder API_GATEWAY_ENDPOINT
in visualize-graph.html file, upload the file to S3 using below command.
--upload the html document with public read access
aws s3 cp ./ s3://<bucket-name> --recursive --exclude "*" --include "vis*" --acl public-read
And, you are all set!
Visualize the graph data through this application from below URL.
http://<bucket-name>.s3-website.<aws-region-code>.amazonaws.com
In this GitHub lab, you learned how to visualize data from Amazon Neptune using VIS.js library, Amazon API Gateway and AWS Lambda service. Feel free to send us your feedback and suggestions. We would be happy to incorporate those in our lab.