This Greengrass accelerator demonstrates how to run machine learning inference (MLI) on a local device at the edge, and publish the inference results to the connected Greengrass Aware Devices (GGAD), back to the cloud.
This accelerator covers the following
- Discuss various options for source of local inference
- Local functions that invoke the inference services running on the device
Example use cases for local MLI:
- Image Analytics in Surveillance - Instead of having a security camera stream the content of its video to the cloud to be analyzed for certain situations such as unknown people or objects detection, the analysis can be done locally in the camera, and keeping the cloud-based analysis to a much smaller segment of the video data. Such architecture reduces the network congestion and sometimes costs with smaller amount of data to be sent to the cloud, and avoid any potential latencies which are very crucial in real-time surveillance application.
- Sensors Analytics - Sensors attached to a machine can measure vibration, temperature, or noise levels, and inference at the edge can predice the state of the equipment and potential anomalies for early indications of failure.
Imagine in an automated sorting and recycling facility center, there are numbers of camera that identify the wastes.
Each camera snaps the image of waste at predetermined frequency of 1 seconds.
The local process will capture the image at predetermined frequency of 1 seconds, and forward the image to the local inference.
Finally, the process run the local logics based on the predictions.
The design is composed of three main parts:
- The source - the data source, such as a camera connected locally, or from a network streaming protocol such as RTSP.
- Greengrass Core - Local hardware that can connect to the camera and has a connection to the AWS Cloud.
- AWS Cloud - For model training, and feedback to improve model accuracy
The processing flow of the accelerator is:
- Data acquisition - This function acquire the raw data inputs from the source, such as data from the sensors, or images from the image capture devices.
- Data preprocessor - This function will pre-process the data, such as resize of the image, or normalize the sensors data. The process will then forward the data to the local inference for predictions.
- Inference engine can either running in Greengrass Connector, or in-process with the Lambda process.
- Greengrass Connector - If the algorithm is supported, Greengrass Connector can be used to infer using the model from the Amazon SageMaker. Lambda processes interface with the Greengrass Connectors using library
greengrass-ml-sdk
- In-process - If the model is compiled with Amazon SageMaker NEO compiler, the model can be loaded using NEO Deep Learning Runtime (NEO-DLR). Otherwise, the model can be loaded using respective runtimes, such as MXNet, TensorFlow or Chainer runtime.
- Greengrass Connector - If the algorithm is supported, Greengrass Connector can be used to infer using the model from the Amazon SageMaker. Lambda processes interface with the Greengrass Connectors using library
- For the in-process inference, the model can be accessed as local Greengrass Machine Learning Resource.
- Model sources for the Greengrass are either models uploaded in S3 buckets, or save as Amazon SageMaker training jobs.
- Models in S3 buckets - Amazon SageMaker Neo compiled models, or pre-trained models such as Inception-BN from the model zoo, can be upload to a S3 buckets and configure as Greengrass Machine Learning Resources
- Amazon SageMaker Training Jobs - Greengrass supports models that are saved as Amazon SageMaker training jobs.
- From the result of the inference, the Lambda function can handle the prediction accordingly locally. For example, to trigger an actuator based on the prediction result.
The diagram below summarize the decision tree to which Greengrass Machine Learning Runtime can be used:
If the algorithm is available as Greengrass connectors, such as Image Classification, run the model using the Greengrass connector
If the model is compiled with AWS SageMaker Neo, the model can be deployed in NEO-AI runtime.
These models can be deployed as Machine Learning resources to the Greengrass Group with MXNet/TensorFlow/Chainer runtime.
These models can uploaded in S3, and deployed as Machine Learning resources to the Greengrass Group
machine_learning_inference/
├── README.md <-- This file!
├── S3_MODELS.md
├── cfn
│ ├── greengrass_mli-s3_models.cfn.yaml
│ └── lambda_functions
│ ├── s3_models
│ └── cfn_custom_resources
└── assets
The cfn/
directory contains the CloudFormation assets to create the Greengrass configuration.
The cfn/lambda_functions
directory contains the Lambda functions including the long-lived Lambda functions that runs in the Greengrass Core to run the inference.
For example, the cfn/lambda_functions/s3_models/
folder contains the Lambda functions that runs the local inference with pre-trained S3 models.
Details of how to run the machine learning inference is in separate document: