Skip to content

Commit

Permalink
Merge pull request #11 from moose-in-australia/main
Browse files Browse the repository at this point in the history
Poetry Python custom image example
  • Loading branch information
jaipreet-s authored May 4, 2021
2 parents a259aa2 + 6f9122b commit 1323260
Show file tree
Hide file tree
Showing 7 changed files with 177 additions and 0 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ This repository contains examples of Docker images that are valid custom images

- [echo-kernel-image](examples/echo-kernel-image) - This example uses the echo_kernel from Jupyter as a "Hello World" introduction into writing custom KernelGateway images.
- [jupyter-docker-stacks-julia-image](examples/jupyter-docker-stacks-julia-image) - This example leverages the Data Science image from Jupyter Docker Stacks to add a Julia kernel.
- [python-poetry-image](examples/python-poetry-image) - This example uses Poetry to manage the package dependencies in Python.
- [r-image](examples/r-image) - This example contains the `ir` kernel and a selection of R packages, along with the AWS Python SDK (boto3) and the SageMaker Python SDK which can be used from R using `reticulate`
- [rapids-image](examples/rapids-image) - This example uses the offical rapids.ai image from Dockerhub. Use with a GPU instance on Studio
- [scala-image](examples/scala-image) - This example adds a Scala kernel based on [Almond Scala Kernel](https://almond.sh/).
Expand Down
40 changes: 40 additions & 0 deletions examples/python-poetry-image/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
FROM python:3.7

ARG NB_USER="sagemaker-user"
ARG NB_UID="1000"
ARG NB_GID="100"

######################
# OVERVIEW
# 1. Creates the `sagemaker-user` user with UID/GID 1000/100.
# 2. Ensures this user can `sudo` by default.
# 3. Installs and configures Poetry, then installs the environment defined in pyproject.toml
# 4. Configures the kernel (ipykernel should be installed on the parent image or defined in pyproject.toml)
# 5. Make the default shell `bash`. This enhances the experience inside a Jupyter terminal as otherwise Jupyter defaults to `sh`
######################

# Setup the "sagemaker-user" user with root privileges.
RUN \
apt-get update && \
apt-get install -y sudo && \
useradd -m -s /bin/bash -N -u $NB_UID $NB_USER && \
chmod g+w /etc/passwd && \
echo "${NB_USER} ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers && \
# Prevent apt-get cache from being persisted to this layer.
rm -rf /var/lib/apt/lists/*

# Install Poetry
RUN pip install poetry
# Disable virtual environments (see notes in README.md)
RUN poetry config virtualenvs.create false --local
# Copy the environment definition file and install the environment
COPY pyproject.toml /
RUN poetry install

# Configure the kernel
RUN python -m ipykernel install --sys-prefix

# Make the default shell bash (vs "sh") for a better Jupyter terminal UX
ENV SHELL=/bin/bash

USER $NB_UID
67 changes: 67 additions & 0 deletions examples/python-poetry-image/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
## Python Poetry Image

### Overview

This example creates a custom image in Amazon SageMaker Studio using [Poetry](https://python-poetry.org/) to manage the Python dependencies.

### Building the image

Build the Docker image and push to Amazon ECR.
```
# Modify these as required. The Docker registry endpoint can be tuned based on your current region from https://docs.aws.amazon.com/general/latest/gr/ecr.html#ecr-docker-endpoints
REGION=<aws-region>
ACCOUNT_ID=<account-id>
# Build the image
IMAGE_NAME=custom-poetry-kernel
aws --region ${REGION} ecr get-login-password | docker login --username AWS --password-stdin ${ACCOUNT_ID}.dkr.ecr.${REGION}.amazonaws.com/smstudio-custom
docker build . -t ${IMAGE_NAME} -t ${ACCOUNT_ID}.dkr.ecr.${REGION}.amazonaws.com/smstudio-custom:${IMAGE_NAME}
```

```
docker push ${ACCOUNT_ID}.dkr.ecr.${REGION}.amazonaws.com/smstudio-custom:${IMAGE_NAME}
```

### Using it with SageMaker Studio

Create a SageMaker Image with the image in ECR.

```
# Role in your account to be used for the SageMaker Image
ROLE_ARN=<role-arn>
aws --region ${REGION} sagemaker create-image \
--image-name ${IMAGE_NAME} \
--role-arn ${ROLE_ARN}
aws --region ${REGION} sagemaker create-image-version \
--image-name ${IMAGE_NAME} \
--base-image "${ACCOUNT_ID}.dkr.ecr.${REGION}.amazonaws.com/smstudio-custom:${IMAGE_NAME}"
# Verify the image-version is created successfully. Do NOT proceed if image-version is in CREATE_FAILED state or in any other state apart from CREATED.
aws --region ${REGION} sagemaker describe-image-version --image-name ${IMAGE_NAME}
```

Create an AppImageConfig for this image.

```
aws --region ${REGION} sagemaker create-app-image-config --cli-input-json file://app-image-config-input.json
```

Create a Domain, providing the SageMaker Image and AppImageConfig in the Domain creation. Replace the placeholders for VPC ID, Subnet IDs, and Execution Role in `create-domain-input.json`.

```
aws --region ${REGION} sagemaker create-domain --cli-input-json file://create-domain-input.json
```

If you have an existing Domain, you can use the `update-domain` command.

```
aws --region ${REGION} sagemaker update-domain --cli-input-json file://update-domain-input.json
```

### Notes

* Since SageMaker Studio overrides `ENTRYPOINT` and `CMD` instructions (see [documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/studio-byoi-specs.html)), this sample disables the Poetry virtual environments as recommended in their [FAQ](https://python-poetry.org/docs/faq/#i-dont-want-poetry-to-manage-my-virtual-environments-can-i-disable-it).
* Note that `ipykernel` must be installed on custom images for SageMaker Studio. If this package is not installed by default on the parent image, then it should be included in the `pyproject.toml` file.
16 changes: 16 additions & 0 deletions examples/python-poetry-image/app-image-config-input.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
{
"AppImageConfigName": "custom-poetry-kernel-image-config",
"KernelGatewayImageConfig": {
"KernelSpecs": [
{
"Name": "python3",
"DisplayName": "Python 3 (poetry)"
}
],
"FileSystemConfig": {
"MountPath": "/home/sagemaker-user",
"DefaultUid": 1000,
"DefaultGid": 100
}
}
}
19 changes: 19 additions & 0 deletions examples/python-poetry-image/create-domain-input.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
{
"DomainName": "domain-with-poetry-kernel-image",
"VpcId": "<vpc-id>",
"SubnetIds": [
"<subnet-ids>"
],
"DefaultUserSettings": {
"ExecutionRole": "<execution-role>",
"KernelGatewayAppSettings": {
"CustomImages": [
{
"ImageName": "custom-poetry-kernel",
"AppImageConfigName": "custom-poetry-kernel-image-config"
}
]
}
},
"AuthMode": "IAM"
}
21 changes: 21 additions & 0 deletions examples/python-poetry-image/pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
[tool.poetry]
name = "custom_poetry_image"
version = "0.1.0"
description = "An example of a custom Poetry image for SageMaker Studio."
authors = ["Your Name <[email protected]>"]

[tool.poetry.dependencies]
python = ">=3.7.1,<3.10"
boto3 = "^1.17.51"
ipykernel = "^5.5.3"
numpy = "^1.20.2"
pandas = "^1.2.4"
sagemaker = "^2.39.0"
scikit-learn = "^0.24.1"
scipy = "^1.6.2"

[tool.poetry.dev-dependencies]

[build-system]
requires = ["poetry-core>=1.1.6"]
build-backend = "poetry.core.masonry.api"
13 changes: 13 additions & 0 deletions examples/python-poetry-image/update-domain-input.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
{
"DomainId": "<domain-id>",
"DefaultUserSettings": {
"KernelGatewayAppSettings": {
"CustomImages": [
{
"ImageName": "custom-poetry-kernel",
"AppImageConfigName": "custom-poetry-kernel-image-config"
}
]
}
}
}

0 comments on commit 1323260

Please sign in to comment.