Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically provision a Prefect Cloud account with resources for the debug tutorial #2

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
10 changes: 9 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,10 @@
env
# Python
__pycache__
env
venv

# Terraform files
.terraform
*.tfstate
*.tfstate.*
.terraform.lock.hcl
32 changes: 32 additions & 0 deletions infra/destroy_env.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
#!/bin/bash
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about converting this script to Python? We can safely assume our users are familiar with Python whereas bash syntax might be new. You can then also use Prefect's built in conveniences for things like getting the active profile.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm down with converting this to Python if it simplifies the script. I started with Terraform, and then wrapped it in a bash script when I realized it didn't do everything I needed.


###############################################################################
# This script destroys any Prefect Cloud resources created by `setup_env.sh` #
###############################################################################

# Exit on any error
set -e

echo "🔑 Reading Prefect API key and account ID..."

# Get active profile from `profiles.toml`
ACTIVE_PROFILE=$(awk -F ' = ' '/^active/ {gsub(/"/, "", $2); print $2}' ~/.prefect/profiles.toml)

# Get API key for the active profile from `profiles.toml`
API_KEY=$(awk -v profile="profiles.$ACTIVE_PROFILE" '
$0 ~ "\\[" profile "\\]" {in_section=1; next}
in_section && /^\[/ {in_section=0}
in_section && /PREFECT_API_KEY/ {
gsub(/"/, "", $3)
print $3
exit
}
' ~/.prefect/profiles.toml)
export TF_VAR_prefect_api_key=$API_KEY

# Extract account ID from `prefect config view`
ACCOUNT_ID=$(prefect config view | awk -F'/' '/^https:\/\/app.prefect.cloud\/account\// {print $5}')
export TF_VAR_prefect_account_id=$ACCOUNT_ID

echo "🏗️ Running Terraform to provision infrastructure..."
terraform destroy -auto-approve
38 changes: 38 additions & 0 deletions infra/main.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
terraform {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice users will also want to create service accounts and provision API keys for workers that way. Example of that here and here.

required_providers {
prefect = {
source = "PrefectHQ/prefect"
}
}
}

provider "prefect" {
api_key = var.prefect_api_key
account_id = var.prefect_account_id
}

# Create staging workspace
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to get a bit fancier we should separate these into modules.

resource "prefect_workspace" "staging" {
name = "Staging"
handle = "staging"
}

# Create production workspace
resource "prefect_workspace" "production" {
name = "Production"
handle = "production"
}

# Create default work pool in staging workspace
resource "prefect_work_pool" "staging_default" {
name = "default-work-pool"
workspace_id = prefect_workspace.staging.id
type = "docker"
}

# Create default work pool in production workspace
resource "prefect_work_pool" "production_default" {
name = "default-work-pool"
workspace_id = prefect_workspace.production.id
type = "docker"
}
Comment on lines +26 to +38
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about picking a Cloud based work pool, for example GCP Cloud Run? There would be more infrastructure but it would also be a more real world example.

10 changes: 10 additions & 0 deletions infra/variables.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
variable "prefect_api_key" {
description = "Prefect Cloud API key"
type = string
sensitive = true
}

variable "prefect_account_id" {
description = "Prefect Cloud Account ID"
type = string
}
165 changes: 165 additions & 0 deletions setup_env.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,165 @@
#!/bin/bash

###############################################################################
# This script sets up a _paid_ Prefect Cloud account with resources: #
# #
# 1. Two workspaces: `production` and `staging` #
# 2. A default Docker work pool in each workspace #
# 3. A flow in each workspace #
# 4. The flow in each workspace is run multiple times #
# 5. The flow in `staging` has failures to demonstrate debugging #
#
# NOTE: You must have Docker and Terraform installed #
###############################################################################

# Exit on any error
set -e

###############################################################################
# Check for dependencies
###############################################################################

# Check if Docker is running
echo "🐳 Checking if Docker is running..."
if ! docker info > /dev/null 2>&1; then
echo "❌ Error: Docker is not running. Please start Docker and try again."
exit 1
fi

echo "✅ Docker is running"

# Check if Terraform is installed
echo "🔧 Checking if Terraform is installed..."
if ! command -v terraform &> /dev/null; then
echo "❌ Error: Terraform is not installed. Please install Terraform and try again."
exit 1
fi

echo "✅ Terraform is installed"

# Check if Python is installed and determine the Python command
echo "🐍 Checking if Python is installed..."
if command -v python3 &> /dev/null; then
PYTHON_CMD="python3"
elif command -v python &> /dev/null; then
PYTHON_CMD="python"
else
echo "❌ Error: Python is not installed. Please install Python 3.9 or higher and try again."
exit 1
fi

# Verify Python version is 3.9 or higher
if ! $PYTHON_CMD -c "import sys; assert sys.version_info >= (3, 9), 'Python 3.9 or higher is required'" &> /dev/null; then
echo "❌ Error: Python 3.9 or higher is required. Found $($PYTHON_CMD --version)"
exit 1
fi

echo "✅ Python $(${PYTHON_CMD} --version) is installed"

###############################################################################
# Set up virtual environment
###############################################################################

# Create and activate virtual environment
echo "🌟 Setting up Python virtual environment..."
$PYTHON_CMD -m venv temp_venv
source temp_venv/bin/activate

# Install requirements
echo "📦 Installing Python packages..."
pip install -r requirements.txt
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should encourage the use of uv.


echo "🔑 Reading Prefect API key and account ID..."

###############################################################################
# Get auth credentials
###############################################################################

# Get active profile from `profiles.toml`
ACTIVE_PROFILE=$(awk -F ' = ' '/^active/ {gsub(/"/, "", $2); print $2}' ~/.prefect/profiles.toml)

# Get API key for the active profile from `profiles.toml`
API_KEY=$(awk -v profile="profiles.$ACTIVE_PROFILE" '
$0 ~ "\\[" profile "\\]" {in_section=1; next}
in_section && /^\[/ {in_section=0}
in_section && /PREFECT_API_KEY/ {
gsub(/"/, "", $3)
print $3
exit
}
' ~/.prefect/profiles.toml)
export TF_VAR_prefect_api_key=$API_KEY

# Extract account ID from `prefect config view`
ACCOUNT_ID=$(prefect config view | awk -F'/' '/^https:\/\/app.prefect.cloud\/account\// {print $5}')
export TF_VAR_prefect_account_id=$ACCOUNT_ID

# Get account handle for the account ID given above
ACCOUNT_HANDLE=$(curl -s "https://api.prefect.cloud/api/accounts/$ACCOUNT_ID" -H "Authorization: Bearer $API_KEY" | awk -F'"handle":"' '{print $2}' | awk -F'"' '{print $1}')

###############################################################################
# Provision Prefect Cloud resources
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice a big part of this kind of setup work is creating all the related objects: blocks, variables, automations, etc.

It would be great to show an example of creating a few blocks as well as a simple "notify on failure" automation. I believe that can all be expressed in both Terraform and Python.

###############################################################################

echo "🏗️ Running Terraform to provision infrastructure..."
cd infra/
terraform init
terraform apply -auto-approve
cd ..

###############################################################################
# Run flows in production
###############################################################################

echo "🚀 Populate production workspace..."

# Start worker for production workspace with suppressed output
prefect cloud workspace set --workspace "$ACCOUNT_HANDLE/production"
prefect worker start --pool "default-work-pool" > /dev/null 2>&1 &
PROD_WORKER_PID=$!

# Give workers time to start
sleep 5

# Run in production workspace
python simulate_failures.py &
PROD_SIM_PID=$!

# Wait for simulations to complete
wait $PROD_SIM_PID

# Kill worker process
kill $PROD_WORKER_PID

###############################################################################
# Run flows in staging
###############################################################################

echo "🚀 Populate staging workspace..."

# Start worker for staging workspace with suppressed output
prefect cloud workspace set --workspace "$ACCOUNT_HANDLE/staging"
prefect worker start --pool "default-work-pool" > /dev/null 2>&1 &
STAGING_WORKER_PID=$!

# Give workers time to start
sleep 5

# Run in staging workspace
python simulate_failures.py --fail-at-run 3 &
STAGING_SIM_PID=$!

# Wait for simulations to complete
wait $STAGING_SIM_PID

# Kill worker process
kill $STAGING_WORKER_PID

###############################################################################
# Cleanup virtual environment
###############################################################################

deactivate
rm -rf temp_venv

echo "✅ All done!"