Skip to content

Commit

Permalink
Update lesson 7
Browse files Browse the repository at this point in the history
  • Loading branch information
maximearmstrong committed Jul 24, 2024
1 parent b6f77b5 commit 07b7ca3
Show file tree
Hide file tree
Showing 2 changed files with 53 additions and 17 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,13 @@ lesson: '7'

# Creating the manifest during deployment

To recap, our deployment failed in the last section because Dagster couldn’t find a dbt manifest file, which it needs to turn dbt models into Dagster assets. This is because we built this file by running `dbt parse` during local development. You ran this manually in Lesson 3 and improved the experience in Lesson 4. However, you'll also need to build your dbt manifest file during deployment, which will require a couple additional steps. We recommend adopting CI/CD to automate this process.
To recap, our deployment failed in the last section because Dagster couldn’t find a dbt manifest file, which it needs to turn dbt models into Dagster assets. This is because we built this file by running `dbt parse` during local development. You ran this manually in Lesson 3 and improved the experience using `DbtProject`'s `prepare_if_dev` in Lesson 4. However, you'll also need to build your dbt manifest file during deployment, which will require a couple additional steps. We recommend adopting CI/CD to automate this process.

Building your manifest for your production deployment will be needed for both open source and Dagster+ deployments. In this case, Dagster+’s out-of-the-box `deploy.yml` GitHub Action isn’t aware that you’re also trying to deploy a dbt project with Dagster.

Since your CI/CD will be running in a fresh environment, you'll need to install dbt and run `dbt deps` before building your manifest with `dbt parse`.
Since your CI/CD will be running in a fresh environment, you'll need to install dbt and other dependencies before building your manifest.

To get our deployment working, we need to add a step to our GitHub Actions workflow that runs the dbt commands required to generate the `manifest.json`. Specifically, we need to run `dbt deps` and `dbt parse` in the dbt project, just like you did during local development.
To get our deployment working, we need to add a step to our GitHub Actions workflow that runs the commands required to generate the `manifest.json`. Specifically, we need to run the `dbt project prepare-and-package` command, available in the `dagster_dbt` package discussed earlier.

1. In your Dagster project, locate the `.github/workflows` directory.
2. Open the `deploy.yml` file.
Expand All @@ -29,8 +29,14 @@ To get our deployment working, we need to add a step to our GitHub Actions workf
dagster-dbt project prepare-and-package --file dagster_university/assets/dbt.py
shell: bash
```
5. Save and commit the changes. Make sure to push them to the remote!
The code above:
1. Creates a step named `Prepare DBT project for deployement`
2. Upgrades `pip`, the package installer for Python
3. Navigates inside the `project-repo` folder
4. Upgrades the project dependencies
5. Prepares the manifest file by running the `dagster-dbt project prepare-and-package` command, specifying the file in which the `DbtProject` object is located.

Once the new step is pushed to the remote, GitHub will automatically try to run a new job using the updated workflow.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -61,22 +61,47 @@ Because we’re still using a DuckDB-backed database, our `type` will also be `d

---

## Adding a prod target to DbtProject

Next, we need to update the `DbtProject` object in `dagster_university/assets/dbt.py` to specify what profile to target. To optimize the developer experience, let’s use an environment variable to specify the profile to target.

1. In the `.env` file, define an environment variable named `DBT_TARGET` and set it to `dev`:

```python
DBT_TARGET=dev
```

2. Next, import the `os` module at the top of the `dbt.py` file so the environment variable is accessible:

```python
import os
```

3. Finally, scroll to the initialization of the DbtProject object, and use the new environment variable to access the profile to target. This should be on or around line 11:

```python
dbt_project = DbtProject(
project_dir=Path(__file__).joinpath("..", "..", "..", "analytics").resolve(),
target=os.getenv("DBT_TARGET")
)
```

---

## Adding a prod target to deploy.yml

Next, we need to update the dbt commands in the `.github/workflows/deploy.yml` file to target the new `prod` profile. This will ensure that dbt uses the correct connection details when the GitHub Action runs as part of our Dagster+ deployment.
After that, we need to update the dbt commands in the `.github/workflows/deploy.yml` file to target the new `prod` profile. This will ensure that dbt uses the correct connection details when the GitHub Action runs as part of our Dagster+ deployment.

Open the file, scroll to the dbt step you added, and add `-- target prod` after the `dbt parse` command. This command should be on or around line 52:
Open the file, scroll to the environment variable sections, and set an environment variable named `DBT_TARGET` to `prod`. This command should be on or around line 12:

```bash
- name: Parse dbt project and package with Dagster project
if: steps.prerun.outputs.result == 'pex-deploy'
run: |
pip install pip --upgrade
pip install dbt-duckdb
cd project-repo/analytics
dbt deps
dbt parse --target prod ## add this flag
shell: bash
env:
DAGSTER_CLOUD_URL: ${{ secrets.DAGSTER_CLOUD_ORGANIZATION }}
DAGSTER_CLOUD_API_TOKEN: ${{ secrets.DAGSTER_CLOUD_API_TOKEN }}
ENABLE_FAST_DEPLOYS: 'true'
PYTHON_VERSION: '3.8'
DAGSTER_CLOUD_FILE: 'dagster_cloud.yaml'
DBT_TARGET: 'prod'
```

Save and commit the file to git. Don’t forget to push to remote!
Expand Down Expand Up @@ -104,7 +129,12 @@ The following table contains the environment variables we need to create in Dags
---

- `DAGSTER_ENVIRONMENT`
- Set this to `prod`. This will be used by your dbt resource to decide which target to use.
- Set this to `prod`. This will be used by your resources and constants.

---

- `DBT_TARGET`
- Set this to `prod`. This will be used by your dbt project and dbt resource to decide which target to use.

---

Expand Down

0 comments on commit 07b7ca3

Please sign in to comment.