Add structure for remaining guides content (#26423)

## Summary & Motivation Reorganize 'guides' and 'getting started' content (['Docs' section](https://docs-preview.dagster.io/) of docs) to prepare for remaining content. No need for a line-level review on this one; we just need to make sure the tests are green (except Vale—that's a bigger problem), and that staging loads and looks basically fine. ## How I Tested These Changes Local build ## Changelog > Insert changelog entry or delete this section. --------- Signed-off-by: nikki everett <[email protected]>
dagster-io · Jan 2, 2025 · 9de40ee · 9de40ee · github-actions · Jan 2, 2025
1 parent 24b60b3
commit 9de40ee
Show file tree

Hide file tree

Showing 116 changed files with 416 additions and 240 deletions.
diff --git a/.../docs-beta/docs/dagster-plus/deployment/deployment-types/serverless/security.md b/.../docs-beta/docs/dagster-plus/deployment/deployment-types/serverless/security.md
@@ -22,11 +22,11 @@ The default I/O manager cannot be used if you are a Serverless user who:
 - Are otherwise working with data subject to GDPR or other such regulations
   :::
 
-In Serverless, code that uses the default [I/O manager](/guides/build/configure/io-managers) is automatically adjusted to save data in Dagster+ managed storage. This automatic change is useful because the Serverless filesystem is ephemeral, which means the default I/O manager wouldn't work as expected.
+In Serverless, code that uses the default [I/O manager](/guides/operate/io-managers) is automatically adjusted to save data in Dagster+ managed storage. This automatic change is useful because the Serverless filesystem is ephemeral, which means the default I/O manager wouldn't work as expected.
 
 However, this automatic change also means potentially sensitive data could be **stored** and not just processed or orchestrated by Dagster+.
 
-To prevent this, you can use [another I/O manager](/guides/build/configure/io-managers#built-in) that stores data in your infrastructure or [adapt your code to avoid using an I/O manager](/guides/build/configure/io-managers#before-you-begin).
+To prevent this, you can use [another I/O manager](/guides/operate/io-managers#built-in) that stores data in your infrastructure or [adapt your code to avoid using an I/O manager](/guides/operate/io-managers#before-you-begin).
 
 :::note
 You must have [boto3](https://pypi.org/project/boto3/) or `dagster-cloud[serverless]` installed as a project dependency otherwise the Dagster+ managed storage can fail and silently fall back to using the default I/O manager.

diff --git a/.../docs/dagster-plus/deployment/management/settings/customizing-agent-settings.md b/.../docs/dagster-plus/deployment/management/settings/customizing-agent-settings.md
@@ -132,4 +132,4 @@ compute_logs:
       ServerSideEncryption: "AES256"
     show_url_only: true
     region: "us-west-1"
-```
+```
diff --git a/...-plus/features/authentication-and-access-control/rbac/user-roles-permissions.md b/...-plus/features/authentication-and-access-control/rbac/user-roles-permissions.md
@@ -115,7 +115,7 @@ TODO: add picture previously at "/images/dagster-cloud/user-token-management/cod
 | Start and stop [schedules](/guides/automate/schedules)                          | ❌     | ❌       | ✅     | ✅    | ✅                       |
 | Start and stop [schedules](/guides/automate/sensors)                            | ❌     | ❌       | ✅     | ✅    | ✅                       |
 | Wipe assets                                                              | ❌     | ❌       | ✅     | ✅    | ✅                       |
-| Launch and cancel [schedules](/guides/build/backfill) | ❌     | ✅       | ✅     | ✅    | ✅                       |
+| Launch and cancel [schedules](/guides/automate/schedules) | ❌     | ✅       | ✅     | ✅    | ✅                       |
 | Add dynamic partitions                                                   | ❌     | ❌       | ✅     | ✅    | ✅                       |
 
 ### Deployments

diff --git a/.../docs/dagster-plus/features/authentication-and-access-control/scim/okta-scim.md b/.../docs/dagster-plus/features/authentication-and-access-control/scim/okta-scim.md
@@ -18,7 +18,7 @@ In this guide, we'll walk you through configuring [Okta SCIM provisioning](https
 With Dagster+'s Okta SCIM provisioning feature, you can:
 
 - **Create users**. Users that are assigned to the Dagster+ application in the IdP will be automatically added to your Dagster+ organization.
-- **Update user attributes.** Updating a user’s name or email address in the IdP will automatically sync the change to your user list in Dagster+.
+- **Update user attributes.** Updating a user's name or email address in the IdP will automatically sync the change to your user list in Dagster+.
 - **Remove users.** Deactivating or unassigning a user from the Dagster+ application in the IdP will remove them from the Dagster+ organization
 {/* - **Push user groups.** Groups and their members in the IdP can be pushed to Dagster+ as [Teams](/dagster-plus/account/managing-users/managing-teams). */}
 - **Push user groups.** Groups and their members in the IdP can be pushed to Dagster+ as

diff --git a/docs/docs-beta/docs/dagster-plus/features/catalog-views.md b/docs/docs-beta/docs/dagster-plus/features/catalog-views.md
@@ -17,7 +17,7 @@ In this guide, you'll learn how to create, access, and share catalog views with
 <summary>Prerequisites</summary>
 
 - **Organization Admin**, **Admin**, or **Editor** permissions on Dagster+
-- Familiarity with [Assets](/guides/build/assets-concepts/index.mdx and [Asset metadata](/guides/build/create-a-pipeline/metadata)
+- Familiarity with [Assets](/guides/build/create-asset-pipelines/assets-concepts/index.mdx and [Asset metadata](/guides/build/create-asset-pipelines/metadata)
 
 </details>
 

diff --git a/...ocs-beta/docs/dagster-plus/features/ci-cd/branch-deployments/change-tracking.md b/...ocs-beta/docs/dagster-plus/features/ci-cd/branch-deployments/change-tracking.md
@@ -8,7 +8,7 @@ unlisted: true
 This guide is applicable to Dagster+.
 :::
 
-Branch Deployments Change Tracking makes it eaiser for you and your team to identify how changes in a pull request will impact data assets. By the end of this guide, you'll understand how Change Tracking works and what types of asset changes can be detected.
+Branch Deployments Change Tracking makes it easier for you and your team to identify how changes in a pull request will impact data assets. By the end of this guide, you'll understand how Change Tracking works and what types of asset changes can be detected.
 
 ## How it works
 

diff --git a/docs/docs-beta/docs/dagster-plus/features/ci-cd/branch-deployments/testing.md b/docs/docs-beta/docs/dagster-plus/features/ci-cd/branch-deployments/testing.md
@@ -8,14 +8,14 @@ unlisted: true
 This guide is applicable to Dagster+.
 :::
 
-This guide details a workflow to test Dagster code in your cloud environment without impacting your production data. To highlight this functionality, we’ll leverage Dagster+ branch deployments and a Snowflake database to:
+This guide details a workflow to test Dagster code in your cloud environment without impacting your production data. To highlight this functionality, we'll leverage Dagster+ branch deployments and a Snowflake database to:
 
 - Execute code on a feature branch directly on Dagster+
 - Read and write to a unique per-branch clone of our Snowflake data
 
 With these tools, we can merge changes with confidence in the impact on our data platform and with the assurance that our code will execute as intended.
 
-Here’s an overview of the main concepts we’ll be using:
+Here’s an overview of the main concepts we'll be using:
 
 {/* - [Assets](/concepts/assets/software-defined-assets) - We'll define three assets that each persist a table to Snowflake. */}
 - [Assets](/todo) - We'll define three assets that each persist a table to Snowflake.
@@ -35,7 +35,7 @@ Here’s an overview of the main concepts we’ll be using:
 ## Prerequisites
 
 :::note
-  This guide is an extension of the <a href="/guides/dagster/transitioning-data-pipelines-from-development-to-production"> Transitioning data pipelines from development to production </a> guide, illustrating a workflow for staging deployments. We’ll use the examples from this guide to build a workflow atop Dagster+’s branch deployment feature.
+  This guide is an extension of the <a href="/guides/dagster/transitioning-data-pipelines-from-development-to-production"> Transitioning data pipelines from development to production </a> guide, illustrating a workflow for staging deployments. We'll use the examples from this guide to build a workflow atop Dagster+’s branch deployment feature.
 :::
 
 To complete the steps in this guide, you'll need:
@@ -52,7 +52,7 @@ To complete the steps in this guide, you'll need:
 
 ## Overview
 
-We have a `PRODUCTION` Snowflake database with a schema named `HACKER_NEWS`. In our production cloud environment, we’d like to write tables to Snowflake containing subsets of Hacker News data. These tables will be:
+We have a `PRODUCTION` Snowflake database with a schema named `HACKER_NEWS`. In our production cloud environment, we'd like to write tables to Snowflake containing subsets of Hacker News data. These tables will be:
 
 - `ITEMS` - A table containing the entire dataset
 - `COMMENTS` - A table containing data about comments
@@ -128,14 +128,14 @@ As you can see, our assets use an [I/O manager](/todo) named `snowflake_io_manag
 
 ## Step 2: Configure our assets for each environment
 
-At runtime, we’d like to determine which environment our code is running in: branch deployment, or production. This information dictates how our code should execute, specifically with which credentials and with which database.
+At runtime, we'd like to determine which environment our code is running in: branch deployment, or production. This information dictates how our code should execute, specifically with which credentials and with which database.
 
-To ensure we can't accidentally write to production from within our branch deployment, we’ll use a different set of credentials from production and write to our database clone.
+To ensure we can't accidentally write to production from within our branch deployment, we'll use a different set of credentials from production and write to our database clone.
 
 {/* Dagster automatically sets certain [environment variables](/dagster-plus/managing-deployments/reserved-environment-variables) containing deployment metadata, allowing us to read these environment variables to discern between deployments. We can access the `DAGSTER_CLOUD_IS_BRANCH_DEPLOYMENT` environment variable to determine the currently executing environment. */}
 Dagster automatically sets certain [environment variables](/todo) containing deployment metadata, allowing us to read these environment variables to discern between deployments. We can access the `DAGSTER_CLOUD_IS_BRANCH_DEPLOYMENT` environment variable to determine the currently executing environment.
 
-Because we want to configure our assets to write to Snowflake using a different set of credentials and database in each environment, we’ll configure a separate I/O manager for each environment:
+Because we want to configure our assets to write to Snowflake using a different set of credentials and database in each environment, we'll configure a separate I/O manager for each environment:
 
 ```python file=/guides/dagster/development_to_production/branch_deployments/repository_v1.py startafter=start_repository endbefore=end_repository
 # definitions.py
@@ -232,7 +232,7 @@ def drop_prod_clone():
     drop_database_clone()
 ```
 
-We’ve defined `drop_database_clone` and `clone_production_database` to utilize the <PyObject object="SnowflakeResource" module="dagster_snowflake" />. The Snowflake resource will use the same configuration as the Snowflake I/O manager to generate a connection to Snowflake. However, while our I/O manager writes outputs to Snowflake, the Snowflake resource executes queries against Snowflake.
+We've defined `drop_database_clone` and `clone_production_database` to utilize the <PyObject object="SnowflakeResource" module="dagster_snowflake" />. The Snowflake resource will use the same configuration as the Snowflake I/O manager to generate a connection to Snowflake. However, while our I/O manager writes outputs to Snowflake, the Snowflake resource executes queries against Snowflake.
 
 We now need to define resources that configure our jobs to the current environment. We can modify the resource mapping by environment as follows:
 
@@ -322,7 +322,7 @@ Opening a pull request for our current branch will automatically kick off a bran
 
 Alternatively, the logs for the branch deployment workflow can be found in the **Actions** tab on the GitHub pull request.
 
-We can also view our database in Snowflake to confirm that a clone exists for each branch deployment. When we materialize our assets within our branch deployment, we’ll now be writing to our clone of `PRODUCTION`. Within Snowflake, we can run queries against this clone to confirm the validity of our data:
+We can also view our database in Snowflake to confirm that a clone exists for each branch deployment. When we materialize our assets within our branch deployment, we'll now be writing to our clone of `PRODUCTION`. Within Snowflake, we can run queries against this clone to confirm the validity of our data:
 
 ![Instance overview](/images/guides/development_to_production/branch_deployments/snowflake.png)
 
@@ -383,7 +383,7 @@ Opening a merge request for our current branch will automatically kick off a bra
 
 ![Instance overview](/images/guides/development_to_production/branch_deployments/instance_overview.png)
 
-We can also view our database in Snowflake to confirm that a clone exists for each branch deployment. When we materialize our assets within our branch deployment, we’ll now be writing to our clone of `PRODUCTION`. Within Snowflake, we can run queries against this clone to confirm the validity of our data:
+We can also view our database in Snowflake to confirm that a clone exists for each branch deployment. When we materialize our assets within our branch deployment, we'll now be writing to our clone of `PRODUCTION`. Within Snowflake, we can run queries against this clone to confirm the validity of our data:
 
 ![Instance overview](/images/guides/development_to_production/branch_deployments/snowflake.png)
 
@@ -489,4 +489,4 @@ close_branch:
 
 After merging our branch, viewing our Snowflake database will confirm that our branch deployment step has successfully deleted our database clone.
 
-We’ve now built an elegant workflow that enables future branch deployments to automatically have access to their own clones of our production database that are cleaned up upon merge!
+We've now built an elegant workflow that enables future branch deployments to automatically have access to their own clones of our production database that are cleaned up upon merge!
diff --git a/docs/docs-beta/docs/dagster-plus/features/insights/asset-metadata.md b/docs/docs-beta/docs/dagster-plus/features/insights/asset-metadata.md
@@ -21,7 +21,7 @@ You'll need one or more assets that emit the same metadata key at run time. Insi
 are most valuable when you have multiple assets that emit the same kind of metadata, such as
 such as the number of rows processed or the size of a file uploaded to object storage.
 
-Follow [the metadata guide](/guides/build/create-a-pipeline/metadata#runtime-metadata) to add numeric metadata
+Follow [the metadata guide](/guides/build/create-asset-pipelines/metadata#runtime-metadata) to add numeric metadata
 to your asset materializations.
 
 ## Step 2: Enable viewing your metadata in Dagster+ Insights

diff --git a/docs/docs-beta/docs/dagster-plus/index.md b/docs/docs-beta/docs/dagster-plus/index.md
@@ -7,7 +7,7 @@ Dagster+ is a managed orchestration platform built on top of Dagster's open sour
 
 Dagster+ is built to be the most performant, reliable, and cost effective way for data engineering teams to run Dagster in production. Dagster+ is also great for students, researchers, or individuals who want to explore Dagster with minimal overhead.
 
-Dagster+ comes in two flavors: a fully [Serverless](/dagster-plus/deployment/deployment-types/serverless) offering and a [Hybrid](/dagster-plus/deployment/deployment-types/hybrid) offering. In both cases, Dagster+ does the hard work of managing your data orchestration control plane. Compared to a [Dagster open source deployment](/guides/), Dagster+ manages:
+Dagster+ comes in two flavors: a fully [Serverless](/dagster-plus/deployment/deployment-types/serverless) offering and a [Hybrid](/dagster-plus/deployment/deployment-types/hybrid) offering. In both cases, Dagster+ does the hard work of managing your data orchestration control plane. Compared to a [Dagster open source deployment](guides/deploy/index.md), Dagster+ manages:
 
 - Dagster's web UI at https://dagster.plus
 - Metadata stores for data cataloging and cost insights

diff --git a/docs/docs-beta/docs/getting-started/glossary.md b/docs/docs-beta/docs/getting-started/glossary.md
@@ -1,7 +1,6 @@
 ---
 title: Glossary
 sidebar_position: 30
-sidebar_label: Glossary
 unlisted: true
 ---
 

diff --git a/docs/docs-beta/docs/getting-started/installation.md b/docs/docs-beta/docs/getting-started/installation.md
@@ -5,8 +5,6 @@ sidebar_position: 20
 sidebar_label: Installation
 ---
 
-# Installing Dagster
-
 To follow the steps in this guide, you'll need:
 
 - To install Python 3.9 or higher. **Python 3.12 is recommended**.
@@ -72,4 +70,4 @@ If you encounter any issues during the installation process:
 ## Next steps
 
 - Get up and running with your first Dagster project in the [Quickstart](/getting-started/quickstart)
-- Learn to [create data assets in Dagster](/guides/build/create-a-pipeline/data-assets)
+- Learn to [create data assets in Dagster](/guides/build/create-asset-pipelines/data-assets)
diff --git a/docs/docs-beta/docs/getting-started/quickstart.md b/docs/docs-beta/docs/getting-started/quickstart.md
@@ -1,12 +1,10 @@
 ---
-title: "Dagster quickstart"
+title: Build your first Dagster project
 description: Learn how to quickly get up and running with Dagster
 sidebar_position: 30
 sidebar_label: "Quickstart"
 ---
 
-# Build your first Dagster project
-
 Welcome to Dagster! In this guide, you'll use Dagster to create a basic pipeline that:
 
 - Extracts data from a CSV file
@@ -154,4 +152,4 @@ id,name,age,city,age_group
 Congratulations! You've just built and run your first pipeline with Dagster. Next, you can:
 
 - Continue with the [ETL pipeline tutorial](/tutorial/tutorial-etl) to learn how to build a more complex ETL pipeline
-- Learn how to [Think in assets](/guides/build/assets-concepts/index.md)
+- Learn how to [Think in assets](/guides/build/create-asset-pipelines/assets-concepts/index.md)
diff --git a/docs/docs-beta/docs/guides/automate/about-automation.md b/docs/docs-beta/docs/guides/automate/about-automation.md
@@ -3,6 +3,8 @@ title: About Automation
 unlisted: true
 ---
 
+{/* TODO combine with index page and delete this page */}
+
 There are several ways to automate the execution of your data pipelines with Dagster.
 
 The first system, and the most basic, is the [Schedule](/guides/automate/schedules), which responds to time.
@@ -24,8 +26,6 @@ as the schedule is processed.
 Schedules were one of the first types of automation in Dagster, created before the introduction of Software-Defined Assets.
 As such, you may find that many of the examples can seem foreign if you are used to only working within the asset framework.
 
-For more on how assets and ops inter-relate, read about [Assets and Ops](/guides/build/assets-concepts#assets-and-ops)
-
 The `dagster-daemon` process is responsible for submitting runs by checking each schedule at a regular interval to determine
 if it's time to execute the underlying job.
 

diff --git a/docs/docs-beta/docs/guides/automate/asset-sensors.md b/docs/docs-beta/docs/guides/automate/asset-sensors.md
@@ -1,22 +1,17 @@
 ---
-title: Triggering cross-job dependencies with Asset Sensors
-sidebar_position: 300
-sidebar_label: Cross-job dependencies
+title: Trigger cross-job dependencies with asset sensors
+sidebar_position: 40
 ---
 
 Asset sensors in Dagster provide a powerful mechanism for monitoring asset materializations and triggering downstream computations or notifications based on those events.
 
 This guide covers the most common use cases for asset sensors, such as defining cross-job and cross-code location dependencies.
 
-<details>
-<summary>Prerequisites</summary>
+:::note
 
-To follow this guide, you'll need:
+This documentation assumes familiarity with [assets](/guides/build/create-asset-pipelines/assets-concepts/index.md) and [ops and jobs](/guides/build/ops-jobs)
 
-- Familiarity with [Assets](/guides/build/assets-concepts/index.mdx
-- Familiarity with [Ops and Jobs](/guides/build/ops-jobs)
-
-</details>
+:::
 
 ## Getting started
 
@@ -54,7 +49,7 @@ This is an example of an asset sensor that triggers a job when an asset is mater
 
 <CodeExample filePath="guides/automation/simple-asset-sensor-example.py" language="python" />
 
-## Customize evaluation logic
+## Customizing the evaluation function of an asset sensor
 
 You can customize the evaluation function of an asset sensor to include specific logic for deciding when to trigger a run. This allows for fine-grained control over the conditions under which downstream jobs are executed.
 
@@ -83,15 +78,15 @@ In the following example, the `@asset_sensor` decorator defines a custom evaluat
 
 <CodeExample filePath="guides/automation/asset-sensor-custom-eval.py" language="python"/>
 
-## Trigger a job with configuration
+## Triggering a job with custom configuration
 
 By providing a configuration to the `RunRequest` object, you can trigger a job with a specific configuration. This is useful when you want to trigger a job with custom parameters based on custom logic you define.
 
 For example, you might use a sensor to trigger a job when an asset is materialized, but also pass metadata about that materialization to the job:
 
 <CodeExample filePath="guides/automation/asset-sensor-with-config.py" language="python" />
 
-## Monitor multiple assets
+## Monitoring multiple assets
 
 When building a pipeline, you may want to monitor multiple assets with a single sensor. This can be accomplished with a multi-asset sensor.
 

diff --git a/docs/docs-beta/docs/guides/automate/declarative-automation.md b/docs/docs-beta/docs/guides/automate/declarative-automation.md
diff --git a/...ion/customizing-automation-conditions/arbitrary-python-automation-conditions.md b/...ion/customizing-automation-conditions/arbitrary-python-automation-conditions.md
@@ -0,0 +1,5 @@
+---
+title: Arbitrary Python automation conditions
+sidebar_position: 500
+unlisted: true
+---
diff --git a/...ustomizing-automation-conditions/automation-condition-operands-and-operators.md b/...ustomizing-automation-conditions/automation-condition-operands-and-operators.md
@@ -0,0 +1,11 @@
+---
+title: Automation conditions operands and operators
+sidebar_position: 600
+unlisted: true
+---
+
+## Operands
+
+## Operators
+
+## Composite conditions