-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[dagster-airlift] Federation tutorial overview and setup
- Loading branch information
Showing
6 changed files
with
112 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
30 changes: 29 additions & 1 deletion
30
docs/content/integrations/airlift/federation-tutorial/overview.mdx
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,29 @@ | ||
Will be filled out in a future PR. | ||
# Airflow Federation Tutorial | ||
|
||
This tutorial demonstrates using `dagster-airlift` to observe DAGs from multiple Airflow instances, and federate execution between them using Dagster as a centralized control plane. | ||
|
||
Using `dagster-airlift` we can | ||
|
||
- Observe Airflow DAGs and their execution history | ||
- Directly trigger Airflow DAGs from Dagster | ||
- Set up federated execution _across_ Airflow instances | ||
|
||
All of this can be done with no changes to Airflow code. | ||
|
||
## Overview | ||
|
||
This tutorial will take you through an imaginary data platform team that has the following scenario: | ||
|
||
- An Airflow instance `warehouse`, run by another team, that is responsible for loading data into a data warehouse. | ||
- An Airflow instance `metrics`, run by the data platform team, that deploys all the metrics constructed by data scientists on top of the data warehouse. | ||
|
||
Two DAGs have been causing a lot of pain lately for the team: `warehouse.load_customers` and `metrics.customer_metrics`. The `warehouse.load_customers` DAG is responsible for loading customer data into the data warehouse, and the `metrics.customer_metrics` DAG is responsible for computing metrics on top of the customer data. There's a cross-instance dependency relationship between these two DAGs, but it's not observable or controllable. The data platform team would ideally _only_ like to rebuild the `metrics.customer_metrics` DAG when the `warehouse.load_customers` DAG has new data. In this guide, we'll use `dagster-airlift` to observe the `warehouse` and `metrics` Airflow instances, and set up a federated execution controlled by Dagster that only triggers the `metrics.customer_metrics` DAG when the `warehouse.load_customers` DAG has new data. This process won't require any changes to the Airflow code. | ||
|
||
## Pages | ||
|
||
<ArticleList> | ||
<ArticleListItem | ||
title="Setup" | ||
href="/integrations/airlift/federation-tutorial/setup" | ||
></ArticleListItem> | ||
</ArticleList> |
76 changes: 76 additions & 0 deletions
76
docs/content/integrations/airlift/federation-tutorial/setup.mdx
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,76 @@ | ||
# Airflow Migration Tutorial: Setup | ||
|
||
In this step, we'll | ||
|
||
- Install the example code | ||
- Set up a local environment | ||
- Ensure we can run Airflow locally. | ||
|
||
## Installation & Project Structure | ||
|
||
First, clone the tutorial example repo locally, and enter the repo directory. | ||
|
||
```bash | ||
git clone [email protected]:dagster-io/airlift-migration-tutorial.git | ||
cd airlift-federation-tutorial | ||
``` | ||
|
||
Next, we'll create a fresh virtual environment using `uv`. | ||
|
||
```bash | ||
pip install uv | ||
uv venv | ||
source .venv/bin/activate | ||
``` | ||
|
||
## Running Airflow locally | ||
|
||
The tutorial example involves running a local Airflow instance. This can be done by running the following commands from the root of the `airlift-migration-tutorial` directory. | ||
|
||
First, install the required python packages: | ||
|
||
```bash | ||
make airflow_install | ||
``` | ||
|
||
Next, scaffold the two Airflow instances we'll be using for this tutorial: | ||
|
||
```bash | ||
make airflow_setup | ||
``` | ||
|
||
Finally, let's run the two Airflow instances with environment variables set: | ||
|
||
In one shell run: | ||
|
||
```bash | ||
make upstream_airflow_run | ||
``` | ||
|
||
In a separate shell, run: | ||
|
||
```bash | ||
make downstream_airflow_run | ||
``` | ||
|
||
This will run two Airflow Web UIs, one for each Airflow instance. You should now be able to access the upstream Airflow UI at `http://localhost:8081`, with the default username and password set to `admin`. | ||
|
||
You should be able to see the `load_customers` DAG in the Airflow UI. | ||
|
||
<Image | ||
alt="load_customers DAG" | ||
src="/images/integrations/airlift/load_customers.png" | ||
width={1484} | ||
height={300} | ||
/> | ||
|
||
Similarly, you should be able to access the downstream Airflow UI at `http://localhost:8082`, with the default username and password set to `admin`. | ||
|
||
You should be able to see the `customer_metrics` DAG in the Airflow UI. | ||
|
||
<Image | ||
alt="customer_metrics DAG" | ||
src="/images/integrations/airlift/customer_metrics.png" | ||
width={1484} | ||
height={300} | ||
/> |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters