Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate to external ETL regions dataset #4534

Merged
merged 2 commits into from
Feb 6, 2025
Merged

Conversation

lucasrodes
Copy link
Member

We want to stop generating grapher/regions/latest/regions dataset, which is meant for our Grapher database and generates this dataset in Admin which no one uses: Regions data (OWID, 2023).

Instead, we should rely on step data://external/owid_grapher/latest/regions, which should just be equivalent for use-cases like this one in owid-grapher.

Would the change in this PR suffice? Are there other instances where we rely on this dataset?

@lucasrodes
Copy link
Member Author

cc. @pabloarosado since you pointed me out to this issue.

@pabloarosado
Copy link
Contributor

pabloarosado commented Feb 6, 2025

For more context:

  • ETL currently supports two solutions:
    • One is to create an export step. This automatically writes to external repositories (optionally creating a PR).
    • The other option is an external step, which is a step we know is read by an external repository.

Either of those solutions would be better than directly calling a data step from an external repos.

But from memory, I think grapher (or cloudflare-workers?) may be requiring two different regions files, so the changes proposed in this PR may not be enough. We'd need to revisit this a conversation.
NOTE: If it turns out we need two files, we could simply create two external steps.

@owidbot
Copy link
Contributor

owidbot commented Feb 6, 2025

Quick links (staging server):

Site Dev Site Preview Admin Wizard Docs

Login: ssh owid@staging-site-migrate-regions-dataset

SVG tester:

Number of differences (default views): 0 ✅
Number of differences (all views): 0 ✅

Edited: 2025-02-06 11:34:10 UTC
Execution time: 1.18 seconds

Copy link
Member

@marcelgerber marcelgerber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi! Yes this makes sense.
I've just added some lines that exclude income groups from the regions file, otherwise these would start showing up as "Your current location" in the entity selector from now.
Which is something we eventually want to do, but with a proper design for that case - see #3517.

But now this is good to go, there are no differences when run, and we indeed can get rid of the old etl step for this.

@lucasrodes
Copy link
Member Author

Lovely, thanks @marcelgerber !

@lucasrodes lucasrodes merged commit a497ef4 into master Feb 6, 2025
25 checks passed
@lucasrodes lucasrodes deleted the migrate-regions-dataset branch February 6, 2025 12:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants