-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run db:migrate in pre- and post-install/upgrade (#18, #26) #37
base: main
Are you sure you want to change the base?
Conversation
As recommended in Mastodon release notes, run db:migrate with SKIP_POST_DEPLOYMENT_MIGRATIONS=true in pre-install and pre-upgrade hooks, and again without the flag in post-install and post-upgrade. Co-authored-by: Sheogorath <[email protected]>
pre-install and pre-upgrade hooks run before the persistent ConfigMap resources are installed. As suggested in helm/helm#8694, create a hook with lower hook-weight and resource-policy=keep to make the same ConfigMap available in pre- hooks.
apiVersion: v1 | ||
kind: ConfigMap | ||
metadata: | ||
name: {{ include "mastodon.fullname" . }}-env |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need a second configmap with the same name and data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because Helm doesn't create the other configmap until after the pre-install hooks, and deletes this one after running pre-install hooks. It's a catch-22: for the mastodon-env configmap to be available when db-migrate job runs, it has to be created as a hook with a lower weight, but since it's created as a hook it gets cleaned up after db-migrate is done, so we also have to keep the non-hook version of the same configmap. See also commit message in a749654.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh i didn't realize that! thanks for the clarification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some additional clarification: this is only about the first time the chart gets installed after that it will be there.
Not sure if it shows up for you, but GitHub tells me:
I don't think this PR is going to land until someone does the needful. |
Please can you release this PR? It seems ok. Otherwise, this chart cannot be used within a highly automated environment with Terraform + Helm; the helm installation never ends, and the migration job is never triggered by Helm, making it impossible to use it for a new fresh install. cc @dunn |
I'm not actually a maintainer of this repo, so I can't merge. |
Ops, so sorry, I saw your comments and just thought you was a maintainer too. |
Just tried this PR, and it doesn't work, the migrate job requires PVC already created otherwise the job cannot be executed. The question is, does the migration job requires the PVC ? |
I tried again using a bucket instead of PVC, and now the problem is with the required redis instance, which should be up and running to finish the job. |
I tried running the migration job along with the other deployments, and it worked fine; it is the same approach the GitLab chart is taking. The concept is to let the scheduler restart the services until the migration job finishes to initialize the services; once finished, the pods start to come up and work fine. Gitlab migration job used as a reference: https://docs.gitlab.com/charts/charts/gitlab/migrations/ |
Thanks for your work on this @paolomainardi!
How would this work for version upgrades? The pre-upgrade migrations needs to be run before any of the new version application pods, otherwise those can generate errors on some requests (when trying to access a table that has not been migrated yet), while their I worry it will create user-facing errors during the migration, or even a server to become unavailable if the migration does not happen and all pods are upgraded to the new version. |
@renchap yes, you're right; the issue with this approach is that users can face problems while migrations are running. This issue can only be overcome by just using Helm, and the best choice is to avoid running them as is doing this chart and move most of the complexity to the application side. Always looking how Gitlab does, they open sourced the database migration types they support: https://docs.gitlab.com/ee/development/migration_style_guide.html The case for Mastodon is "Regular migrations" which according to their document must be always under 3 minutes if higher than must be moved on post-deployment or background migrations. Is not very clear indeed, what it happens during the 3-minutes window, maybe the migrations are always written in a way that prev/next releases are always compatible. This is migration helm chart documentation: https://docs.gitlab.com/charts/charts/gitlab/migrations and from my direct experience, they run along the other deployments. |
is there any chance this could be moved forward? |
Apologies for the long delay in getting to this... Generally I like how this is done, including the generalization of the deploy job into pre & post runs. Though I do want to clarify something first.
@paolomainardi This job is using pre/post-upgrade/install helm hooks, which if I'm not mistaken, means that the pre- job is run first before anything is installed/updated, THEN resources are added/changed, THEN the post- job is run. The scheduler shouldn't be restarting anything while the pre-migration is running. Is this something you saw while trying this yourself? |
@timetinytim, thanks for your reply; no worries about the long delay. It should work like that once the migrations are running. A pod is created as part of the helm hooks; once finished, the rest will be applied, such as updating the Kubernetes deployment with the mastodon container image that will trigger the scheduler to do something. I have not been managing a Mastodon K8S instance lately, but I can give you some tips. |
@paolomainardi Alright cool, I just wanted to confirm. That's the behavior we expect. As long as pre-migrations run before the new pods come up, that will hopefully prevent any user-facing errors during an update. Let me do some testing on my own side on this, and if everything looks good, I'll get this approved/merged in the next day or two. |
I've been doing some testing, and I've come to a conclusion: The idea of a pre/post-hood ConfigMap isn't a bad idea, but at the end of the day the job only needs a couple pieces of information: The DB connection info, and the Redis connection info. There are also situations where the former would be different for the migration job (i.e. using a connection pooler, which wouldn't work with migrations). What I'm suggesting is, rather than create the configmap twice, we just add the env vars it needs directly. It solves for the case mentioned above, and simplifies it a little bit. @angdraug, let me know what you think. Given how long it's taken me to look at this I totally understand if you've moved on from this, so no worries if that's the case. I'll just pick this up myself. |
Speaking from experience, this will work perfectly fine. And is the simplest solution tbh |
I have indeed moved on from this, please feel free to take over, thanks! |
No description provided.