-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use dedicated GCP runners #184
Comments
this is, of course, doable. We'd just need to setup the runners in a GCP project. Do we have a project dedicated to HDL GH org? |
Actually, we do want to use "self-hosted" runners in multiple repos of this org (incl. containers, conda-* and maybe packages). See hdl/containers#51. There is an 'hdl-containers' project in GCP, which is used for the container registry (gcr.io/hdl-containers). I do have management access there, using my personal gmail account. That allows me to e.g. update the tokens, which expire every 1-3 months. See https://hdl.github.io/containers/dev/Tasks.html#credentials. However, I'm unsure about 'hdl-containers' and 'github-hdl' being the same project in GCP. Precisely, I don't have permissions to access https://console.cloud.google.com/?project=github-hdl. I'm missing In the end of September 2021, I talked to @mithro and @QuantamHD about this (through e-mail). Ethan told it was problematic to add either my university account or my personal gmail account for these purposes. Since I do have an antmicro account now, maybe we can reconsider. |
Looks like both the I believe the plan was that I thus think it makes the most sense to deploy the GitHub runners under |
@PiotrZierhoffer @ajelinski @kgugala how can I help to setup dedicated runner for conda-eda? per #210 (comment) this will become necessary for bigger packages like Xyce. |
@proppy We can setup the infrastructure. Do you know what machines will be needed? |
There are currently 64 package building jobs being run concurrently on every commit. The longest job (XLS) takes around 2 hours and dominate the total build time. Assuming you can't have multiple runner running on a single node what about starting with 4x n2-standard-32? |
We can choose machine type per job in CI (you can configure this in the yaml file). We just need to have a list of machine types we want to use |
per job or per workflow? |
per job I think? |
|
@proppy what's the issue with building Xyce on default runners? We build it in hdl/containers and it takes less than 2h, which is far below the limit. Are you cross-compiling it for architectures other than x64? |
@umarcor this was based on a conversation with @cbalint13 here: #210 (comment) |
Trilinos, takes full ~15h (Release, no-debug), all 3x complete builds {native, openmpi, mpich}:
Xyce, takes full ~1h (Release, no-debug), all 3x complete builds {native, openmpi, mpich}:
Automated builds having full possible flags/features, except CUDA for now (coming soon). |
@AdamOlech configured the custom runner and posted a PoC here:
@ajelinski @PiotrZierhoffer should we migrate all the job at once? or incrementally (starting with the one that currently fails because of limited resources #263 #238) ? |
We are working on moving the whole workflow. It should be easier than dividing the current one, as dependencies should be generally the same for everything. |
Thanks for doing this! |
We should consider using https://antmicro.com/blog/2021/08/open-source-github-actions-runners-with-gcp-and-terraform/ to get quicker feedback on the CI jobs.
The text was updated successfully, but these errors were encountered: