-
-
Notifications
You must be signed in to change notification settings - Fork 15k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replacing ofBorg with GitHub Actions #355847
Comments
This issue has been mentioned on NixOS Discourse. There might be relevant details there: |
evaluation checks takes too many resource. I'm worried about if github action's machine can run it in reasonable time. |
@Bot-wxt1221 I managed to run it in 5 minutes for naive nix-env evaluation based on the default.nix entry point and 15 minutes using the same logic that ofborg uses: https://github.com/Mic92/nixpkgs/actions/workflows/eval.yml Both seem already faster compared to the hours of waiting for the ofborg queue that we experience today. Also this is not yet the end of the line of optimizations. We still have https://github.com/Mic92/nixpkgs/blob/main/pkgs/top-level/release-attrpaths-superset.nix to split evaluation in smaller parts that can run even in parallel. |
Will PR commands like |
I worry that bot accounts like ryantm-r can easily hit the limit of CI. CC @ryantm |
Yes it's possible:
|
Well. We have to try and see. Just now it's speculation if it works or not. |
Good to know, though huge builds like kernel and its modules, chromium and firefox will obviously not work. And we'll possibly have to setup a blacklist else even individual contributors will hit their limits. |
According to github doc:
So maybe we don't need to worry about time? |
You can run builds for 12h. Obviously we should establish some reasonable timeouts to be a good citizen in the ecosystem. |
Added a ^ meeting date for this. |
Maybe of interest for this issue, at least just for inspiration, but I've also (ab)used GitHub actions to build tests in my project using a dynamically generated matrix. My project uses flakes but this should be adaptable to non-flakes https://github.com/ibizaman/selfhostblocks/blob/main/.github/workflows/build.yaml |
See the meeting notes for today's infra meeting where we mainly discussed the CI situation: https://github.com/NixOS/infra/blob/7688f20babbeb27a10e4d8669fffe4b0ed00e17f/docs/meeting-notes/2024-11-14.md Here is the high-level plan:
Independently from meeting we also have other discussions about how we can develop ofborg in the future. However this might not happen before February, so we need some alternative solution in the meantime if not longer. |
I've opened a draft PR here for evaluating Nixpkgs using GitHub Actions: #356023. For just evaluation (and those only taking 5 minutes on each arch) instead of also building, I don't think we need to do the running-on-forks dance. Building is harder to get, but it's arguably also less important (and very orthogonal to evaluation). |
This issue has been mentioned on NixOS Discourse. There might be relevant details there: |
One important aspect that ofborg currently provides, and that this issue doesn't mention, is the performance report. For the majority of PRs the performance report is not important, but for work on The report currently does not report the impact of |
Could that be another on-demand GitHub actions job? We could even run automatically if certain paths has been changed. |
Building linux kernel is fine on Github Actions, the CPU time is sufficient, it takes less than 2 hours to build Jovian-NixOS linux kernel, and Github Actions offer max 6 hours per run. The only concern is disk space, workarounds:
All of the above workarounds are implemented in https://github.com/azuwis/actions/blob/main/nix/prepare.sh. Well, expect for 2), which can be set by:
|
Sounds good to me. |
I am concerned about building the kernel modules (both in tree and out of tree). |
Well. We should be quickly able to filter out and blacklist packages we don't want to build once the source of truth lives in the repository? Also we can actually stop github actions, which was not possible with ofborg builds. |
@ibizaman did you see this? https://github.com/thecaralice/flake-gha |
Edge case: When only changing the base branch without force pushing a rebase, eval will not run again, changed packages will not be updated, maintainers will not be requested for review. Fix in #372475, running eval on base branch changes, too. |
Avoid requesting maintainer reviews in draft mode, similar to codeowners: #372479 |
Removed in #356023 (comment) Due to #355847 (comment) #355847 (comment) #355847 (comment) (cherry picked from commit 9ccdc41)
This issue has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/prs-stuck-in-passthru-tests-for-darwin/59257/3 |
I got this failure when pushing a commit to a r-ryantm PR (#376432).
Not bothering me too much, but thought people here would like to know... |
Eval / Process fails with |
That seems like a .nix issue. Another thing, would be nice if we could get a performance report like Ofborg did. There was already an implementation in SQL by @paparodeo (#362844 (comment)), example. Perhaps anyone wants to reimplement it in Nix? |
Addressed by #377434 |
This needs to be fixed still: #371223 |
When "Eval / Comparison" can't find the "run" on the target commit, it will silently pass the job, but "Eval / Tag" will be skipped. In this case, we don't get any rebuild labels, rebuild status, maintainer pings. Random example: https://github.com/NixOS/nixpkgs/actions/runs/12954374703/job/36136463837 I'd argue that we should fail in this case, because this is not expected. Edit: Opened #378909 to do just that. |
Can anyone help me please understand this eval error: 🙏 |
I think this is just GHA being very flaky lately. Sometimes jobs fail for no reason and work fine when you restart them... |
I restarted it and it failed exactly the same way... After a rebase and force push it got fixed. Could it be it was due to an eval error that was present in branch |
One thing that also happens, is GHA being flaky in a shard-dependent way where changing the commit hash rerolls the dice; typically this case doesn't look like your example, though. But maybe they have found new ways to be flaky. |
The eval error regarding "target run id" is this:
When you force push your PR, GHA will create a new temporary merge commit. If something has happened on the target branch already, this might then be based off of a different commit on the target branch - one where CI was passing, thus fixing CI for you. So the failure you are seeing is essentially a cascaded effect of GHA flakiness... |
What if we start using impure derivations and do all this composition system using Nix itself? GH Actions would only setup and run Nix once. That way we could also run more extensive tests on packages. |
IIUC, you are suggesting to eval both the PR branch and the target branch in the PR's job. Eval already takes a lot of resources right now, and this would double the resources needed. Using the already evaluated results from the target branch does speed up CI and save resources significantly, if the target run succeeded. |
I contacted GitHub support about those failures. Let's see whether we get any useful feedback that way. |
Feedback from support is as follows:
So nothing we can do about that, really. |
For me "Rerun all jobs" seems to work. Could it be that one of the previous jobs fail to upload some artifact? |
See #355847 (comment). If there is enough time between the first run and your re-run, GHA will already have created a new temporary merge commit - so this can fix it as well. |
Let's say I hit the issue mentioned in this comment. How do I restart the failed job? Do I need some additional permissions to do so? (I don't see a Re-run button.) Do I have to rebase to re-trigger? |
Yes, as a non-committer, this is your only option. |
This is one of the two plans to ensure we can also perform github evaluation checks in the future.
See https://discourse.nixos.org/t/infrastructure-announcement-the-future-of-ofborg-your-help-needed/56025
for more information.
To replace OfBorg’s functions with GitHub Actions the following tasks need to be implemented:
I already created a proof of concept pull request here: #352808
Update
We have our first jitsi meeting to coordinate the migration on the 14.11 (today) at 17:00 UTC (18:00 Berlin time) at https://jitsi.lassul.us/nixos-infra
The text was updated successfully, but these errors were encountered: