Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce amount of CI round-tripping required by contributors #10263

Open
mpickering opened this issue Aug 19, 2024 · 8 comments
Open

Reduce amount of CI round-tripping required by contributors #10263

mpickering opened this issue Aug 19, 2024 · 8 comments
Labels
re: devx Improving the cabal developer experience (internal issue) type: enhancement

Comments

@mpickering
Copy link
Collaborator

mpickering commented Aug 19, 2024

There is currently quite a lot of back-and-forth which is required when contributing patches to Cabal which makes the developer experience sometimes cumbersome.

In my mind, the goal should be to perform 1 round-trip with CI. A developer submits a MR, sees ALL issues with the patch and can update the patch once in order to fix all the problems. Ideally all the issues are reproducible locally so that iteration can be performed locally on a specific test.

Off the top of my head here are some reasons why you might have to perform multiple CI rounds:

  • The changelog.d template has a MR number field, which can only be filled in after a MR is created.
  • Formatting often initially fails for contributors who are not aware of the formatting requirement.
  • Tests such as the structured hash test require a developer to copy and paste a value from CI into their local tree. This is different on different GHC versions so it's difficult to update locally.
  • If one CI job fails (for example a windows job), then all other jobs are cancelled, which can hide further failures on other platforms or versions.
  • Tests are run on CI sequentially, some tests are not run if earlier tests fail (for example, unit tests run before the package tests). In a bad situation you may do multiple rounds if a unit test fails and then a package test for ./Setup and then a package test for cabal-install.

There is also the suggestion of adding a file which tracks the API of Cabal and Cabal-syntax (#10259) which would add another bullet point to this list.

In order to fix these issues, here are some possible suggestions:

  • Remove tests such as structured hash (I'm not sure what the purpose of this test is as it is blindly updated)
  • Do not cancel concurrent jobs if one CI job fails
  • Always run all tests, don't bail out early if one testsuite fails.
  • Remove MR field from changelog.d entry, require each commit to reference an issue. (At which point the issue will link to the MR automatically)
@ffaf1 ffaf1 added the re: devx Improving the cabal developer experience (internal issue) label Aug 19, 2024
@ffaf1
Copy link
Collaborator

ffaf1 commented Aug 19, 2024

require each commit to reference an issue

We do but it is not enforced strictly.

@geekosaur
Copy link
Collaborator

The changlog.d template has a MR number field, which can only be filled in after a MR is created.

It's possible to predict these, actually: the PR number is shared with issue numbers, but otherwise your only problem is someone else getting an issue or PR in before you do. I haven't specifically dealt with this, but multiple times have predicted links to PRs I'm about to put up.

@geekosaur
Copy link
Collaborator

Remove tests such as structured hash (I'm not sure what the purpose of this test is as it is blindly updated)

These tell us when core structures change, which among other things is an indication that a changelog.d entry is needed and that it will invalidate people's existing dist or dist-newstyle directories.

@mpickering
Copy link
Collaborator Author

An example from my recent PR: #10287

The job is killed after lib-tests runs, so I have no idea if there are going to be more failures in subsequent testsuites.

Why not always run all the tests and report all the failures at once to a user?

@ulysses4ever
Copy link
Collaborator

@mpickering I think there's no strong opposition to what you propose — just a matter of cooking a PR flipping the right switch. If memory serves, @andreasabel worked a bit on orchestrating CI, maybe he has an opinion?

@fgaz
Copy link
Member

fgaz commented Aug 28, 2024

iirc it was done to conserve resources. However since then we got more free ci time and parallelism from github, so changing it should be fine.

@andreasabel
Copy link
Member

I agree. I generally use fail-fast: false because:

  • I want to see all problems CI can uncover as soon as possible.
  • Often jobs fail for random reasons (like some web resource not available, some ill-configured host machine) and then they drag other jobs into the abyss.

ulysses4ever added a commit that referenced this issue Aug 29, 2024
@ulysses4ever
Copy link
Collaborator

Thanks all! I submitted #10291 setting fail-fast: false on the main validate job. Please, review.

ulysses4ever added a commit that referenced this issue Aug 30, 2024
Which means that if a Windows job fails, all other jobs in the matrix
will be allowed to finish (other platforms, as well as other compilers on Windows, etc.)

Inspired by the discussion at #10263
Mikolaj pushed a commit that referenced this issue Sep 1, 2024
Which means that if a Windows job fails, all other jobs in the matrix
will be allowed to finish (other platforms, as well as other compilers on Windows, etc.)

Inspired by the discussion at #10263
mergify bot pushed a commit that referenced this issue Sep 14, 2024
Which means that if a Windows job fails, all other jobs in the matrix
will be allowed to finish (other platforms, as well as other compilers on Windows, etc.)

Inspired by the discussion at #10263

(cherry picked from commit f0e0985)
Mikolaj pushed a commit that referenced this issue Sep 14, 2024
Which means that if a Windows job fails, all other jobs in the matrix
will be allowed to finish (other platforms, as well as other compilers on Windows, etc.)

Inspired by the discussion at #10263

(cherry picked from commit f0e0985)
geekosaur pushed a commit that referenced this issue Sep 14, 2024
Which means that if a Windows job fails, all other jobs in the matrix
will be allowed to finish (other platforms, as well as other compilers on Windows, etc.)

Inspired by the discussion at #10263

(cherry picked from commit f0e0985)
Mikolaj pushed a commit that referenced this issue Sep 14, 2024
Which means that if a Windows job fails, all other jobs in the matrix
will be allowed to finish (other platforms, as well as other compilers on Windows, etc.)

Inspired by the discussion at #10263

(cherry picked from commit f0e0985)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
re: devx Improving the cabal developer experience (internal issue) type: enhancement
Projects
None yet
Development

No branches or pull requests

6 participants