This is used to track any maintanence information for PNNL CI. We will also track any current TODOs/notes for developers.
We currently only have 2 k8s runners, and so you must only run 2 concurrent pipelines at a time.
Since we added resource groups to control how many pipelines can run at once, we will use 2 runners in a given pipeline stage.
If another pipeline is running, you will have to wait for it to finish before yours can start.
CI runs on a variety of hardward architectures by using a perl script when the job is run to select the right partition:
The possible GPU architectures are:
GPU Type | Cuda Architecture | SLURM partition |
---|---|---|
A100 | 80 | a100/a100_80 _shared |
P100 | 60 | dl/dl_shared |
V100 | 70 | dlv/dlv_shared |
RTX 2080 Ti | 75 | dlt/dlt_shared |
For some reason only the HPC runners are configured to run at the moment, and so all stages will share that base configuration.
You can either add [haero-rebuild]
or [rebuild-haero]
directly to your commit message, or go to https://code.pnnl.gov/e3sm/eagles/mam4xx and trigger the rebuild pipeline manually once you have pushed to your branch.
Make sure you either push to GitHub and have the mirror update first, or just push to the GitLab directly.
PNNL CI will only run when you are adding new commits to an existing merge request.
You can add [skip-ci]
in order to prevent CI jobs from running at PNNL. TODO involves adding support for skipping CI when certain tags are present in a PR.
- Port pipeline to AMD architectures
- Consider cleaning up old installations and adding permissions changes so all users can use shared installation
- Add support for a variety of paritions on Deception. We currently only target dl_shared as we can only choose one cuda arch
- Add way to skip CI using a GitHub tag in both GitLab and GitHub
- Run CI based on commit message or manual trigger
- Get mam4xx building for GPU locally, then get working in CI
- Only run 2 jobs at a time as we only have 2 runners
- Only run PNNL CI in PRs
- Refactor CI YAML to remove duplication across scripts
- Support full matrix of build types (single, double etc.)
- Rebuild HAERO in manual pipeline
- Add CMake / ctest configuration in CI
- Add way to skip CI using a commit message
- Add support for cloning with ssh in CI, with documentation
- Build HEARO without cloning mam4xx in CI step
- Ensure that pipelines are not false positive/negative
- Streamline CI rebuilding of HAERO to happen with one button (need to work around 2 max job limit)
- Use installed HAERO in project share to avoid re-building each time
@cameronrutherford initially maintained the access token used to enable GitHub mirroring. @jaelynlitz is the current maintainer.
Each token is set to expire after one year of use, and they will need to be regenerated each year to maintain integration.
Reference: https://code.pnnl.gov/help/user/project/repository/mirror/pull.md
We have manually configured PNNL CI to point to the YAML file in /.github/pnnl-ci/pnnl.gitlab-ci.yml
. Make sure to re-configure this if you need to re-configure the repository.
We are going to set up a push mirror that is updated with each pull request update. Through a GitHub action, each check will:
- Push updates to all branches in the GitLab
- Trigger a pipeline to run using the Pipeline Trigger token
- PNNL GitLab will post a new message describing pipeline status in a separate check
In order to set this up:
- Create an empty project in GitLab. DO NOT initialize using in-build GitHub integration, as this is broken for running pipelines.
- Enable the GitHub integration in Settings > Integrations in GitLab. This will post pipeline status to the relevant Pull Requests, and you will need to add a personal access token used here as well.
- Ensure your YAML has correct syntax, and you should be good to go!
Since the pipeline status is automatically configured through GitLab premium + GitHub integration, pipeline status will automatically be posted to commits/PRs.
There is a way to orchestrate this pipeline posting through non-premium GitLab as well - https://ecp-ci.gitlab.io/docs/guides/build-status-gitlab.html...
A Personal Access Token is needed to enable GitHub/GitLab integration.
@jaelynlitz currently holds the PAT, set to expire Jan 2026. Followed instructions here under "Connect manually". In Summary:
- in GitHub, generate a Personal Access Token (classic) with permissions
repo
andadmin:repo_hook
- have Owner permissions in GitLab
- in GitLab, go to Settings > Integrations > GitHub - paste your PAT in the new token field, test settings to see if connection is successful, then save changes.
The GitHub action in /.github/workflows/pnnl_push_mirror.yml
relies on the following GitHub secrets. Make sure to configure these if they are expired/broken:
GITLAB_ACCESS_TOKEN
: This is the Project Access Token configured with write permissions for the push mirror action [generated in GitLab and pasted as a secret in GitHub]GITLAB_PIPELINE_TRIGGER_TOKEN
: This is a separate token that allows you to use the pipeline trigger API [generated in GitLab and pasted as a secret in GitHub]GITLAB_REPO_URL
: The same url that one would use for adding mam4xx GitLab as a remote w/ https connectionGITLAB_USER
: The username to associate with push mirror actions (can be any valid user)
This is a Project Access Token generated in GitLab here.
It needs to have the Developer role and write_repository
permissions.
This also creates a bot user that can run the jobs. You can set the variable GITLAB_USER
to be the name of the token. Currently githubsync
.
Once you have generated this in GitLab, paste the token as a secret in the GitHub variable GITLAB_ACCESS_TOKEN
here.
You will need to setup up a pipeline trigger token in order to allow GitHub acitons to trigger CI pipelines. This is a pipeline trigger token generated in GitLab under CI/CD in Settings.
Once generated by clicking "Add new token", paste the token as a secret in the GitHub variable GITLAB_PIPELINE_TRIGGER_TOKEN
here.
Note: The curl syntax in pnnl_push_mirror.yml
is given in the GitLab dropdown "View trigger token usage examples" after generating a pipeline trigger token.
This value is the url given when you want to clone the GitLab repository via https. Paste the value as a secret in the GitHub variable GITLAB_REPO_URL
.
This value can be be any valid GitLab username. Paste the value as a secret in the GitHub variable GITLAB_USER
.
There are shared environment variables that are propogated across both scripts, and each job shares the same template in order to reduce code duplication.
The shared variables are:
HAERO_INSTALL
- specifying where haero is/should be installedBUILD_TYPE
- Debug/ReleasePRECISION
- Single/Double, only applies to haero build stage
Used to build and test mam4xx in CI using HAERO installed in project share.
Similar to the rebuild-haero.sh
script, since we are building in CI, SSH submodules will not suffice. As such this scripts clones the validation repo manually after applying a perl script on the .gitmodules
file.
Used to re-configure HAERO in project share, along with configuring permissions so other users can configure with shared installation.
Since we are installing in GitLab pipelines, we are unable to clone with SSH. This requirement resulted in a separate script for CI, where HTTPS is used for submodules instead of SSH.
It does this by manually find and replacing the .gitmodules
files in each repository where relevant with https://.../
instead of git@...:
.
Additionally, for some reason SYSTEM_NAME
is configured on PNNL login nodes, but when running in a job this variable proves unhelpful. As such, we export SYSTEM_NAME=deception
in this script before running.