Skip to content

Reproducing PR Testing Errors

Anderson Chauphan edited this page Oct 23, 2024 · 67 revisions

Note: these steps assume that the system you will be running on has been setup with the proper modules, meets the hardware requirements, and has access to the various git repositories needed for configuration.

Steps to reproduce a Trilinos_PR_* Pull Request Build

Prerequisites

You must have an account on cee-gitlab.sandia.gov. These instructions are only valid for SRN users.

Gather the necessary information

  1. Navigate to the most recent CDash link posted by the AutoTester:

    PullRequestComment
  2. On CDash, find the failing build and navigate to the files uploaded within the build by clicking the tan box:

    PullRequestFiles

    In particular, later on you'll need:

    • packageEnables.cmake
    • the GenConfig build name string stored in genconfig-build-name.txt

    Important: If you do not have access to CDash, please ask someone with access to gather the information you need.

Set up the build environment

For the remaining steps, $TRILINOS_BUILD denotes the absolute path to your local Trilinos build directory, and $TRILINOS_SRC denotes the absolute path to your local Trilinos source directory. git 2.29.0 or later is required.

  1. mkdir $TRILINOS_BUILD

  2. Download packageEnables.cmake (identified in step 2) to $TRILINOS_BUILD

  3. Checkout the correct commit of Trilinos:

    cd $TRILINOS_SRC
    git checkout --track origin/<BRANCH>
    git pull
    git merge origin/develop
    

    where <BRANCH> is the source branch of your GitHub PR.

  4. Fetch GenConfig and ini files.

    Important: Ensure your SSH public key is added to cee-gitlab.sandia.gov before running get_dependencies.sh.

    • To test on an SRN machine:
      cd $TRILINOS_SRC
      ./packages/framework/get_dependencies.sh --srn
      
  5. Pull in the SEMS V2 modules.

       source /projects/sems/modulefiles/utils/sems-modules-init.sh
    

    See the JIRA docs for how to do this.

  6. Setup the environment and configuration:

    cd $TRILINOS_BUILD
    source $TRILINOS_SRC/packages/framework/GenConfig/gen-config.sh --cmake-fragment gen-config.cmake <BUILD_NAME> $TRILINOS_SRC
    

    Note: If you encounter an error similar to ERROR: Hostname 'ascicgpu032' matched to system 'rhel7', add the option --force for gen-config.sh in the line above.

    <BUILD_NAME> is the GenConfig string gathered earlier from step 2. It also can be found from https://github.com/trilinos/Trilinos/wiki/Pull-Request-Testing-Interface or from the pull down menu in the failing PR:

    how-to-find-build-name

    Important: If you're encountering a configuration failure, please go to step 12 below.

  7. Enable the correct Trilinos packages:

    cd $TRILINOS_BUILD
    rm -rf CMake* && cmake -G Ninja -C packageEnables.cmake -C gen-config.cmake $TRILINOS_SRC
    

    NOTE: See step 4 above for how to acquire packageEnables.cmake

    In some cases, we pass the CMake fragement file with -DTrilinos_CONFIGURE_OPTIONS_FILE instead of -C. This is to enable testing of the 'reduced tarball' generation capability in Trilinos. See https://github.com/trilinos/Trilinos/issues/12024 for more details. Whether to use -C or -DTrilinos_CONFIGURE_OPTIONS_FILE can be viewed in the 'Configure Command' that is reported to CDash for the specific build in question.

    cd $TRILINOS_BUILD
    rm -rf CMake* && cmake -G Ninja -C packageEnables.cmake -D Trilinos_CONFIGURE_OPTIONS_FILE=gen-config.cmake $TRILINOS_SRC
    

Build the binaries and reproduce the test failures

  1. Build and test Trilinos.
ninja -j <X>
ctest -R <TEST_NAME>

where <X> is the number of threads and <TEST_NAME> is a regular expression capturing the tests you're interested in reproducing.

  1. Verify that the test is failing the same way as reported via CDash.

Update the Trilinos source code

  1. Update the Trilinos source code and repeat step 9 (or step 8 if you're encountering a configuration failure) until you're satisfied with the results.
  2. Commit your changes and push them to <BRANCH> so the AutoTester can confirm that you have fixed the pull request build.
Clone this wiki locally