Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[UR][CI] add manually triggered benchmark action #17088

Merged
merged 1 commit into from
Feb 21, 2025

Conversation

pbalcer
Copy link
Contributor

@pbalcer pbalcer commented Feb 20, 2025

This is a first step towards reenabling UR performance testing CI. This introduces the reusable yml workflow and a way to trigger it manually.

Here's an example how it looks:
pbalcer#2 (comment)

@pbalcer
Copy link
Contributor Author

pbalcer commented Feb 20, 2025

ping @ianayl

@lukaszstolarczuk
Copy link
Contributor

also, FYI, python code formatting failed

@pbalcer
Copy link
Contributor Author

pbalcer commented Feb 20, 2025

This should be ready to review, but we still don't have the runner configured.

@lukaszstolarczuk
Copy link
Contributor

LGTM (I can't close my issues)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The naming of this file could conform better to what's currently in .github/workflows: how about something like ur-benchmarks-reusable.yml for now, and then it can be mulled upon later?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do that. It's not just UR though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, renamed them to ur-benchmarks-reusable.yml.

env:
PR_NO: ${{ inputs.pr_no }}
run: |
git fetch -- https://github.com/${{github.repository}} +refs/pull/${PR_NO}/*:refs/remotes/origin/pr/${PR_NO}/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
git fetch -- https://github.com/${{github.repository}} +refs/pull/${PR_NO}/*:refs/remotes/origin/pr/${PR_NO}/*
git fetch -- https://github.com/intel/llvm "+refs/pull/${PR_NO}/*:refs/remotes/origin/pr/${PR_NO}/*"

Two quick things:

  • Shell variables being used should always be quoted to avoid code injection
  • NIT: Honestly this is way paranoid, but the "best" known security practice right now is to not trust the github context, as it's been used in code injection before. I'll leave this up to your discretion though.

Copy link
Contributor Author

@pbalcer pbalcer Feb 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shell variables being used should always be quoted to avoid code injection

Fixed.

PR_NO: ${{ inputs.pr_no }}
run: |
git fetch -- https://github.com/${{github.repository}} +refs/pull/${PR_NO}/*:refs/remotes/origin/pr/${PR_NO}/*
git checkout origin/pr/${PR_NO}/merge
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
git checkout origin/pr/${PR_NO}/merge
git checkout "origin/pr/${PR_NO}/merge"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the PR_NO env var.

run: |
git fetch -- https://github.com/${{github.repository}} +refs/pull/${PR_NO}/*:refs/remotes/origin/pr/${PR_NO}/*
git checkout origin/pr/${PR_NO}/merge
git rev-parse origin/pr/${PR_NO}/merge
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
git rev-parse origin/pr/${PR_NO}/merge
git rev-parse "origin/pr/${PR_NO}/merge"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the PR_NO env var.

${{matrix.adapter.sycl_config}}

- name: Build SYCL
run: cmake --build ${{github.workspace}}/sycl_build -j $(nproc)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All workloads in intel/llvm use https://github.com/intel/llvm/blob/sycl/.github/workflows/sycl-linux-build.yml for building, which produces an artifact that is then accepted by multiple different workflows in intel/llvm.

I think the goal'd be to get this workflow up as fast as possible, so for the time being this is probably fine. However, we'll want this changed to use sycl-linux-build.yml eventually.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, let's leave that as TODO.

run: |
# Compute the core range for the first NUMA node; second node is for UMF jobs.
# Skip the first 4 cores - the kernel is likely to schedule more work on these.
CORES=$(lscpu | awk '
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
CORES=$(lscpu | awk '
CORES="$(lscpu | awk '

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

split(a[4], b, ",")
sub(/^0/, "4", b[1])
print b[1]
}')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
}')
}')"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

working-directory: ${{ github.workspace }}
id: benchmarks
run: >
taskset -c ${{ env.CORES }} ${{ github.workspace }}/sycl-repo/unified-runtime/scripts/benchmarks/main.py
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
taskset -c ${{ env.CORES }} ${{ github.workspace }}/sycl-repo/unified-runtime/scripts/benchmarks/main.py
taskset -c "${{ env.CORES }}" ${{ github.workspace }}/sycl-repo/unified-runtime/scripts/benchmarks/main.py

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@ianayl
Copy link
Contributor

ianayl commented Feb 20, 2025

Unfortunately I'm not in dpcpp-devops-reviewers, so final say is with them instead. But:

  • I see you've your own llvm branch for testing. I think it makes sense but I'm not sure if that's enough: We may want to see testing triggered from intel/llvm as well, in which you'll have to string this up to another existing workflow. This is up to dpcpp-devops-reviewers though however.
  • Is the runner you wanted UR_DNP_INTEL_05_01? Or is it another machine?

@pbalcer
Copy link
Contributor Author

pbalcer commented Feb 20, 2025

  • I see you've your own llvm branch for testing. I think it makes sense but I'm not sure if that's enough: We may want to see testing triggered from intel/llvm as well, in which you'll have to string this up to another existing workflow. This is up to dpcpp-devops-reviewers though however.

I don't believe I have write access to intel/llvm to push the workflow on a branch here.

Is the runner you wanted UR_DNP_INTEL_05_01? Or is it another machine?

It's UR_DNP_INTEL_06_01 I think? I talked with @lukaszstolarczuk and he will setup a runner with PVC_PERF label that matches what's in this PR.

@lukaszstolarczuk
Copy link
Contributor

FYI, so the new runner should be up now (it's actually called UR_DNP_INTEL_06_03).

@pbalcer
Copy link
Contributor Author

pbalcer commented Feb 20, 2025

FYI, so the new runner should be up now (it's actually called UR_DNP_INTEL_06_03).

Thanks!

@pbalcer pbalcer marked this pull request as ready for review February 20, 2025 17:28
@pbalcer pbalcer requested review from a team as code owners February 20, 2025 17:28
@pbalcer
Copy link
Contributor Author

pbalcer commented Feb 20, 2025

@intel/unified-runtime-reviewers @intel/dpcpp-devops-reviewers please review. This is just porting an existing workflow from UR. It's not fully conformant with how other intel/llvm workflows are written, and we plan on addressing that after a merge.

Example comment is here: pbalcer#2 (comment)
Logs here: https://github.com/pbalcer/llvm/actions/runs/13440448856/job/37553283023

From what I understand, it's not possible to test workflow_dispatch directly in intel/llvm prior to a merge. Is my testing on the fork enough or is there anything specific you'd like me to do?

Copy link
Contributor

@kbenzie kbenzie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UR LGTM

@aelovikov-intel
Copy link
Contributor

From what I understand, it's not possible to test workflow_dispatch directly in intel/llvm prior to a merge. Is my testing on the fork enough or is there anything specific you'd like me to do?

Correct. No, your testing should be enough.

Copy link
Contributor

@aelovikov-intel aelovikov-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like that it still feels as if SYCL and UR are two completely different projects. Can't we unify the build here and re-use normal artifacts? Why does it have to be different from how we run SYCL E2E tests?

It's not fully conformant with how other intel/llvm workflows are written, and we plan on addressing that after a merge.

The merge has happened, hasn't it?

@pbalcer
Copy link
Contributor Author

pbalcer commented Feb 20, 2025

I don't like that it still feels as if SYCL and UR are two completely different projects. Can't we unify the build here and re-use normal artifacts? Why does it have to be different from how we run SYCL E2E tests?

This is the quickest way of getting the perf testing functionality back. Right now we have no way to test performance changes in the adapters.
I agree that it's not ideal, and we are committed to improving these workflows iteratively so that they match how the rest of intel/llvm workflows work.

It's not fully conformant with how other intel/llvm workflows are written, and we plan on addressing that after a merge.

The merge has happened, hasn't it?

I was referring to a merge of this patch.

@aelovikov-intel
Copy link
Contributor

Still a draft because there's no runner in intel/llvm (WIP).

Please fix PR descrption.

Comment on lines +178 to +183
github.rest.issues.createComment({
issue_number: pr_no,
owner: context.repo.owner,
repo: context.repo.repo,
body: body
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For future, do we want it to be a comment or a summary of this job (like, e.g., https://github.com/intel/llvm/actions/runs/13438348841)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. I think a comment is more visible. Right now we do not fail a job if the scripts think there's a regression, so if the status is hidden in the job logs, people might forget to check.

In the near future we also plan on uploading a set of html charts per PR (basically this: https://oneapi-src.github.io/unified-runtime/benchmark_results.html but with a PR marked on the chart, so you can easily compare against previous nightly runs). Again, I think people are more likely to use this functionality if it's right there in the comment.

@pbalcer
Copy link
Contributor Author

pbalcer commented Feb 20, 2025

Still a draft because there's no runner in intel/llvm (WIP).

Please fix PR descrption.

Done.

This is a first step towards reenabling UR performance testing CI. This
introduces the reusable yml workflow and a way to trigger it manually.
@pbalcer
Copy link
Contributor Author

pbalcer commented Feb 21, 2025

@intel/llvm-gatekeepers please merge.
The CI failure is unrelated (this PR doesn't change any runtime code) and is failing in other PRs as well (e.g., https://github.com/intel/llvm/actions/runs/13441609022/job/37557207692?pr=17101).

@uditagarwal97
Copy link
Contributor

Failed Tests (1):
  SYCL :: e2e_test_requirements/no-unsupported-without-info.cpp

was fixed in e4d65e0

@uditagarwal97 uditagarwal97 merged commit 770afbf into intel:sycl Feb 21, 2025
28 of 29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants