Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

joint_matrix tests failing on BMG #16922

Open
sarnex opened this issue Feb 7, 2025 · 0 comments
Open

joint_matrix tests failing on BMG #16922

sarnex opened this issue Feb 7, 2025 · 0 comments
Labels
bug Something isn't working confirmed

Comments

@sarnex
Copy link
Contributor

sarnex commented Feb 7, 2025

Describe the bug

SYCL :: Matrix/joint_matrix_bf16_fill_k_cache_arg_dim.cpp
SYCL :: Matrix/joint_matrix_bf16_fill_k_cache_runtime_dim.cpp
SYCL :: Matrix/joint_matrix_out_bounds.cpp
2025-02-06T23:28:25.2775472Z ********************
2025-02-06T23:28:25.2775632Z FAIL: SYCL :: Matrix/joint_matrix_bf16_fill_k_cache_arg_dim.cpp (1646 of 2274)
2025-02-06T23:28:25.2775793Z ******************** TEST 'SYCL :: Matrix/joint_matrix_bf16_fill_k_cache_arg_dim.cpp' FAILED ********************
2025-02-06T23:28:25.2775878Z Exit Code: -6
2025-02-06T23:28:25.2775881Z 
2025-02-06T23:28:25.2776038Z Command Output (stdout):
2025-02-06T23:28:25.2776133Z --
2025-02-06T23:28:25.2776202Z # RUN: at line 10
2025-02-06T23:28:25.2776836Z /__w/llvm/llvm/toolchain/bin//clang++  -Werror -I /__w/llvm/llvm/llvm/sycl/test-e2e/Matrix/Inputs  -fsycl -fsycl-targets=spir64  /__w/llvm/llvm/llvm/sycl/test-e2e/Matrix/joint_matrix_bf16_fill_k_cache_arg_dim.cpp -o /__w/llvm/llvm/build-e2e/Matrix/Output/joint_matrix_bf16_fill_k_cache_arg_dim.cpp.tmp_arg_dim_vnni.out -ffp-model=precise -DARG_DIM -DVNNI
2025-02-06T23:28:25.2777499Z # executed command: /__w/llvm/llvm/toolchain/bin//clang++ -Werror -I /__w/llvm/llvm/llvm/sycl/test-e2e/Matrix/Inputs -fsycl -fsycl-targets=spir64 /__w/llvm/llvm/llvm/sycl/test-e2e/Matrix/joint_matrix_bf16_fill_k_cache_arg_dim.cpp -o /__w/llvm/llvm/build-e2e/Matrix/Output/joint_matrix_bf16_fill_k_cache_arg_dim.cpp.tmp_arg_dim_vnni.out -ffp-model=precise -DARG_DIM -DVNNI
2025-02-06T23:28:25.2777626Z # note: command had no output on stdout or stderr
2025-02-06T23:28:25.2777719Z # RUN: at line 11
2025-02-06T23:28:25.2778023Z env ONEAPI_DEVICE_SELECTOR=level_zero:gpu  /__w/llvm/llvm/build-e2e/Matrix/Output/joint_matrix_bf16_fill_k_cache_arg_dim.cpp.tmp_arg_dim_vnni.out
2025-02-06T23:28:25.2778304Z # executed command: env ONEAPI_DEVICE_SELECTOR=level_zero:gpu /__w/llvm/llvm/build-e2e/Matrix/Output/joint_matrix_bf16_fill_k_cache_arg_dim.cpp.tmp_arg_dim_vnni.out
2025-02-06T23:28:25.2778405Z # .---command stdout------------
2025-02-06T23:28:25.2778476Z # | Testing: 8 x 16 x 16 [TM x TN x TK]
2025-02-06T23:28:25.2778562Z # | DONE for size 256
2025-02-06T23:28:25.2778668Z # | GOPS is 3767.83 Gop/s
2025-02-06T23:28:25.2778749Z # | Testing: 16 x 16 x 16 [TM x TN x TK]
2025-02-06T23:28:25.2778842Z # | DONE for size 256
2025-02-06T23:28:25.2778915Z # | GOPS is 3390.06 Gop/s
2025-02-06T23:28:25.2778984Z # | Testing: 32 x 64 x 16 [TM x TN x TK]
2025-02-06T23:28:25.2779110Z # `-----------------------------
2025-02-06T23:28:25.2779186Z # .---command stderr------------
2025-02-06T23:28:25.2779362Z # | Incorrect result in matrix. i: 0, j: 16, Ref: 1.86786, Val: 0.864229, Diff: 1.00363, Epsilon: 0.1
2025-02-06T23:28:25.2780015Z # | joint_matrix_bf16_fill_k_cache_arg_dim.cpp.tmp_arg_dim_vnni.out: /__w/llvm/llvm/llvm/sycl/test-e2e/Matrix/Inputs/joint_matrix_bf16_fill_k_cache_impl.hpp:416: void test(size_t) [T = sycl::ext::oneapi::bfloat16, TResult = float, vnniFactor = 2UL, TM = 32UL, TN = 64UL, TK = 16UL, MCache1 = 32UL, NCache1 = 64UL, KCache1 = 16UL, MCache2 = 256UL, NCache2 = 256UL, KCache2 = 32UL]: Assertion `matrix_compare(matrix_size, matrix_size, C, refC)' failed.
2025-02-06T23:28:25.2780265Z # | Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
2025-02-06T23:28:25.2780385Z # | 0  libsycl.so.8                                                    0x0000742ceb946202
2025-02-06T23:28:25.2780517Z # | 1  libsycl.so.8                                                    0x0000742ceb9436c6
2025-02-06T23:28:25.2780652Z # | 2  libc.so.6                                                       0x0000742ceb433320
2025-02-06T23:28:25.2780785Z # | 3  libc.so.6                                                       0x0000742ceb48cb1c pthread_kill + 284
2025-02-06T23:28:25.2780904Z # | 4  libc.so.6                                                       0x0000742ceb43326e gsignal + 30
2025-02-06T23:28:25.2781028Z # | 5  libc.so.6                                                       0x0000742ceb4168ff abort + 223
2025-02-06T23:28:25.2781133Z # | 6  libc.so.6                                                       0x0000742ceb41681b
2025-02-06T23:28:25.2781227Z # | 7  libc.so.6                                                       0x0000742ceb429507
2025-02-06T23:28:25.2781386Z # | 8  joint_matrix_bf16_fill_k_cache_arg_dim.cpp.tmp_arg_dim_vnni.out 0x0000000000405bbd
2025-02-06T23:28:25.2781522Z # | 9  joint_matrix_bf16_fill_k_cache_arg_dim.cpp.tmp_arg_dim_vnni.out 0x0000000000403970
2025-02-06T23:28:25.2781633Z # | 10 libc.so.6                                                       0x0000742ceb4181ca
2025-02-06T23:28:25.2781754Z # | 11 libc.so.6                                                       0x0000742ceb41828b __libc_start_main + 139
2025-02-06T23:28:25.2781919Z # | 12 joint_matrix_bf16_fill_k_cache_arg_dim.cpp.tmp_arg_dim_vnni.out 0x00000000004035e5
2025-02-06T23:28:25.2781993Z # `-----------------------------
2025-02-06T23:28:25.2782102Z # error: command failed with exit status: -6
2025-02-06T23:28:25.2782105Z 
2025-02-06T23:28:25.2782183Z --
2025-02-06T23:28:25.2782185Z 
2025-02-06T23:28:25.2782243Z ********************
2025-02-06T23:28:25.2782422Z FAIL: SYCL :: Matrix/joint_matrix_bf16_fill_k_cache_runtime_dim.cpp (1651 of 2274)
2025-02-06T23:28:25.2782584Z ******************** TEST 'SYCL :: Matrix/joint_matrix_bf16_fill_k_cache_runtime_dim.cpp' FAILED ********************
2025-02-06T23:28:25.2782711Z Exit Code: -6
2025-02-06T23:28:25.2782714Z 
2025-02-06T23:28:25.2782791Z Command Output (stdout):
2025-02-06T23:28:25.2782867Z --
2025-02-06T23:28:25.2782945Z # RUN: at line 10
2025-02-06T23:28:25.2783590Z /__w/llvm/llvm/toolchain/bin//clang++  -Werror -I /__w/llvm/llvm/llvm/sycl/test-e2e/Matrix/Inputs  -fsycl -fsycl-targets=spir64  /__w/llvm/llvm/llvm/sycl/test-e2e/Matrix/joint_matrix_bf16_fill_k_cache_runtime_dim.cpp -o /__w/llvm/llvm/build-e2e/Matrix/Output/joint_matrix_bf16_fill_k_cache_runtime_dim.cpp.tmp_runtime_dim_vnni.out -ffp-model=precise -DRUNTIME_DIM -DVNNI
2025-02-06T23:28:25.2784277Z # executed command: /__w/llvm/llvm/toolchain/bin//clang++ -Werror -I /__w/llvm/llvm/llvm/sycl/test-e2e/Matrix/Inputs -fsycl -fsycl-targets=spir64 /__w/llvm/llvm/llvm/sycl/test-e2e/Matrix/joint_matrix_bf16_fill_k_cache_runtime_dim.cpp -o /__w/llvm/llvm/build-e2e/Matrix/Output/joint_matrix_bf16_fill_k_cache_runtime_dim.cpp.tmp_runtime_dim_vnni.out -ffp-model=precise -DRUNTIME_DIM -DVNNI
2025-02-06T23:28:25.2784407Z # note: command had no output on stdout or stderr
2025-02-06T23:28:25.2784478Z # RUN: at line 11
2025-02-06T23:28:25.2784737Z env ONEAPI_DEVICE_SELECTOR=level_zero:gpu  /__w/llvm/llvm/build-e2e/Matrix/Output/joint_matrix_bf16_fill_k_cache_runtime_dim.cpp.tmp_runtime_dim_vnni.out 256
2025-02-06T23:28:25.2785073Z # executed command: env ONEAPI_DEVICE_SELECTOR=level_zero:gpu /__w/llvm/llvm/build-e2e/Matrix/Output/joint_matrix_bf16_fill_k_cache_runtime_dim.cpp.tmp_runtime_dim_vnni.out 256
2025-02-06T23:28:25.2785143Z # .---command stdout------------
2025-02-06T23:28:25.2792145Z # | Testing: 8 x 16 x 16 [TM x TN x TK]
2025-02-06T23:28:25.2792211Z # | DONE for size 256
2025-02-06T23:28:25.2792274Z # | GOPS is 4236.51 Gop/s
2025-02-06T23:28:25.2792336Z # | Testing: 16 x 16 x 16 [TM x TN x TK]
2025-02-06T23:28:25.2792395Z # | DONE for size 256
2025-02-06T23:28:25.2792451Z # | GOPS is 3029.42 Gop/s
2025-02-06T23:28:25.2792511Z # | Testing: 32 x 64 x 16 [TM x TN x TK]
2025-02-06T23:28:25.2792581Z # `-----------------------------
2025-02-06T23:28:25.2792643Z # .---command stderr------------
2025-02-06T23:28:25.2792867Z # | Incorrect result in matrix. i: 0, j: 16, Ref: 0.132577, Val: 13.537, Diff: 13.4044, Epsilon: 0.1
2025-02-06T23:28:25.2793554Z # | joint_matrix_bf16_fill_k_cache_runtime_dim.cpp.tmp_runtime_dim_vnni.out: /__w/llvm/llvm/llvm/sycl/test-e2e/Matrix/Inputs/joint_matrix_bf16_fill_k_cache_impl.hpp:416: void test(size_t) [T = sycl::ext::oneapi::bfloat16, TResult = float, vnniFactor = 2UL, TM = 32UL, TN = 64UL, TK = 16UL, MCache1 = 32UL, NCache1 = 64UL, KCache1 = 16UL, MCache2 = 256UL, NCache2 = 256UL, KCache2 = 32UL]: Assertion `matrix_compare(matrix_size, matrix_size, C, refC)' failed.
2025-02-06T23:28:25.2793787Z # | Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
2025-02-06T23:28:25.2793882Z # | 0  libsycl.so.8                                                            0x000079dbe6d46202
2025-02-06T23:28:25.2793976Z # | 1  libsycl.so.8                                                            0x000079dbe6d436c6
2025-02-06T23:28:25.2794059Z # | 2  libc.so.6                                                               0x000079dbe6833320
2025-02-06T23:28:25.2794154Z # | 3  libc.so.6                                                               0x000079dbe688cb1c pthread_kill + 284
2025-02-06T23:28:25.2794243Z # | 4  libc.so.6                                                               0x000079dbe683326e gsignal + 30
2025-02-06T23:28:25.2794328Z # | 5  libc.so.6                                                               0x000079dbe68168ff abort + 223
2025-02-06T23:28:25.2794410Z # | 6  libc.so.6                                                               0x000079dbe681681b
2025-02-06T23:28:25.2794488Z # | 7  libc.so.6                                                               0x000079dbe6829507
2025-02-06T23:28:25.2794666Z # | 8  joint_matrix_bf16_fill_k_cache_runtime_dim.cpp.tmp_runtime_dim_vnni.out 0x0000000000406485
2025-02-06T23:28:25.2794800Z # | 9  joint_matrix_bf16_fill_k_cache_runtime_dim.cpp.tmp_runtime_dim_vnni.out 0x0000000000403abb
2025-02-06T23:28:25.2794884Z # | 10 libc.so.6                                                               0x000079dbe68181ca
2025-02-06T23:28:25.2794980Z # | 11 libc.so.6                                                               0x000079dbe681828b __libc_start_main + 139
2025-02-06T23:28:25.2795111Z # | 12 joint_matrix_bf16_fill_k_cache_runtime_dim.cpp.tmp_runtime_dim_vnni.out 0x0000000000403635
2025-02-06T23:28:25.2795171Z # `-----------------------------
2025-02-06T23:28:25.2795245Z # error: command failed with exit status: -6
2025-02-06T23:28:25.2795249Z 
2025-02-06T23:28:25.2795300Z --
2025-02-06T23:28:25.2795302Z 
2025-02-06T23:28:25.2795352Z ********************
2025-02-06T23:28:25.2795458Z FAIL: SYCL :: Matrix/joint_matrix_out_bounds.cpp (1796 of 2274)
2025-02-06T23:28:25.2795579Z ******************** TEST 'SYCL :: Matrix/joint_matrix_out_bounds.cpp' FAILED ********************
2025-02-06T23:28:25.2795632Z Exit Code: -6
2025-02-06T23:28:25.2795640Z 
2025-02-06T23:28:25.2795698Z Command Output (stdout):
2025-02-06T23:28:25.2795747Z --
2025-02-06T23:28:25.2795802Z # RUN: at line 13
2025-02-06T23:28:25.2796289Z /__w/llvm/llvm/toolchain/bin//clang++  -Werror -I /__w/llvm/llvm/llvm/sycl/test-e2e/Matrix/Inputs  -fsycl -fsycl-targets=spir64  /__w/llvm/llvm/llvm/sycl/test-e2e/Matrix/joint_matrix_out_bounds.cpp -o /__w/llvm/llvm/build-e2e/Matrix/Output/joint_matrix_out_bounds.cpp.tmp.out
2025-02-06T23:28:25.2796795Z # executed command: /__w/llvm/llvm/toolchain/bin//clang++ -Werror -I /__w/llvm/llvm/llvm/sycl/test-e2e/Matrix/Inputs -fsycl -fsycl-targets=spir64 /__w/llvm/llvm/llvm/sycl/test-e2e/Matrix/joint_matrix_out_bounds.cpp -o /__w/llvm/llvm/build-e2e/Matrix/Output/joint_matrix_out_bounds.cpp.tmp.out
2025-02-06T23:28:25.2796871Z # note: command had no output on stdout or stderr
2025-02-06T23:28:25.2796927Z # RUN: at line 14
2025-02-06T23:28:25.2797117Z env ONEAPI_DEVICE_SELECTOR=level_zero:gpu  /__w/llvm/llvm/build-e2e/Matrix/Output/joint_matrix_out_bounds.cpp.tmp.out
2025-02-06T23:28:25.2797366Z # executed command: env ONEAPI_DEVICE_SELECTOR=level_zero:gpu /__w/llvm/llvm/build-e2e/Matrix/Output/joint_matrix_out_bounds.cpp.tmp.out
2025-02-06T23:28:25.2797427Z # .---command stdout------------
2025-02-06T23:28:25.2797485Z # | A row major, B row major:
2025-02-06T23:28:25.2797556Z # | bf16: 1044x1044x1048, 8x16x16: SG size: 16 
2025-02-06T23:28:25.2797610Z # `-----------------------------
2025-02-06T23:28:25.2797669Z # .---command stderr------------
2025-02-06T23:28:25.2797839Z # | Incorrect result in matrix. i: 0, j: 0, Ref: -162.136, Val: -227.44, Diff: 65.3033, Epsilon: 0.1
2025-02-06T23:28:25.2798529Z # | joint_matrix_out_bounds.cpp.tmp.out: /__w/llvm/llvm/llvm/sycl/test-e2e/Matrix/Inputs/joint_matrix_out_bounds_impl.hpp:139: void test() [Tab = sycl::ext::oneapi::bfloat16, Tc = float, MATRIX_M = 1044UL, MATRIX_N = 1044UL, MATRIX_K = 1048UL, TM = 8UL, TN = 16UL, TK = 16UL, A_layout = sycl::ext::oneapi::experimental::matrix::layout::row_major, B_layout = sycl::ext::oneapi::experimental::matrix::layout::row_major]: Assertion `matrix_compare(MATRIX_M, MATRIX_N, C, D)' failed.
2025-02-06T23:28:25.2798755Z # | Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
2025-02-06T23:28:25.2798840Z # | 0  libsycl.so.8                        0x00007a1442746202
2025-02-06T23:28:25.2798928Z # | 1  libsycl.so.8                        0x00007a14427436c6
2025-02-06T23:28:25.2798999Z # | 2  libc.so.6                           0x00007a1442233320
2025-02-06T23:28:25.2799085Z # | 3  libc.so.6                           0x00007a144228cb1c pthread_kill + 284
2025-02-06T23:28:25.2799164Z # | 4  libc.so.6                           0x00007a144223326e gsignal + 30
2025-02-06T23:28:25.2799243Z # | 5  libc.so.6                           0x00007a14422168ff abort + 223
2025-02-06T23:28:25.2799348Z # | 6  libc.so.6                           0x00007a144221681b
2025-02-06T23:28:25.2799420Z # | 7  libc.so.6                           0x00007a1442229507
2025-02-06T23:28:25.2799504Z # | 8  joint_matrix_out_bounds.cpp.tmp.out 0x000000000040436a
2025-02-06T23:28:25.2799584Z # | 9  joint_matrix_out_bounds.cpp.tmp.out 0x0000000000403942
2025-02-06T23:28:25.2799668Z # | 10 joint_matrix_out_bounds.cpp.tmp.out 0x00000000004037b2
2025-02-06T23:28:25.2799736Z # | 11 libc.so.6                           0x00007a14422181ca
2025-02-06T23:28:25.2799824Z # | 12 libc.so.6                           0x00007a144221828b __libc_start_main + 139
2025-02-06T23:28:25.2799903Z # | 13 joint_matrix_out_bounds.cpp.tmp.out 0x00000000004035c5
2025-02-06T23:28:25.2799957Z # `-----------------------------
2025-02-06T23:28:25.2800026Z # error: command failed with exit status: -6
2025-02-06T23:28:25.2800029Z 
2025-02-06T23:28:25.2800074Z --
2025-02-06T23:28:25.2800077Z 

https://github.com/intel/llvm/actions/runs/13188912768/job/36819654925?pr=16910

To reproduce

No response

Environment

No response

Additional context

No response

@sarnex sarnex added bug Something isn't working confirmed labels Feb 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working confirmed
Projects
None yet
Development

No branches or pull requests

1 participant