Bug when using MKLSparse following use of MKL #36

simonp0420 · 2023-02-15T23:11:10Z

When I use MKLSparse after a previous use MKL, a bug manifests when multiplying a real-valued SparseMatrixCSC{ComplexF64, Int64} matrix times a ComplexF64 or Float64 vector. The following output exhibits the bug:

julia> include("reproducer.jl")
typeof.((QAAQ, fsrc, b)) = (SparseMatrixCSC{ComplexF64, Int64}, Vector{ComplexF64}, Vector{ComplexF64})
size.((QAAQ, fsrc, b)) = ((429567, 429567), (429567,), (429567,))
nnz(QAAQ) = 11532
maximum(abs ∘ imag, QAAQ) = 0.0
maximum(abs, QAAQ - real(QAAQ)) = 0.0
maximum(abs, b - real(QAAQ) * fsrc) = 0.0
0.0

julia> using MKL  # This does not introduce the bug

julia> include("reproducer.jl")
typeof.((QAAQ, fsrc, b)) = (SparseMatrixCSC{ComplexF64, Int64}, Vector{ComplexF64}, Vector{ComplexF64})
size.((QAAQ, fsrc, b)) = ((429567, 429567), (429567,), (429567,))
nnz(QAAQ) = 11532
maximum(abs ∘ imag, QAAQ) = 0.0
maximum(abs, QAAQ - real(QAAQ)) = 0.0
maximum(abs, b - real(QAAQ) * fsrc) = 0.0
0.0

julia> using MKLSparse  # this will introduce the bug

julia> include("reproducer.jl")
typeof.((QAAQ, fsrc, b)) = (SparseMatrixCSC{ComplexF64, Int64}, Vector{ComplexF64}, Vector{ComplexF64})
size.((QAAQ, fsrc, b)) = ((429567, 429567), (429567,), (429567,))
nnz(QAAQ) = 11532
maximum(abs ∘ imag, QAAQ) = 0.0
maximum(abs, QAAQ - real(QAAQ)) = 0.0
maximum(abs, b - real(QAAQ) * fsrc) = 119.4196587781747
119.4196587781747

The final output should be zero since b was computed as QAAQ * fsrc and QAAQ has zero imaginary part for each element. Note that the bug does not manifest if I use the two packages in the other order: MKLSparse first, followed by MKL.

The four files necessary for reproducing this are in this gist. Here is my version info:

julia> versioninfo()
Julia Version 1.8.5
Commit 17cfb8e65e (2023-01-08 06:45 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: 8 × Intel(R) Core(TM) i7-9700 CPU @ 3.00GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.1 (ORCJIT, skylake)
  Threads: 8 on 8 virtual cores
Environment:
  JULIA_EDITOR = code
  JULIA_NUM_THREADS = 8

and my package status:

(error_reproducer) pkg> st
Status `D:\peter\Documents\julia\book_codes\RumpfFDFD\error_reproducer\Project.toml`
  [33e6dc65] MKL v0.6.0
  [0c723cd3] MKLSparse v1.1.0

The text was updated successfully, but these errors were encountered:

TobiasHolicki · 2023-09-10T22:13:53Z

I encountered essentially the same bug. Here is a smaller example that might be helpful:

using SparseArrays
A = sparse([1.0 1.0]); B = [1 0; 0 1.0]

using MKL
A * B # this returns [1.0 1.0] which is ok

using MKLSparse
A * B # this returns [1.0 0.0] which is bad

B = [1 0; 0 1];
A * B # this returns [1.0 1.0] which is ok again

Leebre · 2024-04-27T18:25:44Z

I am seeing similar problems with MKLSparse. However, note that the package seems to be abandoned, as it hasn't been updated in over 2 years:

https://github.com/JuliaSparse/MKLSparse.jl

amontoison · 2024-04-28T04:48:22Z

I encountered essentially the same bug. Here is a smaller example that might be helpful:

A = sparse([1.0 1.0]); B = [1 0; 0 1.0]

using MKL
A * B # this returns [1.0 1.0] which is ok

using MKLSparse
A * B # this returns [1.0 0.0] which is bad

B = [1 0; 0 1];
A * B # this returns [1.0 1.0] which is ok again

I highly suspect that the issue is in MKL.
When we do a using MKL, we change the threading mode here.
It should have a side effect on how they perform the sparse products internally.
MKL.jl and MKLSParse.jl use the same library MKL_jll.libmkl_rt.

amontoison · 2024-05-06T05:12:09Z

I find the culprit, MKL.jl is loading an LP64 interface by default:
JuliaLinearAlgebra/MKL.jl@cbbaacd
It was changed with a recent release of MKL.jl (v6.0).

We load an ILP64 here in MKLSparse.jl:
https://github.com/JuliaSparse/MKLSparse.jl/blob/master/src/MKLSparse.jl#L37

amontoison · 2024-05-06T05:25:19Z

@ViralBShah Is it not possible to still load the interface ILP64 in MKL.jl?

ViralBShah · 2024-05-06T14:19:32Z

I thought we use the ILP64 by default in MKL. MKL now has the _64 suffixes for ILP64 as well.

amontoison · 2024-05-06T14:57:31Z

I thought we use the ILP64 by default in MKL. MKL now has the _64 suffixes for ILP64 as well.

Do you know in which version of Intel MKL , the _64 suffixes were added for ILP64?
I opened a PR in MKL.jl but we maybe need to update the compat entry on MKL_jll.jl.

ViralBShah · 2024-05-07T14:10:25Z

IIRC, it was in 2023, but there were some issues where a few symbols got left out. So, I think 2024 is perhaps the safest. In fact, we may want to make 2024 a minimum requirement even in MKL.jl.

amontoison · 2024-05-07T15:30:31Z

We will drop support for Mac but I think it's fine. It never worked well and required sesuential threading.

mhaense1 · 2024-05-13T18:35:51Z

Hi,

I also faced a similar issue as the users above today, even though the package was updated recently.
Sorry if I am stating known problems here, but it took me a long time figuring out the related problem in one of my codes.

MWE:

julia > using MKL, MKLSparse
julia > using SparseArrays

julia > idx1 = rand(1:10_000,100_000);

julia > idx2 = rand(1:10_000,100_000);

julia > vals = rand(100_000);

julia > spm = sparse(idx1,idx2,vals,100_000,100_000);

julia > spm*ones(100_000)
100000-element Vector{Float64}:
    3.2610336365211516
    5.981257329062043
    3.6314630135321475
    2.706724008231419
    4.515169252866177
    4.121828693918578
    2.410895086369234
    3.073361618309217
    1.6881840883567407
    0.0
    4.18252725784919
    0.8644031768747882
    5.747019246623832
    ⋮
    0.0
    0.0
    0.0
    0.0
    0.0
    0.0
    0.0
    0.0
    0.0
    0.0
    0.0
 9349.209078325439

Note that by construction, matrix spm will have only zeros for all rows > 10_000. Thus the result spm*ones(100_000) gives is obviously wrong as it should return the sum of spms rows.

Within a project I was working on, MKLSparse also seems to have resulted in Julia crashing in weird ways, as I eventually figured out here. But I couldn't produce an easy MWE for that.

My Pkg status:

Pkg.status()
Status `C:\Users\matth\.julia\environments\v1.10\Project.toml`
⌃ [31c24e10] Distributions v0.25.107
  [33e6dc65] MKL v0.6.3
  [0c723cd3] MKLSparse v1.2.0
⌃ [c3e4b0f8] Pluto v0.19.37
  [295af30f] Revise v3.5.14
Info Packages marked with ⌃ have new versions available and may be upgradable.

amontoison · 2024-05-13T19:30:59Z

MKL.jl is loading an LP64 interface whereas MKLSparse.jl is loading the ILP64 interface.
We don't know how to fix the issue yet.

ViralBShah · 2024-05-14T00:30:25Z

Does it work if you switch MKL.jl to the ILP64 interface? The problem is that MKL.jl only uses the _64 suffix APIs in LP64. In ILP64, both 32-bit and 64-bit symbols have the same name, creating a lot of headaches.

Is it possible to set ILP64 just for the MKLSparse stuff without affecting the BLAS?

amontoison · 2024-05-14T06:35:53Z

We can't set the ILP64 interface for MKLSparse.jl with the __init__ function like here, because it's modified by MKL.jl.

However, we could change the interface within mul!. We can set the ILP64 interface before cscmv! and reset the interface to LP64 after.

ViralBShah · 2024-05-14T14:19:06Z

Do these MKL sparse routines give us a clear way of picking which API to call for 32 vs. 64-bit integers? Another possibility may be to default this package to SparseMatrixCSC with 32-bit ints only?

ViralBShah · 2024-05-14T18:33:20Z

If we were able to use the C API for MKLSparse, it may stay clear of the issues we are facing with the INTERFACE setting, which probably changes the size of the Fortran INTEGER based on LP64 vs ILP64.

Another solution might be to see if we can avoid libmkl_rt and directly pick up the relevant library such as libmkl_intel_ilp64.so

amontoison · 2024-05-14T21:12:33Z

The performance with the C API is worse than Fortran routines last time that I tested.
I am at SIAM LA24 this week but I can review your PRs and check that everything works this weekend.

ViralBShah · 2024-05-14T22:12:50Z

My PR didn't work. I closed it. Huh, that is weird that the C routines have worse performance. Can you check that once again?

ViralBShah · 2024-05-14T22:19:02Z

@mkrainiuk Any insight you have here might be valuable.

mkrainiuk · 2024-05-14T23:13:37Z

@mkrainiuk Any insight you have here might be valuable.

The only one potential performance change I'd expect is when you compare 32bit integer API vs 64bit integer API, but not Fortran vs C. Could you please clarify what type of APIs you compared?

As a side note, I'd highly recommend to use IE spBLAS API instead of deprecated NIST style spBLAS API, since we plan to completely remove it from oneMKL in the future. We have a short KB article about migration here

ViralBShah · 2024-05-15T18:39:09Z

@mkrainiuk Thank you. If we are using the LP64 interface of MKL, would the integer that is passed to the sparse API calls have to be 32-bit ints? (Currently this package passes 64-bit ints, and when MKL uses the LP64 interface, we observe the crash).

mkrainiuk · 2024-05-15T18:44:41Z

@mkrainiuk Thank you. If we are using the LP64 interface of MKL, would the integer that is passed to the sparse API calls have to be 32-bit ints? (Currently this package passes 64-bit ints, and when MKL uses the LP64 interface, we observe the crash).

Hi @ViralBShah, yes, LP64 interface expect 32bit integers, and this is most likely why you observe the crash, you need to use _64 API from LP64 interface library or API from ILP64 interface.

ViralBShah · 2024-05-15T21:08:46Z

@mkrainiuk The thing I couldn't figure out is if the sparse functions also have the _64 interface. For example, there is dgemm_64, but not mkl_dcsrmultcsr_64.

mkrainiuk · 2024-05-15T21:43:10Z

oneMKL has _64 API for inspector-executor SpBLAS API mkl_sparse_d_mm_64. The _64 API was introduced in oneMKL 2024.0.

mkl_dcsrmultcsr is NIST style API that was deprecated, so there are no plans to extend it, but only remove it in the future.

ViralBShah · 2024-05-20T19:44:58Z

Need to migrate to https://www.intel.com/content/www/us/en/docs/onemkl/developer-reference-c/2024-1/blas-and-sparse-blas-routines.html

ViralBShah · 2024-05-20T19:48:11Z

@rayegun Do you understand how the Intel sparse BLAS compares to Tim's GraphBLAS? Should people take a look at your GraphBLAS Julia package as well?

rayegun · 2024-05-20T22:04:22Z

@rayegun Do you understand how the Intel sparse BLAS compares to Tim's GraphBLAS? Should people take a look at your GraphBLAS Julia package as well?

Around JuliaCon I would recommend it yes. It's in fine shape now maybe a little rough on the edges, but I'm about 3/4 of the way through a total rewrite. GraphBLAS is a strict superset of Sparse BLAS with the exception of solvers, and SuiteSparse:GraphBLAS is almost always quite a bit faster than MKL. The real problem I will have solved by JuliaCon is that the GraphBLAS interface is not exactly what SparseBLAS people want.

amontoison · 2024-07-07T22:52:27Z

@simonp0420 @TobiasHolicki @Leebre
I released a version 2.0.0 of MKLSparse.jl.
It uses the new API and fix this issue.

amontoison · 2024-07-08T00:08:27Z

@mkrainiuk Any insight you have here might be valuable.

The only one potential performance change I'd expect is when you compare 32bit integer API vs 64bit integer API, but not Fortran vs C. Could you please clarify what type of APIs you compared?

As a side note, I'd highly recommend to use IE spBLAS API instead of deprecated NIST style spBLAS API, since we plan to completely remove it from oneMKL in the future. We have a short KB article about migration here

@mkrainiuk Sorry for not answering earlier. I tested the Fortran API (v1.2.0 of MKLSparse.jl) and the C API (v2.0.0 of MKLSparse.jl) for sparse matrix - vector products with CSC matrices.

I don't have any performance regression with the version 2024.2.0 of MKL. 👍
When I tested in 2022, the C API was less efficient than the Fortran one.

My benchmark is the following code:

using BenchmarkTools, SuiteSparseMatrixCollection, MatrixMarket
using SparseArrays, LinearAlgebra, Printf, JLD, Base.Threads

ssmc = ssmc_db(verbose=false)
dataset = ssmc[(ssmc.binary .== false) .& (10000 .≤ ssmc.nrows .≤ 20000) .& (ssmc.numerical_symmetry .== 1.0), :]
# dataset = ssmc[(ssmc.binary .== false) .& (1000 .≤ ssmc.nrows .≤ 10000), :]
paths = fetch_ssmc(dataset, format="MM")
names = dataset[!,:name]

# nb_pbs = length(paths)  # 796 matrices...
nb_pbs = 15
benchmark_sparse = true
benchmark_mkl = true
time_sparse = zeros(nb_pbs)
time_mkl = zeros(nb_pbs)

if benchmark_sparse
  for i = 1 : nb_pbs
    name = dataset[!,:name][i]
    path = paths[i]
    @printf("SparseArrays -- %3d / %3d -- %s\n", i, nb_pbs, name)
    A = Float64.(MatrixMarket.mmread(path * "/$name.mtx"))
    n, m = size(A)
    x = rand(m)
    y = rand(n)
    time_sparse[i] = @belapsed for k in 1:1000 mul!($y, $A, $x) end
  end
  save("time_sparse.jld", "time_sparse", time_sparse)
else
  time_sparse = load("time_sparse.jld", "time_sparse")
end

using MKLSparse

if benchmark_mkl
  for i = 1 : nb_pbs
    name = names[i]
    path = paths[i]
    @printf("MKLSparse -- %3d / %3d -- %s\n", i, nb_pbs, name)
    A = MatrixMarket.mmread(path * "/$name.mtx")
    n, m = size(A)
    x = rand(m)
    y = rand(n)
    time_mkl[i] = @belapsed for k in 1:1000 mul!($y, $A, $x) end
  end
  save("time_mkl.jld", "time_mkl", time_mkl)
else
  time_mkl = load("time_mkl.jld", "time_mkl")
end

using PlotlyJS

p = plot([
  bar(name="mul! SparseArrays", x=names[1:nb_pbs], y=time_sparse),
  bar(name="mul! MKLSparse", x=names[1:nb_pbs], y=time_mkl),
  ],
)
relayout!(p, barmode="group")
savefig(p, "julia_vs_mkl_eps", format="eps")
savefig(p, "julia_vs_mkl_svg", format="svg")

Fortran API:

C API:

amontoison mentioned this issue May 6, 2024

Update MKL.jl to use the ILP64 interface JuliaLinearAlgebra/MKL.jl#164

Merged

ViralBShah mentioned this issue May 16, 2024

Update MKLSparse.jl to support both LP64 and ILP64 interfaces #44

Closed

amontoison closed this as completed Jul 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug when using MKLSparse following use of MKL #36

Bug when using MKLSparse following use of MKL #36

simonp0420 commented Feb 15, 2023

TobiasHolicki commented Sep 10, 2023 •

edited by ViralBShah

Loading

Leebre commented Apr 27, 2024 •

edited

Loading

amontoison commented Apr 28, 2024

amontoison commented May 6, 2024 •

edited

Loading

amontoison commented May 6, 2024

ViralBShah commented May 6, 2024 •

edited

Loading

amontoison commented May 6, 2024

ViralBShah commented May 7, 2024

amontoison commented May 7, 2024

mhaense1 commented May 13, 2024 •

edited

Loading

amontoison commented May 13, 2024

ViralBShah commented May 14, 2024

amontoison commented May 14, 2024

ViralBShah commented May 14, 2024 •

edited

Loading

ViralBShah commented May 14, 2024 •

edited

Loading

amontoison commented May 14, 2024

ViralBShah commented May 14, 2024

ViralBShah commented May 14, 2024

mkrainiuk commented May 14, 2024

ViralBShah commented May 15, 2024

mkrainiuk commented May 15, 2024

ViralBShah commented May 15, 2024 •

edited

Loading

mkrainiuk commented May 15, 2024 •

edited

Loading

ViralBShah commented May 20, 2024

ViralBShah commented May 20, 2024 •

edited

Loading

rayegun commented May 20, 2024 •

edited

Loading

amontoison commented Jul 7, 2024 •

edited

Loading

amontoison commented Jul 8, 2024

Bug when using MKLSparse following use of MKL #36

Bug when using MKLSparse following use of MKL #36

Comments

simonp0420 commented Feb 15, 2023

TobiasHolicki commented Sep 10, 2023 • edited by ViralBShah Loading

Leebre commented Apr 27, 2024 • edited Loading

amontoison commented Apr 28, 2024

amontoison commented May 6, 2024 • edited Loading

amontoison commented May 6, 2024

ViralBShah commented May 6, 2024 • edited Loading

amontoison commented May 6, 2024

ViralBShah commented May 7, 2024

amontoison commented May 7, 2024

mhaense1 commented May 13, 2024 • edited Loading

amontoison commented May 13, 2024

ViralBShah commented May 14, 2024

amontoison commented May 14, 2024

ViralBShah commented May 14, 2024 • edited Loading

ViralBShah commented May 14, 2024 • edited Loading

amontoison commented May 14, 2024

ViralBShah commented May 14, 2024

ViralBShah commented May 14, 2024

mkrainiuk commented May 14, 2024

ViralBShah commented May 15, 2024

mkrainiuk commented May 15, 2024

ViralBShah commented May 15, 2024 • edited Loading

mkrainiuk commented May 15, 2024 • edited Loading

ViralBShah commented May 20, 2024

ViralBShah commented May 20, 2024 • edited Loading

rayegun commented May 20, 2024 • edited Loading

amontoison commented Jul 7, 2024 • edited Loading

amontoison commented Jul 8, 2024

TobiasHolicki commented Sep 10, 2023 •

edited by ViralBShah

Loading

Leebre commented Apr 27, 2024 •

edited

Loading

amontoison commented May 6, 2024 •

edited

Loading

ViralBShah commented May 6, 2024 •

edited

Loading

mhaense1 commented May 13, 2024 •

edited

Loading

ViralBShah commented May 14, 2024 •

edited

Loading

ViralBShah commented May 14, 2024 •

edited

Loading

ViralBShah commented May 15, 2024 •

edited

Loading

mkrainiuk commented May 15, 2024 •

edited

Loading

ViralBShah commented May 20, 2024 •

edited

Loading

rayegun commented May 20, 2024 •

edited

Loading

amontoison commented Jul 7, 2024 •

edited

Loading