Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename extensions to include MPI in name #802

Open
luraess opened this issue Nov 24, 2023 · 15 comments
Open

Rename extensions to include MPI in name #802

luraess opened this issue Nov 24, 2023 · 15 comments

Comments

@luraess
Copy link
Contributor

luraess commented Nov 24, 2023

Developing a package that also requires AMDGPU and CUDA extensions and MPI, we hit an issue about conflicting extension names that can't be resolved.

A solution is to name the extensions in a way to include the package name they are belonging to. For MPI, it means renaming extensions to MPIAMDGPUext and MPICUDAext.

If this sounds good I could draft a PR.

@simonbyrne
Copy link
Member

Can you give more details on the issue (stack trace, reproducer)?

@utkinis
Copy link
Contributor

utkinis commented Nov 24, 2023

Reproducer

Assume we have a package ExtensionsConflict.jl, that depends on MPI.jl, and also has an extension for CUDA.jl. The structure of the package is as follows:

iutkin$ tree
.
├── Manifest.toml
├── Project.toml
├── ext
│   └── CUDAExt
│       └── CUDAExt.jl
└── src
    └── ExtensionsConflict.jl

3 directories, 4 files

iutkin$ cat Project.toml
name = "ExtensionsConflict"
uuid = "97c6e3c7-976d-4ed6-9a3c-28cf76c81daa"
authors = ["Ivan Utkin <[email protected]>"]
version = "0.1.0"

[deps]
MPI = "da04e1cc-30fd-572f-bb4f-1f8673147195"

[weakdeps]
CUDA = "052768ef-5323-5732-b1bb-66c8b64840ba"

[extensions]
CUDAExt = "CUDA"

iutkin$ cat src/ExtensionsConflict.jl
module ExtensionsConflict

greet() = print("Hello World!")

end # module ExtensionsConflict

iutkin$ cat ext/CUDAExt/CUDAExt.jl
module CUDAExt

using ExtensionsConflict, CUDA

ExtensionsConflict.greet(dev::CuDevice) = "device $(deviceid(dev))"

end

Then, when trying to use the package, CUDA, and MPI and the same time leads to the name conflict:

(ExtensionsConflict) pkg> up
    Updating registry at `~/.julia/registries/General.toml`
  No Changes to `/scratch-1/iutkin/ExtensionsConflict.jl/Project.toml`
  No Changes to `/scratch-1/iutkin/ExtensionsConflict.jl/Manifest.toml`

julia> using ExtensionsConflict

julia> using CUDA

julia> using MPI
[ Info: Precompiling CUDAExt [11b7e2e0-d079-575b-885e-0ab22ef3252c]
ERROR: LoadError: ArgumentError: Package CUDAExt does not have ExtensionsConflict in its dependencies:
- You may have a partially installed environment. Try `Pkg.instantiate()`
  to ensure all packages in the environment are installed.
- Or, if you have CUDAExt checked out for development and have
  added ExtensionsConflict as a dependency but haven't updated your primary
  environment's manifest file, try `Pkg.resolve()`.
- Otherwise you may need to report an issue with CUDAExt
Stacktrace:
 [1] macro expansion
   @ ./loading.jl:1634 [inlined]
 [2] macro expansion
   @ ./lock.jl:267 [inlined]
 [3] require(into::Module, mod::Symbol)
   @ Base ./loading.jl:1611
 [4] include
   @ ./Base.jl:457 [inlined]
 [5] include_package_for_output(pkg::Base.PkgId, input::String, depot_path::Vector{String}, dl_load_path::Vector{String}, load_path::Vector{String}, concrete_deps::Vector{Pair{Base.PkgId, UInt128}}, source::Nothing)
   @ Base ./loading.jl:2049
 [6] top-level scope
   @ stdin:3
in expression starting at /scratch-1/iutkin/ExtensionsConflict.jl/ext/CUDAExt/CUDAExt.jl:1
in expression starting at stdin:3
┌ Error: Error during loading of extension CUDAExt of MPI, use `Base.retry_load_extensions()` to retry.
│   exception =1-element ExceptionStack:
│    Failed to precompile CUDAExt [11b7e2e0-d079-575b-885e-0ab22ef3252c] to "/home/iutkin/.julia/compiled/v1.9/CUDAExt/jl_WoAYgY".
│    Stacktrace:
│      [1] error(s::String)
│        @ Base ./error.jl:35
│      [2] compilecache(pkg::Base.PkgId, path::String, internal_stderr::IO, internal_stdout::IO, keep_loaded_modules::Bool)
│        @ Base ./loading.jl:2294
│      [3] compilecache
│        @ ./loading.jl:2167 [inlined]
│      [4] _require(pkg::Base.PkgId, env::Nothing)
│        @ Base ./loading.jl:1805
│      [5] _require_prelocked(uuidkey::Base.PkgId, env::Nothing)
│        @ Base ./loading.jl:1660
│      [6] _require_prelocked(uuidkey::Base.PkgId)
│        @ Base ./loading.jl:1658
│      [7] run_extension_callbacks(extid::Base.ExtensionId)
│        @ Base ./loading.jl:1255
│      [8] run_extension_callbacks(pkgid::Base.PkgId)
│        @ Base ./loading.jl:1290
│      [9] run_package_callbacks(modkey::Base.PkgId)
│        @ Base ./loading.jl:1124
│     [10] _require_prelocked(uuidkey::Base.PkgId, env::String)
│        @ Base ./loading.jl:1667
│     [11] macro expansion
│        @ ./loading.jl:1648 [inlined]
│     [12] macro expansion
│        @ ./lock.jl:267 [inlined]
│     [13] require(into::Module, mod::Symbol)
│        @ Base ./loading.jl:1611
│     [14] top-level scope
│        @ ~/.julia/packages/CUDA/YIj5X/src/initialization.jl:208
│     [15] eval
│        @ ./boot.jl:370 [inlined]
│     [16] eval_user_input(ast::Any, backend::REPL.REPLBackend, mod::Module)
│        @ REPL /scratch-1/soft/julia/julia-1.9.4/share/julia/stdlib/v1.9/REPL/src/REPL.jl:153
│     [17] repl_backend_loop(backend::REPL.REPLBackend, get_module::Function)
│        @ REPL /scratch-1/soft/julia/julia-1.9.4/share/julia/stdlib/v1.9/REPL/src/REPL.jl:249
│     [18] start_repl_backend(backend::REPL.REPLBackend, consumer::Any; get_module::Function)
│        @ REPL /scratch-1/soft/julia/julia-1.9.4/share/julia/stdlib/v1.9/REPL/src/REPL.jl:234
│     [19] run_repl(repl::REPL.AbstractREPL, consumer::Any; backend_on_current_task::Bool, backend::Any)
│        @ REPL /scratch-1/soft/julia/julia-1.9.4/share/julia/stdlib/v1.9/REPL/src/REPL.jl:379
│     [20] run_repl(repl::REPL.AbstractREPL, consumer::Any)
│        @ REPL /scratch-1/soft/julia/julia-1.9.4/share/julia/stdlib/v1.9/REPL/src/REPL.jl:365
│     [21] (::Base.var"#1018#1020"{Bool, Bool, Bool})(REPL::Module)
│        @ Base ./client.jl:421
│     [22] #invokelatest#2
│        @ ./essentials.jl:819 [inlined]
│     [23] invokelatest
│        @ ./essentials.jl:816 [inlined]
│     [24] run_main_repl(interactive::Bool, quiet::Bool, banner::Bool, history_file::Bool, color_set::Bool)
│        @ Base ./client.jl:405
│     [25] exec_options(opts::Base.JLOptions)
│        @ Base ./client.jl:322
│     [26] _start()
│        @ Base ./client.jl:522
└ @ Base loading.jl:1261

julia>

Motivation

@luraess suggests prefixing the extension name with the package name, to avoid such conflicts, i.e., in MPI.jl extensions will be named MPICUDAExt and AMDGPUCUDAExt. This is also consistent with the Pkg.jl docs: https://pkgdocs.julialang.org/v1/creating-packages/#Conditional-loading-of-code-in-packages-(Extensions).

I'm not sure that this is the expected behaviour though, at least it's not documented anywhere, and it doesn't make sense for extensions of different packages to know each other, so probably this is rather the bug in Pkg.

@luraess
Copy link
Contributor Author

luraess commented Nov 24, 2023

Thanks @utkinis for providing the MWE and for adding some context. Although the doc is not explicit about this, note that in the example, the naming convention for the extension appends the package name to the package name the extension supports PlottingContourExt. Following this logic, we should have MPIAMDGPUExt. But as said in previous comment, IDK if this is meant to be so or if there is an issue with Pkg.

@simonbyrne
Copy link
Member

@utkinis how are you loading CUDA, since it isn't a dependency of ExtensionsConflict? do you have it via another environment?

@simonbyrne
Copy link
Member

BTW, I'm not opposed to the rename: it shouldn't break anything, I just wanted to figure out what exactly is going on.

@luraess
Copy link
Contributor Author

luraess commented Dec 12, 2023

@utkinis how are you loading CUDA, since it isn't a dependency of ExtensionsConflict? do you have it via another environment?

We tend to have the GPU backend dependency in the base env (on GPU servers, supercomputers) as this would then make it available to all other projects. But we also got the above mentioned issue with e.g. CUDA as deps of ExtensionsConflict.

@luraess
Copy link
Contributor Author

luraess commented Dec 12, 2023

BTW, I'm not opposed to the rename: it shouldn't break anything, I just wanted to figure out what exactly is going on.

We guess the issue could be addressed in Pkg if extensions would be resolved based on UUIDs instead of names (have no insights there - maybe there is a good reason for things to be as they are).

@simonbyrne
Copy link
Member

But we also got the above mentioned issue with e.g. CUDA as deps of ExtensionsConflict.

This is a bit of a weird case: apparently if a package is both in deps and weakdeps, it is treated as a weakdeps:
JuliaLang/Pkg.jl#3727 (comment)

@luraess
Copy link
Contributor Author

luraess commented Dec 13, 2023

But we also got the above mentioned issue with e.g. CUDA as deps of ExtensionsConflict.

This is a bit of a weird case: apparently if a package is both in deps and weakdeps, it is treated as a weakdeps: JuliaLang/Pkg.jl#3727 (comment)

EDIT:
But we also got the above mentioned issue within a project using both CUDA and ExtensionsConflict.

@simonbyrne
Copy link
Member

But we also got the above mentioned issue within a project using both CUDA and ExtensionsConflict.

Do you have an example of that?

@luraess
Copy link
Contributor Author

luraess commented Dec 13, 2023

But we also got the above mentioned issue within a project using both CUDA and ExtensionsConflict.

Do you have an example of that?

@utkinis could you test this ☝️ in your MWE?

@KristofferC
Copy link
Contributor

We guess the issue could be addressed in Pkg if extensions would be resolved based on UUIDs instead of names

That should already be the case. I'll try see what is going on.

@KristofferC
Copy link
Contributor

I tried this:

pkg> activate --temp
  Activating new project at `/tmp/jl_SWjSGu`

(jl_SWjSGu) pkg> dev .
   Resolving package versions...
    Updating `/tmp/jl_SWjSGu/Project.toml`
  [97c6e3c7] + ExtensionsConflict v0.1.0 `../../home/kc/JuliaTests/ExtensionsConflict`
    Updating `/tmp/jl_SWjSGu/Manifest.toml`

(jl_SWjSGu) pkg> add CUDA, MPI

julia> using ExtensionsConflict

julia> using CUDA

julia> using MPI

julia> Base.get_extension(ExtensionsConflict, :CUDAExt)
CUDAExt

I don't see where the possible name collision could be.

@luraess
Copy link
Contributor Author

luraess commented Dec 18, 2023

Indeed, in this setting ☝️ no error occurs. Seems the issue occurs when within the ExtensionsConflict package, and using CUDA from the global environment (e.g. @v1.9).

@simonbyrne
Copy link
Member

Indeed, in this setting ☝️ no error occurs. Seems the issue occurs when within the ExtensionsConflict package, and using CUDA from the global environment (e.g. @v1.9).

Yes, I was able to reproduce it in this case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants