Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RAPIDS deployment on Snowflake Notebook Container Runtime #496

Open
3 tasks
ncclementi opened this issue Jan 22, 2025 · 6 comments
Open
3 tasks

RAPIDS deployment on Snowflake Notebook Container Runtime #496

ncclementi opened this issue Jan 22, 2025 · 6 comments
Assignees

Comments

@ncclementi
Copy link
Contributor

Snowflake supports notebooks via container runtime see https://quickstarts.snowflake.com/guide/notebook-container-runtime/#0

You can't bring you own runtime but you can pip install packages, and from a specific index by adding an External Access Integration hooked up to a PyPI network.

I think by modifying this part of the setup to include pypi.nvidia.com in the value list it should work to then pip install rapids.

-- Substep #2: create external access integration (these are account-level objects; end users need access to this to access the public internet with endpoints defined in network rules)

CREATE OR REPLACE EXTERNAL ACCESS INTEGRATION allow_all_integration
  ALLOWED_NETWORK_RULES = (allow_all_rule)
  ENABLED = true;

CREATE OR REPLACE NETWORK RULE pypi_network_rule
  MODE = EGRESS
  TYPE = HOST_PORT
  VALUE_LIST = ('pypi.org', 'pypi.python.org', 'pythonhosted.org',  'files.pythonhosted.org');

TODO:

  • create a snowflake notebook container runtime with access to pypi.nvidia.com
  • Install rapids with pip
  • Run an example
@ncclementi
Copy link
Contributor Author

Update: gave this a first try, but currently not working.

! pip install \
    --extra-index-url=https://pypi.nvidia.com \
    "cudf-cu12==24.12.*"

but when trying to do import cudf got:

/opt/conda/lib/python3.10/site-packages/cudf/utils/_ptxcompiler.py:64: UserWarning: Error getting driver and runtime versions:
stdout:
stderr:
Traceback (most recent call last):
File "<string>", line 7, in <module>
File "/opt/conda/lib/python3.10/site-packages/numba_cuda/numba/cuda/cudadrv/runtime.py", line 111, in get_version
self.cudaRuntimeGetVersion(ctypes.byref(rtver))
File "/opt/conda/lib/python3.10/site-packages/numba_cuda/numba/cuda/cudadrv/runtime.py", line 65, in getattr
self._initialize()
File "/opt/conda/lib/python3.10/site-packages/numba_cuda/numba/cuda/cudadrv/runtime.py", line 51, in _initialize
self.lib = open_cudalib('cudart')
File "/opt/conda/lib/python3.10/site-packages/numba_cuda/numba/cuda/cudadrv/libs.py", line 65, in open_cudalib
return ctypes.CDLL(path)
File "/opt/conda/lib/python3.10/ctypes/init.py", line 374, in init
self._handle = _dlopen(self._name, mode)
OSError: libcudart.so: cannot open shared object file: No such file or directory
Not patching Numba
warnings.warn(msg, UserWarning)

OSError: libcudart.so: cannot open shared object file: No such file or directory
Traceback:
File "Cell [cell3]", line 1, in <module>
    import cudf
File "/opt/conda/lib/python3.10/site-packages/cudf/__init__.py", line 20, in <module>
    validate_setup()
File "/opt/conda/lib/python3.10/site-packages/cudf/utils/gpu_utils.py", line 96, in validate_setup
    cuda_runtime_version = runtimeGetVersion()
File "/opt/conda/lib/python3.10/site-packages/rmm/_cuda/gpu.py", line 88, in runtimeGetVersion
    major, minor = numba.cuda.runtime.get_version()
File "/opt/conda/lib/python3.10/site-packages/numba/cuda/cudadrv/runtime.py", line 111, in get_version
    self.cudaRuntimeGetVersion(ctypes.byref(rtver))
File "/opt/conda/lib/python3.10/site-packages/numba/cuda/cudadrv/runtime.py", line 65, in __getattr__
    self._initialize()
File "/opt/conda/lib/python3.10/site-packages/numba/cuda/cudadrv/runtime.py", line 51, in _initialize
    self.lib = open_cudalib('cudart')
File "/opt/conda/lib/python3.10/site-packages/numba/cuda/cudadrv/libs.py", line 65, in open_cudalib
    return ctypes.CDLL(path)
File "/opt/conda/lib/python3.10/ctypes/__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)

@ncclementi
Copy link
Contributor Author

ncclementi commented Feb 12, 2025

I inspected this a bit more but can't quite find how to make it work. Maybe @jacobtomlinson you might have another idea.

What I discover is:

  • there is no access to a terminal, so all the commands are through the notebook cell using a magic
  • There is one big conda env where everything gets installed. I tried creating another environment but when I tried to activate it fails, and cannot init micromamba because the "terminal" is sh and I can't switch and micromamba doesn't allow me to use sh.
  • !conda info
libmamba version : 1.5.8  
    micromamba version : 1.5.8  
          curl version : libcurl/8.6.0 OpenSSL/3.2.1 zlib/1.2.13 zstd/1.5.5 libssh2/1.11.0 nghttp2/1.58.0  
    libarchive version : libarchive 3.7.2 zlib/1.2.13 bz2lib/1.0.8 libzstd/1.5.5  
      envs directories : /opt/conda/envs  
         package cache : /opt/conda/pkgs  
                         /root/.mamba/pkgs  
           environment : base (active)  
          env location : /opt/conda  
     user config files : /root/.mambarc  
populated config files : /root/.condarc  
      virtual packages : __unix=0=0  
                         __linux=5.4.181=0  
                         __glibc=2.31=0  
                         __archspec=1=x86_64-v3  
                         __cuda=12.4=0  
              channels : file:///opt/repo/internal/linux-64  
                         file:///opt/repo/internal/noarch  
      base environment : /opt/conda  
              platform : linux-64
  • !find / -name "libcudart.so*"
/opt/conda/lib/python3.10/site-packages/nvidia/cuda_runtime/lib/libcudart.so.12
  • there is no /usr/local/cuda
  • ! echo $LD_LIBRARY_PATH returns empty

Tried running this, as a suggestion from @jameslamb ended up with different error, not quite sure how to move on from here

!LD_LIBRARY_PATH="/opt/conda/lib/python3.10/site-packages/nvidia/cuda_runtime/lib/" python -c "import cudf; print(cudf.__version__)"

/opt/conda/lib/python3.10/site-packages/cudf/utils/_ptxcompiler.py:64: UserWarning: Error getting driver and runtime versions:  

stdout:  

stderr:  

Traceback (most recent call last):  
  File "<string>", line 7, in <module>  
  File "/opt/conda/lib/python3.10/site-packages/numba_cuda/numba/cuda/cudadrv/runtime.py", line 111, in get_version  
    self.cudaRuntimeGetVersion(ctypes.byref(rtver))  
  File "/opt/conda/lib/python3.10/site-packages/numba_cuda/numba/cuda/cudadrv/runtime.py", line 65, in __getattr__  
    self._initialize()  
  File "/opt/conda/lib/python3.10/site-packages/numba_cuda/numba/cuda/cudadrv/runtime.py", line 51, in _initialize  
    self.lib = open_cudalib('cudart')  
  File "/opt/conda/lib/python3.10/site-packages/numba_cuda/numba/cuda/cudadrv/libs.py", line 65, in open_cudalib  
    return ctypes.CDLL(path)  
  File "/opt/conda/lib/python3.10/ctypes/__init__.py", line 374, in __init__  
    self._handle = _dlopen(self._name, mode)  
OSError: libcudart.so: cannot open shared object file: No such file or directory  


Not patching Numba  
  warnings.warn(msg, UserWarning)  
Traceback (most recent call last):  
  File "<string>", line 1, in <module>  
  File "/opt/conda/lib/python3.10/site-packages/cudf/__init__.py", line 20, in <module>  
    validate_setup()  
  File "/opt/conda/lib/python3.10/site-packages/cudf/utils/gpu_utils.py", line 96, in validate_setup  
    cuda_runtime_version = runtimeGetVersion()  
  File "/opt/conda/lib/python3.10/site-packages/rmm/_cuda/gpu.py", line 88, in runtimeGetVersion  
    major, minor = numba.cuda.runtime.get_version()  
  File "/opt/conda/lib/python3.10/site-packages/numba_cuda/numba/cuda/cudadrv/runtime.py", line 111, in get_version  
    self.cudaRuntimeGetVersion(ctypes.byref(rtver))  
  File "/opt/conda/lib/python3.10/site-packages/numba_cuda/numba/cuda/cudadrv/runtime.py", line 65, in __getattr__  
    self._initialize()  
  File "/opt/conda/lib/python3.10/site-packages/numba_cuda/numba/cuda/cudadrv/runtime.py", line 51, in _initialize  
    self.lib = open_cudalib('cudart')  
  File "/opt/conda/lib/python3.10/site-packages/numba_cuda/numba/cuda/cudadrv/libs.py", line 65, in open_cudalib  
    return ctypes.CDLL(path)  
  File "/opt/conda/lib/python3.10/ctypes/__init__.py", line 374, in __init__  
    self._handle = _dlopen(self._name, mode)  
OSError: libcudart.so: cannot open shared object file: No such file or directory

I tried creating a symbolic link

!ln -s /opt/conda/lib/python3.10/site-packages/nvidia/cuda_runtime/lib/libcudart.so.12 /opt/conda/lib/python3.10/site-packages/nvidia/cuda_runtime/lib/libcudart.so

then running

!LD_LIBRARY_PATH="/opt/conda/lib/python3.10/site-packages/nvidia/cuda_runtime/lib/" python -c "import cudf; print(cudf.__version__)"

then got

Traceback (most recent call last):  
  File "<string>", line 1, in <module>  
  File "/opt/conda/lib/python3.10/site-packages/cudf/__init__.py", line 19, in <module>  
    _setup_numba()  
  File "/opt/conda/lib/python3.10/site-packages/cudf/utils/_numba.py", line 121, in _setup_numba  
    shim_ptx_cuda_version = _get_cuda_build_version()  
  File "/opt/conda/lib/python3.10/site-packages/cudf/utils/_numba.py", line 16, in _get_cuda_build_version  
    from cudf._lib import strings_udf  
  File "/opt/conda/lib/python3.10/site-packages/cudf/_lib/__init__.py", line 4, in <module>  
    from . import (  
  File "binaryop.pyx", line 1, in init cudf._lib.binaryop  
  File "column.pyx", line 1, in init cudf._lib.column  
  File "scalar.pyx", line 1, in init cudf._lib.scalar  
  File "/opt/conda/lib/python3.10/site-packages/pylibcudf/__init__.py", line 13, in <module>  
    from . import (  
  File "expressions.pyx", line 5, in init pylibcudf.expressions  
  File "/opt/conda/lib/python3.10/site-packages/pyarrow/__init__.py", line 65, in <module>  
    import pyarrow.lib as _lib  
ImportError: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /opt/conda/lib/python3.10/site-packages/pyarrow/lib.cpython-310-x86_64-linux-gnu.so)

@jacobtomlinson
Copy link
Member

This might require a little more LD_LIBRARY_PATH tweaking to help it find GLIBCXX_3.4.29 if it is indeed installed.

Are you installing via pip? Given that it's a conda environment are you able to do conda install -c rapidsai -c conda-forge -c nvidia cudf=24.12 instead?

@ncclementi
Copy link
Contributor Author

This might require a little more LD_LIBRARY_PATH tweaking to help it find GLIBCXX_3.4.29 if it is indeed installed.

yes, I need to check if I can even find it. Not having a terminal makes it very annoying.

Are you installing via pip?

I'm installing via pip, we can't install via conda I tried conda install -c rapidsai -c conda-forge -c nvidia cudf=24.12 but got

critical libmamba Multiple errors occured:
  Download error (6) Couldn't resolve host name [https://conda.anaconda.org/conda-forge/noarch/repodata.json.zst]  
    Could not resolve host: conda.anaconda.org  
    Subdir conda-forge/noarch not loaded!`  

It's network restricted. I'm not entirely sure if there is a possibility to allow all kind of network access to try to fix this.

@jacobtomlinson
Copy link
Member

It's network restricted.

Yeah this makes it kinda painful. I think they allowlisted pypi.org and that's about it.

Not having a terminal makes it very annoying.

I wonder if there is some way to get tmate on there? Or use netcat in a similar way to this GitHub Actions blog post I wrote a while back.

If their network restrictions are just DNS based then you could use IP addresses of the Ngrok server to set up the connection.

@ncclementi
Copy link
Contributor Author

I wonder if I can hack around and add conda.anaconda.org as part of the list here. That's where I added pypi.nvidia.com. I might try.

CREATE OR REPLACE NETWORK RULE pypi_network_rule
  MODE = EGRESS
  TYPE = HOST_PORT
  VALUE_LIST = ('pypi.org', 'pypi.python.org', 'pythonhosted.org',  'files.pythonhosted.org', 'pypi.nvidia.com');

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants